<img src='https://www.icos-cp.eu/sites/default/files/2017-11/ICOS_CP_logo.png' width=400 align=right>

# ICOS Carbon Portal Python Library
## Example: STILT CH$_4$ & CO$_2$  timeseries

In this example, we load STILT timeseries data, create some plots, and compare the STILT data with observed data.


### Documentation

Full documentation for the library is available on the [project page](https://icos-carbon-portal.github.io/pylib/). Information on how to install it and obtain the wheel package can be found on [pypi.org](https://pypi.org/project/icoscp/), while the source code is accessible on [Github](https://github.com/ICOS-Carbon-Portal/pylib).

In [None]:
# Import STILT tools:
from icoscp.stilt import stiltstation
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import pandas as pd

# allow for interactive plots
#%matplotlib widget

# default size for plots
plt.rcParams['figure.figsize'] = [10, 5] 


## Model results

### Create a STILT station object

Find available stations and create new ones in the [STILT viewer](https://stilt.icos-cp.eu/).

In [None]:
STILT_station = stiltstation.get(id='KIT100')

# the STILT station object has information that is used in later cells, such as its id.
print(STILT_station)

### Retrieve the CO$_2$ and CH$_4$ time series

Find more information on how to use 'get_ts' to retrieve time series data for the station object [here](https://icos-carbon-portal.github.io/pylib/modules/#get_tsstart_date-end_date-hours-columns).

In [None]:
# it is possible to change the start and end date, as long as there is model result available (see link to STILT viewer in previous cell)
start_date = '2018-01-01'  
end_date  = '2018-12-31'

STILT_co2 = STILT_station.get_ts(start_date, end_date, columns = 'co2')

# .head() shows the first five rows of the dataframe
display(STILT_co2.head())

STILT_ch4 = STILT_station.get_ts(start_date, end_date, columns = 'ch4')
display(STILT_ch4.head())

#### Information about the time series columns 

<b>CO$_2$</b>

`co2.stilt = co2.bio + co2.fuel + co2.cement + co2.background`

Where the biospheric natural fluxes `co2.bio` are split into photosynthetic uptake and release by respiration:

`co2.bio = co2.bio.gee + co2.bio.resp`

The anthropogenic emissions related to fuel burning are split up according to the fuel types:

`co2.fuel = co2.fuel.coal + co2.fuel.oil + co2.fuel.gas + co2.fuel.bio + co2.fuel.waste`

Other anthropogenic source category emissions are related according to the formula: 

`co2.fuel + co2.cement = co2.energy + co2.transport + co2.industry + co2.residential + co2.other_categories`

<br>

<b>CH$_4$</b>

`ch4.stilt = ch4.anthropogenic + ch4.natural + ch4.background`

Where the biospheric natural fluxes `ch4.natural` are split into microbial uptake and release and emissions from wildfires:

`ch4.natural = ch4.wetlands + ch4.soil_uptake + ch4.wildfire + ch4.other_natural`

The anthropogenic emissions are split into:

`ch4.anthropogenic = ch4.agriculture + ch4.waste + ch4.energy + ch4.other_categories`


#### Plot STILT time series 

In [None]:
# plot the modelled concentrations 
STILT_co2.plot(y=['co2.stilt','co2.background'], 
               title = STILT_station.id + ' modelled CO$_2$', 
               ylabel = 'CO$_2$ [ppm]');

# plot the fuel components 
columns_to_plot = ['co2.fuel.coal', 'co2.fuel.oil', 'co2.fuel.gas', 'co2.fuel.bio', 'co2.fuel.waste']
STILT_co2.plot(y=columns_to_plot, 
               title = STILT_station.id + ' modelled CO$_2$ components', 
               ylabel = 'CO$_2$ [ppm]');

# un-comment to display the modelled concentrations for CH4
#STILT_ch4.plot(y=['ch4.stilt'], 
              #title = STILT_station.id + ' modelled CH$_4$', ylabel = 'ppb');

#### Retrieve and plot the average over the entire time period


In [None]:
# select what columns to average and visualize
columns_to_plot = ['co2.fuel.coal', 'co2.fuel.oil', 'co2.fuel.gas', 'co2.fuel.bio', 'co2.fuel.waste']

STILT_co2_mean = STILT_co2[columns_to_plot].agg('mean').to_frame(name='Mean values CO$_2$ [ppm]')

display(STILT_co2_mean)

# calculate the total to compute the percentage for autopct
total = STILT_co2_mean['Mean values CO$_2$ [ppm]'].sum()

# plot pie chart with actual values displayed
STILT_co2_mean.plot.pie(y='Mean values CO$_2$ [ppm]', 
                        autopct=lambda p: '{:.1f}'.format(p * total / 100),
                        title = STILT_station.id + ' average modelled CO$_2$ components')

plt.show()

# un-comment for CH4 example
#columns_to_plot = ['ch4.agriculture','ch4.waste','ch4.energy','ch4.other_categories']

#STILT_ch4_mean = STILT_ch4[columns_to_plot].agg('mean').to_frame(name='Mean values (ppb)')

#total = STILT_ch4_mean['Mean values (ppb)'].sum()

#STILT_ch4_mean.plot.pie(y='Mean values (ppb)', 
#                        autopct=lambda p: '{:.1f}'.format(p * total / 100), 
#                        title = STILT_station.id + ' average modelled CH$_4$ components')

#plt.show()

#### Retrieve and plot the daily averages

In [None]:
# create dataframe with daily averages
STILT_ch4_daily_average = STILT_ch4.resample('D').mean().reset_index()

# convert to datetime for proper display on y-axis
STILT_ch4_daily_average['date'] = pd.to_datetime(STILT_ch4_daily_average['date'], format='%Y-%m-%d')

# filter for specific month
month_to_plot = 6

# choose components to plot
columns_to_plot = ['ch4.wetlands', 'ch4.soil_uptake', 'ch4.wildfire', 'ch4.other_natural']

# subset to specific month
STILT_ch4_daily_average_subset = STILT_ch4_daily_average[STILT_ch4_daily_average['date'].dt.month == month_to_plot]

# set 'date' as the index which is the x-axis in the final graph
STILT_ch4_daily_average_subset.set_index('date', inplace=True)

ax = STILT_ch4_daily_average_subset[columns_to_plot].plot.bar(stacked=True, 
                                                              ylabel = 'CH$_4$ [ppb]', 
                                                              title = STILT_station.id + ' modelled natural CH$_4$ components \n Month: ' + str(month_to_plot))

# set the x-axis major formatter to display dates properly
ax.xaxis.set_major_formatter(mdates.DateFormatter('%d'))

# adjust the x-axis date labels to prevent overlap
plt.xticks(rotation=45)

plt.show()


## Observations

Find observations that match modelled concentrations from STILT footprints by selecting the same station and elevation, in this case Karlsruhe at 100m elevation (STILT code KIT100). CO$_2$ and CH$_4$ is in this case loaded separately.

Browse data in the <a href="https://data.icos-cp.eu" target = "_blank">ICOS data portal</a> and paste the PID of the dataset for the desired station. 

#### Load the data

In [None]:
from icoscp.cpb.dobj import Dobj

In [None]:
observations_co2 = Dobj('https://meta.icos-cp.eu/objects/X5BbGcMRUH6pNLIPHvUB7utv')

# here is how to access the data for CH4 at KIT100 instead
observations_ch4 = Dobj('https://meta.icos-cp.eu/objects/3AhyD7AhmO4YLTkJd2FY1WmT')

# observations_co2 is an object which contain the data comparable to the STILT data
display(observations_co2.data.head())

# it also contains metadata, such as the citation string
print(observations_co2.citation)


#### Compare observations and STILT model result
While ICOS observations are hourly, the STILT results are only available 3-hourly. Therefore we have to merge the two datasets for further analysis.

In [None]:
# merge the observations to the STILT data
STILT_observations_co2 = STILT_co2.merge(observations_co2.data, left_on = STILT_co2.index, right_on = 'TIMESTAMP')
STILT_observations_co2.set_index('TIMESTAMP', inplace=True)

ax = STILT_observations_co2.plot(y = ['co2.stilt', 'co2'], grid=True, linewidth=0.5, ylabel = 'CO$_2$ [ppm]')

plt.title('CO$_2$ STILT vs CO$_2$ observations')

plt.show()

In [None]:
# add a column with the model data mismatch
STILT_observations_co2['Model data mismatch'] = STILT_observations_co2['co2.stilt']-STILT_observations_co2['co2']

ax = STILT_observations_co2.plot(y = ['Model data mismatch'], grid=True, linewidth=0.5, ylabel = 'CO$_2$ [ppm]')

plt.title('CO$_2$ STILT - CO$_2$ observation')

plt.show()

#### Zoom options
If you would like to zoom in to just see a few days or weeks, <br>
- either enable interactive plots by activating this line in the first cell 
    
    `%matplotlib widget`  
    
    and rerun the notebook
- or restrict the plot to a specific time span like in the next cell.

In [None]:
# start and end date of plot
plot_start = '2018-06-01 00:00:00'
plot_end = '2018-07-01 00:00:00'

ax = STILT_observations_co2.plot(y = ['co2.stilt', 'co2'], 
                                 grid=True, linewidth=0.5, xlim = (plot_start, plot_end), ylabel = 'CO$_2$ [ppm]')

plt.title('CO$_2$ STILT vs CO$_2$ observations')
plt.show()

### Compare CO$_2$ and CH$_4$

Combine the CO$_2$ and CH$_4$ datasets into one dataset 

In [None]:
# but first merge the observations to the STILT data
STILT_observations_ch4 = STILT_ch4.merge(observations_ch4.data, left_on = STILT_ch4.index, right_on = 'TIMESTAMP')
STILT_observations_ch4.set_index('TIMESTAMP', inplace=True)

# merge the CH4 dataset to the CO2 dataset
STILT_observations_co2_ch4 = STILT_observations_co2.merge(STILT_observations_ch4, left_on = 'TIMESTAMP', right_on = 'TIMESTAMP')

# plot the CO2 and CH4 time series in one plot with shared time axis
fig, axs = plt.subplots(2, 1, figsize=(12,5), sharex=True)

# Remove horizontal space between axes
fig.subplots_adjust(hspace=0)

plot_start = '2018-02-01 00:00:00'
plot_end = '2018-03-29 00:00:00'

i=0
STILT_observations_co2_ch4.plot(y=['co2.stilt'], xlim = (plot_start, plot_end), color= 'tab:blue', ax=axs[i]) 
STILT_observations_co2_ch4.plot(y=['co2'], xlim = (plot_start, plot_end), color= 'k', ax=axs[i])
axs[i].set_ylabel('CO$_2$ [ppm]')
axs[i].grid(axis='both',color='lightgray')
axs[i].set_title('STILT vs Observations at '+STILT_station.name)

i+=1
STILT_observations_co2_ch4.plot(y=['ch4.stilt'], xlim = (plot_start, plot_end), color= 'tab:orange', ax=axs[i])
STILT_observations_co2_ch4.plot(y=['ch4'], xlim = (plot_start, plot_end), color ='k', ax=axs[i])
axs[i].set_ylabel('CH$_4$ [ppb]')
axs[i].grid(axis='both',color='lightgray')
plt.show()