<img src='https://www.icos-cp.eu/sites/default/files/2017-11/ICOS_CP_logo.png' width=400 align=right>

# ICOS Carbon Portal Python Libraries

This example uses the library called `icoscp_stilt` which can be used to access results from the [STILT footprint tool](https://www.icos-cp.eu/data-services/tools/stilt-footprint).

General information on all ICOS Carbon Portal Python libraries can be found on our [help pages](https://icos-carbon-portal.github.io/pylib/). 

Documentation of the `icoscp_stilt` library, including information on running it locally, can also be found on [PyPI.org](https://pypi.org/project/icoscp_stilt/).

Note that for running this example locally, authentication is required (see the `how_to_authenticate.ipynb` notebook).


# Example: STILT CH$_4$ & CO$_2$  timeseries

In this example, we load STILT timeseries data, create some plots, and compare the STILT data with observed data.


## Imports

In [None]:
# Import STILT tools:
from icoscp_stilt import stilt

import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import pandas as pd

# allow for interactive plots
#%matplotlib widget

# default size for plots
plt.rcParams['figure.figsize'] = [10, 5] 


## Model results

### Select a STILT station from the list of all available STILT results

You can also find available stations and create new ones in the [STILT viewer](https://stilt.icos-cp.eu/).

In [None]:
# list of Stilt stations
stations = stilt.list_stations()

station_info_lookup = {s.id: s for s in stations}

# example: KIT Karlsruhe station, samplig height 100 m above ground
STILT_station = station_info_lookup['KIT100']

# STILT_station contains information on the station location in the STILT model 
# and also the id and sampling height of the corresponding measurement station - if one exists
STILT_station

### Retrieve the CO$_2$ and CH$_4$ time series

Find more information on how to retrieve time series data for the station by using `help(stilt)`

In [None]:
STILT_time_series = stilt.fetch_result_ts('KIT100', '2018-01-01', '2018-12-31')

# use date and time information (i.e. column 'isodate') as index
STILT_time_series.set_index('isodate', inplace=True)

columns = STILT_time_series.columns.to_list()
columns_co2 = [s for s in columns if 'co2' in s]
columns_ch4 = [s for s in columns if 'ch4' in s]

# display all co2 columns
# .head() shows the first five rows of the dataframe
display(STILT_time_series[columns_co2].head())

# display all ch4 columns
display(STILT_time_series[columns_ch4].head())


#### Information about the time series columns 

<b>CO$_2$</b>

`co2.stilt = co2.bio + co2.fuel + co2.cement + co2.non_fuel + co2.background`

Where the biospheric natural fluxes `co2.bio` are split into photosynthetic uptake and release by respiration:

`co2.bio = co2.bio.gee + co2.bio.resp`

The anthropogenic emissions related to fuel burning are split up according to the fuel types:

`co2.fuel = co2.fuel.coal + co2.fuel.oil + co2.fuel.gas + co2.fuel.bio + co2.fuel.waste`

Other anthropogenic source category emissions are related according to the formula: 

`co2.fuel + co2.cement + co2.non_fuel = co2.energy + co2.transport + co2.industry + co2.residential + co2.other_categories`

<br>

<b>CH$_4$</b>

`ch4.stilt = ch4.anthropogenic + ch4.natural + ch4.background`

Where the biospheric natural fluxes `ch4.natural` are split into microbial uptake and release and emissions from wildfires:

`ch4.natural = ch4.wetlands + ch4.soil_uptake + ch4.wildfire + ch4.other_natural`

The anthropogenic emissions are split into:

`ch4.anthropogenic = ch4.agriculture + ch4.waste + ch4.energy + ch4.other_categories`


#### Plot STILT time series 

In [None]:
# plot the modelled concentrations 
STILT_time_series.plot(y=['co2.stilt','co2.background'], 
                       title = STILT_station.id + ' modelled CO$_2$', 
                       ylabel = 'CO$_2$ [ppm]');

# plot the fuel components 
columns_to_plot = ['co2.fuel.coal', 'co2.fuel.oil', 'co2.fuel.gas', 'co2.fuel.bio', 'co2.fuel.waste']
STILT_time_series.plot(y=columns_to_plot, 
                       title = STILT_station.id + ' modelled CO$_2$ components', 
                       ylabel = 'CO$_2$ [ppm]');

# un-comment to display the modelled concentrations for CH4
#STILT_time_series.plot(y=['ch4.stilt','ch4.background'], 
#                       title = STILT_station.id + ' modelled CH$_4$', ylabel = 'ppb');

#### Retrieve and plot the average over the entire time period


In [None]:
# select what columns to average and visualize
columns_to_plot = ['co2.fuel.coal', 'co2.fuel.oil', 'co2.fuel.gas', 'co2.fuel.bio', 'co2.fuel.waste']

STILT_co2_mean = STILT_time_series[columns_to_plot].agg('mean').to_frame(name='Mean values CO$_2$ [ppm]')

display(STILT_co2_mean)

# calculate the total to compute the percentage for autopct
total = STILT_co2_mean['Mean values CO$_2$ [ppm]'].sum()

# plot pie chart with actual values displayed
STILT_co2_mean.plot.pie(y='Mean values CO$_2$ [ppm]', 
                        autopct=lambda p: '{:.1f}'.format(p * total / 100),
                        title = STILT_station.id + ' average modelled CO$_2$ components')

plt.show()

# un-comment for CH4 example
#columns_to_plot = ['ch4.agriculture','ch4.waste','ch4.energy','ch4.other_categories']

#STILT_ch4_mean = STILT_time_series[columns_to_plot].agg('mean').to_frame(name='Mean values (ppb)')

#total = STILT_ch4_mean['Mean values (ppb)'].sum()

#STILT_ch4_mean.plot.pie(y='Mean values (ppb)', 
#                        autopct=lambda p: '{:.1f}'.format(p * total / 100), 
#                        title = STILT_station.id + ' average modelled CH$_4$ components')

#plt.show()

#### Retrieve and plot the daily averages

In [None]:
# create dataframe with daily averages
STILT_ch4_daily_average = STILT_time_series[columns_ch4].resample('D').mean().reset_index()

# convert to datetime for proper display on y-axis
STILT_ch4_daily_average['date'] = pd.to_datetime(STILT_ch4_daily_average['isodate'], format='%Y-%m-%d')

# filter for specific month
month_to_plot = 6

# choose components to plot
columns_to_plot = ['ch4.wetlands', 'ch4.soil_uptake', 'ch4.wildfire', 'ch4.other_natural']

# subset to specific month
STILT_ch4_daily_average_subset = STILT_ch4_daily_average[STILT_ch4_daily_average['date'].dt.month == month_to_plot]

# set 'date' as the index which is the x-axis in the final graph
STILT_ch4_daily_average_subset.set_index('date', inplace=True)

ax = STILT_ch4_daily_average_subset[columns_to_plot].plot.bar(stacked=True, 
                                                              ylabel = 'CH$_4$ [ppb]', 
                                                              title = STILT_station.id + ' modelled natural CH$_4$ components \n Month: ' + str(month_to_plot))

# set the x-axis major formatter to display dates properly
ax.xaxis.set_major_formatter(mdates.DateFormatter('%d'))

# adjust the x-axis date labels to prevent overlap
plt.xticks(rotation=45)

plt.show()


## Observations

Get the observations that match modelled concentrations from STILT. CO$_2$ and CH$_4$ are in this case loaded separately and afterwards combined into one dataset. 

In this example we use the official ICOS CO$_2$ and CH$_4$ release data sets.


### Load the data

In [None]:
# URL for official ICOS CO2 molar fraction release data
datatype_co2='http://meta.icos-cp.eu/resources/cpmeta/atcCo2L2DataObject'
# stilt.fetch_observations_pandas retrieves observation data and metadata for a list of STILT stations, 
# but in this example our list has only a single station
obs_data_meta_co2 = stilt.fetch_observations_pandas(datatype_co2, [STILT_station])
STILT_station_obs_meta_co2 = obs_data_meta_co2[STILT_station.id].dobj
# show the metadata for the observation dataset
print(STILT_station_obs_meta_co2)
STILT_station_obs_data_co2 = obs_data_meta_co2[STILT_station.id].df
# show the observation data
display(STILT_station_obs_data_co2.head())

# URL for official ICOS CH4 molar fraction release data
datatype_ch4='http://meta.icos-cp.eu/resources/cpmeta/atcCh4L2DataObject'
obs_data_meta_ch4 = stilt.fetch_observations_pandas(datatype_ch4, [STILT_station])
STILT_station_obs_meta_ch4 = obs_data_meta_ch4[STILT_station.id].dobj
# show the metadata for the observation dataset
print(STILT_station_obs_meta_ch4)
STILT_station_obs_data_ch4 = obs_data_meta_ch4[STILT_station.id].df
# show the observation data
display(STILT_station_obs_data_ch4.head())

In [None]:
# Combine CO2 and CH4 observations into one dataset
STILT_station_obs = STILT_station_obs_data_co2.merge(STILT_station_obs_data_ch4,on='TIMESTAMP',suffixes=('_co2','_ch4'))
display(STILT_station_obs.head())

## Compare observations and STILT model result
While ICOS observations are hourly, the STILT results are only available 3-hourly. Therefore we have to merge the two datasets for further analysis.

In [None]:
# merge the observations to the STILT model results
STILT_model_obs = STILT_time_series.merge(STILT_station_obs, left_on = STILT_time_series.index, right_on = 'TIMESTAMP')
STILT_model_obs.set_index('TIMESTAMP', inplace=True)

ax = STILT_model_obs.plot(y = ['co2.stilt', 'co2'], grid=True, linewidth=0.5, ylabel = 'CO$_2$ [ppm]')

plt.title('CO$_2$ STILT vs CO$_2$ observations')

plt.show()

In [None]:
# add a column with the model data mismatch
STILT_model_obs['Model data mismatch co2'] = STILT_model_obs['co2.stilt']-STILT_model_obs['co2']

ax = STILT_model_obs.plot(y = ['Model data mismatch co2'], grid=True, linewidth=0.5, ylabel = 'CO$_2$ [ppm]')

plt.title('CO$_2$ STILT - CO$_2$ observation')

plt.show()

#### Zoom options
If you would like to zoom in to just see a few days or weeks, <br>
- either enable interactive plots by activating this line in the first cell 
    
    `%matplotlib widget`  
    
    and rerun the notebook
- or restrict the plot to a specific time span like in the next cell.

In [None]:
# start and end date of plot
plot_start = '2018-06-01 00:00:00'
plot_end = '2018-07-01 00:00:00'

ax = STILT_model_obs.plot(y = ['co2.stilt', 'co2'], 
                                 grid=True, linewidth=0.5, xlim = (plot_start, plot_end), ylabel = 'CO$_2$ [ppm]')

plt.title('CO$_2$ STILT vs CO$_2$ observations')
plt.show()

### Compare CO$_2$ and CH$_4$


In [None]:
# plot the CO2 and CH4 time series in one plot with shared time axis
fig, axs = plt.subplots(2, 1, figsize=(12,5), sharex=True)

# Remove horizontal space between axes
fig.subplots_adjust(hspace=0)

plot_start = '2018-02-01 00:00:00'
plot_end = '2018-03-29 00:00:00'

i=0
STILT_model_obs.plot(y=['co2.stilt'], xlim = (plot_start, plot_end), color= 'tab:blue', ax=axs[i]) 
STILT_model_obs.plot(y=['co2'], xlim = (plot_start, plot_end), color= 'k', ax=axs[i])
axs[i].set_ylabel('CO$_2$ [ppm]')
axs[i].grid(axis='both',color='lightgray')
axs[i].set_title('STILT vs Observations at ' + STILT_station.name + ' ' + str(STILT_station.alt) + ' m')

i+=1
STILT_model_obs.plot(y=['ch4.stilt'], xlim = (plot_start, plot_end), color= 'tab:orange', ax=axs[i])
STILT_model_obs.plot(y=['ch4'], xlim = (plot_start, plot_end), color ='k', ax=axs[i])
axs[i].set_ylabel('CH$_4$ [ppb]')
axs[i].grid(axis='both',color='lightgray')
plt.show()