<img src='https://www.icos-cp.eu/sites/default/files/2017-11/ICOS_CP_logo.png' width=400 align=right>

# ICOS Carbon Portal Python Library
## Example: STILT footprints and timeseries

This example shows how to load timeseries data and footprints, and make some plots using Holoviews and Geoviews to create a map.

## Documentation
Full documentation for the library on the [project page](https://icos-carbon-portal.github.io/pylib/), how to install and wheel on [pypi.org](https://pypi.org/project/icoscp/), source is available on [github](https://github.com/ICOS-Carbon-Portal/pylib)

In [None]:
# import matplotlib
import matplotlib.pyplot as plt
import matplotlib.dates as mdates

import pandas as pd

#Import STILT tools:
from icoscp.stilt import stiltstation

### Create a STILT station object

In [None]:
st = stiltstation.get(id='kit100')
print(st)

### Retrieve the default time series data

In [None]:
yearstart = '2018-01-01'
yearend = '2018-12-31'

# When retrieving the station data the default is to load the columns
# ["isodate","co2.stilt","co2.fuel","co2.bio", "co2.background"]
data = st.get_ts(yearstart, yearend)
data.head()

### Plot STILT time series

In [None]:
data.plot(y=['co2.stilt', 'co2.background'], title=st.id, ylabel='ppm', figsize=(8,4))

### Extract time series with columns = 'co2'
See the documentation what columns you can return
[https://icos-carbon-portal.github.io/pylib/modules/#get_tsstart_date-end_date-hours-columns](https://icos-carbon-portal.github.io/pylib/modules/#get_tsstart_date-end_date-hours-columns)

In [None]:
# These date constraints will be used in the rest of this example notebook
start = '2018-01-01'
end = '2018-01-31'

In [None]:
stiltdata = st.get_ts(start, end, columns='co2')
stiltdata.head()

In [None]:
# This dataset has 16 columns given by
stiltdata.columns

### A comment on relations between the columns 
The columns are related by
- `co2.stilt = co2.bio + co2.fuel + co2.cement + co2.background`

Where the biospheric natural fluxes: `co2.bio` split into photosynthetic uptake and release by respiration:
- `co2.bio = co2.bio.gee + co2.bio.resp`

The anthropogenic emissions related to fuel burning split up according to the fuel types:
- `co2.fuel = co2.fuel.coal + co2.fuel.oil + co2.fuel.gas + co2.fuel.bio + "waste burning"`<br>
However, by now *waste burning is not explicitly* stored in the STILT dataset.

Other anthropogenic source category emissions are related according to the formula: 
- `co2.fuel + co2.cement = co2.energy + co2.transport + co2.industry + co2.others`


In [None]:
df_bio = stiltdata[['co2.bio','co2.bio.gee', 'co2.bio.resp']]
df_fuel = stiltdata[['co2.fuel','co2.fuel.coal','co2.fuel.oil','co2.fuel.gas','co2.fuel.bio']]
# The waste burning column can be included like below 
df_fuel = pd.concat([df_fuel,(df_fuel['co2.fuel']-df_fuel.iloc[:,1:].sum(axis=1)).rename('calc.waste')],axis=1)
df_source = stiltdata[['co2.fuel','co2.cement','co2.energy','co2.transport','co2.industry','co2.others']]


### Pie charts
In the next example we sum each column and visualize the data in pie charts.<br> 
**Note:** Here we take the *absolute value* in the biospheric components since the columns contain both positive and negative values. The biospheric flux relation should then be replaced by `co2.bio = - co2.bio.gee + co2.bio.resp`, the proportions are however the same.   

In [None]:
biocomponent = abs(df_bio.agg('sum'))
fuelcomponent = df_fuel.agg('sum')
sourcecomponent = df_source.agg('sum')

# Note: 
biocomponent

In [None]:
fig, axes = plt.subplots(1, 3,figsize=(8,2))

for ax in range(0,3):
    current_data = [biocomponent, fuelcomponent, sourcecomponent][ax]
    curent_title = ['STILT bio components', 'STILT fuel components','STILT source components'][ax]
    axes[ax].pie(current_data, labels=current_data.index, textprops={'fontsize': 8}) 
    axes[ax].set_xlabel(str(curent_title))
fig.subplots_adjust(wspace = 1.3)
fig.suptitle('STILT CO$_2$ components', fontsize=16)


plt.show()

### Plotting the co2 components

In [None]:
ax = pd.concat([df_bio,df_fuel,df_source.drop(['co2.fuel'],axis=1)],axis=1).plot()
ax.legend(bbox_to_anchor=(1.01,1), loc='upper left', fontsize=8)
ax.plot()

### Aggregate by day
As an example, we now aggregate the data daily and create a stacked bar graph

In [None]:
day = stiltdata.iloc[:,2:15].resample('D').sum()
day

In [None]:
# plot the bar graph
ax1 = day.plot.bar(stacked='True')
ax1.legend(loc='best', fontsize=8)

# adjust the xticks
ax1.xaxis.set_major_formatter(mdates.DateFormatter('%m-%d'))
ax1.set_ylabel('$\mathbf{Note:}$ These values are not independent')
ax1.set_title('$\mathbf{Note:}$ These values are not independent')
# display
ax1.plot()

## Load observation and compare to model result

In [None]:
from icoscp.cpb.dobj import Dobj

In [None]:
kit100 = Dobj('https://hdl.handle.net/11676/LJ4uetvEho7-k9K9TUnLHfFh')

In [None]:
kit100.citation

### Create a mask to get the same timeframe as the STILT data
Here, `start` and `end` dates are from the Notebook cell number 5

In [None]:
# Now the last date (or rather datetime) of our STILT data was sampled at 2018-01-31 21:00:00.
# In order to avoid discrepancies when we compare observed data with the STILT data we set 
enddate = pd.to_datetime(end) + pd.DateOffset(hours=21)
mask = (kit100.data['TIMESTAMP'] >= start) & (kit100.data['TIMESTAMP'] <= enddate)

obsdata = kit100.data[mask]
obsdata.set_index('TIMESTAMP', inplace=True)
obsdata['co2']

### Resample STILT data
Because the observation are hourly aggregates, we resample the STILT output to make our lives easier to compare the observation vs model.

In [None]:
stilthourly = stiltdata.resample('1H').mean().interpolate('linear')
stilthourly['co2.stilt']

### Data harmonization and plot
Next, we merge our data frames. If we look at the length of the dataframes there could have been a discrepancy. Observations may contain fewer data points due to some interruption of the measurement or QA / QC removed values. When merging dataframes the data missing values would be represented as NaN.

In [None]:
harmonized = stilthourly.join(obsdata)
harmonized.plot(y = ['co2.stilt', 'co2'], grid=True, linewidth=0.5)

### Plot difference

In [None]:
harmonized['diff'] = harmonized['co2.stilt']-harmonized['co2']

In [None]:
fig_dif, ax_dif = plt.subplots()
ax_dif = plt.axes()
ax_dif.plot(harmonized['diff'])
ax_dif.grid(color='0.9')
ax_dif.set_ylabel('diff (ppm)')

# adjust the xticks
ax_dif.xaxis.set_major_formatter(mdates.DateFormatter('%d'))