<img src="https://www.icos-cp.eu/sites/default/files/2017-11/ICOS_CP_logo.png" width="300" align="right"/>

# ICOS Carbon Portal Python Library

# Example: Load data based on a sparql query


## Documentation, installation, source
- documentation for the library on the [project page](https://icos-carbon-portal.github.io/pylib/)
- install with pip [pypi.org](https://pypi.org/project/icoscp/)
- source available on [github](https://github.com/ICOS-Carbon-Portal/pylib)

## Import the libraries 

In [None]:
# represent a digital object, containing the data
from icoscp.cpb.dobj import Dobj 

# execute a sparql query and return the result
from icoscp.sparql.runsparql import RunSparql  

<img src="img/sparql.png" width="400"  style="float:right">

## You can copy your own sparql query, based on your search criteria 

- Go to https://data.icos-cp.eu  and find datasets you want
- press the icon in the middle of the screen (see image to the right),  to copy your sparql query
- come back here and create the variable `query` 

For the following example, we have searched for:<br>
 - Project: ICOS
 - Theme: EcoSystem data 
 - Ecosystem type: Deciduous Broadleaf Forests 
 - Data type: ETC L2 Fluxnet (half-hourly)
 - Data level: 2
 - Responsible country: Belgium, France, Germany, Italy<br>
https://data.icos-cp.eu/portal/#%7B%22filterCategories%22%3A%7B%22project%22%3A%5B%22icos%22%5D%2C%22theme%22%3A%5B%22ecosystem%22%5D%2C%22ecosystem%22%3A%5B%22igbp_DBF%22%5D%2C%22level%22%3A%5B2%5D%2C%22type%22%3A%5B%22etcL2Fluxnet%22%5D%2C%22countryCode%22%3A%5B%22BE%22%2C%22FR%22%2C%22DE%22%2C%22IT%22%5D%7D%7D
with the goal to plot NEE over time

In [None]:
query = '''
prefix cpmeta: <http://meta.icos-cp.eu/ontologies/cpmeta/>
prefix prov: <http://www.w3.org/ns/prov#>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
select ?dobj ?hasNextVersion ?spec ?fileName ?size ?submTime ?timeStart ?timeEnd
where {
	VALUES ?spec {<http://meta.icos-cp.eu/resources/cpmeta/etcL2Fluxnet>}
	?dobj cpmeta:hasObjectSpec ?spec .
	BIND(EXISTS{[] cpmeta:isNextVersionOf ?dobj} AS ?hasNextVersion)
	VALUES ?station {<http://meta.icos-cp.eu/resources/stations/ES_DE-Hai> <http://meta.icos-cp.eu/resources/stations/ES_IT-BFt> <http://meta.icos-cp.eu/resources/stations/ES_DE-HoH> <http://meta.icos-cp.eu/resources/stations/ES_FR-Fon> <http://meta.icos-cp.eu/resources/stations/ES_BE-Lcr>}
			?dobj cpmeta:wasAcquiredBy/prov:wasAssociatedWith ?station .
	?dobj cpmeta:hasSizeInBytes ?size .
?dobj cpmeta:hasName ?fileName .
?dobj cpmeta:wasSubmittedBy/prov:endedAtTime ?submTime .
?dobj cpmeta:hasStartTime | (cpmeta:wasAcquiredBy / prov:startedAtTime) ?timeStart .
?dobj cpmeta:hasEndTime | (cpmeta:wasAcquiredBy / prov:endedAtTime) ?timeEnd .
	FILTER NOT EXISTS {[] cpmeta:isNextVersionOf ?dobj}
}
order by desc(?submTime)
'''

In [None]:
result = RunSparql(query, 'pandas')   # look at the documentation for different outputformats...
result.run()
result.data()

In [None]:
result.data()['dobj'].values

In [None]:
dobj_list = []
for d in result.data()['dobj']:
    dobj_list.append(Dobj(d))

dobj_list

In [None]:
for o in dobj_list:
    display(o.data.head())

In [None]:
# bokeh for plotting the data
from bokeh.plotting import figure, show
from bokeh.layouts import gridplot, column, row
from bokeh.io import output_notebook
from bokeh.models import Div
output_notebook()

In [None]:
figure_list = []
title = Div(text='<h2>Net Ecosystem Exchange</h2>')

start_date = '2021-01-01'
end_date = '2021-12-31'

for dobj in dobj_list:
    mask = (dobj.data['TIMESTAMP'] >= start_date) & (dobj.data['TIMESTAMP'] <= end_date)
    data = dobj.data[mask]  # filter by date
    
    # create a figure with title
    fig = figure(plot_width=350, plot_height=300, title=dobj.station['id'], x_axis_type='datetime')
    # plot data    
    fig.circle(data['TIMESTAMP'], data['NEE_VUT_REF'], size=1, alpha=0.3)
    
    #append to list
    figure_list.append(fig)
    
#link x & y axis for all figures
for i in range(1,len(figure_list)):    
    figure_list[i].x_range=figure_list[0].x_range
    figure_list[i].y_range=figure_list[0].y_range
    
grid = gridplot([figure_list])
show(column(title, grid))