<img src='https://www.icos-cp.eu/sites/default/files/2017-11/ICOS_CP_logo.png' width=400 align=right>
<img src='https://naavre.net/img/hero-light.svg' width=200 align=right>


# ICOS Carbon Portal Python Library in NaaVRE

## Documentation

- [ICOS Carboon Portal Python Library](https://icos-carbon-portal.github.io/pylib/)
- [NaaVRE](https://qcdis.github.io/NaaVRE-website/) (Notebook as a Virtual Research Environment)

## Install libraries

In [None]:
!pip install icoscp==0.2.2 icoscp_core==0.3.9 matplotlib python-slugify

## NaaVRE workflow parameters

To get an ICOS API token for `param_cpauth_token`: go to https://cpauth.icos-cp.eu, login (creating an account if needed), accept the ICOS data license, and copy the API token at the bottom of the page (`cpauthToken=xyzABC...`).

In [None]:
# (Do not containerize, NaaVRE workflow parameters cell)

param_cpauth_token = ''

param_ecosystem_type = 'Wetland'
param_data_type = 'ETC L2 Fluxes'
param_variable = 'CO2'

 ## List stationss

In [None]:
# use-case-icos-list-stations
# ---
# NaaVRE:
#  cell:
#   inputs: []
#   outputs:
#    - stations_id_list: List
#   params:
#    - param_ecosystem_type:
#       type: String
#       default_value: "Wetland"
#   secrets: []
#   confs: []
#   dependencies:
#    - module: icoscp.station
#      name: station
# ...

from icoscp.station import station

print("Hello world")
stations = station.getIdList()
stations = stations[stations.siteType == param_ecosystem_type]
stations_id_list = list(stations.id)

stations_id_list

## List data products

In [None]:
# use-case-icos-list-data-products
# ---
# NaaVRE:
#  cell:
#   inputs:
#    - stations_id_list: List
#   outputs:
#    - dobj_list: List
#   params:
#    - param_data_type:
#       type: String
#       default_value: "ETC L2 Fluxes"
#   secrets: []
#   confs: []
#   dependencies:
#    - module: icoscp.station
#      name: station
#    - name: warnings
# ...

import warnings
import pandas as pd

dobj_list = []
for station_id in stations_id_list:
    with warnings.catch_warnings():
        warnings.simplefilter("ignore")
        s = station.get(station_id)
        datasets = s.data()
    if isinstance(datasets, str) and (datasets == 'no data available'):
        print(f'No datasets for station {station_id}')
        continue
    datasets = datasets[datasets.specLabel == param_data_type]
    dobj_list += list(datasets.dobj)

dobj_list

## Plot data

In [None]:
# use-case-icos-plot-time-series
# ---
# NaaVRE:
#  cell:
#   inputs:
#    - dobj_list: List
#   outputs:
#    - plot_files: List
#   params:
#    - param_cpauth_token:
#       type: String
#    - param_variable:
#       type: String
#       default_value: "CO2"
#   secrets: []
#   confs: []
#   dependencies:
#     - module: icoscp.cpb.dobj
#       name: Dobj
#     - module: icoscp_core.icos
#       name: bootstrap
#     - module: icoscp
#       name: cpauth
#     - name: matplotlib.pyplot
#       asname: plt
#     - name: slugify
# ...

from icoscp_core.icos import bootstrap
from icoscp import cpauth
from icoscp.cpb.dobj import Dobj
import matplotlib.pyplot as plt
import slugify

meta, data = bootstrap.fromCookieToken(param_cpauth_token)
cpauth.init_by(data.auth)


plot_files = []
for dobj_pid in dobj_list:
    dobj = Dobj(dobj_pid)
    unit = dobj.variables[dobj.variables.name == param_variable].unit.values[0]
    name = dobj.station['org']['name']
    uri = dobj.station['org']['self']['uri']
    title = f"{name} \n {uri}"
    plot = dobj.data.plot(x='TIMESTAMP', y=param_variable, grid=True, title=title)
    plot.set(ylabel=unit)
    filename = f'/tmp/data/{slugify.slugify(dobj_pid)}.pdf'
    plt.savefig(filename)
    plot_files.append(filename)
    plt.show()

plot_files

## Upload plots (optional)

To retrieve the generated files after the workflow has completed, they need to be uploaded to a persistent storage location. For testing purposes, we provide a S3-compatible MinIO server.

Follow this [tutorial](https://docs.google.com/presentation/d/112Vs-vsOonVq1TlC4WprzWR6s9XntOBGgoFGbbT_FDg) (part "Minio Setup", slide 2 to 9) to generate `param_s3_user_prefix`, `param_s3_access_key` and `param_s3_secret_key`.

# (Do not containerize, NaaVRE workflow parameters cell)

param_s3_server = "scruffy.lab.uvalight.net:9000"
param_s3_bucket = "naa-vre-user-data"

param_s3_user_prefix = ""
param_s3_access_key = ""
param_s3_secret_key = ""

# use-case-icos-upload-files
# ---
# NaaVRE:
#  cell:
#   inputs:
#    - plot_files: List
#   outputs: []
#   params:
#    - param_s3_server:
#       type: String
#       default_value: "scruffy.lab.uvalight.net:9000"
#    - param_s3_bucket:
#       type: String
#       default_value: "naa-vre-user-data"
#    - param_s3_user_prefix:
#       type: String
#    - param_s3_access_key:
#       type: String
#    - param_s3_secret_key:
#       type: String
#   secrets: []
#   confs: []
#   dependencies:
#     - module: minio
#       name: Minio
#     - name: os
# ...

import os
from minio import Minio

minio_client = Minio(param_s3_server, access_key=param_s3_access_key, secret_key=param_s3_secret_key, secure=True)

for plot_file  in plot_files:
    print("Uploading", plot_file)
    object_name = f'{param_s3_user_prefix}/vl-openlab/icos-naavre-demo/{os.path.basename(plot_file)}'
    minio_client.fput_object(param_s3_bucket, object_name=object_name, file_path=plot_file)