<img src='./img/LogoWekeo_Copernicus_RGB_0.png' align='right' width='20%'></img>

# Tutorial on climate data access through WEkEO

This tutorial focusses on how to access climate data through [WEkEO](https://www.wekeo.eu/), the EU Copernicus DIAS (Data and Information Access Service) reference service for environmental data. In addition to data, WEkEO provides virtual processing environments and skilled user support.

WEkEO offers access to satellite data from the Sentinel missions, and many products produced by the Copernicus climate change, atmosphere, marine and land monitoring services. 

The Copernicus Climate Change service provides authoritative information about the past, present and future climate. Its product portfolio includes the following:

- **Satellite and in-situ observations**
- **Reanalysis**
- **Seasonal forecasts**
- **Climate projections**
- **Climate indices**

This data is accessible through the [C3S Climate Data Store (CDS)](https://cds.climate.copernicus.eu/), and in future also through WEkEO. While some C3S datasets are available through WEkEO, this is still work in progress. This [Jupyter Notebook](https://www.wekeo.eu/docs/using-jupyter) demonstrates how to access data through WEkEO, but please be aware that for the time being, C3S data should be accessed via the CDS. 

Users can access data from WEkEO either directly from the [WEkEO web platform](https://www.wekeo.eu/), or through the [Harmonised Data Access (HDA) API](https://www.wekeo.eu/docs/harmonised-data-access-api), which is a REST interface.

This Jupyter Notebook is a step-by-step guide on how to search for and download data from WEkEO using the `HDA API`.

The tutorial consists of the following steps:
1. [Install the WEkEO HDA client](#wekeo_hda_install)
2. [Search for datasets on WEkEO](#wekeo_search)
3. [Configure the WEkEO API Authentication](#wekeo_hda_auth)
4. [Download requested data](#wekeo_download)
 
Having downloaded data, follow the step below to view it and create a simple plot:
 
5. [View and plot data](#wekeo_view)

#### Load required libraries

In [None]:
import os
import sys
import json
from zipfile import ZipFile
import time
import base64
from IPython.core.display import HTML

import requests
import warnings
warnings.filterwarnings('ignore')

import numpy as np
import xarray as xr

<hr>

### <a id='wekeo_hda_install'></a>1. Install the WEkEO HDA client

The WEkEO HDA client is a python based library. It provides support for both Python 2.7.x and Python 3.

In order to install the WEkEO HDA client via the package management system pip, you have to running on Unix/Linux the command shown below.

In [None]:
pip install hda

Please verify the following requirements are installed before skipping to the next step:
   - Python 3
   - requests
   - tqdm

#### Load WEkEO HDA client

The hda client provides a fully compliant Python 3 client that can be used to search and download products using the Harmonized Data Access WEkEO API.
HDA is RESTful interface allowing users to search and download WEkEO datasets.
Documentation about its usage can be found at https://www.wekeo.eu/.

In [None]:
from hda import Client

<hr>

### <a id='wekeo_search'></a>2. Search for datasets on WEkEO

Under [WEkEO DATA](https://wekeo.eu/data?view=catalogue). Clicking the + to add a layer, opens a catalogue search. Here you can use free text, or you can use the filter options on the left to refine your search and look by satellite plaform, sensor, Copernicus service, area (region of interest), general time period (past or future), as well as through a variety of flags.

You can click on the dataset you are interested in and you will be guided to a range of details including the dataset temporal and spatial extent, collection ID, and metadata.

Now search for the product `Sea level daily gridded data for the global ocean from 1993 to present`. You can find it more easily by selecting 'C3S (Climate)' in the 'COPERNICUS SERVICE' filter group. 

Once you have found it, select 'Details' to read the dataset description.

<br>

<div style='text-align:center;'>
<figure><img src='./img/WEkEO_data.png' width='70%' />
    <figcaption><i>WEkEO interface to search for datasets</i></figcaption>
</figure>
</div>

The dataset description provides the following information:
- **Abstract**, containing a general description of the dataset,
- **Classification**, including the Dataset ID 
- **Resources**, such as a link to the Product Data Format Specification guide, and JSON metadata
- **Contacts**, where you can find further information about the data source from its provider.  

You need the `Dataset ID` to request data from the Harmonised Data Access API. 

<br>

<div style='text-align:center;'>
<figure><img src='./img/SeaLevel_info.png' width='40%' />
    <figcaption><i>Dataset information on WEkEO</i></figcaption>
</figure>
</div>
<br>

Let's store the Dataset ID as a variable called `dataset_id` to be used later.

In [None]:
dataset_id = "EO:ECMWF:DAT:SEA_LEVEL_DAILY_GRIDDED_DATA_FOR_GL"

Now select `Add to map` in the data description to add the selected dataset to the list of layers in your map view. Once the dataset appears as a layer, select the `subset and download` icon. This will enable you to specify the variables, temporal and in some cases geographic extent of the data you would like to download. Select `2019` as year, `August` as month, and `15` as day. Then select `Zip file` as format.

Now select `Show API request`. This will show the details of your selection in `JSON` format. If you now select `Copy`, you can copy these details to the clipboard then paste it either into a text file to create a `JSON` file (see example [here](./SeaLevel_data_descriptor.json)), or paste it directly into the cell below.

The Harmonised Data Access API can read this information, which is in the form of a dictionary with the following keys:
- `datasetID`: the dataset's collection ID
- `stringChoiceValues`: type of dataset, e.g. 'Non Time Critical'
- `dataRangeSelectValues`: time period you would like to retrieve data
- `boundingBoxValues`: optional to define a subset of a global field

<br>

<div style='text-align:center;'>
<figure><img src='./img/SeaLevel_params_json.png' width='60%' />
    <figcaption><i>Displaying a JSON query from a request made to the Harmonised Data Access API through the data portal</i></figcaption>
</figure>
</div>
<br>

If you created a `JSON` file, you can load it with `json.load()`:

In [None]:
try:
    with open('./SeaLevel_data_descriptor.json', 'r') as f:
        data = json.load(f)
    print('Your JSON file:')
    print(data)
except:
    print('Your JSON file is not in the correct format, or is not found, please check it!')

Alternatively, you can paste the dictionary describing your data into a cell, as done below:

In [None]:
data = {
  "datasetId": "EO:ECMWF:DAT:SEA_LEVEL_DAILY_GRIDDED_DATA_FOR_GLOBAL_OCEAN_1993_PRESENT",
  "multiStringSelectValues": [
    {
      "name": "variable",
      "value": [
        "all"
      ]
    },
    {
      "name": "year",
      "value": [
        "2019"
      ]
    },
    {
      "name": "month",
      "value": [
        "08"
      ]
    },
    {
      "name": "day",
      "value": [
        "15"
      ]
    }
  ],
  "stringChoiceValues": [
    {
      "name": "format",
      "value": "zip"
    }
  ]
}

### <a id='wekeo_hda_auth'></a>3. Configure the WEkEO API Authentication

In order to interact with WEkEO's Harmonised Data Access API, each user first makes sure the file "$HOME/.hdarc" exists with the URL to the API end point and your user and password.

For example, to search for the file .hdarc in the $HOME diretory, the user would open a terminale and run the following command:

Then he could copy the code below in the file "$HOME/.hdarc" (in your Unix/Linux environment) and adapt the following template with the credentials of your WEkEO account:

If he doesn't have a WEkEO account, please self register at the WEkEO registration page https://my.wekeo.eu/web/guest/user-registration.

### <a id='wekeo_download'></a>4. Download requested data

As a final step, you can use directly the client to download data as in following example. 

In [None]:
c = Client(debug=True)

matches = c.search(data)
print(matches)
matches.download()

### <a id='wekeo_view'></a>5. View and plot data

First we need to unzip the file we downloaded:

In [None]:
zip_file = r'dataset-satellite-sea-level-global-48ffff29-fc03-4dc0-bf38-6e071421c012.zip'

# Create a ZipFile Object and load sample.zip in it
with ZipFile(zip_file, 'r') as zipObj:
   # Extract all the contents of zip file in current directory
   zipObj.extractall()

Having unzipped the file, notice that the data is in NetCDF format (.nc file). This is a commonly used format for array-oriented scientific data. 

To read and view this data we will make use of the Xarray library. Xarray is an open source project and Python package that makes working with labelled multi-dimensional arrays simple, efficient, and fun! We will read the data from our NetCDF file into an Xarray **"dataset"**

In [None]:
nc_file = r'dt_global_twosat_phy_l4_20190815_vDT2018.nc'
ds = xr.open_dataset(nc_file)

Now we can query our newly created Xarray dataset:

In [None]:
ds

We see that the dataset has multiple variables and coordinates. We would like to plot a map of the absolute dynamic topography (the sea surface height above geoid). The variable for this is **'adt'**.

While an Xarray **dataset** may contain multiple variables, an Xarray **data array** holds a single multi-dimensional variable and its coordinates. To make the processing of the **adt** data easier, we convert it into an Xarray data array.

In [None]:
da = ds['adt']

We can now use the "plot" function of Xarray to create a simple plot of this variable.

In [None]:
da.plot()

<hr>

<p><img src='./img/all_partners_wekeo.png' align='left' alt='Logo EU Copernicus' width='100%'></img></p>