<div style='text-align:center;'>
<figure><img src='https://raw.githubusercontent.com/wekeo/wekeo4data/main/img/LogoWekeo_Copernicus_RGB_0.png' alt='Logo EU Copernicus WEkEO' align='right' width='20%'>
</figure>
</div>

<h1><center><code>How To Download WEkEO Data</code></center></h1>

Follow the next few steps to download data from WEkEO via the __HDA API__.  
Please check the following article to get further details: 
- [How to use the HDA API in Python?
](https://help.wekeo.eu/en/articles/6751608-how-to-use-the-hda-api-in-python)
- [How to download WEkEO data?](https://help.wekeo.eu/en/articles/6416936-how-to-download-wekeo-data)
- [Official documentation of HDA API](https://hda.readthedocs.io/en/latest/usage.html)

## Step 1. Install the latest version of `hda`

You can run the next cell to install the latest version of `hda`:

In [None]:
!pip install hda -U

*__Note__: version used in this notebook is `2.17`*.

## Step 2. Import `hda` module

The HDA Client provides a fully compliant Python3 Client that can be used to search and download products using the Harmonized Data Access WEkEO API. First let's import the `hda` functions:

In [1]:
from hda import Client, Configuration

## Step 3. Configure credentials and load `hda` Client

### Method 1 (not regular users)

Pass your credentials directly in the script:

In [3]:
# Configure your credentials without a .hdarc file
conf = Configuration(user = "", password = "")
hda_client = Client(config = conf)

### Method 2 (regular users)

If you have not yet created your `.hdarc` file to allow **auto-login process**, you can execute this cell (otherwise disregard it):

In [2]:
from pathlib import Path

hdarc = Path(Path.home()/'.hdarc')
if not hdarc.is_file():
    import getpass
    USERNAME = input('Enter your username: ')
    PASSWORD = getpass.getpass('Enter your password: ')

    with open(Path.home()/'.hdarc', 'w') as f:
        f.write(f'user: {USERNAME}\n')
        f.write(f'password:{PASSWORD}\n')
else:
    print('Configuration file already exists.')
    
hda_client = Client()

Enter your username:  nbilliet
Enter your password:  ········


## Step 4. Create the request and download data

### Get the dataset metadata

Here we are going to download the following Copernicus Land dataset: __EO:EEA:DAT:CLMS_HRVPP_VPP__.

To create our request we can ask to the API what parameters are needed.
To do so we use the `metadata()` function:

In [3]:
help(hda_client.metadata)

Help on method metadata in module hda.api:

metadata(dataset_id) method of hda.api.Client instance
    Returns the metadata object for the given dataset.

    :param dataset_id: The dataset ID
    :type dataset_id: str



In [4]:
# Request metadata of a dataset
hda_client.metadata(dataset_id="EO:EEA:DAT:CLMS_HRVPP_VPP")

{'type': 'object',
 'title': 'Queryable',
 'properties': {'dataset_id': {'title': 'dataset_id',
   'type': 'string',
   'oneOf': [{'const': 'EO:EEA:DAT:CLMS_HRVPP_VPP',
     'title': 'EO:EEA:DAT:CLMS_HRVPP_VPP',
     'group': None}]},
  'itemsPerPage': {'title': 'Items PerPage',
   'type': 'string',
   'pattern': '^[0-9]*$'},
  'startIndex': {'title': 'Start Index',
   'type': 'string',
   'pattern': '[1-9][0-9]*'},
  'httpAccept': {'title': 'Http Accept',
   'type': 'string',
   'oneOf': [{'const': 'application%2Fatom%2Bxml',
     'title': 'Atom',
     'group': None},
    {'const': 'application%2Fgeo%2Bjson', 'title': 'GeoJson', 'group': None}]},
  'recordSchema': {'title': 'Record Schema',
   'type': 'string',
   'oneOf': [{'const': 'OM', 'title': 'OM', 'group': None},
    {'const': 'OM11', 'title': 'OM11', 'group': None},
    {'const': 'ISO', 'title': 'ISO', 'group': None},
    {'const': 'DC', 'title': 'DC', 'group': None},
    {'const': 'geojson', 'title': 'GeoJson', 'group': None}

## Create the request

Based on this information we can create the request below.

<div class="alert alert-block alert-info">
    📌 <b>Note</b>: to learn how to get your query from the Data Viewer, please check <a href="https://help.wekeo.eu/en/articles/6416936-how-to-download-wekeo-data#h_85849dcd7a">this article</a>.
</div>

In [6]:
query = {
  "dataset_id": "EO:EEA:DAT:CLMS_HRVPP_VPP",
  "productType": "TPROD",
  "productGroupId": "s1",
  "start": "2020-01-01T00:00:00.000Z",
  "end": "2021-01-01T00:00:00.000Z",
  "bbox": [
    -9.53592042,
    42.46825465,
    -7.0363102799999995,
    43.99700636
  ]
}

<div class="alert alert-block alert-info">
    📌 <b>Note</b>: the geographical coordinates in the <code>bbox</code> are ordered as: <code>[longitude_min, latitude_min, longitude_max, latitude_max]</code>
</div>

## Search data

The `search()` function launches the search of the data you requested with the specific parameters. It may take some time, as the server processes it.

In [7]:
matches = hda_client.search(query)
print(matches)

SearchResults[items=12,volume=1.1GB]


We can see that we can download **12 items**, for a total **volume of 1.1 GB**.

## Download file(s)

On WEkEO's JupyterHub you are limited to 20GB of stockage space, so be careful of the total size of files your request generated.  

### Download files in the current working directory

You can run `matches.download()` to download all the files of your request.  
Please [read the documentation](https://hda.readthedocs.io/en/latest/usage.html#advanced-client-usage) for advanced usage such as:
- downloading first result: `matches[0].download()`
- downloading last result: `matches[-1].download()`
- downloading first 10 results: `matches[:10].download()`
- downloading even results: `matches[::2].download()`
- etc.

For the purpose of this example, we are going to fetch the last result:

In [None]:
OUTPUT_PATH = '/tmp'
matches[-1].download(OUTPUT_PATH)

The `download()` function launches the download of the file(s) your request generated. They will be downloaded in the same folder as this notebook unless you specify an existing directory as `OUTPUT_PATH`.

## Additional Information
---

#### Compatible Data Science Toolkits

In [None]:
import pkg_resources; pkg_resources.get_distribution("hda").version

#### Last Modified and Tested

In [None]:
from datetime import date; print(date.today())

<img src='https://github.com/wekeo/ai4EM_MOOC/raw/04147f290cfdcce341f819eab7ad037b95f25600/img/ai4eo_logos.jpg' alt='Logo EU Copernicus WEkEO' align='center' width='100%'></img>