<img src='../../media/common/LogoWekeo_Copernicus_RGB_0.png' align='left' height='50px'></img>

<hr>

# Tutorial on Basic Land Applications (Data Download)

In this tutorial, we will use the WEkEO Jupyterhub to access and download data from the Copernicus Sentinel-2 and the <a href='https://land.copernicus.eu/' target='_blank'>Copernicus Land Monitoring Service (CLMS)</a>.  
We have chosen a region in northern Corsica because it features representative landscape characteristics and processes that highlight the strengths and capabilities of Copernicus space components and services.

The tutorial guides you through the process of selecting and downloading a Sentinel-2 scene and <a href='https://sdi.eea.europa.eu/catalogue/srv/eng/catalog.search#/metadata/a5144888-ee2a-4e5d-a7b0-2bbf21656348' target='_blank'>CLMS CORINE Land Cover (CLC)</a> data from WEkEO, using the Harmonised Data Access (HDA) API.

<img src='../../media/land/Intro_banner.jpg' align='center' height='400px'></img>

### Environment Setup
Before we begin, we need to prepare our environment by installing and importing the necessary Python libraries:

In [1]:
#Uncomment and run if necessary
#!pip install hda

In [6]:
#Load required libraries
import os
import sys
import json
import time
import base64
import warnings
import shutil
warnings.filterwarnings('ignore')
import zipfile
from pathlib import Path
#Import HDA API client
from hda import Client

### WEkEO Account Registration

If you don't have a WEkEO account, please self-register at the <a href='https://wekeo.copernicus.eu/register' target='_blank'>WEkEO registration page</a>.

### HDA API Authentication

In order to interact with WEkEO's Harmonised Data Access API, each user shall ensure that the file '.hdarc' with username and password exists in the home directory. Please, find the tutorial on "how to" <a href='https://help.wekeo.eu/en/articles/6751608-how-to-use-the-hda-api-in-python' target='_blank'>here</a>. 

<hr>

## Process Data with HDA Client

### Search for the Dataset ID from the WEkEO Landing Platform

<a href='https://wekeo.eu/' target='_blank'>WEkEO</a> offers access to a vast amount of data. Under <a href='https://wekeo.eu/data' target='_blank'>WEkEO DATA</a>, clicking the "+" to add a layer opens a catalog search.  
Here, you can use free text or the filter options on the left to refine your search by satellite platform, sensor, Copernicus service, area (region of interest), general time period (past or future), and various other flags.

<img src='../../media/land/WEkEO_data_01.jpg' align='middle' height='400px'></img>

You can click on the datasets you are interested in to view detailed information, including the dataset's temporal and spatial extent, collection ID, and metadata.

When searching for Sentinel-2 products, click under "Platform" in the Filters on the left-hand side of the catalog panel.  
Two datasets are available, but we will use “SENTINEL-2 Level-1C”. Once you have found it, select 'Details' to read the dataset description. 

The dataset description provides the following information:
* Abstract: A general description of the dataset.
* Classification: Including the Dataset ID.
* Resources: Links to the Product Data Format Specification guide, and JSON metadata.
* Contacts: Information about the data source from its provider.
* Raw Metadata: Details of the dataset in XML format.

<img src='../../media/land/WEkEO_data_02.jpg' align='centre' height='400px'></img>

You will need this information to request data from the Harmonised Data Access API.

This process is explained in a previous training session, which can be found on the <a href='https://www.youtube.com/channel/UCvS3VvKmMKs1M2ZkmQPyRlw' target='_blank'>WEkEO YouTube Channel</a>. The YouTube channel also contains many other useful training and support materials,  
such as how to <a href='https://www.youtube.com/watch?v=pmCkvXcnZxY&list=PLAT-b7DuvMgogqJa5_ii5GteOYmXCce24&index=2' target='_blank'>clone the GitHub repository to refresh the training materials</a>.

For this session, the details of the required datasets have already been prepared as JSON files, which will be used below.

In [7]:
dataset_id_S2 = "EO:EO:ESA:DAT:SENTINEL-2:MSI"
dataset_id_corine = "EO:CLMS:DAT:CORINE"

filename_json_S2 = os.path.join(os.getcwd(), '../../data/raw/land/S2_request.json')
filename_json_corine = os.path.join(os.getcwd(), '../../data/raw/land/corine_corsica.json')

### Load Data Descriptor File and Request Data


The Harmonised Data Access API can read your data request from a JSON file. In this JSON file, you can specify the dataset you want to download.  
The file is essentially a dictionary and can include the following keys:

- **datasetID**: The dataset's collection ID.
- **stringChoiceValues**: The type of dataset, e.g., 'Non Time Critical'.
- **dataRangeSelectValues**: The time period for which you want to retrieve data.
- **boundingBoxValues**: Optional, to define a subset of a global field.

You can also obtain a specific example of a JSON file for a particular query from the WEkEO DATA portal.

### Displaying a JSON Query from a Request Made to the Harmonised Data Access API Through the Data Portal

You can load the JSON file using `json.load()`. Alternatively, you can copy and paste the dictionary describing your data directly into a cell, as demonstrated in the YouTube video.

For this training session, multiple JSON files have already been prepared to select the appropriate Sentinel-2 scene and CLC data for the subsequent tasks. The details were loaded earlier in the notebook. 

The following cell reads these JSON files and displays their contents.  

In [8]:
try:
    with open(filename_json_S2, 'r') as f:
        data_S2 = json.load(f)
        print('Your JSON file:')
        print(json.dumps(data_S2, indent=4))
except:
    print('Your JSON file is not in the correct format, or is not found, please check it!')

Your JSON file:
{
    "dataset_id": "EO:ESA:DAT:SENTINEL-2",
    "bbox": [
        9.425764239078317,
        42.74275713340862,
        9.735642520957134,
        43.05969192516483
    ],
    "startDate": "2017-08-02T00:00:00.000Z",
    "completionDate": "2017-08-02T23:00:00.000Z",
    "processingLevel": "S2MSI2A"
}


In [9]:
try:
    with open(filename_json_corine, 'r') as f:
        data_corine = json.load(f)
        print('Your JSON file:')
        print(json.dumps(data_corine, indent=4))
except:
    print('Your JSON file is not in the correct format, or is not found, please check it!')

Your JSON file:
{
    "dataset_id": "EO:EEA:DAT:CORINE",
    "product_type": "Corine Land Change 2006 2012",
    "format": "GeoTiff100mt"
}


### Download Requested Data

You can use the client directly to download the data, as shown in the following example.

In [10]:
hda_client = Client()

download_dir_path = os.path.join(os.getcwd(), '../../data/download/land')

matches = hda_client.search(data_S2)
print("Sentinel 2:")
print(matches)
matches.download(download_dir_path)

matches = hda_client.search(data_corine)
print("\nCorine Land Cover:")
print(matches)
matches.download(download_dir_path)

Sentinel 2:
SearchResults[items=1,volume=832.9MB]


                                                   


Corine Land Cover:
SearchResults[items=1,volume=25.1MB]



[A%|          | 0.00/25.1M [00:00<?, ?B/s]
[A%|▎         | 785k/25.1M [00:00<00:03, 8.03MB/s]
[A%|██▍       | 6.02M/25.1M [00:00<00:00, 35.7MB/s]
[A%|████▊     | 12.0M/25.1M [00:00<00:00, 48.0MB/s]
[A%|███████▍  | 18.8M/25.1M [00:00<00:00, 57.0MB/s]
[A                                                 

### Decompressing Sentinel-2 and Corine Land Cover Data

In [11]:
processing_dir_path = os.path.join(os.getcwd(), '../../data/processing/land')

extension = ".zip"
for item in os.listdir(download_dir_path): 
    print("Decompressing " + item + " ... ", end = '')
    if item.endswith(extension): 
        file_name = os.path.join(download_dir_path, item) 
        with zipfile.ZipFile(file_name, 'r') as zip_ref:
            zip_ref.extractall(processing_dir_path)
    print("DONE")

Decompressing u2012_cha0612_v2020_20u1_raster100m.zip ... DONE
DONEmpressing S2A_MSIL2A_20170802T101031_N0500_R022_T32TNN_20231002T122411.zip ... 


<hr>

## Cleanup

To ensure a clean workspace and remove all downloaded files and processing artifacts created during this session, run the following code. This will delete any files that were downloaded and processed within this notebook.

In [12]:
paths_to_cleanup = [
    download_dir_path,
    processing_dir_path
]

for path in paths_to_cleanup:
    if os.path.isfile(path):
        os.remove(path)
    elif os.path.isdir(path):
        shutil.rmtree(path)

print("Cleanup complete. All downloaded and processed files have been removed.")

Cleanup complete. All downloaded and processed files have been removed.


<hr>

## Data Reference

CORINE Land Cover Change 2006-2012 (raster 100 m), Europe, 6-yearly. European Union's Copernicus Land Monitoring Service information, https://www.wekeo.eu/. https://doi.org/10.2909/32883574-90dd-4021-843f-f9ea6b22bfce (Accessed on 28.01.2025)

<p><img src='../../media/land/all_partners_wekeo_2.png' align='left' alt='Logo EU Copernicus' height='400px'></img></p>