<img src='../../media/common/LogoWekeo_Copernicus_RGB_0.png' align='left' height='96px'></img>

<hr>

# Tutorial on Basic Land Applications (Data Download)

In this tutorial, we will use the WEkEO Jupyterhub to access and download data from the Copernicus Sentinel-2 and the <a href='https://land.copernicus.eu/' target='_blank'>Copernicus Land Monitoring Service (CLMS)</a>.  
We have chosen a region in northern Corsica because it features representative landscape characteristics and processes that highlight the strengths and capabilities of Copernicus space components and services.

The tutorial guides you through the process of selecting and downloading a Sentinel-2 scene and CLMS CORINE Land Cover (CLC) data from their original archives on WEkEO, using the Harmonised Data Access (HDA) API.

<img src='../../media/land/Intro_banner.jpg' align='center' height='400px'></img>

### Environment Setup
Before we begin, we need to prepare our environment by installing and importing the necessary R packages.

In [1]:
# Define the list of required packages
required_packages <- c(
  
  # Data handling
  "zip", "jsonlite",
      
  # hdar data access
  "hdar"
)

In [2]:
# Check and install missing packages
install_if_missing <- function(pkg) 
{
  if (!requireNamespace(pkg, quietly = TRUE)) 
  {
    install.packages(pkg, dependencies = TRUE)
  }
}

# Apply the function to the list of required packages
invisible(sapply(required_packages, install_if_missing))

also installing the dependencies ‘lazyeval’, ‘pkgbuild’, ‘diffobj’, ‘rex’, ‘httr’, ‘yaml’, ‘ps’, ‘brio’, ‘callr’, ‘desc’, ‘evaluate’, ‘pkgload’, ‘praise’, ‘waldo’, ‘withr’, ‘covr’, ‘processx’, ‘testthat’


Updating HTML index of packages in '.Library'

Making 'packages.html' ...
 done



In [3]:
# load packages
load_required_packages <- function(pkg) 
{
  library(pkg, character.only = TRUE)  # Load the package
}
  
# Iterate over the list of required packages
invisible(sapply(required_packages, load_required_packages))


Attaching package: ‘zip’


The following objects are masked from ‘package:utils’:

    unzip, zip




### WEkEO Account Registration

If you don't have a WEkEO account, please self-register at the <a href='https://my.wekeo.eu/web/guest/user-registration' target='_blank'>WEkEO registration page</a>.

### HDA API Authentication

In order to interact with WEkEO's Harmonised Data Access API, each user shall ensure that the file '.hdarc' with username and password exists in the home directory. Please, find the tutorial on "how to" <a href='https://help.wekeo.eu/en/articles/7035318-how-to-use-the-hdar-package-for-accessing-the-wekeo-hda-api-in-r' target='_blank'>here</a>. 

<hr>

## Process data with HDA Client

### Search for the Dataset ID from the WEkEO Landing Platform

<a href='https://wekeo.eu/' target='_blank'>WEkEO</a> offers access to a vast amount of data. Under <a href='https://wekeo.eu/data' target='_blank'>WEkEO DATA</a>, clicking the "+" to add a layer opens a catalog search.  
Here, you can use free text or the filter options on the left to refine your search by satellite platform, sensor, Copernicus service, area (region of interest), general time period (past or future), and various other flags.

<img src='../../media/land/WEkEO_data_01.jpg' align='middle' height='400px'></img>

You can click on the datasets you are interested in to view detailed information, including the dataset's temporal and spatial extent, collection ID, and metadata.

When searching for Sentinel-2 products, click under "Platform" in the Filters on the left-hand side of the catalog panel.  
Two datasets are available, but we will use “SENTINEL-2 Level-1C”. Once you have found it, select 'Details' to read the dataset description. 

The dataset description provides the following information:
* Abstract: A general description of the dataset.
* Classification: Including the Dataset ID.
* Resources: Links to the Product Data Format Specification guide, and JSON metadata.
* Contacts: Information about the data source from its provider.
* Raw Metadata: Details of the dataset in XML format.

<img src='../../media/land/WEkEO_data_02.jpg' align='centre' height='400px'></img>

You will need this information to request data from the Harmonised Data Access API.

This process is explained in a previous training session, which can be found on the <a href='https://www.youtube.com/channel/UCvS3VvKmMKs1M2ZkmQPyRlw' target='_blank'>WEkEO YouTube Channel</a>. The YouTube channel also contains many other useful training and support materials,  
such as how to <a href='https://www.youtube.com/watch?v=pmCkvXcnZxY&list=PLAT-b7DuvMgogqJa5_ii5GteOYmXCce24&index=2' target='_blank'>clone the GitHub repository to refresh the training materials</a>.

For this session, the details of the required datasets have already been prepared as JSON files, which will be used below.

In [None]:
dataset_id_S2 <- "EO:EO:ESA:DAT:SENTINEL-2:MSI"
dataset_id_corine <- "EO:EEA:DAT:CORINE"

filename_json_S2 <- file.path(getwd(), "../../data/raw/land/S2_request.json")
filename_json_corine <- file.path(getwd(), "../../data/raw/land/corine_corsica.json")

### Load Data Descriptor File and Request Data


The Harmonised Data Access API can read your data request from a JSON file. In this JSON file, you can specify the dataset you want to download.  
The file is essentially a dictionary and can include the following keys:

- **datasetID**: The dataset's collection ID.
- **stringChoiceValues**: The type of dataset, e.g., 'Non Time Critical'.
- **dataRangeSelectValues**: The time period for which you want to retrieve data.
- **boundingBoxValues**: Optional, to define a subset of a global field.

You can also obtain a specific example of a JSON file for a particular query from the WEkEO DATA portal.

### Displaying a JSON Query from a Request Made to the Harmonised Data Access API Through the Data Portal

You can load the JSON file using `json.load()`. Alternatively, you can copy and paste the dictionary describing your data directly into a cell, as demonstrated in the YouTube video.

For this training session, we have prepared two methods to create the query for selecting the appropriate Sentinel-2 scene and CLC data for the subsequent tasks. The first method involves reading the query from pre-prepared JSON files, while the second method demonstrates how to use the generate_query_template() function to create the query automatically.

In [18]:
tryCatch({
  data_S2 <- fromJSON(filename_json_S2, simplifyVector = TRUE)
  data_S2 <- toJSON(data_S2, pretty = TRUE, auto_unbox = TRUE)
}, error = function(e) {
  cat('Your JSON file is not in the correct format, or is not found, please check it!\n')
})

c <- Client$new()
corine_query <- c$generate_query_template("EO:EEA:DAT:CORINE")

### Download Requested Data

You can use the client directly to download the data, as shown in the following example.

In [21]:
download_dir_path <- file.path(getwd(), "../../data/download/land")

matches <- c$search(data_S2)
cat("Sentinel 2:\n")
matches$download(download_dir_path)

matches <- c$search(corine_query)
cat("\nCorine Land Cover:\n")
matches$download(download_dir_path)

Found 1 files

Total Size 873.4 MB



Sentinel 2:


The total size is 873.4 MB. Do you want to proceed? (Y/N): 



ERROR: Error in if (answer %in% c("y", "n")) {: argument is of length zero


In [None]:
### Decompressing Sentinel-2 and Corine Land Cover Data

In [24]:
processing_dir_path <- file.path(getwd(), "../../data/processing/land")

extension <- ".zip"
for (item in list.files(download_dir_path)) {
  cat("Decompressing", item, "... ")
  if (grepl(paste0(extension, "$"), item)) {
    file_name <- file.path(download_dir_path, item)
    unzip(file_name, exdir = processing_dir_path)
  }
  cat("DONE\n")
}

Decompressing S2A_MSIL2A_20170802T101031_N0500_R022_T32TNN_20231002T122411.zip ... DONE
Decompressing u2000_cha9000_v2020_20u1_fgdb.zip ... DONE


<hr>

## Cleanup

To ensure a clean workspace and remove all downloaded files and processing artifacts created during this session, run the following code. This will delete any files that were downloaded and processed within this notebook.

In [25]:
paths_to_cleanup <- list(
  download_dir_path,
  processing_dir_path
)

for (path in paths_to_cleanup) {
  if (file.exists(path)) {
    if (file.info(path)$isdir) {
      unlink(path, recursive = TRUE)
    } else {
      file.remove(path)
    }
  }
}

cat("Cleanup complete. All downloaded and processed files have been removed.\n")

Cleanup complete. All downloaded and processed files have been removed.


<hr>

## Data Reference

CORINE Land Cover Change 2006-2012 (raster 100 m), Europe, 6-yearly. European Union's Copernicus Land Monitoring Service information, https://www.wekeo.eu/. https://doi.org/10.2909/32883574-90dd-4021-843f-f9ea6b22bfce (Accessed on 28.01.2025)

<p><img src='../../media/land/all_partners_wekeo_2.png' align='left' alt='Logo EU Copernicus' height='400px'></img></p>