### How to get started 👩‍🏫
In the following, you can find a beginner-friendly guide on how to retrieve ARLIE data.
The ARLIE data is stored in a persistent PostGIS database. Users can request the data through a REST API by using two URLs. One URL is to acquire lake and river geometries within a specified area of interest (AOI) while the other is to retrieve ARLIE data for the AOI. It is important to note that data requests must adhere to the Well-Known Text (WKT) format.<br>
<br>
This notebook contains the following steps:<br>
1) Initial setup
2) Pick your AOI
3) Import your geometry and display it
4) Transform to WKT format
5) Request ARLIE data
6) Display ARLIE data
7) Display EU Hydro data


#### 1) Initial setup
1. First of all, pick your preferred platform for running Jupyter Notebooks 👩‍💻 <br>
*(For instance, you can opt for the [WekEO platform](https://help.wekeo.eu/en/articles/6337538-what-is-the-wekeo-jupyterhub))*<br>
2. Next, copy this GitHub repository and paste all the included notebooks onto your chosen platform<br>
4. Make sure that you have the [clms_hrsi_arlie_downloader.py](clms_hrsi_arlie_downloader.py) located next to this notebook<br>
*You can see an example of the folder structure below*<br>
<img src="images/folders.png"/><br>
5. Additional to some standard Python libraries you will need: [Folium](https://pypi.org/project/folium/), [Geopandas](https://pypi.org/project/geopandas/) & [Shapely](https://pypi.org/project/shapely/).<br>
In case you miss any libraries throughout this tutorial, simply install them by running `%pip install {package name}`
See example below:

In [1]:
%pip install folium
%pip install Geopandas
%pip install Shapely
# Note: press shift+enter to execute a cell

Collecting folium
  Downloading folium-0.15.1-py2.py3-none-any.whl (97 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m97.0/97.0 kB[0m [31m763.6 kB/s[0m eta [36m0:00:00[0m [36m0:00:01[0mm
Installing collected packages: folium
Successfully installed folium-0.15.1
Note: you may need to restart the kernel to use updated packages.
Collecting Geopandas
  Using cached geopandas-0.14.1-py3-none-any.whl (1.1 MB)
Collecting fiona>=1.8.21 (from Geopandas)
  Using cached fiona-1.9.5-cp310-cp310-manylinux2014_x86_64.whl (15.7 MB)
Collecting pandas>=1.4.0 (from Geopandas)
  Downloading pandas-2.1.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.3/12.3 MB[0m [31m27.8 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hCollecting pyproj>=3.3.0 (from Geopandas)
  Using cached pyproj-3.6.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (8.3 MB)
Collecting click~=8.0 (from f

#### 2) Pick your AOI
To access ARLIE data for your specific AOI, you have two options. You can either upload a predefined geometry (e.g., a shapefile or GeoJSON) or you can draw a custom geometry on a map following the steps below. If you have a predifined geometry, jump directly to step 4). In case you have a file in WKT format, jump to step 5).

The geometry is used to request all lake and river polygons from the EU Hydro database within your AOI, which, in turn in is used to retrieve ARLIE data from the PostGIS database.

In [1]:
# Import some general libraries
import os
from pathlib import Path

Draw a custom geometry using Folium

In [2]:
# Import relevant libraries
import folium
from folium.plugins import Draw

# Initialise an interactive map
m = folium.Map(location=[50, 20], zoom_start=4)

#  Add a function to draw and export your polygon
draw = Draw(export=True, filename='aoi.geojson', position='topleft')
draw.add_to(m)

m

When you execute the cell above, an interactive map appears. This map can be used to draw your AOI. Use the toolbar on the left to zoom, draw and edit a geometry, and after choosing your area, export the geometry using the small 'Export' button to the right. A window will appear and ask you to save in geojson format. Save it to the same folder as this notebook.
<br>
<br>


<img src="images/folium1.png" alt="Drawing" style="width: 400px;"/>

🤫 Note, the geometry should have an appropriate size and complexety, so try with a smaller area if you experience issues later in the workflow.

#### 3) Import your geometry and display it

In [20]:
# Import relevant libraries
import folium
import geopandas as gpd

# Set the path to your AOI
aoi = "aoi.geojson"

# Read the AOI as a geodataframe
gdf = gpd.read_file(aoi)

# Retrieve the center points of the AOI to display it later
aoiX = gdf["geometry"].centroid.x
aoiY = gdf["geometry"].centroid.y

# Add your AOI to an interactive map and display it
m = folium.Map(
    location=[aoiY[0],aoiX[0]],
    tiles='OpenStreetMap',
    zoom_start=7,
    min_zoom=1,
    max_zoom=15,
    control_scale=True)

folium.GeoJson(aoi, name="AOI", style_function=lambda x: {'fillOpacity': 0.1,'color': 'Crimson'}).add_to(m)

m


  aoiX = gdf["geometry"].centroid.x

  aoiY = gdf["geometry"].centroid.y


#### 4) Transform to WKT format
Transform your AOI from the Geojson to WKT format using Shapely. In case you uploaded a shapefile, simply use this as input.

In [21]:
# Import relevant libraries
import json
from shapely.geometry import shape

# Set the path to your AOI
aoi = "aoi.geojson"

# Use Shapely to transform to WKT format and save it
with open(Path(aoi)) as fin, open(Path(aoi).with_suffix(".wkt"), "w") as fout:
    features = json.load(fin)["features"]
    for feature in features:
        geo = shape(feature["geometry"])
        wkt = geo.wkt
        fout.write(wkt + "\n")

In [22]:
# A WKT (Well-Known Text) is a scheme for writing a geometry into a standard text string, as you can see below
wkt

'POLYGON ((7.106781 60.426055, 7.106781 60.657032, 7.68631 60.657032, 7.68631 60.426055, 7.106781 60.426055))'

#### 5) Request ARLIE data
The retrieval of the ARLIE data can take a while, depending on the size and complexety of your AOI. In the end, you should see a message informing on where the output has been saved and how many ARLIE records have been found.

In [23]:
# Import the API to download ARLIE data
from clms_hrsi_arlie_downloader import download_arlie_products

# Set the following variables to retrieve ARLIE data:
geometryWkt = wkt               # The WKT you have generated above, or simply copy+paste a geometry directly (eg. 'POLYGON ((9.667968999999999 52.48278, 10.722656 51.289406, 6.855469 49.496675, 10.283203 47.457809, 15.205078 48.922499, 14.238281 51.344339, 9.667968999999999 52.48278))')
outputDir = "./output"          # Your preferred output folder
startDate = '2017-01-01'        # Start date for ARLIE data
completionDate = '2023-01-01'   # End date for ARLIE data
cloudCoverageMax = 100          # Maximum cloud cover to consider
requestGeometries = True        # True returns polygons from EU Hydro, False only returns the ARLIE data
returnMode = 'csv_and_variable' # choose between "csv", "csv_and_variable", "variable"

geom, arlie = download_arlie_products(returnMode, outputDir=outputDir, startDate=startDate, completionDate=completionDate, geometryWkt=geometryWkt, cloudCoverageMax=cloudCoverageMax, requestGeometries=requestGeometries)



Executing request for geometries: https://cryo.land.copernicus.eu/arlie/get_geometries?geometrywkt=POLYGON+((7.106781+60.426055%2C+7.106781+60.657032%2C+7.68631+60.657032%2C+7.68631+60.426055%2C+7.106781+60.426055))&getonlyids=True
Executing request for geometries: https://cryo.land.copernicus.eu/arlie/get_geometries?geometrywkt=POLYGON+((7.106781+60.426055%2C+7.106781+60.657032%2C+7.68631+60.657032%2C+7.68631+60.426055%2C+7.106781+60.426055))
Writing geometries in /home/jovyan/Arlie_repoEL/output/geometries.csv
Executing request for ARLIE: https://cryo.land.copernicus.eu/arlie/get_arlie?geometrywkt=POLYGON+((7.106781+60.426055%2C+7.106781+60.657032%2C+7.68631+60.657032%2C+7.68631+60.426055%2C+7.106781+60.426055))&cloudcoveragemax=100&startdate=2017-01-01&completiondate=2023-01-01
Done
Writing ARLIE in /home/jovyan/Arlie_repoEL/output/arlie.csv
Found 441725 ARLIE products.
Writing metadata link into /home/jovyan/Arlie_repoEL/output/ARLIE_MTD.xml
End.


#### 6) Display ARLIE data
We now retrieved the ARLIE data in a csv format and saved it to the folder <code style="background:yellow;color:black">output</code>.
The data should look similar to this when opening as a list:
> [[1, 46493, '2019-01-01T17:16:05', 27, 56, 0, 0, 17 ,1 ,'Sentinel-1 Sentinel-2'],
 [2, 46493, '2019-01-02T17:07:26', 62, 23, 0, 0, 15, 1, 'Sentinel-1 Sentinel-2']], [3, 46493, '2019-01-03T16:59:52', 51, 35, 0, 0, ...

Pass the statistics into a Pandas DataFrame to easier explore and organise the data by running the cell below.

In [24]:
# Import relevant libraries
import pandas as pd
from datetime import datetime

# Parse the orignial ARLIE date format to something more readable
dateparse = lambda x: datetime.strptime(x, '%Y-%m-%dT%H:%M:%S')

# Read the csv file into a dataframe and display it
arlie = pd.read_csv(os.path.join(outputDir, 'arlie.csv'), parse_dates=['datetime'], delimiter=";")
arlie.to_csv(os.path.join(outputDir, 'arlie.csv'))
arlie

Unnamed: 0,id,river_km_id,datetime,water_perc,ice_perc,other_perc,cloud_perc,nd_perc,qc,source
0,1,378906,2017-01-01 05:48:00,67,0,0,0,33,2,Sentinel-1
1,2,378888,2017-01-01 05:48:00,85,0,0,0,15,2,Sentinel-1
2,3,378927,2017-01-01 05:48:00,91,0,0,0,9,2,Sentinel-1
3,4,378991,2017-01-01 05:48:00,71,8,0,0,21,2,Sentinel-1
4,5,378996,2017-01-01 05:48:00,91,0,0,0,9,2,Sentinel-1
...,...,...,...,...,...,...,...,...,...,...
441720,441721,379971,2022-02-28 17:03:54,82,14,0,0,4,2,Sentinel-1 Sentinel-2
441721,441722,379972,2022-02-28 17:03:54,96,0,0,0,4,2,Sentinel-1 Sentinel-2
441722,441723,379974,2022-02-28 17:03:54,86,9,0,0,5,2,Sentinel-1 Sentinel-2
441723,441724,379887,2022-02-28 17:03:54,68,17,0,0,15,2,Sentinel-1 Sentinel-2


The output should now look similar to the table above, with the following columns:<p>

>**id** = unique identifcation number for each ARLIE record <br>
**river_km_id** = identification number for the lake or river section polygon from the EU Hydro database<br>
**datetime** = date and time of ARLIE record<br>
**water_perc** = amount of water per waterbody in %<br>
**ice_perc** = amount of ice or snow cover per waterbody in %<br>
**other_perc** = amount of other features per waterbody in %<br>
**cloud_perc** = cloud cover per waterbody in %<br>
**nd_perc** = no data pixels per waterbody %<br>
**source** = satellite data used to derive ARLIE record<br>
**qc** = quality control value with the following quality levels:
>> high quality: 0.5 > ARLIE_confidence ≥ 0<br>
>> medium quality: 1.5 > ARLIE_confidence ≥ 0.5<br>
>> low quality: 2.5 > ARLIE_confidence ≥ 1.5<br>
>> minimal quality: 3.0 ≥ ARLIE_confidence ≥ 2.5<br>
    
To organise the data, you can sort it by date and filter it for the highest quality level. Do do this, run the cell below.

In [25]:
# Sort records by datetime
arlie = arlie.sort_values(by="datetime", ascending=True)

# Only get records with high confidence (i.e. QC flag = 0)
arlie = arlie.where(arlie.qc == 0).dropna()
display(arlie)

Unnamed: 0,id,river_km_id,datetime,water_perc,ice_perc,other_perc,cloud_perc,nd_perc,qc,source
5921,5922.0,379724.0,2017-01-26 16:54:35,0.0,41.0,0.0,51.0,8.0,0.0,Sentinel-1 Sentinel-2
6094,6095.0,379449.0,2017-01-26 16:54:35,0.0,0.0,0.0,83.0,17.0,0.0,Sentinel-1 Sentinel-2
6091,6092.0,379500.0,2017-01-26 16:54:35,0.0,9.0,0.0,80.0,11.0,0.0,Sentinel-1 Sentinel-2
5813,5814.0,379076.0,2017-01-26 16:54:35,0.0,0.0,0.0,88.0,12.0,0.0,Sentinel-1 Sentinel-2
5808,5809.0,379040.0,2017-01-26 16:54:35,0.0,25.0,0.0,56.0,19.0,0.0,Sentinel-1 Sentinel-2
...,...,...,...,...,...,...,...,...,...,...
406747,406748.0,378965.0,2022-12-31 10:54:38,0.0,0.0,0.0,82.0,18.0,0.0,Sentinel-2
406746,406747.0,378960.0,2022-12-31 10:54:38,0.0,0.0,0.0,93.0,7.0,0.0,Sentinel-2
406745,406746.0,378950.0,2022-12-31 10:54:38,0.0,0.0,0.0,88.0,12.0,0.0,Sentinel-2
406743,406744.0,378941.0,2022-12-31 10:54:38,0.0,0.0,0.0,82.0,18.0,0.0,Sentinel-2


### 7) Display EU Hydro data

Along with the ARLIE statistics, we retrieved EU Hydro data in csv format. Those contain vector polygon data with information about the lake/river ID, lake/river basin name, EU Hydro ID, object name, area, and river kilometre from the mouth (only for rivers which are split into 10 km long sections). Note, that we only recieve the individual EU Hydro ID numbers and no river/lake names. Read more about the full EU Hydro dataset and retrieve the river/lake names [here](#https://www.eea.europa.eu/en/datahub/datahubitem-view/2e782ca5-c7b2-4b48-8928-03031b642176).

By passing the csv file into a Pandas GeoDataFrame we can display the geometries on a folium map. Run the cells below to see how this works.

In [26]:
from shapely.wkt import loads

# Read the geometries from csv file
geometries = pd.read_csv(os.path.join(outputDir, 'geometries.csv'), delimiter=";")

# Convert the WKT geometry strings to Shapely geometry objects
geometries['geometry'] = geometries['geometry'].apply(loads)

# Create a GeoDataFrame. As the ARLIE data comes in the LAEA European projection, we use the crs EPSG:3035
EUhydro_gdf = gpd.GeoDataFrame(geometries, geometry='geometry', crs='EPSG:3035')
EUhydro_gdf.head()

Unnamed: 0,id,geometry,basin_name,eu_hydro_id,object_nam,area,river_km
0,379338,"MULTIPOLYGON (((4164983.064 4162193.428, 41649...",Vorma,IW40032787,UNK,19931.63,
1,379412,"MULTIPOLYGON (((4163971.649 4163710.841, 41639...",Vorma,IW40032861,UNK,20109.02,
2,379666,"MULTIPOLYGON (((4163542.631 4168932.734, 41635...",Vorma,IW40033115,UNK,20765.77,
3,379927,"MULTIPOLYGON (((4168470.905 4174170.158, 41684...",Vorma,IW40033376,UNK,28225.79,
4,379866,"MULTIPOLYGON (((4163919.531 4172605.933, 41638...",Vorma,IW40033315,UNK,44720.36,


In [27]:
# Set the path to your AOI in geojson format and read it as GeoDataFrame
aoi = "aoi.geojson"
aoi_gdf = gpd.read_file(aoi)

# Retrieve the center points of the AOI to display it later
aoiX = aoi_gdf["geometry"].centroid.x
aoiY = aoi_gdf["geometry"].centroid.y

# Create an empty map
m = folium.Map(
    location=[aoiY[0],aoiX[0]],
    tiles='OpenStreetMap',
    zoom_start=7,
    min_zoom=1,
    max_zoom=15,
    control_scale=True
)

# Add the EU Hydro data and your AOI to an interactive map and display it
folium.GeoJson(EUhydro_gdf, name="EU Hydro rivers & Lakes", style_function=lambda x: {'color': 'RoyalBlue'}).add_to(m)
folium.GeoJson(aoi, name="AOI", style_function=lambda x: {'fillOpacity': 0.1,'color': 'Crimson'}).add_to(m)

folium.LayerControl().add_to(m)

m


  aoiX = aoi_gdf["geometry"].centroid.x

  aoiY = aoi_gdf["geometry"].centroid.y


Awesome! 🥳 You now requested ARLIE data and the assoiciated EU Hydro polygons for your AOI! Use it as you like or continue with the next notebook if you a curious on how to analyse and manipulate the data further.