# 3.1 Data Processing
In this exercise we will build a complete EO workflow on a cloud platform; from data access to obtaining the result. 
In this example we will analyse snow cover in the Alps. 
**MORE DETAILS HERE**: This exercise should be more repetition, and the goal is that everybody arrives at the result - without coding very much themselves. Then the transfer application will be done in the sharing exercise

We are going to follow these steps in our analysis:
- Load relevant data sources
- Specify the spatial, temporal extents and the features we are interested in
- Process the satellite data to retreive snow cover information
- aggregate information in data cubes
- Tracking the resources we use for our computation
- Visualize and analyse the results


## Login

In [5]:
# platform libraries
import openeo
from sentinelhub import (SHConfig, SentinelHubRequest, DataCollection, MimeType, CRS, BBox, bbox_to_dimensions, geometry)

# utility libraries
from datetime import date
import numpy as np
import pandas as pd
import geopandas as gpd
import matplotlib.pyplot as plt
import folium

In [2]:
config = SHConfig()
config.sh_client_id = %env SH_CLIENT_ID
config.sh_client_secret = %env SH_CLIENT_SECRET

In [3]:
conn = openeo.connect('https://jjdxlu8vwl.execute-api.eu-central-1.amazonaws.com/production')

In [4]:
conn.authenticate_basic(username=config.sh_client_id, password=config.sh_client_secret)

<Connection to 'https://jjdxlu8vwl.execute-api.eu-central-1.amazonaws.com/production/' with BasicBearerAuth>

In [None]:
# Use this for more 
# https://github.com/openEOPlatform/sample-notebooks/blob/main/openEO%20Platfrom%20-%20Basics.ipynb
# https://github.com/Open-EO/openeo-community-examples/tree/main/python

## Select a region of interest
- Select fixed region for all students -> easier evaluation, easier analysis catchment, easier validation
- Everybody choose region in predefined area -> More fun, reusable for next exercise (ideally you see which regions are already computed by a stac catalog with all entries of the course participants). Size limit: X x X pixels

--> Will start with fixed region and recalculate the result in the sharing lesson

Load the catchment area.
**Possible Question: What is the city at the outlet of the catchment? a) Meran, b) Innsbruck, c) Grenoble**

In [7]:
catchment_outline = gpd.read_file('data/catchment_outline.geojson')

In [8]:
m = folium.Map(location=[catchment_outline.centroid.y, catchment_outline.centroid.x])
folium.GeoJson(data=catchment_outline.to_json(), name='catchment').add_to(m)
m


  m = folium.Map(location=[catchment_outline.centroid.y, catchment_outline.centroid.x])

  m = folium.Map(location=[catchment_outline.centroid.y, catchment_outline.centroid.x])
  float(coord)
  if math.isnan(float(coord)):
  return [float(x) for x in coords]


## Configuring the data content of the cube
We need to set the following configurations to define the content of the data cube we want to access:
- dataset name
- band names
- time range
- the area of interest specifed via bounding box coordinates
- spatial resolution

To select the correct dataset we can first list all the available datasets.

In [9]:
print(conn.list_collection_ids())

['SENTINEL2_L2A_MOSAIC_120', 'COPERNICUS_30', 'MAPZEN_DEM', 'SENTINEL1_GRD', 'CDS_2M_TEMP_2020', 'ALOS_PALSAR2_RICE_PADDY_FIELD_MAP', 'ALOS_PALSAR2_AGRICULTURE', 'ALOS_PALSAR2_L2_1_3M', 'ALOS_PALSAR2_L2_1_10M', 'CAMS_GLC', 'CNR_CHL', 'CNES_LAND_COVER_MAP', 'SENTINEL_5P_CO_T3D_AVERAGE', 'CORINE_LAND_COVER', 'CORINE_LAND_COVER_ACCOUNTING_LAYERS', 'E12C_MOTORWAY', 'E12D_PRIMARY', 'ESA_WORLDCOVER_10M_2020_V1', 'GHS_BUILT_S2', 'GLOBAL_LAND_COVER', 'GLOBAL_SURFACE_WATER', 'NASA_HARMONIZED_LANDSAT_SENTINEL', 'ICEYE_GRD_E11', 'ICEYE_GRD_E11A', 'ICEYE_GRD_E13B', 'ICEYE_GRD_E3', 'JAXA_WQ_CHLA_ANOMALY', 'JAXA_WQ_CHLA_AVERAGE', 'JAXA_WQ_TSM_ANOMALY', 'JAXA_WQ_TSM_AVERAGE', 'LANDSAT1-5_MSS_L1', 'LANDSAT4-5_TM_L1', 'LANDSAT4-5_TM_L2', 'LANDSAT7_ETM_L1', 'LANDSAT7_ETM_L2', 'LANDSAT8-9_L1', 'LANDSAT8-9_L2', 'MODIS', 'LTK_NATIONAL_HIGH_RESOLUTION_LAYER', 'POPULATION_DENSITY', 'SENTINEL_5P_CH4_T7D_AVERAGE', 'SENTINEL_5P_NO2_T14D_AVERAGE', 'SEA_ICE_INDEX', 'SEASONAL_TRAJECTORIES', 'SENTINEL1_CARD4L', 'SE

We want to use the Sentinel-2 L2A product. It's name is `'SENTINEL2_L2A_SENTINELHUB'`. 

We get the metadata for this collection as follows.

In [11]:
conn.describe_collection("SENTINEL2_L2A_SENTINELHUB")

As a time range we will focus on the snow melting season 2018, in particular from Febraury to June 2018:
**How many images are available in the time range?**
**How many pixels are in the data cube?** (time*x*y*bands)

https://github.com/openEOPlatform/sample-notebooks/blob/main/openEO%20Platfrom%20-%20Basics.ipynb