# Team work: Handling of available data

For testing your hypothesis in the team work part of the course, different datasets are provided:

- [EODC STAC Catalog](https://services.eodc.eu/browser/#/v1/): Sentinel-1 and other datasets
- Data on JupyterHub:
    - Data used within the hands-on exercise (`~/shared/datasets/rs`): ALOS, ASCAT, soil moisture, ...
    - Austrian Datacube (`~/shared/datasets/fe/data`): Sentinel-1, Sentinel-2, Corine Land Cover, ...
- Additionally, you can bring in your own datasets or access other STAC catalogs.

In this notebook, examples are given how to access the different data sources through the JupyterHub. All shown methods aim to load a `xarray` object, which allows to use many predefined functions and offers a detailed [documentdation](https://docs.xarray.dev/). Besides that you can also work in QGIS, run local Python environments on you own PC or pick tools you are used to. Please be aware that your work needs to end up in a report and presentation at the end!

## STAC data access

Here, a quick recap of [Unit 1]() of the course on how to access data available through STAC. For more details re-visit the corresponding notebook!

In [None]:
import gc

import pystac_client
from odc import stac as odc_stac

EODC offers a lot of datasets through STAC, but be aware that not all might be accessible. To get an overview, have a look at the collections:

In [None]:
# collections provided by the EODC STAC Catalog
eodc_catalog = pystac_client.Client.open("https://stac.eodc.eu/api/v1")
collections = eodc_catalog.get_collections()

max_length = max(len(collection.id) for collection in collections)
for collection in eodc_catalog.get_collections():
    print(f"{collection.id.ljust(max_length)}: {collection.title}")

We will now load Sentinel-1 data for the area of Innsbruck from STAC.

In [None]:
time_range = "2022-03-01/2022-03-31"
innsbruck_bbox = (11.070099, 47.148400, 11.729279, 47.380219)

In [None]:
s1_collection_id = "SENTINEL1_SIG0_20M"
search = eodc_catalog.search(
    collections=s1_collection_id,
    bbox=innsbruck_bbox,
    datetime=time_range,
)
s1_items = search.item_collection()
len(s1_items)

In [None]:
bands = "VV"
chunks = {"time": 1, "latitude": 500, "longitude": 500}

sig0_dc = odc_stac.load(
    s1_items,
    bands=bands,
    crs="EPSG:27704",
    resolution=20,
    bbox=innsbruck_bbox,
    chunks=chunks,
    resampling="bilinear",
)

scale = s1_items[0].assets["VV"].extra_fields.get("raster:bands")[0]["scale"]
nodata = s1_items[0].assets["VV"].extra_fields.get("raster:bands")[0]["nodata"]
sig0_dc = sig0_dc.where(sig0_dc != nodata) / scale
sig0_dc = sig0_dc.dropna(dim="time")
sig0_dc.VV

In [None]:
sig0_mean = sig0_dc.mean(dim="time", skipna=True)
sig0_mean.VV.plot(robust=True, cmap="Greys_r")

In [None]:
del sig0_dc
gc.collect()

In [None]:
import os
import numpy as np
from matplotlib import pyplot as plt

In [None]:
os.environ["GDAL_HTTP_TCP_KEEPALIVE"] = "YES"
os.environ["AWS_S3_ENDPOINT"] = "eodata.dataspace.copernicus.eu"
os.environ["AWS_ACCESS_KEY_ID"] = ""  # Add your key
os.environ["AWS_SECRET_ACCESS_KEY"] = ""  # Add your key
os.environ["AWS_HTTPS"] = "YES"
os.environ["AWS_VIRTUAL_HOSTING"] = "FALSE"
os.environ["GDAL_HTTP_UNSAFESSL"] = "YES"

In [None]:
# collections provided by the CDSE STAC Catalog
cdse_catalog = pystac_client.Client.open("https://stac.dataspace.copernicus.eu/v1")

cdse_collections = cdse_catalog.get_collections()
max_length = max(len(collection.id) for collection in cdse_collections)

for collection in cdse_catalog.get_collections():
    print(f"{collection.id.ljust(max_length)}: {collection.title}")

In [None]:
time_range = "2022-03-01/2022-03-31"
innsbruck_bbox = (11.070099, 47.148400, 11.729279, 47.380219)

In [None]:
s2_collection_id = "sentinel-2-l2a"
search = cdse_catalog.search(
    collections=s2_collection_id,
    bbox=innsbruck_bbox,
    datetime=time_range,
)
s2_items = search.item_collection()
len(s2_items)

In [None]:
chunks = {"time": 1, "latitude": 500, "longitude": 500}


bands=["B02_10m","B03_10m","B04_10m"]

s2_dc = odc_stac.load(
    s2_items,
    bands=bands,
    crs="EPSG:27704",
    resolution=20,
    bbox=innsbruck_bbox,
    chunks=chunks,
    resampling="bilinear",
)

s2_dc



In [None]:
s2_first_date = s2_dc.isel(time=0)

rgb = np.dstack([s2_first_date['B02_10m'],s2_first_date['B03_10m'],s2_first_date['B04_10m']])
scaled_rgb = np.clip(rgb/10000,0,1)
plt.imshow(scaled_rgb)
