# MRS-UE: Handling of the additional datasets available for the team work

Besides the data used in the hand-on exercises, additional data is provided for the team work part (=ACube data). You can find the data under: `/shared/datasets/fe/data`

There are 3 types of datasets available:

- Sentinel-1 data
- Sentinel-2 data
- Auxiliary data

In [1]:
from pathlib import Path
import rioxarray
import xarray as xr
import pandas as pd
import hvplot.xarray

## Data structure

The data is projected and tiled based on the Equi7Grid. Therefore, all datasets follow the same folder structure:

PRODUCT / SUBGRID / TILE / LAYER

In [2]:
source_path = Path(r'~/shared/datasets/fe/data').expanduser()  # additional data for team part

res = 10  # 10m or 500m
sentinel1_parameter_path = source_path / 'sentinel1' / 'parameters' / ('EU{}M'.format(str(res).zfill(3)))
sentinel1_preprocessed_path = source_path / 'sentinel1' / 'preprocessed' / ('EU{}M'.format(str(res).zfill(3)))
sentinel2_path = source_path / 'sentinel2' / 'L2A' / 'EU010M'

sentinel1_parameter_path

PosixPath('/home/froth/shared/datasets/fe/data/sentinel1/parameters/EU010M')

## Data loading
Examples how to load the additional data.

### Sentinel-1 (single file)
Sentinel-1 data is stored here: `/shared/datasets/fe/data`

How to load a single Sentinel-1 observation file:
- Collect file from file system
- Load it as xarray DataSet
- Prepare the data for further usage (decoding and clean-up)
- Work with the data

In [3]:
def acube_s1_preprocess(x):

    '''
    Decode and clean up Sentinel-1 ACube data.

    Parameters
    ----------
    x : xarray.Dataset
    
    Returns
    -------
    xarray.Dataset
    '''

    path = Path(x.encoding["source"])
    filename = path.name

    date_str = filename.split('_')[0][1:]
    time_str = filename.split('_')[1][:6]
    datetime_str = date_str + time_str
    date = pd.to_datetime(datetime_str, format='%Y%m%d%H%M%S')
    x = x.expand_dims(dim={"time": [date]})

    x = x.rename({"band_data": "s1_" + path.parent.stem}).\
        squeeze("band").\
        drop_vars("band")

    return x * 0.01

In [6]:
single_path = sentinel1_preprocessed_path / 'E051N015T1' / 'sig0' / 'D20210122_165830--_SIG0-----_S1BIWGRDH1VVA_044_A0105_EU010M_E051N015T1.tif'

s1 = xr.open_dataset(
    single_path,
    engine="rasterio",
    )
s1 = acube_s1_preprocess(s1)
s1

In [7]:
# visualize data
s1.hvplot.image(robust=True, data_aspect=1, cmap="Greys_r", rasterize=True)

### Sentinel-1 (multiple files)
Similar to the hands-on exercises, one can load multiple Sentinel-1 files at once, but be aware of the limited availability of memory in the JupyterHub. It is suggested to used the provoded statistical parameters (introduced below) instead of loading multi-temporal data.

In [8]:
tile_path = single_path = sentinel1_preprocessed_path / 'E051N015T1' / 'sig0'
sig0_day_paths = list(tile_path.glob('D20210122*S1*IWGRDH1VV*.tif'))
len(sig0_day_paths)

2

In [9]:
s1_day = xr.open_mfdataset(
    sig0_day_paths,
    engine="rasterio",
    combine='nested',
    chunks=-1,
    preprocess=acube_s1_preprocess
    )

s1_day

Unnamed: 0,Array,Chunk
Bytes,762.94 MiB,381.47 MiB
Shape,"(2, 10000, 10000)","(1, 10000, 10000)"
Dask graph,2 chunks in 21 graph layers,2 chunks in 21 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 762.94 MiB 381.47 MiB Shape (2, 10000, 10000) (1, 10000, 10000) Dask graph 2 chunks in 21 graph layers Data type float32 numpy.ndarray",10000  10000  2,

Unnamed: 0,Array,Chunk
Bytes,762.94 MiB,381.47 MiB
Shape,"(2, 10000, 10000)","(1, 10000, 10000)"
Dask graph,2 chunks in 21 graph layers,2 chunks in 21 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


In [10]:
s1_day = s1_day.mean(dim='time', skipna=True)

In [11]:
s1_day.hvplot.image(robust=True, data_aspect=1, cmap="Greys_r", rasterize=True)

### Sentinel-1 (parameters)
Parameters are statistics based on the Sentinel-1 backscatter and they are provided under: `~/shared/data/sentinel1/parameters`

The most relevant statistical parameters are:

- tmaxsig0: Maximum SIG0 backscatter per relative orbit
- tmaxsig38: Maximum SIG0 backscatter normalized to an incidence angle of 38 degree
- tmensig0: Average SIG0 backscatter per relative orbit
- tmensig38: Average SIG0 backscatter normalized to an incidence angle of 38 degree
- tminsig0: Minimum SIG0 backscatter per relative orbit
- tminsig38: Minimum SIG0 backscatter normalized to an incidence angle of 38 degree

In [14]:
par_path = sentinel1_parameter_path / 'E051N015T1' / 'tmensig38' / 'M20160101_20171231_TMENSIG38_S1-IWGRDH1VV-_---_B0104_EU010M_E051N015T1.tif'

par = xr.open_dataset(
    par_path,
    engine="rasterio",
    )
par = acube_s1_preprocess(par)

In [15]:
par.hvplot.image(robust=True, data_aspect=1, cmap="Greys_r", rasterize=True)

### Sentinel-2
Sentinel-2 data is available under `~/shared/datasets/fe/data/sentinel2`

In this example, we will load the true-color image (TCI), but the dataset contains all the available bands from Sentinel-2.

In [16]:
s2_path = sentinel2_path / 'E051N015T1' / 'tci' / 'TCI-------_SEN2COR_S2B_L2A------_20210125_20210125_EU010M_E051N015T1.tif'

s2 = xr.open_dataset(
    s2_path,
    engine="rasterio",
    )
s2.coords['band'] = [0, 1, 2]
s2

In [17]:
s2.hvplot.rgb(
    x="x",
    y="y",
    bands='band',
    rasterize=True,
    xlabel="x",
    ylabel="y",
).redim.nodata(z=0)