## 3. Compile Time Series by Units of Analysis

**Note Description:**

This notebook aggregate pixels in Sentinel-1 image time series by landscape units, which were extracted from segmentation of Sentinel-2 composite. It returns an "average" time series for each of landscape units that represent the general temporal pattern of S1 backscatter for a certain year. The temporal resolution of the "average" time sereis is determined by the interval of Sentinel-1 images, which is 10 days in this case.

In [26]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

import glob
import geopandas as gpd

import xarray as xr
import rioxarray as rxr

from rasterio.features import geometry_mask

base_dir = '/geoanalytics_user_shared_data/vtang/PEOPLE-ER_Vietnam/'

# Load Sentinel-1 Data

- read all GeoTIFF images from a given folder

- recommend to put all Sentinel-1 images for a year in a designated folder

- the file name of GeoTIFF should include the order number, such as "***02***" in "*an_giang_2022_10d_**02**.tif*"

In [84]:
# find all the .tif files in a given folder
fpath_list = glob.glob(base_dir + 'sentinel-1/2022/10d_composition/*.tif')

# extract image order number from file name
x = [int(item.split('_')[-1].split('.')[0]) for item in fpath_list]

# load all the tif files as a list of xarray DataArray
da_list = []
for fpath in fpath_list:
    da = rxr.open_rasterio(fpath)  # load Sentinel-1 image
    da = da.sel(band=1)  # select VH band
    da_list.append(da)

# concatenate the list of xarray DataArray
da = xr.concat(da_list, pd.Index(x))

# convert unit of Sentinel-1 values from linear to decibel
da.values = 10 * np.log10(da.values)

# Load Polygons of Landscape Units

In [85]:
gdf = gpd.read_file('data/an_giang_segmentation.geojson')
gdf = gdf.to_crs(da.rio.crs.to_epsg())
gdf = gdf.set_index('PID')

# Aggregate Pixels by Units

- Select one landscape unit (i.e. polygon) and use it to clip Sentinel-1 image stack.
- Calculate median of the pixels within polygon for each layer
    - layers are the Senitnel-1 image from different dates
- Generate 1D array of median as the time series that represent the temporal profile for the selected landscape unit.
- Repreat above steps for all polygons

In [86]:
n = da.shape[0]  # number of images (i.e. time steps)
df = pd.DataFrame(index=gdf.index, columns=np.arange(1, n + 1))

for pid, row in gdf.iterrows():
    
    # subset image to improve efficiency
    xmin, ymin, xmax, ymax = row.geometry.bounds
    da2 = da.sel(x=slice(xmin, xmax), y=slice(ymax, ymin))

    mask = geometry_mask(
        gpd.GeoSeries([row.geometry]),
        out_shape=da2.shape[-2:],  # get dimension of x and y
        transform=da2.rio.transform(),
    )
    mask = ~mask

    data = da2.values[:, mask]  # select pixels within polygon mask
    df.loc[pid] = np.median(data, axis=-1)  # aggregate by median

In [None]:
# df.to_csv('data/vh_2022.csv')