# 2: Example of Sentinel-1 RTC time series analysis


Now that we have done so much work to organize these two datasets and prepare them for analysis, let's explore the data with a scientific question in mind.

In this example, we have data over a specific area of interest covering two glaciers and two proglacial lakes in the Central Himalaya near the India-Chinese border. As a glaciologist, I might be interested in questions related to the conditions of these surface -- is there a seasonal pattern to proglacial lake conditions? Do they freeze during the winter, and if so, at similar times? What is the surface of the glacier like at certain times of year? SAR backscatter imagery may not conclusively answer any of these questions in itself, but it can provide important insights about surface conditions and how they change over time that could be used to answer some of these questions.

This notebook will walk through some initial steps of how you could use the data objects we've created to explore backscatter dynamics over time and space for the area of interest we have identified. 

## Learning objectives 

### Concepts
- Subset larger dataset to spatial areas of interest
- Computations and reductions
- Data visualization
  
### Techniques
- [GeoPandas](https://geopandas.org/en/stable/):  
      - Handling projections  
      - Spatial joins of multiple vector datasets   
      - Interactive data visualization  
- [Xarray](https://xarray.dev/) and [RioXarray](https://corteva.github.io/rioxarray/stable/):  
      - Clip raster object by a vector object  
      - Computations and reductions along different dimensions  
      - Data visualization

## Software and setup

In [None]:
%xmode minimal
import geopandas as gpd
import xarray as xr
from shapely import geometry
import matplotlib.pyplot as plt
import numpy as np

### Utility functions

In [2]:
def get_bbox_single(input_xr, buffer=0):
    """Takes input xr object (from itslive data cube), plots a quick map of the footprint.
    currently only working for granules in crs epsg 32645"""

    xmin = input_xr.coords["x"].data.min()
    xmax = input_xr.coords["x"].data.max()

    ymin = input_xr.coords["y"].data.min()
    ymax = input_xr.coords["y"].data.max()

    pts_ls = [(xmin, ymin), (xmax, ymin), (xmax, ymax), (xmin, ymax), (xmin, ymin)]

    crs = input_xr.rio.crs

    polygon_geom = geometry.Polygon(pts_ls)
    polygon = gpd.GeoDataFrame(index=[0], crs=crs, geometry=[polygon_geom])
    polygon_prj = polygon
    polygon = polygon_prj.to_crs(crs)

    # add a buffer if needed
    bounds = polygon.total_bounds
    # bounds = [bounds[0]-500, bounds[2]+500, bounds[1]-500, bounds[3]+500]

    bounds_xmin = bounds[0] - buffer
    bounds_xmax = bounds[2] + buffer
    bounds_ymin = bounds[1] - buffer
    bounds_ymax = bounds[3] + buffer

    bounds_ls = [
        (bounds_xmin, bounds_ymin),
        (bounds_xmax, bounds_ymin),
        (bounds_xmax, bounds_ymax),
        (bounds_xmin, bounds_ymax),
        (bounds_xmin, bounds_ymin),
    ]

    bounds_geom = geometry.Polygon(bounds_ls)
    bound_gdf = gpd.GeoDataFrame(index=[0], crs=crs, geometry=[bounds_geom])
    bounds_prj = bound_gdf.to_crs(crs)

    return bounds_prj

In [3]:
def power_to_db(input_arr):
    return 10 * np.log10(np.abs(input_arr))

### Read in prepared RTC data

- this example will use ASF dataset

In [None]:
asf_cube = xr.open_dataset("../data/tutorial2/s1_asf_cube_updated.zarr")

In [None]:
asf_cube

In [6]:
asf_cube = asf_cube.where(asf_cube.vv != 0.0, np.nan, drop=False)

### Read in vector data 

- Manually-drawn outlines of proglacial lakes

In [None]:
lakes = gpd.read_file(
    "https://github.com/e-marshall/sentinel1_rtc/raw/main/proglacial_lake_outline.geojson"
)
lakes_prj = lakes.to_crs("EPSG:32645")
lakes_prj

Glacier outlines from Randolph Glacier Inventory

In [8]:
da_bbox = get_bbox_single(asf_cube)

In [9]:
rgi = gpd.read_parquet("../data/tutorial1/rgi7_region15_south_asia_east.parquet")
rgi.head(3)
rgi_prj = rgi.to_crs("epsg:32645")

rgi_sub = gpd.sjoin(rgi_prj, da_bbox, how="inner")

In [None]:
rgi_sub.explore()

In [None]:
rgi_sub

In [12]:
rgi_2 = rgi_sub.loc[rgi_sub["glims_id"].isin(["G088279E27984N", "G088259E27982N"])]

In [None]:
rgi_2

## TODO
handdrew the lake outllines off of rgi6, looks like rgi7 slightly longer, should update lake outlines


In [None]:
fig, ax = plt.subplots(figsize=(9, 8))

power_to_db(asf_cube.vv.mean(dim=["acq_date"])).plot(ax=ax, cmap=plt.cm.Greys_r)

rgi_2.plot(edgecolor="r", facecolor="none", ax=ax)
rgi_2.plot(edgecolor="r", facecolor="none", ax=ax)

lakes_prj.plot(ax=ax, facecolor="none", edgecolor="blue")

fig.suptitle("ASF RTC backscatter image, 30 Apr 2022", fontsize=14);

## Clip to lake extent

In [15]:
lake1 = lakes_prj.loc[lakes_prj["id"] == 1]
lake2 = lakes_prj.loc[lakes_prj["id"] == 2]

In [16]:
lake1_asf = asf_cube.rio.clip(lake1.geometry, lake1.crs)
lake2_asf = asf_cube.rio.clip(lake2.geometry, lake2.crs)

In [None]:
lake1_asf

In [None]:
lakes_prj["color"] = ["r", "b"]
lakes_prj

## Data visualization

In [None]:
from matplotlib.lines import Line2D

fig, axs = plt.subplots(ncols=2, figsize=(18, 7))

# scatter plot VV
power_to_db(lake2_asf.vv.mean(dim=["x", "y"])).plot(
    ax=axs[0], color="blue", marker="o", linewidth=0, alpha=0.6
)
power_to_db(lake1_asf.vv.mean(dim=["x", "y"])).plot(
    ax=axs[0], color="red", marker="o", linewidth=0, alpha=0.6
)
# scatter plot VH
power_to_db(lake2_asf.vh.mean(dim=["x", "y"])).plot(
    ax=axs[1], color="blue", marker="o", linewidth=0, alpha=0.6
)
power_to_db(lake1_asf.vh.mean(dim=["x", "y"])).plot(
    ax=axs[1], color="red", marker="o", linewidth=0, alpha=0.6
)
axs[0].set_title("VV backscatter over proglacial lakes 2021-2022")
axs[1].set_title("VH backscatter over proglacial lakes 2021-2022")

legend_elements = [
    Line2D([0], [0], color="r", lw=3, label="lake 1"),
    Line2D([0], [0], color="b", lw=3, label="lake 2"),
];

What observations can we make about VV and VH variability in the above plots? What would we want to look at next to further explore those observations? 