# Individual glacier surface velocity analysis

This notebook will build upon the data access and inspection steps in the earlier notebooks and demonstrate basic data analysis and visualization of surface velocity data at the scale of an individual glacier using xarray. 

*Learning goals*: 
- using xarray label-based indexing and selection tools
- computation and grouped computation
- visualization

In [None]:
import os
import json
import urllib.request
import numpy as np
import xarray as xr
import rioxarray as rxr
import geopandas as gpd
import pandas as pd
import seaborn as sns 

import matplotlib.pyplot as plt
import matplotlib.ticker as mticker

from shapely.geometry import Polygon
from shapely.geometry import Point
import cartopy.crs as ccrs
from cartopy.mpl.gridliner import LONGITUDE_FORMATTER, LATITUDE_FORMATTER
import cartopy
import cartopy.feature as cfeature

from skimage.morphology import skeletonize
import flox

%config InlineBackend.figure_format='retina'

In [None]:
import itslivetools

In [None]:
with urllib.request.urlopen('https://its-live-data.s3.amazonaws.com/datacubes/catalog_v02.json') as url_catalog:
    itslive_catalog = json.loads(url_catalog.read().decode())
itslive_catalog.keys()

In [None]:
url = itslivetools.find_granule_by_point(itslive_catalog, [84.56, 28.54])
url

In [None]:
dc = itslivetools.read_in_s3(url[0])
dc

In [None]:
dc_timesorted = dc.sortby(dc['mid_date'])


## Read in vector data

In [None]:
se_asia = gpd.read_file('https://github.com/scottyhq/rgi/raw/main/15_rgi60_SouthAsiaEast.gpkg')
se_asia.head(3)
se_asia_prj = se_asia.to_crs('EPSG:32645')

In [None]:
se_asia_prj.explore()

Choose a single glacier and subset

In [None]:
sample_glacier_vec = se_asia_prj.loc[se_asia_prj['RGIId'] == 'RGI60-15.04714']
sample_glacier_vec

### Clip ITS_LIVE data to extent of sample glacier

First, need to write the crs attr of the datacube

In [None]:
dc_timesorted = dc_timesorted.rio.write_crs(f"epsg:{dc_timesorted.mapping.attrs['spatial_epsg']}", inplace=True)

In [None]:
sample_glacier_raster = dc_timesorted.rio.clip(sample_glacier_vec.geometry, sample_glacier_vec.crs)

In [None]:
sample_glacier_raster = sample_glacier_raster.drop_vars(mapping)
sample_glacier_raster

## Taking a look at a velocity time series

Now that we have the velocity data clipped to a single glacier, let's explore the clipped dataset. The below cell plots the mean velocity across the x and y dimensions over time.

In [None]:
sample_glacier_raster.v.mean(dim=['x','y']).plot()

It looks like there is a large amount of variability in the mean velocity over time. Let's use xarray tools to resample the time dimension.

In [None]:
resample_obj = sample_glacier_raster.resample(mid_date = '1M')
resample_obj

`.resample()` is another grouping operation and returns an object of type `xarray.core.resample.DatasetResample`

In [None]:
sample_glacier_resample_1m = resample_obj.mean(dim='mid_date')


The below plot is the initial velocity time series in blue, and the velocity data resampled to 1 month intervals in orange

In [None]:
sample_glacier_raster.v.mean(dim=['x','y']).plot(label = 'Mean glacier speed')
sample_glacier_resample_1m.v.mean(dim=['x','y']).plot(label = '1 month resample')
plt.legend()

This is interesting! Despite what looks to be a pretty noisy signal looking at the full time series, we can start to pick out a seasonal signal and sub-annual velocity variability looking at the velocity data resampled into 1-month bins.

### We could also calculate velocity anomalies... 

To do this, we will use xarray `groupby()` and `map()` 


following example from xarray tutorial

We first define a function that subtracts the long-term mean from a single observation. 

In [None]:
def remove_time_mean(x):
    return x-x.mean(dim='mid_date')

We then group the dataset by month and apply the function to calculate the anomaly on each group

In [None]:
sample_glacier_anom = sample_glacier_raster.groupby('mid_date.month').map(remove_time_mean)
sample_glacier_anom

Let's observe the velocity anomaly alongside the velocity time series. 

In [None]:
fig, axs = plt.subplots(ncols = 2, figsize=(17,7))
sample_glacier_anom.v.mean(dim=['x','y']).plot(ax=axs[1]);
sample_glacier_raster.v.mean(dim=['x','y']).plot(ax=axs[0]);
axs[1].axhline(y=0, c = 'red', alpha = 0.5)
axs[0].set_title('Glacier mean magnitude of velocity (m/y) over time series')
axs[1].set_title('Glacier mean velocity anomaly (m/y) over time series')

In the above plot we were taking the mean over the x and y dimensions. Let's take the mean along the mid_date dimension:

In [None]:
fig, axs = plt.subplots(ncols =2 , figsize=(16,7))
sample_glacier_raster.mean(dim='mid_date').v.plot(ax = axs[0]);
sample_glacier_anom.mean(dim='mid_date').v.plot(ax=axs[1]);

## Grouped analysis by season
We have a dense time series of surface velocity data for a single glacier. We can use xarray's `groupby()` to examine velocity variability further. We will start with using `groupby()` to break the velocity time series into seasonal means.

In [None]:
seasons_gb = sample_glacier_raster.groupby(sample_glacier_raster.mid_date.dt.season).mean()
#add attrs to gb object
seasons_gb.attrs = sample_glacier_raster.attrs 
seasons_gb

Breaking down the above cell, we defined how we wanted to group our data (`sample_glacier_raster.mid_date.dt.season`) and the reduction we wanted to apply to each group (`mean()`). After the apply step, xarray automatically combines the groups into a single object. We can see that the `seasons_gb` object is an `xarray.Dataset` with the same dimensions and coordinates as the `sample_glacier_raster` object but that the `seasons_gb` object has a `seasons` dimension as well.

 If you'd like to see another example of this with more detailed explanations, go [here](https://tutorial.xarray.dev/fundamentals/03.2_groupby_with_xarray.html).



To visualize velocity data across the seasonal groups we just defined, we can use xarray's `faceting` functionality. Faceting is a great way to visualize your data in 'small multiples' format. 

In [None]:
fg = seasons_gb.v.plot(
    col='season',
);

In [None]:
fig, axs = plt.subplots(ncols =3, figsize=(20,5))
sample_glacier_raster.v.sel(x = 246052.5, y= 3181987.5).plot(ax=axs[0])
sample_glacier_raster.v.mean(dim=['x','y']).plot(ax=axs[0], alpha = 0.5)
sample_glacier_raster.v.mean(dim='mid_date').plot(ax=axs[1])
axs[1].axvline(x=246052.5, c= 'red')
axs[1].axhline(y=3181987.5, c='red')
(sample_glacier_raster.v.sel(x = 246052.5, y= 3181987.5) - sample_glacier_raster.v.mean(dim=['x','y'])).plot(ax=axs[2], linewidth=0, marker='o', alpha = 0.5)
axs[0].set_title('Time series of average glacier speed (orange) \n and speed at point in accumulation zone (blue')
axs[2].set_title('Point speed - mean glacier speed')
