# Tutorial Content


This notebook will walk you through steps to read in and organize velocity data in a raster format using xarray and rioxarray tools

First, lets install the python libraries that were listed on the [Software](software.ipynb) page:

In [None]:
import geopandas as gpd
import os
import numpy as np
import xarray as xr
import rioxarray as rxr
import matplotlib.pyplot as plt
from geocube.api.core import make_geocube

In [None]:
!pwd

In [None]:
gen_path = '/Users/emarshall/Desktop/siparcs/xr_book1/'

## ITS_LIVE (netcdf)

This section contains a workflow for reading in and organizing ITS_LIVE glacier velocity data that is accessed in netcdf format from the NSIDC DAAC. Whereas before, we needed to build a the magnitude of velocity variable from the velocity component variables (individual geotiff files), the netcdf file contains a variable for magnitude of velocity as well as many other variables representing land cover types, error estimates and metadata

In [None]:
itslive = rxr.open_rasterio(gen_path[:-9] + '/data/HMA_G0120_0000.nc').squeeze()

In [None]:
itslive

What is the CRS of this object?

There are two ways we can check that. First, by using the `rio.crs` accessor:

In [None]:
itslive.rio.crs

The netcdf object is in a different CRS than the geotiff object. Because **Asia North Lambert Conformal Conic** covers a larger spatial extent than a single UTM zone (the projection of the geotiff object), we will use that projection.
*add link to good explainer page?*

In [None]:
itslive.dims

In [None]:
itslive.coords

## Vector data 

In [None]:
#read in vector data 
se_asia = gpd.read_file(gen_path[:-9] + 'data/nsidc0770_15.rgi60.SouthAsiaEast/15_rgi60_SouthAsiaEast.shp')

How many glaciers are in this dataframe?

In [None]:
len(se_asia['RGIId'])

What coordinate reference system is this dataframe in? 

In [None]:
se_asia.crs

The vector dataset is in WGS 84, meaning that its coordinates are in degrees latitude and longitude rather than meters N and E. We will project this dataset to match the projection of the netcdf dataset.

## Handling projections

Let's project this dataframe to match the CRS of the itslive dataset

In [None]:
#not sure why but this didn't work for me specifying epsg code, had to specify full description
se_asia_prj = se_asia.to_crs('+proj=lcc +lat_1=15 +lat_2=65 +lat_0=30 +lon_0=95 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m no_defs')
se_asia_prj

## Let's start this analysis on a single glacier

We'll demonstrate analysis on a single glacier before scaling up to multiple glaciers. To start with, let's select the largest glacier in the dataframe.

In [None]:
se_asia_prj['Area'].idxmax()

In [None]:
se_asia_prj.iloc[11908]

### So, our sample glacier is:

In [None]:
sample_glacier = se_asia_prj.loc[se_asia_prj['RGIId'] == 'RGI60-15.11909']
sample_glacier

#### Clip raster data to vector (sample glacier)

We'll be following [this example](https://corteva.github.io/rioxarray/stable/examples/clip_geom.html), go check it out for more info  

In [None]:
itslive

In [None]:
glacier_raster = itslive.rio.clip(sample_glacier.geometry, sample_glacier.crs)

In [None]:
glacier_raster

In [None]:
glacier_raster.ice.plot()

In [None]:
fig, ax = plt.subplots()

#sample_glacier.plot(ax=ax, facecolor='white', edgecolor='red')
glacier_raster.v.plot(ax=ax, cmap=plt.cm.cividis)

In [None]:
glacier_raster.v.data.min()

### Handling missing data / selecting data (xr.where)
The above plot isn't that informative because you can see that the non-glaciated terrain surrounding the glacier is assigned negative values that skew the colorscale. Assigning missing or non-target datapoints a unique and distinctive numeric value can be useful in some cases, but for our purposes we don't want them showing up in our plots right now.

In [None]:
glacier_raster.ice.data.shape

glacier_raster.v.data[51]

**fix this part**

In [None]:
#anywhere glacier_raster.ice == 0, we want to turn to nan (I think?)
glacier_raster_x = xr.where(glacier_raster.v != -32767., glacier_raster, np.nan)

In [None]:
glacier_raster_x.v.plot()