# **ITS_LIVE** Global Glacier Velocity Exploration and Analysis. <a class="anchor" id="chapter_1"/>
<img title="ITS_LIVE" src="img/header.png" width="50%"/>

Luis Lopez[<sup>1</sup>](#fn1), Alex Gardner[<sup>2</sup>](#fn2), Mark Fahnestock[<sup>3</sup>](#fn3), Ted Scambos[<sup>4</sup>](#fn4), Maria Liukis[<sup>2</sup>](#fn2), Chad Greene[<sup>2</sup>](#fn2), Yang Lei[<sup>5</sup>](#fn5), Joe Kennedy[<sup>3</sup>](#fn3),  Bruce Wallin[<sup>1</sup>](#fn1)
 
[![Binder](https://binder.pangeo.io/badge_logo.svg)](https://binder.pangeo.io/v2/gh/nasa-jpl/itslive-explorer/earthcube?filepath=LL_01_ITS_LIVE_global_glacier_velocity_exploration_and_analysis.ipynb)

<span id="fn1" style="font-size: small">1.National Snow and Ice Data Center</span><BR>
<span id="fn2" style="font-size: small">2.NASA Jet Propulsion Laboratory</span><BR>
<span id="fn3" style="font-size: small">3.University of Alaska Fairbanks</span><BR>
<span id="fn4" style="font-size: small">4.University of Colorado Earth Science and Observation Center</span><BR>
<span id="fn5" style="font-size: small">5.California Institute of Technology</span>

## Author(s) <a class="anchor" id="section_1_1"/>

- Author1 = {"name": "Luis Lopez", "affiliation": "National Snow and Ice Data Center", "email": "luis.lopez@nsidc.org", "orcid": ""}
- Author2 = {"name": "Alex Gardner", "affiliation": "NASA Jet Propulsion Laboratory", "email": "alex.s.gardner@jpl.nasa.gov", "orcid": "0000-0002-8394-8889"}
- Author3 = {"name": "Mark Fahnestock", "affiliation": "University of Alaska Fairbanks", "email": "mfahnestock@alaska.edu", "orcid": ""}
- Author4 = {"name": "Ted Scambos", "affiliation": "University of Colorado Earth Science and Observation Center", "email": "tascambos@colorado.edu", "orcid": ""}
- Author5 = {"name": "Maria Liukis", "affiliation": "NASA Jet Propulsion Laboratory", "email": "maria.liukis@jpl.nasa.gov", "orcid": ""}
- Author6 = {"name": "Chad Greene", "affiliation": "NASA Jet Propulsion Laboratory", "email": "chad.a.greene@jpl.nasa.gov", "orcid": ""}
- Author7 = {"name": "Yang Lei", "affiliation": "California Institute of Technology", "email": "ylei@caltech.edu", "orcid": ""}
- Author7 = {"name": "Joe Kennedy", "affiliation": "University of Alaska ASF", "email": "jhkennedy@alaska.edu", "orcid": ""}


## Table of Contents

* [1 ITS_LIVE global glacier velocity exploration and analysis.¶](#chapter_1)
    * [1.1 Author(s)](#section_1_1)
    * [1.2 Purpose](#section_1_2)
    * [1.3 Technical contributions](#section_1_3)
    * [1.4 Methodology](#section_1_4)
    * [1.5 Results](#section_1_5)
    * [1.6 Funding](#section_1_6)
    * [1.7 Keywords](#section_1_7)
    * [1.8 Citation](#section_1_8)
    * [1.9 Work In Progress - improvements](#section_1_9)
    * [1.10 Suggested next steps](#section_1_10)
    * [1.11 Acknowledgements](#section_1_11)
* [2 Setup](#chapter_2)
    * [2.1 Library import](#section_2_1)
    * [2.2  Local library import](#section_2_2)
* [3 Parameter definitions](#chapter_3)
* [4 Data import](#chapter_4)
* [5 Data processing and analysis](#chapter_5)
    * [5.1 Building a data cube](#section_5_1)
    * [5.2 Plotting a time series with xarray](#section_5_2)
* [6 References](#chapter_6)

## Purpose <a class="anchor" id="section_1_2"/>

The itslive-explorer notebook allows users to explore and visualize glacier surface velocity at global scale using data produced by NASA's JPL autonomous Repeat Image Feature Tracking algorithm (Gardner et al., 2018).

Because of its high temporal and spatial resolution, ITS_LIVE data amounts to 8+ million NetCDF files (and growing) stored in AWS S3. For the users' convenience, exploration and filtering of this data can be done using an ipyleaflet-based widget that laverages the [ITS_LIVE search API](https://nsidc.org/apps/itslive-search/docs). After relevant data granules for an area (for example a glacier of interest) are downloaded, the notebook provides users with an xarray-powered method to build a data cube ready for time series analysis and from which valuable scientific insights could be gathered.

## Technical contributions <a class="anchor" id="section_1_3"/>

* **Demonstration of time series analysis using the ITS_LIVE velocity pairs dataset**
  * The main contribution of the notebook is to provide users with a transparent way to access and process glacier surface velocity data. The notebook uses the ITS_LIVE search API to retrieve a list of granules of interest stored on AWS S3. This can be done using the client library or using an ipyleaflet-based widget. The core example of this notebook creates a time series that enables scientists to work on the science and worry less about the code.
* **Development of underlying search API that is exposed in the notebook**
  * The [ITS_LIVE search API](https://nsidc.org/apps/itslive-search/docs) is a [FastAPI](https://fastapi.tiangolo.com/) application that ingests geojson metadata produced by the AutoRIFT processing pipeline and indexes them using PostGIS.
* **Contributing back to the open source community**
  * Since a considerable number of glaciers are in polar latitudes, visualization of such glacier boundaries get distorted and visual overlaps make it difficult to work with them. ITS_LIVE wanted to give users who are not familiar with APIs and Python a way to get the information to their machine, independently of whether their machine is a laptop or a VM in the cloud. For this reason we **implemented custom map projections** for the **[ipyleaflet](https://github.com/jupyter-widgets/ipyleaflet)** map widget. This way a user can search and download data without the need to code a single line of Python, as some scientist have their own way of processing data (e.g. matlab) 

## Methodology <a class="anchor" id="section_1_4"/>

### Data Processing 

Since its conception ITS_LIVE aims to use a cloud native approach to generate and analyze data. Leveraging the fact that the input sources are stored in AWS S3 all the processing occurs on AWS infrastructure using the [hyp3]((https://hyp3.asf.alaska.edu/)) pipeline developed by the **Alaska Satellite Facility**. A dockerized implementation of AutoRIFT is applied to n number of input files in parallel generating output files which are stored back in S3 along with geojson metadata for each granule.

<img title="ITS_LIVE hyp3 pipeline" src="img/processing-pipeline.png" width="50%"/> <a class="anchor" id="figure_2"/>

 [ITS_LIVE's AutoRIFT algorithm](https://github.com/leiyangleon/autoRIFT) can be applied to optical and radar input files. Use cases include the measurement of surface displacements occurring between two repeat satellite images as a result of glacier flow, large earthquake displacements, and land slides. Currently we use Landsat optical and Sentinel radar imagery as input sources for glacier velocity extraction.

Image pairs collected from the same satellite position ("same-path-row") are searched if they have a time separation of fewer than 546 days. This approach was used for all satellites in the Landsat series (L4 to L8). To increase data density prior to the launch of Landsat 8, images acquired from differing satellite positions, generally in adjacent or near-adjacent orbit swaths ("cross-path-row"), are also processed if they have a time separation between 10 and 96 days and an acquisition date prior to 14 June 2013(beginning of regular Landsat 8 data). Feature tracking of cross-path-row image pairs produces velocity fields with a lower signal-to-noise ratio due to residual parallax from imperfect terrain correction. Same-path-row and cross-path-row preprocessed pairs of images are searched for matching features by finding local normalized cross-correlation (NCC) maxima at sub-pixel resolution by oversampling the correlation surface by a factor of 16 using a Gaussian kernel. A sparse grid pixel-integer NCC search (1/16 of the density of full search grid) is used to determine areas of coherent correlation between image pairs. For more information, see the Normalized Displacement Coherence (NDC) filter described in Gardner et al. (2018)

Fig 1 shows  vertical and horizontal pixel displacements being correlated to create a final velocity field normalized to meters per year.

<img title="AutoRIFT" src="https://raw.githubusercontent.com/leiyangleon/autoRIFT/master/figures/regular_grid_optical.png" width="50%"/> <a class="anchor" id="figure_1"/>Fig 1.

### ITS_LIVE velocity pair granules

The velocity pair granule is distributed in NetCDF format. 

* Coverage: All land ice
* Date range: 1985-present
* Resolution: 240m
* Scene-pair separation: 6 to 546 days

<img title="ITS_LIVE hyp3 pipeline" src="img/velocity-granule.png" width="50%"/> <a class="anchor" id="figure_3"/>

### Granule Search and Analysis
It has 2 endpoints to retrieve a list of granules of interest using the OpenAPI specification:

  * `/velocities/coverage/`
    * gets an aggregate by year (a faceted result) of how many granules will be returned given the user's spatiotemporal parameters 
  * `/velocities/urls/`
    * produces a list of S3 file URLs that match the user's spatiotemporal parameters

## Results <a class="anchor" id="section_1_5"/>
Describe and comment on the most important results. Include images and URLs as necessary. 

## Funding <a class="anchor" id="section_1_6"/>

- Award1 = {"agency": "NASA", "award_code": "# Making Earth System Data Records for Use in Research Environments (MEaSUREs) Program", "award_URL": "https://earthdata.nasa.gov/esds/competitive-programs/measures/its-live"}

## Keywords <a class="anchor" id="section_1_7"/>

keywords=["glacier", "surface", "velocity", "dataset", "nasa"]

## Citation <a class="anchor" id="section_1_8"/>


## Work In Progress - improvements <a class="anchor" id="section_1_9"/>

The current analysis workflow still relies in the dated download and analyze paradigm but the idea is to use Pangeo's way and thus process the granules on a Dask cluster.  A dedicated Dask cluster for our users is out of scope for ITS_LIVE, however it will be very convenient to adapt the current client library and examples to use one.

#### TODOs:
- **On demand cube generation**: An intermediate step between the current workflow to analyze the velocity granules locally and in the cloud is to implement the data cube generation on the ITS_LIVE back-end. This way the users will only care about downloading the slice of data they need with the variables they need.
- **Velocity basemap**: the current map widget does not have a basemap that reflects the global velocity mosaics. The trick is to adapt gdal2tiles or something like it to process enough regional maps into a partition that is compatible with NASA GIBS grid definitions(the basemaps used in the widget)
- **Include elevation change datasets**: Velocity is not the only ingredient scientist need in order to analyze glacier movements, elevation change data is also part of ITS_LIVE but is not yet included on the current notebook. Once we have both datasets, more accurate and interesting analyses will be possible e.g. mass balance trends.
- **NASA's Harmony integration**: [NASA's Harmony](https://harmony.earthdata.nasa.gov/) is NASA's next generation data processing tool that will allow scientist to get subsetted and analysis ready data in cloud data format. Having ITS_LIVE as a Harmony compatible data will close the  cloud native circle stated earlier.

## Acknowledgements <a class="anchor" id="section_1_11"/>

Daniel Tiger and Peppa Pig et al. for entertaining our twin toddlers while finishing this notebook.


# Setup <a class="anchor" id="chapter_2"/>



## Library import <a class="anchor" id="section_2_1"/>

> This notebook was designed to build data cubes on a glacier scale rather than larger areas. This notebook will use Pine Island Glacier for demo purposes.


In [6]:
# data manipulation and plotting.
import xarray as xr

# ITS_LIVE Search client which can be used as a widget or just code.
from SearchWidget import map
# horizontal=render in notebook. vertical = render in sidecar
m = map(orientation='horizontal')

# Parameter definitions  <a class="anchor" id="chapter_3"/>

ITS_LIVE data com from multiple scenes and satellites, which means a lot of overlap. In this case all the scenes that match with our spatial criteria will be returned.

### Search parameters

* **polygon/bbox**: LON-LAT pairs separated by a comma.
* **start**: YYYY-mm-dd start date
* **end**: YYYY-mm-dd end date
* **min_separation**: minimum days of separation between scenes
* **max_separation**: maximum days of separation between scenes
* **percent_valid_pixels**: % of valid pixel coverage on glaciers
* **serialization**: json,html,text

In [4]:
# Pine Island Glacier
params = {
    'polygon': '-101.1555,-74.7387,-99.1172,-75.0879,-99.8797,-75.46,-102.425,-74.925,-101.1555,-74.7387',
    'start': '1984-01-01',
    'end': '2020-01-01',
    'percent_valid_pixels': 40,
    'min_separation': 6,
    'max_separation': 120
}

granule_urls = m.Search(params)
print(f'Total granules found: {len(granule_urls)}')

Querying: https://nsidc.org/apps/itslive-search/velocities/urls/?polygon=-101.1555,-74.7387,-99.1172,-75.0879,-99.8797,-75.46,-102.425,-74.925,-101.1555,-74.7387&start=1984-01-01&end=2020-01-01&percent_valid_pixels=40&min_interval=6&max_interval=120
Total granules found: 1643


# Data import  <a class="anchor" id="chapter_4"/>

## Data Filtering

More than a thousand granules doesn't seem much but it's not trivial if you only want to get a glance of the behavior of a particular glacier over the years. For this reason we can limit the number of granules per year and download only those with a given month as a middate, this is useful if the glacier is affected by seasonal cycles.

In [5]:
# filter_urls requires a list of urls, the result is stored in the m.filtered_urls attribute
filtered_granules_by_year = m.filter_urls(granule_urls,
                                          max_files_per_year=10,
                                          months=['November', 'December', 'January'],
                                          by_year=True)

# We print the counts per year
for k in filtered_granules_by_year:
    print(k, len(filtered_granules_by_year[k]))
print(f'Total granules after filtering: {len(m.filtered_urls)}')

1997 1
2000 1
2001 7
2002 10
2003 5
2007 5
2008 8
2009 7
2010 10
2011 6
2012 10
2013 10
2014 10
2015 10
2016 10
2017 10
2018 10
2019 10
Total granules after filtering: 140


## Downloading data

We have 2 options to download data, we can download filtered urls (by year or as a whole) or we can donload a whole set of URLs returned in our original search.

Single year example:

```python
files = m.download_velocity_granules(urls=filtered_granules_by_year['2006'],
                                     path_prefix='data/pine-glacier-2006',
                                     params=params)
```

The `path_prefix` is the dorectory on which the netcdf files will be downloaded to and `params` is to keep track of which parameters were used to download a particular set of files.

We can also download the whole series

```python
files = m.download_velocity_granules(urls=m.filtered_urls,
                                     path_prefix='data/pine-glacier-1996-2019',
                                     filtered_urls=params)

```

In [None]:
filtered_urls = m.filtered_urls # or filtered_granules_by_year
project_folder = 'data/pine-1996-2019'

# if we are using our parameters (not the widget) we asign our own dict i.e. params=my_params
files = m.download_velocity_granules(urls=filtered_urls,
                                     path_prefix=project_folder,
                                     params=params)

# Data processing and analysis  <a class="anchor" id="chapter_5"/>



In [None]:
# First let's open a single data granule (included in this notebook)
velocity_granule = xr.open_dataset('data/LE07_L1GT_001113_20121118_20161127_01_T2_X_LE07_L1GT_232113_20121104_20161127_01_T2_G0240V01_P059.nc')
velocity_granule

In [None]:
# xarray has built-in methods to plot our variables 
velocity_granule.v.plot(x='x', y='y')

## Building a data cube <a class="anchor" id="section_5_1"/>

## Plotting a time series with xarray <a class="anchor" id="section_5_2"/>

# References <a class="anchor" id="chapter_6"/>
