# *IceFlow & icepyx*: Altimetry Time Series Tutorial
### NASA Earthdata Webinar - April 2021</b>

This tutorial demonstrates how to harmonize several NASA altimetry data sets with varying temporal coverage, formats, and coordinate reference frames using the IceFlow and icepyx Python tools. Please refer to the 0_introduction.ipynb notebook for detailed information on the data sets you will be exploring in this tutorial. 

#### Objectives:
1. Use the IceFlow map widget to select and visualize an area of interest.
2. Access coincident ICESat/Glas, Operation IceBridge, and ICESat-2 data over the same spatial region.
3. Use the community-developed icepyx python library to subset ICESat-2 data.
4. Learn about advanced icepyx capabilities including data value visualization prior to download. 
5. Extract common data variables into a Geopandas dataframe.
7. Plot and visualize the altimetry time series to detect glacial elevation change over time.

<b>Authors:</b><br />
<span style="font-size:larger;">Jessica Scheick</span>, *University of New Hampshire*, Durham, New Hampshire<br />
<span style="font-size:larger;">Nicholas Kotlinski & Amy Steiker</span>, *NASA National Snow and Ice Data Center DAAC*, Boulder, Colorado, USA

---

#### Running this tutorial locally

To run this notebook locally, you must first set up your computing environment. Please see the [repository readme](https://github.com/nsidc/NSIDC-Data-Tutorials#usage-with-binder) for instructions on several ways (using Binder, Docker, or Conda) to do this.

## 1. NASA's Earthdata Credentials

To access data using the *IceFlow* library and *icepyx* package, it is necessary to log into [Earthdata Login](https://urs.earthdata.nasa.gov/). To do this, enter your NASA Earthdata credentials in the next step after executing the following code cell.

**Note**: If you don't have NASA Earthdata credentials you will need to register first at the link above. An account is free and available to everyone!

In [None]:
# This cell will prompt you for your Earthdata username and password. Press
from iceflow.ui import IceFlowUI
client = IceFlowUI()
client.display_credentials()

In [None]:
# This cell will verify if your credentials are valid. 
# This may take a little while. If it fails, please try again.

authorized = client.authenticate()
if authorized is None:
    print('Earthdata Login not successful')
else:
    print('Earthdata Login successful!')

**Note:** If the output shows "You are logged into NASA Earthdata!", then you are ready to proceed!

---

## 2. Accessing and harmonizing data across missions

#### 2.1. Available data

|IceFlow Name | Data Set| Spatial Coverage | Temporal Coverage| Mission  | Sensors  |
|-------------|---------|------------------|------------------|----------|----------|
**ATM1B** |[BLATM L1B](https://nsidc.org/data/BLATM1B)| South: N:-53, S: -90, E:180, W:-180 <br> North: N:90, S: 60, E:180, W:-180 | 23 Jun. 1993 - 30 Oct. 2008 | Pre-IceBridge | ATM  | 
**ATM1B** |[ILATM L1B V1](https://nsidc.org/data/ILATM1B/versions/1) | South: N:-53, S: -90, E:180, W:-180 <br> North: N:90, S: 60, E:180, W:-180 | 31 Mar. 2009 - 8 Nov. 2012  <br> (updated 2013) | IceBridge | ATM | 
**ATM1B** |[ILATM L1B V2](https://nsidc.org/data/ILATM1B/versions/2)| South: N:-53, S: -90, E:180, W:-180 <br> North: N:90, S: 60, E:180, W:-180 | 20 Mar. 2013 - 16 May 2019  <br> (updated 2020)| IceBridge|ATM|
**ILVIS2** |[ILVIS2](https://nsidc.org/data/ILVIS2)| North: N:90, S: 60, E:180, W:-180|25 Aug. 2017 - 20 Sept. 2017|IceBridge | ALTIMETERS, LASERS, LVIS |
**GLAH06** |[GLAH06](https://nsidc.org/data/GLAH06/)| Global: N:86, S: -86, E:180, W:-180|20 Feb. 2003 - 11 Oct. 2009|ICESat/GLAS | ALTIMETERS, CD, GLAS, GPS, <br> GPS Receiver, LA, PC|

**Notes**:
* Due to the nature of the **ILVIS2** product, IceFlow doesn't provide a common dictionary. Data is accessible, but the user will need to harmonize the data to their own specifications.
* If you have questions about the data sets please refer to the user guides or contact NSIDC user services at nsidc@nsidc.org

--- 


#### 2.2. Choosing Corrections: Using the ITRF and Epoch values
* The differences between ITRF corrections is negligible in most cases, and corrections should only be applied by users who are familiar with the procedures behind these corrections.

* The optional ***ITRF*** parameter allows you to choose an ITRF reference to which the data will be transformed via the published ITRF transformation parameters. This parameter is optional but must be used if you want to specify an epoch. Available values are: **ITRF2000, ITRF2008, ITRF2014**</br>
Example: `'ITRF': '2014'`
* The ***epoch*** parameter is optional and entered in decimal years to which the data will be transformed via the ITRF Plate Motion Model corresponding to ITRF. This parameter can only be used if the ***ITRF*** parameter is specified and set to either 2008 or 20014, as only ITRF2008 and ITRF2014 have a plate motion model. </br>
Example: `'epoch': '2018.1'` (This specifies January 2018.)

ICESat-2: `ITRF2014`

ICESat/Glas: `ITRF2008`
  
IceBridge/Pre-IceBridge ILATM1B: `ITRF2008`

IceBridge ILVIS2: `ITRF2000`

---
#### 2.3. Accessing Data with the *IceFlow* Access Widget
The *IceFlow* access widget is a user interface tool to visualize flightpaths from IceBridge, draw a region of interest, set spatio-temporal parameters and place data orders to the *IceFlow* API and *icepyx* package without the need to write code.
The output of the operations performed in the widget can be seen in the log window (right-most icon at the bottom of your browser.) 
<img src='./img/log-icons.png'> or by selecting it on the _View_ menu "Show log console"

In [None]:
# Let's start with the user interface. Using 'horizontal' will add the widget inline.
client.display_map('horizontal', extra_layers=True)

---
#### 2.4. Accessing data with the *IceFlow* API

In [None]:
# Small example subset over Sermeq Kujalleq (Jakobshavn Isbrae):
my_params1 ={
    'datasets': ['GLAH06', 'ATL06'],
    'start': '2007-01-01',
    'end': '2018-12-31',
    'bbox': '-49.6,69.1,-49.3,69.17',

    # Here we will select ITRF2014 to match ICESat-2 an Epoch of the most recent ICESat-2 granule we are ordering
    'itrf': 'ITRF2014',
    'epoch': '2018.12'
}

# returns a json dictionary, the request parameters and the order's response.
granules_metadata = client.query_cmr(params=my_params1)

In [None]:
# Since the IceBridge data is so dense, we order a smaller subset to decrease order and download times
my_params2 ={
    'datasets': ['ATM1B'],
    'start': '2007-01-01',
    'end': '2018-12-31',
    'bbox': '-49.53,69.12,-49.51,69.135',
    # Here we will select ITRF2014 to match ICESat-2 an Epoch of the most recent ICESat-2 granule we are ordering
    'itrf': 'ITRF2014',
    'epoch': '2018.12'
}

# returns a json dictionary, the request parameters and the order's response.
granules_metadata = client.query_cmr(params=my_params2)

In [None]:
orders1 = client.place_data_orders(params=my_params1)
print(orders1)
orders2 = client.place_data_orders(params=my_params2)
print(orders2)

#### Check Order Status
The following cell will show you the status of your data order. You can proceed in the notebook once all orders are "COMPLETE". If you proceed earlier only the completed data orders will be downloaded.

In [None]:
for order in orders1:
    status = client.order_status(order)
    print(order['dataset'], order['id'], status['status'])
    
for order in orders2:
    status = client.order_status(order)
    print(order['dataset'], order['id'], status['status'])

#### Download Data
Once all data orders are "COMPLETE", you can proceed downloading the data. The data are downloaded to the /data folder of this notebook directory.

In [None]:
for order in orders1:
    status = client.order_status(order)
    if status['status'] == 'COMPLETE':
        client.download_order(order)

In [None]:
for order in orders2:
    status = client.order_status(order)
    if status['status'] == 'COMPLETE':
        client.download_order(order)

---
#### 2.5. Downloading ICESat-2 data [directly] with ***icepyx***
Behind the scenes, *IceFlow* is using the [*icepyx*](https://icepyx.readthedocs.io/en/latest/) Python package to download ICESat-2 data. *icepyx* is a standalone library that includes its own examples and documentation and welcomes contributions from data users (no previous GitHub or software development experience required!). Thus, it has a lot of additional functionality for querying, subsetting, ordering, and downloading ICESat-2 datasets (with in-the-works additions for data ingest into multiple formats), including making it easier to programmatically download data from multiple regions. Here we highlight some of the data visualization capabilities for exploring data prior to order and download.

In [None]:
# Access the icepyx query object and import icepyx
# Note: if you would like to order additional ICESat-2 data using icepyx, you'll need to attach an Earthdata
# session to your icepyx query object (or re-login to Earthdata). See [icepyx examples](https://icepyx.readthedocs.io/en/latest/getting_started/example_link.html) for more details.
import icepyx as ipx
bbox_list = [float(val) for val in (my_params1["bbox"].split(","))]
is2_obj = ipx.Query(str(my_params1["datasets"][-1]), bbox_list, [my_params1["start"], my_params1["end"]])

In [None]:
# Visualize the query extent (this map won't be interactive if you don't have geoviews and the dev version of icepyx installed)
# Thus, for very small areas it can be difficult to see the specified region on a static world map (an area for future development!)
is2_obj.visualize_spatial_extent()

## 3. Working with the data
Now that we have downloaded our data, we need to make sure that they are in a common format to do analysis across missions.

Although typically we would include all import statements at the start of the workflow, here we have separated them into this section for instructional clarity.

The main Python packages/libraries that will be used in this notebook are:

* [*cartopy*](https://scitools.org.uk/cartopy/docs/latest/):
A Python package designed for geospatial data processing in order to produce maps and other geospatial data analyses.
* [*geopandas*](https://geopandas.org/): 
Library to simplify working with geospatial data in Python (using pandas).
* [*h5py*](https://github.com/h5py/h5py):
Pythonic wrapper around the [*HDF5 library](https://en.wikipedia.org/wiki/Hierarchical_Data_Format) 
* [*matplotlib*](https://matplotlib.org/):
Comprehensive library for creating static, animated, and interactive visualizations in Python
* [*vaex*](https://github.com/vaexio/vaex):
High performance Python library for lazy Out-of-Core dataframes (similar to *pandas*), to visualize and explore big tabular data sets
* [*pandas*](https://pandas.pydata.org/):
Open source data analysis and manipulation tool
* [*icepyx*](https://icepyx.readthedocs.io/en/latest/):
Library for ICESat-2 data users
 
**Note**: *Warnings* are being ignored to suppress verbose warnings from some libraries (i.e. vaex, h5py). This will not prevent users from seeing errors.

In [None]:
import cartopy.crs as ccrs #geospatial (mapping) plotting library
import cartopy.io.img_tiles as cimgt
import geopandas as gpd #add geospatial awareness/functionality to pandas
import h5py
from iceflow.processing import IceFlowProcessing as ifp
%matplotlib widget
import matplotlib.pyplot as plt #Python visualization
import vaex
import pandas as pd #data analysis and manipulation tool
import numpy as np
import warnings #Python warnings module
warnings.filterwarnings("ignore")

#### 3.1. Import data and convert to a geopandas data frame
ICESat, ICESat-2 and IceBridge data can be read in using preconfigured common dictionaries.

In [None]:
# ICESat granule data
glas_gdf = ifp.to_geopandas('data/GLAH06-20210423-Sample.h5')
glas_gdf['mission'] = "IS"

# Pre-IceBridge/IceBridge ATM granule data
#preib_gdf = ifp.to_geopandas('data/ATM1B-20210423-Sample.h5')
#preib_gdf['mission'] = "IB"

In [None]:
# ICESat-2 granule data
is2_gdf = ifp.to_geopandas('data/ATL06_20190201020557_05290204_004_01.h5')
is2_gdf['mission'] = "IS2"

In [106]:
# Let's see what's in the harmonized dataframe and its shape.
display(glas_gdf.head(), glas_gdf.shape)

Unnamed: 0_level_0,latitude,longitude,elevation,geometry,mission
time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2007-03-22 19:50:29.545575,69.16859,-49.32669,644.33,POINT (-49.32669 69.16859),IS
2007-03-22 19:50:29.570575,69.167068,-49.327619,648.844,POINT (-49.32762 69.16707),IS
2007-03-22 19:50:29.595575,69.165547,-49.328545,651.84,POINT (-49.32855 69.16555),IS
2007-03-22 19:50:29.620575,69.164027,-49.32947,654.344,POINT (-49.32947 69.16403),IS
2007-03-22 19:50:29.645575,69.162507,-49.330395,661.877,POINT (-49.33040 69.16251),IS


(212, 5)

In [None]:
# and again for ICESat-2 ATL06
display(is2_gdf.head(), is2_gdf.shape)

#### 3.2. Down sample IceBridge Data
Due to the size of the IceBridge ATM1B point cloud, it is often difficult to work with or plot data in a Notebook environment. We will downsample the data in this example for faster plotting.

In [103]:
# Read in the data and common dictionary
filepath = 'data/ATM1B-20210423-Sample.h5'
atm_key = ifp.get_common_dictionary('ATM')

f = h5py.File(filepath, 'r')
preib_vx = vaex.open(filepath)

preib_vx['date'] = preib_vx.utc_datetime.values.astype('datetime64[ns]')
preib_df = atm_vx[atm_key['latitude'], atm_key['longitude'], atm_key['elevation'], 'date']
preib_df.add_column('index', vaex.vrange(0, len(preib_vx)))
display(preib_df)

#,latitude,longitude,elevation,date,index
<i style='opacity: 0.6'>0</i>,69.127761,-49.493261,-1816.333008,2007-09-20 16:38:11.122000000,0.0
<i style='opacity: 0.6'>1</i>,69.129082,-49.493192,-1671.847046,2007-09-20 16:38:11.394000000,1.0
<i style='opacity: 0.6'>2</i>,69.130029,-49.541076,232.735001,2008-06-27 16:52:40.187000000,2.0
<i style='opacity: 0.6'>3</i>,69.130081,-49.540974,233.444,2008-06-27 16:52:40.187000000,3.0
<i style='opacity: 0.6'>4</i>,69.130265,-49.540582,236.263,2008-06-27 16:52:40.240000000,4.0
...,...,...,...,...,...
"<i style='opacity: 0.6'>3,385,808</i>",69.120001,-49.52292,257.800995,2018-04-30 15:00:36.663000000,3385808.0
"<i style='opacity: 0.6'>3,385,809</i>",69.11999,-49.522447,258.140991,2018-04-30 15:00:36.664000000,3385809.0
"<i style='opacity: 0.6'>3,385,810</i>",69.11999,-49.522402,256.713013,2018-04-30 15:00:36.665000000,3385810.0
"<i style='opacity: 0.6'>3,385,811</i>",69.119989,-49.522359,255.509995,2018-04-30 15:00:36.665000000,3385811.0


In [107]:
# Here we will aggrigate or "decimate" the data to make it smaller for our purposes
preib_dec = preib_df[(preib_df.index % 100 == 0)]
ib = np.array(["IB"]*len(preib_df))
preib_dec.add_column('mission', ib)
display(preib_dec)

#,latitude,longitude,elevation,date,index,mission
<i style='opacity: 0.6'>0</i>,69.127761,-49.493261,-1816.333008,2007-09-20 16:38:11.122000000,0.0,IB
<i style='opacity: 0.6'>1</i>,69.130141,-49.540565,240.117004,2008-06-27 16:52:40.509000000,100.0,IB
<i style='opacity: 0.6'>2</i>,69.129584,-49.541695,238.651993,2008-06-27 16:52:40.673000000,200.0,IB
<i style='opacity: 0.6'>3</i>,69.129798,-49.541006,244.223999,2008-06-27 16:52:40.833000000,300.0,IB
<i style='opacity: 0.6'>4</i>,69.129254,-49.542389,240.735992,2008-06-27 16:52:40.943000000,400.0,IB
...,...,...,...,...,...,...
"<i style='opacity: 0.6'>33,854</i>",69.120412,-49.523633,247.108002,2018-04-30 15:00:36.251000000,3385400.0,IB
"<i style='opacity: 0.6'>33,855</i>",69.119995,-49.525912,247.442001,2018-04-30 15:00:36.327000000,3385500.0,IB
"<i style='opacity: 0.6'>33,856</i>",69.120157,-49.521689,246.391006,2018-04-30 15:00:36.379000000,3385600.0,IB
"<i style='opacity: 0.6'>33,857</i>",69.120001,-49.521329,250.070007,2018-04-30 15:00:36.462000000,3385700.0,IB


In [109]:
# Now we need to convert our downsampled data back into a pandas geodataframe so we can merge it with the other missions
preib_pandas = preib_dec.to_pandas_df(["latitude","longitude", "elevation", "date", "mission"])

preib_gdf = gpd.GeoDataFrame(preib_pandas,
                                geometry=gpd.points_from_xy(preib_pandas['longitude'],
                                                            preib_pandas['latitude'],
                                                            crs='epsg:4326'))
display(preib_gdf.head(), preib_gdf.shape)

Unnamed: 0,latitude,longitude,elevation,date,mission,geometry
0,69.127761,-49.493261,-1816.333008,2007-09-20 16:38:11.122,IB,POINT (-49.49326 69.12776)
1,69.130141,-49.540565,240.117004,2008-06-27 16:52:40.509,IB,POINT (-49.54057 69.13014)
2,69.129584,-49.541695,238.651993,2008-06-27 16:52:40.673,IB,POINT (-49.54169 69.12958)
3,69.129798,-49.541006,244.223999,2008-06-27 16:52:40.833,IB,POINT (-49.54101 69.12980)
4,69.129254,-49.542389,240.735992,2008-06-27 16:52:40.943,IB,POINT (-49.54239 69.12925)


(33859, 6)

#### 3.3. Plot the data from each mission together

In [None]:
# Now, let's plot all three datasets together, using color to show elevation
# Note that although this data is projected, it is not recommended you use this map as a basis for geospatial analysis

# Create a Stamen terrain background instance.
stamen_terrain = cimgt.Stamen('terrain-background')

map_fig = plt.figure()
# Create a GeoAxes in the tile's projection.
map_ax = map_fig.add_subplot(111, projection=stamen_terrain.crs)

# Limit the extent of the map to a small longitude/latitude range.
map_ax.set_extent([-56, -45, 67, 71], crs=ccrs.Geodetic())

# Add the Stamen data at zoom level 8.
map_ax.add_image(stamen_terrain, 8)

# world = gpd.read_file(gpd.datasets.get_path("naturalearth_lowres"))
# world.plot(ax=map_ax, facecolor="lightgray", edgecolor="gray")

# ####ATTN: Need to get the correct projection in here!
for onegdf, lab, shp in zip([preib_gdf, glas_gdf],["preib","glas"], ['P','o','D']):
    ms=map_ax.scatter(onegdf["longitude"], onegdf["latitude"],  2, c=onegdf["elevation"],
                      vmin=0, vmax=1000, label=lab, marker=shp,
                      transform=ccrs.Geodetic())

plt.colorbar(ms, label='elevation')

#### 3.4. Stack the dataframes

In [None]:
# Thanks to the harmonization, we can stack our geopandas dataframes to have a unified dataframe for analysis.
stacked_df = gpd.GeoDataFrame(pd.concat([preib_gdf, glas_gdf]))
display(stacked_df.head(), stacked_df.shape)

#### 3.5. Time series analysis

In [None]:
# We zoom in on a very small section of our data to plot a time series
tsgdf_full = stacked_df.loc[(stacked_df["longitude"]>=-49.526) & (stacked_df["longitude"]<=-49.521)
              & (stacked_df["latitude"]>=69.121) & (stacked_df["latitude"]<=69.125) ]
# filter out erroneous elevation values
tsgdf_full = tsgdf_full.loc[tsgdf_full["elevation"] > 0]

print(len(tsgdf_full))

In [None]:
# in order to plot as a time series, we cannot have duplicate x (time) values. Since the data collection rates
# are on the order of seconds, we keep the average where there are multiple records per second
tsgdf = tsgdf_full.groupby('time').mean()

# we also need to make "time" a non-index column
tsgdf["time_col"] = tsgdf.index

In [None]:
tsgdf.plot(x="time_col",y="elevation", kind="scatter")

---

Use code and example from https://nbviewer.jupyter.org/github/nicholas-kotlinski/2020_ICESat-2_Hackweek_Tutorials/blob/master/05.Geospatial_Analysis/shean_ICESat-2_hackweek_tutorial_GeospatialAnalysis_rendered.ipynb

## [Draft notebook for April 2021 Earthdata Webinar]

#### Learning Objectives (tbd)
* (goal from discussion) Introduce audience to the “Ice cubed” (ICESat/OIB/ICESat-2) missions and products
* (goal from discussion Promote icepyx and IceFlow as a way to address the need to access/harmonize multiple data products/resolutions/formats/structure


#### Webinar Outline
* Intro of harmonization challenge
    - Moving one level deeper into the need for time series analysis across disparate platforms
* Intro to the altimetry missions
    - Use the IceFlow intro notebook to guide this
    - Highlight the impressive time series
* Overview of IceFlow and icepyx
    - Audience and use cases of these tools
    - Icepyx genesis, open science and community development principles
        - working on that balance between community development and end use
    - IceFlow overview
    - How do these tools address challenges
        - IceFlow aims to make it easier to harmonize OIB/ICESat data with ICESat-2 using backend IceFlow API
* Use Case / Application
    - Leverage ICESat-2 hackweek topics
    - https://github.com/ICESAT-2HackWeek/geospatial-analysis/blob/master/shean_ICESat-2_hackweek_tutorial_GeospatialAnalysis.ipynb
* Short navigation of nsidc site, how to get to tools, notebook
    - Have Jennifer post links while we talk.
* Demos
    - **Use this notebook to pulls pieces of iceflow / icepyx using a specific science application**