# *IceFlow & icepyx*: Altimetry Time Series Tutorial
### NASA Earthdata Webinar - April 2021</b>

This tutorial demonstrates how to harmonize several NASA altimetry data sets with varying temporal coverage, formats, and coordinate reference frames using the IceFlow and icepyx Python tools. Please refer to the 0_introduction.ipynb notebook for detailed information on the data sets you will be exploring in this tutorial. 

#### Objectives:
1. Use the IceFlow map widget to select and visualize an area of interest.
2. Access coincident ICESat/Glas, Operation IceBridge, and ICESat-2 data over the same spatial region.
3. Use the community-developed icepyx python library to subset ICESat-2 data.
4. Learn about advanced icepyx capabilities including data value visualization prior to download. 
5. Extract common data variables into a Geopandas dataframe.
7. Plot and visualize the altimetry time series to detect glacial elevation change over time.

<b>Authors:</b><br />
<span style="font-size:larger;">Jessica Scheick</span>, *University of New Hampshire*, Durham, New Hampshire<br />
<span style="font-size:larger;">Nicholas Kotlinski & Amy Steiker</span>, *NASA National Snow and Ice Data Center DAAC*, Boulder, Colorado, USA

---

#### Running this tutorial locally

To run this notebook locally, you must first set up your computing environment. Please see the [repository readme](https://github.com/nsidc/NSIDC-Data-Tutorials#usage-with-binder) for instructions on several ways (using Binder, Docker, or Conda) to do this.

### 1. NASA's Earthdata Credentials

To access data using the *IceFlow* library and *icepyx* package, it is necessary to log into [Earthdata Login](https://urs.earthdata.nasa.gov/). To do this, enter your NASA Earthdata credentials in the next step after executing the following code cell.

**Note**: If you don't have NASA Earthdata credentials you will need to register first at the link above. An account is free and available to everyone!

In [None]:
# This cell will prompt you for your Earthdata username and password. Press
from iceflow.ui import IceFlowUI
client = IceFlowUI()
client.display_credentials()

In [None]:
# This cell will verify if your credentials are valid. 
# This may take a little while. If it fails, please try again.

authorized = client.authenticate()
if authorized is None:
    print('Earthdata Login not successful')
else:
    print('Earthdata Login successful!')

**Note:** If the output shows "You are logged into NASA Earthdata!", then you are ready to proceed!

---

### 2. Accessing and harmonizing data across missions
#### 2.1. Accessing Data with the *IceFlow* Access Widget
The *IceFlow* access widget is a user interface tool to visualize flightpaths from IceBridge, draw a region of interest, set spatio-temporal parameters and place data orders to the *IceFlow* API and *icepyx* package without the need to writing code.
The output of the operations performed in the widget can be seen in the log window (right-most icon at the bottom of your browser.) 
<img src='./img/log-icons.png'> or by selecting it on the View menu "Show log console"

**Note:** The access widget is currently stateless, so if you change any parameter you will have to redraw your bounding box or polygon.

In [None]:
# Let's start with the user interface. Using 'horizontal' will add the widget inline.
client.display_map('horizontal', extra_layers=True)

#### 2.2. Accessing data with the *IceFlow* API

### ToDo: Need to explain what the itrf and epoch values are and which ones are best to utilize in conjunction with ICESat-2 analysis. 

### For Jessica's clarification (and maybe others')?
ICESat = GLAH06 (GLAS sensor)

pre-IceBridge = ATM

IceBridge = ILVIS2

ICESat-2 = ATL0X (ATLAS sensor)

In [None]:
# Small example subset over Sermeq Kujalleq (Jakobshavn Isbrae):
my_params1 ={
    'datasets': ['GLAH06', 'ILVIS2', 'ATL06'],
    'start': '2007-01-01',
    'end': '2018-12-31',
    'bbox': '-49.6,69.1,-49.3,69.17',

    # Here we will select ITRF2014 to match ICESat-2 an Epoch of the most recent ICESat-2 granule we are ordering
    'itrf': 'ITRF2014',
    'epoch': '2018.12'
}

# returns a json dictionary, the request parameters and the order's response.
granules_metadata = client.query_cmr(params=my_params1)

In [None]:
# Since the IceBridge data is so dense, we order a smaller subset to decrease order and download times
my_params2 ={
    'datasets': ['ATM1B'],
    'start': '2007-01-01',
    'end': '2018-12-31',
    'bbox': '-49.53,69.12,-49.51,69.135',
    # Here we will select ITRF2014 to match ICESat-2 an Epoch of the most recent ICESat-2 granule we are ordering
    'itrf': 'ITRF2014',
    'epoch': '2018.12'
}

# returns a json dictionary, the request parameters and the order's response.
granules_metadata = client.query_cmr(params=my_params2)

In [None]:
orders1 = client.place_data_orders(params=my_params1)
print(orders1)
orders2 = client.place_data_orders(params=my_params2)
print(orders2)

#### Check Order Status
The following cell will show you the status of your data order. You can proceed in the notebook once all orders are "COMPLETE". If you proceed earlier only the completed data orders will be downloaded.

In [None]:
for order in orders1:
    status = client.order_status(order)
    print(order['dataset'], order['id'], status['status'])
    
for order in orders2:
    status = client.order_status(order)
    print(order['dataset'], order['id'], status['status'])

#### Download Data
Once all data orders are "COMPLETE", you can proceed downloading the data. The data are downloaded to the /data folder of this notebook directory.

In [None]:
for order in orders1:
    status = client.order_status(order)
    if status['status'] == 'COMPLETE':
        client.download_order(order)

In [None]:
for order in orders2:
    status = client.order_status(order)
    if status['status'] == 'COMPLETE':
        client.download_order(order)

#### 2.3. Downloading ICESat-2 data [directly] with *icepyx*
Behind the scenes, *IceFlow* is using the [*icepyx*](https://icepyx.readthedocs.io/en/latest/) Python package to download ICESat-2 data. *icepyx* is a standalone library that includes its own examples and documentation and welcomes contributions from data users (no previous GitHub or software development experience required!). Thus, it has a lot of additional functionality for querying, subsetting, ordering, and downloading ICESat-2 datasets (with in-the-works additions for data ingest into multiple formats), including making it easier to programmatically download data from multiple regions. Here we highlight some of the data visualization capabilities for exploring data prior to order and download.

In [None]:
# Access the icepyx query object and import icepyx
# Note: if you would like to order additional ICESat-2 data using icepyx, you'll need to attach an Earthdata
# session to your icepyx query object (or re-login to Earthdata). See [icepyx examples](https://icepyx.readthedocs.io/en/latest/getting_started/example_link.html) for more details.
import icepyx as ipx
bbox_list = [float(val) for val in (my_params1["bbox"].split(","))]
is2_obj = ipx.Query(str(my_params1["datasets"][-1]), bbox_list, [my_params1["start"], my_params1["end"]])

In [None]:
# Visualize the query extent (this map won't be interactive if you don't have geoviews and the dev version of icepyx installed)
# Thus, for very small areas it can be difficult to see the specified region on a static world map (an area for future development!)
is2_obj.visualize_spatial_extent()

### 3. Working with the data
Now that we have downloaded our data, we need to make sure that they are in a common format to do analysis across missions.

Although typically we would include all import statements at the start of the workflow, here we have separated them into this section for instructional clarity.

### **Jessica update**: why are we ignoring warnings? I addressed the rest of the comments 

**Amy Note**: We should explain the libraries we're importing briefly. I generally think it's better practice to put all imports at the very top of the notebook but I don't have any documentation on hand to back that up! In particular we shuold explain why wer're ignoring warnings (and this could probably be in the same code block as the previous block). 

In [None]:
import cartopy.crs as ccrs #geospatial (mapping) plotting library
import cartopy.io.img_tiles as cimgt
import geopandas as gpd #add geospatial awareness/functionality to pandas
from iceflow.processing import IceFlowProcessing as ifp
%matplotlib widget
import matplotlib.pyplot as plt #Python visualization
import pandas as pd #data analysis and manipulation tool
import warnings #Python warnings module
warnings.filterwarnings("ignore")

In [None]:
import h5py

In [None]:
with h5py.File('data/ILVIS2-20210422-6a3b3ac4-f1f9-40a0-b19d-7b34022e0960.h5', 'r') as h5:
    print(h5.keys())

## We don't have a reader for the ILVIS data - should we not download it then?

In [None]:
# Pre-IceBridge ATM granule data
preib_gdf = ifp.to_geopandas('data/ATM1B-20210423-8fc1edba-1022-4fa1-937b-43d3457fbd93.h5')
preib_gdf['mission'] = "IB"

# We don't have a reader for the ILVIS data
# preib_gdf = ifp.to_geopandas('data/ILVIS2-20210423-c099b967-fd33-4104-85fc-11b0edc218c5.h5)
# preib_gdf['mission'] = "IB"

# ICESat granule data
glas_gdf = ifp.to_geopandas('data/GLAH06-20210423-dc59a3db-e79b-4723-8272-f461bdc02dc1.h5')
glas_gdf['mission'] = "IS"

# ICESat-2 granule data
is2_gdf = ifp.to_geopandas('data/processed_ATL06_20181214041627_11690105_004_01.h5')
is2_gdf['mission'] = "IS2"

In [None]:
# first, let's see what's in the harmonized dataframe and its shape.
display(preib_gdf.head(), preib_gdf.shape)

## Nic, can you introduce the preib data downscaling here? The plotting is super slow because of the data density for preib_gdf

In [None]:
# we do the same for the ICESat/GLAS dataframe
display(glas_gdf.head(), glas_gdf.shape)

In [None]:
# and again for ICESat-2 ATL06
display(is2_gdf.head(), is2_gdf.shape)

In [None]:
# Now, let's plot all three datasets together, using color to show elevation
# Note that although this data is projected, it is not recommended you use this map as a basis for geospatial analysis

# Create a Stamen terrain background instance.
stamen_terrain = cimgt.Stamen('terrain-background')

map_fig = plt.figure()
# Create a GeoAxes in the tile's projection.
map_ax = map_fig.add_subplot(111, projection=stamen_terrain.crs)

# Limit the extent of the map to a small longitude/latitude range.
map_ax.set_extent([-56, -45, 67, 71], crs=ccrs.Geodetic())

# Add the Stamen data at zoom level 8.
map_ax.add_image(stamen_terrain, 8)

# world = gpd.read_file(gpd.datasets.get_path("naturalearth_lowres"))
# world.plot(ax=map_ax, facecolor="lightgray", edgecolor="gray")

# ####ATTN: Need to get the correct projection in here!
for onegdf, lab, shp in zip([preib_gdf, glas_gdf, is2_gdf],["preib","glas","is2"], ['P','o','D']):
    ms=map_ax.scatter(onegdf["longitude"], onegdf["latitude"],  2, c=onegdf["elevation"],
                      vmin=0, vmax=1000, label=lab, marker=shp,
                      transform=ccrs.Geodetic())

plt.colorbar(ms, label='elevation')

In [None]:
# Thanks to the harmonization, we can stack our geopandas dataframes to have a unified dataframe for analysis.
stacked_df = gpd.GeoDataFrame(pd.concat([preib_gdf, glas_gdf, is2_gdf]))
display(stacked_df.head(), stacked_df.shape)

In [None]:
# We zoom in on a very small section of our data to plot a time series
tsgdf_full = stacked_df.loc[(stacked_df["longitude"]>=-49.526) & (stacked_df["longitude"]<=-49.521)
              & (stacked_df["latitude"]>=69.121) & (stacked_df["latitude"]<=69.125) ]
# filter out erroneous elevation values
tsgdf_full = tsgdf_full.loc[tsgdf_full["elevation"] > 0]

print(len(tsgdf_full))

In [None]:
# in order to plot as a time series, we cannot have duplicate x (time) values. Since the data collection rates
# are on the order of seconds, we keep the average where there are multiple records per second
tsgdf = tsgdf_full.groupby('time').mean()

# we also need to make "time" a non-index column
tsgdf["time_col"] = tsgdf.index

In [None]:
tsgdf.plot(x="time_col",y="elevation", kind="scatter")

---

Use code and example from https://nbviewer.jupyter.org/github/nicholas-kotlinski/2020_ICESat-2_Hackweek_Tutorials/blob/master/05.Geospatial_Analysis/shean_ICESat-2_hackweek_tutorial_GeospatialAnalysis_rendered.ipynb

## [Draft notebook for April 2021 Earthdata Webinar]

#### Learning Objectives (tbd)
* (goal from discussion) Introduce audience to the “Ice cubed” (ICESat/OIB/ICESat-2) missions and products
* (goal from discussion Promote icepyx and IceFlow as a way to address the need to access/harmonize multiple data products/resolutions/formats/structure


#### Webinar Outline
* Intro of harmonization challenge
    - Moving one level deeper into the need for time series analysis across disparate platforms
* Intro to the altimetry missions
    - Use the IceFlow intro notebook to guide this
    - Highlight the impressive time series
* Overview of IceFlow and icepyx
    - Audience and use cases of these tools
    - Icepyx genesis, open science and community development principles
        - working on that balance between community development and end use
    - IceFlow overview
    - How do these tools address challenges
        - IceFlow aims to make it easier to harmonize OIB/ICESat data with ICESat-2 using backend IceFlow API
* Use Case / Application
    - Leverage ICESat-2 hackweek topics
    - https://github.com/ICESAT-2HackWeek/geospatial-analysis/blob/master/shean_ICESat-2_hackweek_tutorial_GeospatialAnalysis.ipynb
* Short navigation of nsidc site, how to get to tools, notebook
    - Have Jennifer post links while we talk.
* Demos
    - **Use this notebook to pulls pieces of iceflow / icepyx using a specific science application**