# Reproducible investigations of maritime glaciers using open-source tools
This notebook ({nb-download}`download <IS2_OSS-FAIR-Resources_Workflow.ipynb>`) illustrates the use of multiple open-source tools (icepyx, IceFlow, SlideRule) for exploring an Alaskan maritime glacier.

The notebook was designed for a presentation at the June 2022 International Glaciological Society (IGS) [International Symposium on Maritime Glaciers](https://www.igsoc.org/event/international-symposium-on-maritime-glaciers) in Juneau, Alaska, USA.

**Symposium Abstract:**
Multiple open-source software (OSS) packages developed by and for the glaciological community enable rapid investigations of maritime glaciers. Focusing on Alaskan maritime glaciers, we illustrate how icepyx and other community-built software packages can be leveraged to quickly explore ICESat-2 data in combination with data from other sensors for a given glacier. The first tool showcased, the Python package icepyx, was created by the author in response to challenges faced by the glaciology community in accessing ICESat-2 data programmatically. With icepyx, we query and quickly visualize ICESat-2 data of the glacier. Then, we construct a time series of elevations spanning the ICESat, IceBridge, and ICESat-2 sensors using the NSIDC IceFlow package. Last, we customize our ICESat-2 data analyses with in-cloud processing using SlideRule. The workflow, encapsulated within an executable Jupyter Notebook, showcases the tools' ease of use for data access, analysis, and visualization while demonstrating the application of FAIR (Findable, Accessible, Interoperable, Reusable) principles and collaborative development in glaciological research.


### Tools Showcased
 1. [icepyx]()
 2. [IceFlow]()
 3. [SlideRule]()


### Objectives
 1. Showcase several open-source tools useful to glaciologists
 2. Demonstrate the application of FAIR principles in glaciological research
 3. Investigate a maritime glacier...

# TO DO:
- add links to tools
- update science and FAIR objectives
- add refs/source material
- figures/images (use table Mikala made)
- clean up!
- open issues as needed/track changes to code base

## Maybe to-do:
- get RGI from NSIDC via CMR, if it's an option

### Environment

In [None]:
# import needed packages
import geopandas as gpd
%load_ext autoreload
import icepyx as ipx
%autoreload 2


### Regional Extent

The Randolph Glacier Inventory ([RGI](https://nsidc.org/data/nsidc-0770)), part of Global Land Ice Measurements from Space ([GLIMS](https://www.glims.org/))b provides glacier outlines. Here we'll open the Alaska glacier outlines into a GeoPandas DataFrame.

In [None]:
# get RGI glacier polygons
rgi_zip_fn = '01_rgi60_Alaska.zip'
url = 'https://www.glims.org/RGI/rgi60_files/' + rgi_zip_fn
ak_rgi_gdf = gpd.read_file(url)

In [None]:
ak_rgi_gdf.head()

In [None]:
ak_rgi_gdf.plot()

In [None]:
# choose a glacier
# ak_rgi_gdf[~ak_rgi_gdf["Name"].isnull()]
glac = ak_rgi_gdf[ak_rgi_gdf["Name"] == "Mendenhall Glacier"]

In [None]:
glac.plot()

### ICESat-2 data via icepyx

In [None]:
# get exterior coordinates of the glacier polygon
poly = list(glac.geometry.values[0].exterior.coords)

# simplify polygon for CMR
simp_poly = list(glac.simplify(0.01).geometry.values[0].exterior.coords)

In [None]:
simp_poly

In [None]:
# create an icepyx Query object
is2_glac = ipx.Query(spatial_extent=simp_poly, 
                     date_range=['2021-06-01','2021-07-01'], 
                     product="ATL06")

In [None]:
is2_glac.avail_granules(ids=True)

In [None]:
# visualize our outline on a map
is2_glac.visualize_spatial_extent()

In [None]:
# quick-view available ICESat-2 data with icepyx
cyclemap, rgtmap = is2_glac.visualize_elevation()
cyclemap

In [None]:
rgtmap

In [None]:
# download the data with icepyx
path = "./is2-download"

is2_glac.earthdata_login(uid='icepyx_devteam', email='icepyx.dev@gmail.com')

In [None]:
is2_glac.download_granules(path=path)

In [None]:
# do some basic data read in and analysis!
pattern = "processed_ATL{product:2}_{datetime:%Y%m%d%H%M%S}_{rgt:4}{cycle:2}{orbitsegment:2}_{version:3}_{revision:2}.h5"
reader = ipx.Read(path, "ATL06", pattern) # or ipx.Read(filepath, "ATLXX") if your filenames match the default pattern

In [None]:
reader._filelist

In [None]:
import h5py

In [None]:
h5pt = h5py.File(reader._filelist[1],'r')

In [None]:
print(
    list(h5pt['gt1r'].keys()),
    list(h5pt['gt1l'].keys()),
    list(h5pt['gt2r'].keys()),
    list(h5pt['gt2l'].keys()),
    list(h5pt['gt3r'].keys()),
    list(h5pt['gt3l'].keys()),
)

# New Issue (see line 599 of read module)
### Why are gt1r and gt1l being returned (e.g. residual histogram) being included in subset file from NSIDC if there's no geospatial data there?

In [None]:
# get desired variables
reader.vars.append(var_list=['h_li', "latitude", "longitude"])
reader.vars.wanted

In [None]:
# load the data
is2_ds = reader.load()
is2_ds

In [None]:
import xarray as xr
is2_merge = xr.merge(is2_ds, compat='override')
is2_merge

In [None]:
# quick preview!
is2_merge.plot.scatter(x="longitude", y="latitude", hue="h_li", vmin=-100, vmax=2000)

In [None]:
# make a better map/plot here so can see where the points fall!!

-------------
## IceFlow

Use IceFlow to get a longer time-series of data from ICESat, IceBridge, and ICESat-2.

For more details on the inputs selected here, see [this time series tutorial notebook](https://github.com/nsidc/NSIDC-Data-Tutorials/blob/main/notebooks/iceflow/4_time_series_tutorial.ipynb)

In [None]:
# add packages to environment (if you didn't when creating your environment)
!pip install ipyleaflet ipympl python-cmr sidecar

In [None]:
# add location of iceflow module to path and import 
# (note this is system-dependent and requires that you first clone the library from GitHub)
import importlib
import pandas as pd
from pathlib import Path
import sys

sys.path.append("/Users/jessica/computing/misc/github/NSIDC-Data-Tutorials/notebooks/iceflow/")
iceflow = importlib.__import__("iceflow")

In [None]:
# Earthdata authentication (someday this will hopefully be streamlined with icepyx so you only need to log in once!)
client = iceflow.ui.IceFlowUI()
client.display_credentials()

authorized = client.authenticate()
if authorized is None:
    print('Earthdata Login not successful')
else:
    print('Earthdata Login successful!')

In [None]:
bound_box_poly = list(is2_glac._spat_extent.envelope.exterior.coords)
bound_box_poly

In [None]:
bound_box = ','.join([str(bound_box_poly[0][0]), str(bound_box_poly[0][1]), 
                      str(bound_box_poly[2][0]), str(bound_box_poly[2][1])])

In [None]:
# input parameters needed for ordering data via IceFlow
ifl_params ={
    'datasets': ['GLAH06', 'ATM1B', 'ILVIS2'],
    'start': '1993-01-01',
    'end': '2018-12-31',
    'bbox': bound_box,
    # Here we will select ITRF2014 to match the Epoch of the most recent ICESat-2 granule we are ordering
    'itrf': 'ITRF2014',
    'epoch': '2018.12'
}

# returns a json dictionary, the request parameters, and the order's response.
granules_metadata = client.query_cmr(params=ifl_params)

In [None]:
# update input parameters so an order is not placed for datasets with no granules
ifl_params['datasets'] = ['GLAH06']

In [None]:
# place order
ifl_order = client.place_data_orders(params=ifl_params)

In [None]:
# check order status
for order in ifl_order:
    status = client.order_status(order)
    print(order['dataset'], order['id'], status['status'])

In [None]:
# download data (once all orders are COMPLETE)
for order in ifl_order:
    status = client.order_status(order)
    if status['status'] == 'COMPLETE':
        client.download_order(order)

In [None]:
# this notebook is running outside the IceFlow module
!pwd

# get the path where IceFlow data was automatically downloaded
ifl_path = Path(iceflow.__file__).parent.joinpath('../data')
print('\n', ifl_path)

# get the list of downloaded files
ifl_filelist = [p for p in ifl_path.iterdir() if (p.is_file() and p.glob("*-2022*.h5"))]
print('\n', ifl_filelist)

In [None]:
ifl_filelist[4]

In [None]:
# read in the data

# ICESat granule data
glas_gdf = iceflow.processing.IceFlowProcessing.to_geopandas(ifl_filelist[4]) # UPDATE PATH BASED ON YOUR OUTPUTTED FILENAME
glas_gdf['mission'] = "IS"
glas_gdf['time'] = pd.to_datetime(glas_gdf.index.astype(str))

# #Pre-IceBridge/IceBridge ATM granule data
# preib_gdf = ifp.to_geopandas('data/ATM1B-20210423-Sample.h5') # UPDATE PATH BASED ON YOUR OUTPUTTED FILENAME
# preib_gdf['mission'] = "IB"

In [None]:
glas_gdf

In [None]:
is2_merge # instead, consider turning it into a dataframe for plotting?

In [None]:
import cartopy.crs as ccrs #geospatial (mapping) plotting library
import cartopy.io.img_tiles as cimgt
%matplotlib widget
import matplotlib.pyplot as plt #Python visualization

In [None]:
# Note that although this data is projected, it is not recommended you use this map as a basis for geospatial analysis

# Create a Stamen terrain background instance.
stamen_terrain = cimgt.Stamen('terrain-background')

map_fig = plt.figure()
# Create a GeoAxes in the tile's projection.
map_ax = map_fig.add_subplot(111, projection=stamen_terrain.crs)

# Limit the extent of the map to a small longitude/latitude range.
map_ax.set_extent([-135, -134, 58.1, 58.9], crs=ccrs.Geodetic())

# Add the Stamen data at zoom level 8.
map_ax.add_image(stamen_terrain, 8)

for onegdf, lab, shp in zip([glas_gdf],["is"], ['o']):
    ms=map_ax.scatter(onegdf["longitude"], onegdf["latitude"],  2, c=onegdf["elevation"],
                      vmin=0, vmax=1000, label=lab, marker=shp,
                      transform=ccrs.Geodetic())

for oneds, lab, shp in zip([is2_merge],["IS2"], ['D']):
    ms=map_ax.scatter(oneds["longitude"], oneds["latitude"],  2, c=oneds["h_li"],
                      vmin=0, vmax=1000, label=lab, marker=shp,
                      transform=ccrs.Geodetic())
plt.colorbar(ms, label='elevation');

In [None]:
# look at a time series or something (as in NSIDC example)?

In [None]:
# get higher-resolution ICESat-2 data with SlideRule

**Credits**
* notebook by: Jessica Scheick
* notebook contributors: 
* source material: []() by ???