## Using SlideRule with ICESat-2 data

ICESat-2 has several products that could potentially be used to retrieve surface height and snow depth over SnowEx Alaska sites. The ATL03 product has the finest along-track resolution at 0.7 m, but it is also very noisy without filtering. The ATL06 and ATL08 products are less noisy, and ATL08 can be used to differentiate between vegetation and bare earth, but both are at a coarser resolution than ATL03 (40 m and 100 m, respectively).

SlideRule is an ICESat-2 data querying tool that offers a compromise between ATL03 and the higher-level products. It allows users to access ICESat-2 data through the cloud, given spatial bounds and a set of data parameters. Crucially, it also processes ATL03 data in a manner similar to ATL06 (i.e. line segments generated from aggregated signal photons), though the user has more flexibility in the along-track resolution and photon selection criteria (see below).

This is a short Jupyter Notebook designed to show how SlideRule could be used to reduce noise in ICESat-2 ATL03 data. We will use Creamer's Field, AK as a testbed.

In [None]:
import ipywidgets as widgets
import logging
import concurrent.futures
import time
from datetime import datetime
import numpy as np
import pandas as pd
import geopandas as gpd
import matplotlib.pyplot as plt
from pyproj import Transformer, CRS
from shapely.geometry import Polygon, Point
from sliderule import icesat2
from sliderule import sliderule, ipysliderule, io

import xarray as xr
import geoviews as gv
import geoviews.feature as gf
from geoviews import dim, opts
import geoviews.tile_sources as gts
from bokeh.models import HoverTool
import hvplot.pandas

gv.extension('bokeh')

In [None]:
# Initialize the package
icesat2.init("slideruleearth.io", verbose=False)

The below cell contains bounding boxes for several SnowEx field sites across Alaska. The tutorial uses the code for Creamer's Field ('cffl').

In [None]:
# Define the region of interest. Currently given as 3 blocks of code, but will (hopefully) update this
# to use polygons instead.
# cpcrw = Caribou/Poker Creek, AK
# cffl = Creamer's Field/Farmer's Loop, AK
# bcef = Bonanza Creek, AK
# acp = Arctic Coastal Plain (Deadhorse area), AK
# mesa = Grand Mesa, CO
field_id = 'cffl'

if field_id == 'cpcrw':
    # Caribou/Poker Creek, AK
    region = [ {"lon":-147.66633, "lat": 65.114884},
               {"lon":-147.379038, "lat": 65.114884},
               {"lon":-147.379038, "lat": 65.252394},
               {"lon":-147.66633, "lat": 65.252394},
               {"lon":-147.66633, "lat": 65.114884} ]
elif field_id == 'cffl':
    # Creamer's Field/Farmer's Loop, AK
    region = [ {"lon":-147.750873, "lat": 64.858387},
               {"lon":-147.661642, "lat": 64.858901},
               {"lon":-147.661642, "lat": 64.888732},
               {"lon":-147.750873, "lat": 64.888732},
               {"lon":-147.750873, "lat": 64.858387} ]
elif field_id == 'bcef':
    # Bonanza Creek, AK
    region = [ {"lon":-148.337216, "lat": 64.687819},
               {"lon":-148.243277, "lat": 64.687819},
               {"lon":-148.243277, "lat": 64.749681},
               {"lon":-148.337216, "lat": 64.749681},
               {"lon":-148.337216, "lat": 64.687819} ]
elif field_id == 'acp':
    # Arctic Coastal Plain, AK
    region = [ {"lon":-148.85, "lat": 69.985},
               {"lon":-148.527, "lat": 69.985},
               {"lon":-148.527, "lat": 70.111},
               {"lon":-148.85, "lat": 70.111},
               {"lon":-148.85, "lat": 69.985} ]
elif field_id == 'mesa':
    # Grand Mesa, CO
    region = [ {"lon":-108.275, "lat": 38.837},
               {"lon":-108.0, "lat": 38.837},
               {"lon":-108.0, "lat": 39.127},
               {"lon":-108.275, "lat": 39.127},
               {"lon":-108.275, "lat": 38.837} ]
else:
    raise ValueError('Field ID not recognized.')
    
# To be added: geoJSON usage, add "buffer zones" for TOO request preparation
region = icesat2.toregion('jsons-shps/cffl_lidar_box.geojson')["poly"]
print(region)

We are now going to build TWO SlideRule requests over Creamer's Field. The first one will include all signal photons (high, medium, low) and will not filter out tree canopies. The second query will only include high-confidence signal photons that are recognized as "ground" photons by the ATL08 algorithm.

A brief rundown of each of the parameters given below:
* "poly": The polygon defining our region of interest (defined above as "region")
* "srt": The surface type: land, land ice, sea ice, ocean, or inland water.
* "cnf": Confidence level of the retrieved ICESat-2 photons. The lower the confidence threshold, generally the noisier the data.
* "ats": Minimum along-track spread (uncertainty) in photon aggregates (units of meters).
* "len": Length of line segments of photon aggregates (units of meters).
* "res": Distance between line segment midpoints, or along-track resolution (units of meters).
* "maxi": Maximum number of times for the SlideRule algorithm to process photon aggregates into elevation estimates.

In [None]:
# Build first request with the specified parameters
parms = {
    "poly": region,
    "srt": icesat2.SRT_LAND,
    "cnf": icesat2.CNF_SURFACE_LOW,
    "ats": 5.0,
    "len": 20.0,
    "res": 10.0,
    "maxi": 5
}

# Request ATL06 Data (first request)
df = icesat2.atl06p(parms, "nsidc-s3")

# Build second request
parms = {
    "poly": region,
    "srt": icesat2.SRT_LAND,
    "cnf": icesat2.CNF_SURFACE_HIGH,
    "atl08_class": ["atl08_ground"],
    "ats": 5.0,
    "len": 20.0,
    "res": 10.0,
    "maxi": 5
}

df2 = icesat2.atl06p(parms, "nsidc-s3")

In [None]:
df.head()

In [None]:
df2.head()

There's a few things that we can notice right away from the above DataFrame previews. First, the "h_sigma" parameter, i.e. the uncertainty in the approximated surface height, is lower in the second DataFrame. As a consequence, the surface height (elevation) estimate ("h_mean") differs by several decimeters between the two queries. We will look at this in more detail.

Just for reference, let's take a look at the coverage of Creamer's Field from all tracks since 2018.

In [None]:
# Sample plot for all of the ICESat-2 tracks since its launch

# Calculate Extent
lons = [p["lon"] for p in region]
lats = [p["lat"] for p in region]
lon_margin = (max(lons) - min(lons)) * 0.1
lat_margin = (max(lats) - min(lats)) * 0.1

# Create Plot
fig,(ax1,ax2) = plt.subplots(num=None, ncols=2, figsize=(12, 6))
box_lon = [e["lon"] for e in region]
box_lat = [e["lat"] for e in region]

# Plot SlideRule Ground Tracks
ax1.set_title("SlideRule Zoomed Ground Tracks")
df2.plot(ax=ax1, column=df2["h_mean"], cmap='winter_r', s=1.0, zorder=3)
ax1.plot(box_lon, box_lat, linewidth=1.5, color='r', zorder=2)
ax1.set_xlim(min(lons) - lon_margin, max(lons) + lon_margin)
ax1.set_ylim(min(lats) - lat_margin, max(lats) + lat_margin)
ax1.set_aspect('equal', adjustable='box')

# Plot SlideRule Global View
ax2.set_title("SlideRule Global Reference")
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
world.plot(ax=ax2, color='0.8', edgecolor='black')
df2.plot(ax=ax2, marker='o', color='red', markersize=2.5, zorder=3)
ax2.set_xlim(-160,-145)
ax2.set_ylim(60,70)
ax2.set_aspect('equal', adjustable='box')

# Show Plot
plt.tight_layout()

# List the RGTs that are within the region of interest
print(np.unique(df2['rgt']))

The Creamer's Field site is fairly small, so there are only a few ICESat-2 tracks that fly over (without TOOs). To keep thigs simple, let's look at just one of the tracks: RGT 266.

In [None]:
# Set up a dataframe that is only applicable for an RGT of interest
rgt = 1356
rgt_pd = df[df['rgt']==rgt]
rgt_pd2 = df2[df2['rgt']==rgt]

rgt_pd.head()

Let's take a look at the height comparisons between our two queries. The below plots show (a) the along-track surface height in July 2021, and (b) the uncertainty ("h_sigma") along the same track.

In [None]:
# Only look at the central beam, for simplicity
rgt_pd_ctr = rgt_pd[rgt_pd['gt']==50]
rgt_pd_ctr2 = rgt_pd2[rgt_pd2['gt']==50]

#%matplotlib inline
plt.plot(rgt_pd_ctr.geometry.y.loc['2022-03'], rgt_pd_ctr['h_mean'].loc['2022-03'], '.', label='Unfiltered')
plt.plot(rgt_pd_ctr2.geometry.y.loc['2022-03'], rgt_pd_ctr2['h_mean'].loc['2022-03'], '.', label='Filtered')
plt.xlabel('Latitude')
plt.ylabel('Elevation [m]')
plt.legend()
plt.tight_layout()
plt.show()

In [None]:
uf_sigma_mean = rgt_pd_ctr['h_sigma'].loc['2022-03'].mean()
f_sigma_mean = rgt_pd_ctr2['h_sigma'].loc['2022-03'].mean()

plt.plot(rgt_pd_ctr.geometry.y.loc['2022-03'], rgt_pd_ctr['h_sigma'].loc['2022-03'], '.', label='Unfiltered')
plt.plot(rgt_pd_ctr2.geometry.y.loc['2022-03'], rgt_pd_ctr2['h_sigma'].loc['2022-03'], '.', label='Filtered')
plt.xlabel('Latitude')
plt.ylabel('Height uncertainty [m]')
#plt.text(38.85, 37, 'Unfiltered mean $\sigma_h$ = {:.2f} m'.format(uf_sigma_mean))
#plt.text(38.85, 30, 'Filtered mean $\sigma_h$ = {:.2f} m'.format(f_sigma_mean))
plt.legend()
plt.tight_layout()
plt.show()

Looking at these plots, it is obvious that limiting our ICESat-2 data to only high-confidence ground photons reduces noise significantly. At least over Creamer's Field, this appears to be caused by vegetation, as the along-track elevation profile shows elevations several meters above the ground in the unfiltered profile.

Creamer's Field is highly vegetated, so a difference is expected when we are more selective with our photons. Over flatter, less vegetated terrain, such as the Arctic Coastal Plain of Alaska, the differences may not be quite as significant.


As a last step, let's make a nice-looking map of RGT 266 over Creamer's Field. We will use Holoviews and Geoviews for this.

In [None]:
hover = HoverTool(tooltips=[('Latitude', '@Latitude'),
                             ('Longitude', '@Longitude'),
                             ('h_mean', '@h_mean'),
                             ('rgt', '@rgt')])

lidar_boxes = gpd.read_file('/home/jovyan/icesat2-snowex/jsons-shps/snowex_lidar_swaths.shp')
lidar_boxes_poly = gv.Polygons(lidar_boxes).opts(color='white', 
                                                 alpha=0.5)

ds = gv.Dataset(rgt_pd2[0:-1:100])
points_on_map = gv.Points(rgt_pd2,
                          kdims=['Longitude', 'Latitude'],
                          vdmins=['h_mean']).opts(tools=[hover],
                                                  color_index='h_mean',
                                                  colorbar=True,
                                                  clabel='Elevation [m]')

world_map = gts.EsriImagery.opts(width=600, height=570) * gts.StamenLabels.options(level='annotation')

world_map * lidar_boxes_poly * points_on_map

In [None]:
# Save the DataFrame to a CSV file
df2.to_csv(r'is2_atl03sl_%s.csv' %(field_id))
#rgt_pd2.to_csv(r'is2_atl03sl_%s_rgt%s.csv' %(field_id, rgt))

In [None]:
rgt_pd2.head()

## References
Shean et al., (2023). SlideRule: Enabling rapid, scalable, open science for the NASA ICESat-2 mission and beyond. Journal of Open Source Software, 8(81), 4982, https://doi.org/10.21105/joss.04982