# Overview of **SlideRule** Functionality

[SlideRule](https://slideruleearth.io/web/) is a web service that provides on-demand customized data products.  The primary way of accessing SlideRule is through its Python client.

```{admonition} Quick links for the event
* SlideRule Documentation: https://slideruleearth.io/web/
* SlideRule GitHub Repository: https://github.com/ICESat2-SlideRule/sliderule
* SlideRule Python Examples GitHub Repository: https://github.com/ICESat2-SlideRule/sliderule-python
```

This notebook gives an overview of the different functionality SlideRule provides.

```{admonition} Learning Objectives
* How to import and configure the SlideRule Python client
* Survey SlideRule's core and advanced functionality
* Where to find documentation on SlideRule's APIs
```

## I. Import and Configure SlideRule

The SlideRule Python client currently consists of seven primary modules.

* __sliderule__ - the core module
* __earthdata__ - functions that access CMR (NASA's Common Metadata Repository), CMR-STAC, and TNM (The National Map, for the 3DEP data hosted by USGS)
* __h5__ - APIs for directly reading HDF5 and NetCD4 data
* __raster__ - APIs for sampling supported raster datasets
* __icesat2__ - APIs for processing ICESat-2 data
* __gedi__ - APIs for processing GEDI data
* __io__ - functions for reading and writing local files with SlideRule results
* __ipysliderule__ - functions for building interactive Jupyter notebooks that interface to SlideRule

These modules can be imported into your environment like so:

In [1]:
from sliderule import sliderule, earthdata, h5, raster, icesat2, gedi


import os
os.environ['USE_PYGEOS'] = '0'
import geopandas

In a future release, GeoPandas will switch to using Shapely by default. If you are using PyGEOS directly (calling PyGEOS functions on geometries from GeoPandas), this will then stop working and you are encouraged to migrate from PyGEOS to Shapely 2.0 (https://shapely.readthedocs.io/en/latest/migration_pygeos.html).
  import geopandas


Once those modules are imported, the next thing you'll likely want to do is configure the client with the settings you want to use.  That is done with a call to `sliderule.init()`.  For detailed documentation on what arguments are supported by the initialization function, check out the [api reference page](https://slideruleearth.io/web/rtd/api_reference/sliderule.html#init).

It is not necessary to call `sliderule.init()` in order to start using the client, since the default settings provide a working system.  Nevertheless, it is a good practice to include a call to this function early in your notebook as a placeholder for when different settings are desired.  For instance, if you want to change the verbosity of the client and enable logging to the console, you can do so as shown below.

In [2]:
sliderule.init(verbose=True)

## II. Core Functionality
- A. [Directly read HDF5 and NetCDF4 files](#h5p)
- B. [Subset ATL03 photon cloud data](#atl03sp)
- C. [Generate customized ATL06 elevations](#atl06p)
- D. [Generate customized ATL06 elevations using ATL08 classifications](#atl08)
- E. [Generate customized ATL08 vegetation metrics](#phoreal)
- F. [Sample rasters at points of interest](#raster)
- G. [Subset GEDI L1B, L2A, L4A](#gedi)

<a id='h5p'></a>
### A. Directly read HDF5 and NetCDF files

The **h5** module provides APIs for directly reading HDF5 and NetCDF4 files hosted by NASA in the cloud. [`h5.h5p`](https://slideruleearth.io/web/rtd/api_reference/h5.html#h5p) is the primary method used to directly read data in the cloud.  The [reference page](https://slideruleearth.io/web/rtd/api_reference/h5.html#h5p) for `h5.h5p` provides a description of each of the arguments needed to make the call and the different options available.  


```{tip}
Under-the-hood, the functions in the **h5** module make HTTP requests to SlideRule servers running in us-west-2, and those servers read the requested data from S3 and return the results in an HTTP response back to the client.
```

In the example below, the first 100 latitudes and longitudes are read from an ATL06 granule.  The results are returned in a dictionary of numpy arrays, where each key is the name of the dataset.  If instead of reading just the first 100 values, all the values need to be read, then "numrows" can be set to `h5.ALL_ROWS`. 

In [None]:
asset = "icesat2"
resource="ATL06_20181017222812_02950102_006_02.h5"
datasets = [
    # latitudes
    {"dataset": "/gt1l/land_ice_segments/latitude", "startrow": 0, "numrows": 100},
    {"dataset": "/gt1r/land_ice_segments/latitude", "startrow": 0, "numrows": 100},
    {"dataset": "/gt2l/land_ice_segments/latitude", "startrow": 0, "numrows": 100},
    {"dataset": "/gt2r/land_ice_segments/latitude", "startrow": 0, "numrows": 100},
    {"dataset": "/gt3l/land_ice_segments/latitude", "startrow": 0, "numrows": 100},
    {"dataset": "/gt3r/land_ice_segments/latitude", "startrow": 0, "numrows": 100},
    # longitudes
    {"dataset": "/gt1l/land_ice_segments/longitude", "startrow": 0, "numrows": 100},
    {"dataset": "/gt1r/land_ice_segments/longitude", "startrow": 0, "numrows": 100},
    {"dataset": "/gt2l/land_ice_segments/longitude", "startrow": 0, "numrows": 100},
    {"dataset": "/gt2r/land_ice_segments/longitude", "startrow": 0, "numrows": 100},
    {"dataset": "/gt3l/land_ice_segments/longitude", "startrow": 0, "numrows": 100},
    {"dataset": "/gt3r/land_ice_segments/longitude", "startrow": 0, "numrows": 100}
]

atl06 = h5.h5p(datasets, resource, asset)

In [None]:
atl06

<a id='atl03sp'></a>
### B. Subset ATL03 photon cloud data

The [`icesat2.atl03sp`](https://slideruleearth.io/web/rtd/api_reference/icesat2.html#atl03sp) function makes an ATL03 subsetting request to SlideRule servers and returns a GeoDataFrame of photons.  Documentation for this function can be found in the [API reference](https://slideruleearth.io/web/rtd/api_reference/icesat2.html#atl03sp).

In the example below, a set of resources is specified via the `poly`, `rgt`, and `cycle` parameters.  Because each granule contains so many photons, it is necessary when making this call to limit the area over which the subsetting request is made, along with the number of granules inside that area. By supplying a GeoJSON file (which is read and processed by the `sliderule.toregion` function into a format usable by SlideRule), the extent of data read in each granule is trimmed.  By supplying a reference ground track (`rgt`), and the cycle number, the number of granules is reduced - in this case to a single granule.

The other parameters in the request are used to specify different aspects of the ATL03 subsetting request.  The `srt` parameters specifies the surface type, which in this case is land. The surface type is used in conjunction with the next parameter - `cnf` which is the confidence level.  A confindence level of high tells SlideRule to only include photons that are highly likely to be surface reflections off of land.  (As a different example, if the `srt` parameter specified land ice and the confidence level was low, then SlideRule would include all photons that had *at least* a low likelihood of being a reflection off of land ice). Lastly, the `len` and `res` parameters specify the length and resolution of the photon segments being returned.  In this case we are asking for 20m segments of photons every 20m.  The length and step size of the segment does not matter so much if it is only photons being returned, but when other processing parameters are supplied (like minimal along track spread of a segment), then it matters more.

Lastly, the call to `icesat2.atl03sp` is made which sends the HTTP request to SlideRule's servers and then waits and accumulates the response from the servers into a GeoDataFrame, with each row representing a single photon.

```{tip}
For the request below there are only ~1K photons returned.  In the actual data, there are roughly ~300K high-confidence photons inside the Grand Mesa region in the selected granule.  The reason so few photons are returned is because the length of the segment is set to 20m and the default along-track-spread required for a valid segment is also 20m (because the default segment length is 40m).  As a result, most segments are filtered out by SlideRule as not being valid.  This happens all the time to me - I get back a lot less data than I expected because I inadvertantly changed one parameter without changing another parameter related to it.  In this case, the way to get all of the photons would be to either change the along-track-spread to 40m ("ats": 40), or tell SlideRule to return invalid segments ("pass_invalid": True).  I left it this way below because I wanted to highlight this common issue, and also because the smaller datasets is faster to load into a GeoDataFrame. 
```


In [None]:
# Build Region of Interest
region = sliderule.toregion('grandmesa.geojson')

# Build ATL03 Subsetting Request Parameters
parms = {
    "poly": region["poly"],
    "rgt": 737,
    "cycle": 16,
    "srt": icesat2.SRT_LAND,
    "cnf": icesat2.CNF_SURFACE_HIGH,
    "len": 20.0,
    "res": 20.0
}

# Make ATL03 Subsetting Request
atl03 = icesat2.atl03sp(parms)

In [None]:
atl03

<a id='atl06p'></a>
### C. Generate custom ATL06 elevations

The [`icesat2.atl06p`](https://slideruleearth.io/web/rtd/api_reference/icesat2.html#atl06p) function makes an on-demand processing request to SlideRule servers to generate customized ATL06 elevations and return them in a GeoDataFrame.  Documentation for this function can be found in the [API reference](https://slideruleearth.io/web/rtd/api_reference/icesat2.html#atl06p).

In the example below, a set of resources are implicitly specified via the `poly` parameter.  By only supplying a polygon defining the region of interest, SlideRule will determine which granules intersect the region and then process all them.

```{tip}
Under-the-hood the SlideRule Python client is using the supplied shapefile to make a call to NASA's CMR system to get a list of ATL03 granules that intersect the region of interest.  The list of granules is passed to the SlideRule servers along with geometry in the shapefile.  The SlideRule servers then distribute the processing of each granule across all the available servers and each is responsible for pulling out the photons insides the region of interest and calculating a set of elevations from them.
```
The other parameters in the request all control different aspects of the ATL03 subsetting and ATL06 algorithm running on the SlideRule servers. Note that the length of the ATL03 segment used to generate an ATL06 elevation has been customized to 20m instead of the 40m in the standard product, and the step size has been similarly customized to 10m instead of 20m.

Lastly, the call to `icesat2.atl06p` makes the processing request by sending an HTTP request to SlideRule's servers and then waiting and accumulating the response from the servers into a GeoDataFrame, with each row representing an elevation calculated from a custom ATL03 segment.

In [None]:
# Build Region of Interest
region = sliderule.toregion('grandmesa.geojson')

# Build ATL06 Request Parameters
parms = {
    "poly": region["poly"],
    "srt": icesat2.SRT_LAND,
    "cnf": icesat2.CNF_SURFACE_HIGH,
    "ats": 7.0,
    "cnt": 10,
    "len": 20.0,
    "res": 10.0,
}

# Make ATL06 Request
atl06 = icesat2.atl06p(parms)

In [None]:
atl06

<a id='atl08'></a>
### D. Generate customized ATL06 elevations using ATL08 classifications

The [`icesat2.atl03sp`](https://slideruleearth.io/web/rtd/api_reference/icesat2.html#atl03sp) and [`icesat2.atl06p`](https://slideruleearth.io/web/rtd/api_reference/icesat2.html#atl06p) on-demand processing requests can also include the use of ATL08 data to classify and filter the photons returned and used in the ATL06 elevation calculation.  See the [icesat2 module user guide](https://slideruleearth.io/web/rtd/user_guide/ICESat-2.html#atl08-classification) for further documentation.

In the example below, the request to generate custom ATL06 elevations has been modified to use the ATL08 classifier.  The first modification is to drop the minimum required ATL03 confidence level down to CNF_NOT_CONSIDERED which is the lowest confidence of a non-TEP photon.  (This effectively removes ATL03 confidence level filtering by including all non-TEP photons regardless of how the ATL03 processing labelled it).  The second modification is to include the `atl08_class` parameter and  specify `atl08_ground` in the list of labels that should be used.  This causes the ATL06 elevation to only be caclulated from photons labelled as ground by the ATL08 classification system.

In [None]:
# Build Region of Interest
region = sliderule.toregion('grandmesa.geojson')

# Build ATL06 Request Parameters
parms = {
    "poly": region["poly"],
    "srt": icesat2.SRT_LAND,
    "cnf": icesat2.CNF_NOT_CONSIDERED, # effectively remove atl03 filtering
    "ats": 7.0,
    "cnt": 10,
    "len": 20.0,
    "res": 10.0,
    "atl08_class": ["atl08_ground"] # specify ground photons only
}

# Make ATL06 Request
atl06 = icesat2.atl06p(parms)

In [None]:
atl06

<a id='phoreal'></a>
### E. Generate customized ATL08 vegetation metrics

The [`icesat2.atl08p`](https://slideruleearth.io/web/rtd/api_reference/icesat2.html#atl08p) function makes an on-demand processing request to SlideRule servers to generate customized ATL08 vegetation metrics and return them in a GeoDataFrame.  Documentation for this function can be found in the [API reference](https://slideruleearth.io/web/rtd/api_reference/icesat2.html#atl08p).  

In the example below, vegetation metrics are calculated for every 20m segment in the Grand Mesa region that pass the provided criteria (`ats` - along track spread, `cnt` - minimum number of photons in a segment).  The photons are classified using the ATL08 classification method, and in addition to ground photons, canopy and top of canopy photons are also included (which is necessary for the vegetation statistics).

The set of parameters specific to the ATL08 processing are provided under the `phoreal` key.  The name comes from the University of Texas team that developed PhoREAL and collaborated with us to get their algorithms into SlideRule. Documentation on the different parameters related to the vegetation calculations in PhoREAL can be found in the [user guide](https://slideruleearth.io/web/rtd/user_guide/ICESat-2.html#phoreal-parameters).

Lastly, the call to `icesat2.atl08p` makes the processing request by sending an HTTP request to SlideRule's servers and then waiting and accumulating the response from the servers into a GeoDataFrame, with each row representing a set of vegetation metrics for a custom 20m ATL03 segment.

In [None]:
# Build Region of Interest
region = sliderule.toregion('grandmesa.geojson')

# Build ATL08 Request Parameters
parms = { 
    "poly": region['poly'],
    "cnf": icesat2.CNF_NOT_CONSIDERED,
    "ats": 5.0,
    "cnt": 5,
    "len": 20.0,
    "res": 10.0,
    "atl08_class": [
        "atl08_ground", 
        "atl08_canopy", 
        "atl08_top_of_canopy"
    ],
    "phoreal": {
        "binsize": 1.0, 
        "geoloc": "center", 
        "use_abs_h": False, 
        "send_waveform": False
    }
}

# Make ATL08 Processing Request
atl08 = icesat2.atl08p(parms)

In [None]:
atl08

<a id='raster'></a>
### F. Sample rasters at points of interest

Many of the SlideRule processing APIs support sampling raster datasets at every point generated by the server-side processing.  For a detailed discussion of this capability see the [GeoRaster page](https://slideruleearth.io/web/rtd/user_guide/GeoRaster.html) and the [sampling parameters](https://slideruleearth.io/web/rtd/user_guide/SlideRule.html#raster-sampling) in the SlideRule documentation.

In the example below, an on-demand ATL06 processing request is made along with parameters that specify that the Harmonized Landsat Sentinel-2 (HLS) raster dataset and the GEDI L4B raster dataset is to be sampled at the location of every calculated ATL06 elevation.  The results are returned in a GeoDataFrame where each row contains both an elevation and the value of the sampled HLS and GEDI L4B rasters.

In [None]:
# Build Region of Interest
region = sliderule.toregion('grandmesa.geojson')
catalog = earthdata.stac(short_name="HLS", polygon=region["poly"], time_start="2022-01-01T00:00:00Z", time_end="2022-03-01T00:00:00Z", as_str=True)

# Build Sampling Request Parameters
samples = {
    "landsat": {
        "asset": "landsat-hls",
        "catalog": catalog,
        "closest_time": "2022-01-05T00:00:00Z", 
        "bands": ["NDVI"]
    },
    "gedi": {
        "asset": "gedil4b"
    }   
}

# Build ATL03 Subsetting Request Parameters
parms = {
    "poly": region["poly"],
    "rgt": 737,
    "cycle": 16,
    "srt": icesat2.SRT_LAND,
    "cnf": icesat2.CNF_SURFACE_HIGH,
    "len": 20.0,
    "res": 20.0,
    "samples": samples
}

# Make ATL06 with Sampling Request
atl06 = icesat2.atl06p(parms)

In [None]:
atl06

<a id='gedi'></a>
### G. Subset GEDI L1B, L2A, L4A

The [`gedi`](https://slideruleearth.io/web/rtd/api_reference/gedi.html) module provides a handful of functions to make GEDI subsetting requests to SlideRule.  For documentation on those functions, see the [api reference](https://slideruleearth.io/web/rtd/api_reference/gedi.html).  For documentation on the general GEDI capabilities in SlideRule, see the [user guide](https://slideruleearth.io/web/rtd/user_guide/GEDI.html).

In the example below, a request to subset the GEDI L2A data for the Grand Mesa region is made.  The polygon representing the region is provided in the same way it is for the _icesat2_ processing requests. The `*_flag` settings are filters specifying that data is only to be returned for footprints with those flags set that way. The `beam` parameter specifies a single beam number, a list of beam numbers, or all of the beams (`gedi.ALL_BEAMS`).

The call to [`gedi.gedi02ap()`](https://slideruleearth.io/web/rtd/api_reference/gedi.html#gedi02ap) sends the HTTP request to the SlideRule servers and waits for and accumulates the response into a GeoDataFrame where each row represents a returned footprint.

```{tip}
While GEDI L3 and L4 data is all hosted in the cloud, the GEDI L1 and L2 data has not been migrated to the cloud yet and therefore the areas of interest that SlideRule can support are very limited.  To work around this, we have data for a few areas of interest staged in our own S3 bucket.  If you want to use GEDI L1 and/or L2 data for a given area of interest, please get in touch with the SlideRule team and we can work with you to have the data staged in our own bucket until the official products are migrated to the cloud.
```

In [None]:
# Build Region of Interest
region = sliderule.toregion('grandmesa.geojson')

# Build GEDI L2A Request Parameters
parms = {
    "poly": region["poly"],
    "degrade_flag": 0,
    "quality_flag": 1,
    "beam": 0
}

# Make GEDI L2A Request
gedi02a = gedi.gedi02ap(parms)

In [None]:
gedi02a

## III. Advanced Functionality
- Private clusters
- GeoParquet output to S3
- Customized YAPC classification
- Include ancillary data
- Query CMR, CMR-STAC, and TNM
- Directly sample supported raster datasets
- Subset via rasterized area of interest
- Kmeans clustering