# Using icepyx for GEDI

This notebook aims to explore the use of icepyx, specifically the variables module, for working with GEDI data.

The motivation was Mahsa's 20 Feb 2024 earthaccess presentation of a GEDI notebook, including noting some variable management challenges.
This notebook demonstrates that the icepyx variables module could be useful for managing GEDI variables as well.

Detailed tutorial for using icepyx's Variables module: https://icepyx.readthedocs.io/en/latest/example_notebooks/IS2_data_variables.html

The only changes made to the underlying icepyx code were to comment out the portion of `Variables.__init__` (lines 83-107 on the dev branch) that extracts the dataset product and version and then add a new `self._path=path` in place of the commented code. The version and product come from the ICESat-2 metadata and are then validated against a hard-coded list of ICESat-2 data products, so even if a valid version and product were extracted from GEDI the checks would obviously fail.

The notebook also includes a few notes on where I (as a user) wanted better earthaccess docs for contributing to the ongoing docs improvements and restructure (https://github.com/nsidc/earthaccess/tree/update-documentation). My hope is that this notebook can seed a basic tutorial notebook that shows how to use earthaccess to search for a dataset when all one knows is the sensor name.

In [1]:
%load_ext autoreload
import icepyx as ipx
%autoreload 2

In [2]:
import earthaccess

In [4]:
auth = earthaccess.login()

Enter your Earthdata Login username:  icepyx_devteam
Enter your Earthdata password:  ········


### earthaccess Notes/challenges:
- navigating the output from search_datasets (`.summary` only accepts a single dict from the list output by `.search_datasets`, and `keyword="GEDI"` returns 22 datasets)
- terms that are a bit jargony: concept_ID vs shortname, collection vs dataset
- docs need examples for running the various functions, and clarity on which things are classes, methods, kwargs, etc. For instance, on https://earthaccess.readthedocs.io/en/latest/user-reference/collections/collections-query/, in the right hand TOC in a single list are the CollectionQuery class, the keyword kwarg (which cannot be passed to CollectionQuery), and the paramaters method (for the DataCollections class object).

In [94]:
# How/why would I use the collection_query class (as a user)? Or is it the underlying object type I don't call directly?
coll = earthaccess.collection_query()
coll

<earthaccess.search.DataCollections at 0x14654f920>

## Identifying a dataset

You can search for individual data products using the `keyword="string"` argument.
This is helpful if you know the name of the sensor or satellite but not the specific product name syntax.

`earthaccess.search_datasets()` will return a list of datasets, where each entry is a dictionary containing CMR metadata...


In [93]:
earthaccess.search_datasets(keyword="GEDI")[0].summary()

Datasets found: 22


{'short-name': 'GEDI_L4A_AGB_Density_V2_1_2056',
 'concept-id': 'C2237824918-ORNL_CLOUD',
 'version': '2.1',
 'file-type': "[{'FormatType': 'Native', 'Fees': '0', 'Format': 'HDF5'}]",
 'get-data': ['https://daac.ornl.gov/daacdata/gedi/GEDI_L4A_AGB_Density_V2_1/'],
 'cloud-info': {'Region': 'us-west-2',
  'S3BucketAndObjectPrefixNames': ['s3://ornl-cumulus-prod-protected/gedi/GEDI_L4A_AGB_Density_V2_1/'],
  'S3CredentialsAPIEndpoint': 'https://data.ornldaac.earthdata.nasa.gov/s3credentials',
  'S3CredentialsAPIDocumentationURL': 'https://data.ornldaac.earthdata.nasa.gov/s3credentialsREADME'}}

In [25]:
# from https://github.com/nasa/GEDI-Data-Resources/blob/main/python/tutorials/GEDI_Finder_Tutorial_Python.ipynb
product = 'GEDI_L4A_AGB_Density_V2_1_2056'           # Options for concept ID include 'GEDI01_B.002', 'GEDI02_A.002', 'GEDI02_B.002'
bbox = (-73.65,-12.64,-47.81,9.7)  # bounding box coordinates in LL Longitude, LL Latitude, UR Longitude, UR Latitude format

In [52]:
results = earthaccess.search_data(
    short_name=product,
    temporal=('2023-03-01', '2023-03-02'),
    bounding_box=bbox
)

Granules found: 5


In [54]:
results[0]

In [56]:
downloaded_files = earthaccess.download(
    results[0],
    local_path='./GEDI/',
)

 Getting 1 granules, approx download size: 0.15 GB


QUEUEING TASKS | :   0%|          | 0/2 [00:00<?, ?it/s]

PROCESSING TASKS | :   0%|          | 0/2 [00:00<?, ?it/s]

COLLECTING RESULTS | :   0%|          | 0/2 [00:00<?, ?it/s]

In [58]:
%ls ./GEDI/

GEDI04_A_2023060022721_O23871_01_T01222_02_003_01_V002.h5
GEDI04_A_2023060022721_O23871_01_T01222_02_003_01_V002.h5.sha256


In [66]:
vars = ipx.Variables(path="./GEDI/GEDI04_A_2023060022721_O23871_01_T01222_02_003_01_V002.h5")

In [77]:
vars._product="GEDI" #to trick icepyx because we commented out the setting of _product from the metadata

In [67]:
vars.path

'./GEDI/GEDI04_A_2023060022721_O23871_01_T01222_02_003_01_V002.h5'

In [68]:
vars.avail()

['ANCILLARY/model_data',
 'ANCILLARY/pft_lut',
 'ANCILLARY/region_lut',
 'BEAM0000/agbd',
 'BEAM0000/agbd_pi_lower',
 'BEAM0000/agbd_pi_upper',
 'BEAM0000/agbd_prediction/agbd_a1',
 'BEAM0000/agbd_prediction/agbd_a10',
 'BEAM0000/agbd_prediction/agbd_a2',
 'BEAM0000/agbd_prediction/agbd_a3',
 'BEAM0000/agbd_prediction/agbd_a4',
 'BEAM0000/agbd_prediction/agbd_a5',
 'BEAM0000/agbd_prediction/agbd_a6',
 'BEAM0000/agbd_prediction/agbd_pi_lower_a1',
 'BEAM0000/agbd_prediction/agbd_pi_lower_a10',
 'BEAM0000/agbd_prediction/agbd_pi_lower_a2',
 'BEAM0000/agbd_prediction/agbd_pi_lower_a3',
 'BEAM0000/agbd_prediction/agbd_pi_lower_a4',
 'BEAM0000/agbd_prediction/agbd_pi_lower_a5',
 'BEAM0000/agbd_prediction/agbd_pi_lower_a6',
 'BEAM0000/agbd_prediction/agbd_pi_upper_a1',
 'BEAM0000/agbd_prediction/agbd_pi_upper_a10',
 'BEAM0000/agbd_prediction/agbd_pi_upper_a2',
 'BEAM0000/agbd_prediction/agbd_pi_upper_a3',
 'BEAM0000/agbd_prediction/agbd_pi_upper_a4',
 'BEAM0000/agbd_prediction/agbd_pi_upper_a

In [72]:
vars.avail(options=True)

var_list inputs: model_data, pft_lut, region_lut, agbd, agbd_pi_lower, agbd_pi_upper, agbd_a1, agbd_a10, agbd_a2, agbd_a3, agbd_a4, agbd_a5, agbd_a6, agbd_pi_lower_a1, agbd_pi_lower_a10, agbd_pi_lower_a2, agbd_pi_lower_a3, agbd_pi_lower_a4, agbd_pi_lower_a5, agbd_pi_lower_a6, agbd_pi_upper_a1, agbd_pi_upper_a10, agbd_pi_upper_a2, agbd_pi_upper_a3, agbd_pi_upper_a4, agbd_pi_upper_a5, agbd_pi_upper_a6, agbd_se_a1, agbd_se_a10, agbd_se_a2, agbd_se_a3, agbd_se_a4, agbd_se_a5, agbd_se_a6, agbd_t_a1, agbd_t_a10, agbd_t_a2, agbd_t_a3, agbd_t_a4, agbd_t_a5, agbd_t_a6, agbd_t_pi_lower_a1, agbd_t_pi_lower_a10, agbd_t_pi_lower_a2, agbd_t_pi_lower_a3, agbd_t_pi_lower_a4, agbd_t_pi_lower_a5, agbd_t_pi_lower_a6, agbd_t_pi_upper_a1, agbd_t_pi_upper_a10, agbd_t_pi_upper_a2, agbd_t_pi_upper_a3, agbd_t_pi_upper_a4, agbd_t_pi_upper_a5, agbd_t_pi_upper_a6, agbd_t_se_a1, agbd_t_se_a10, agbd_t_se_a2, agbd_t_se_a3, agbd_t_se_a4, agbd_t_se_a5, agbd_t_se_a6, algorithm_run_flag_a1, algorithm_run_flag_a10, algor

In [79]:
vars.append(var_list=["shot_number","beam","lat_lowestmode_a1","lon_lowestmode_a1"])

In [80]:
vars.wanted

{'shot_number': ['BEAM0000/agbd_prediction/shot_number',
  'BEAM0001/agbd_prediction/shot_number',
  'BEAM0010/agbd_prediction/shot_number',
  'BEAM0011/agbd_prediction/shot_number',
  'BEAM0101/agbd_prediction/shot_number',
  'BEAM0110/agbd_prediction/shot_number',
  'BEAM1000/agbd_prediction/shot_number',
  'BEAM1011/agbd_prediction/shot_number'],
 'beam': ['BEAM0000/beam',
  'BEAM0001/beam',
  'BEAM0010/beam',
  'BEAM0011/beam',
  'BEAM0101/beam',
  'BEAM0110/beam',
  'BEAM1000/beam',
  'BEAM1011/beam'],
 'lat_lowestmode_a1': ['BEAM0000/geolocation/lat_lowestmode_a1',
  'BEAM0001/geolocation/lat_lowestmode_a1',
  'BEAM0010/geolocation/lat_lowestmode_a1',
  'BEAM0011/geolocation/lat_lowestmode_a1',
  'BEAM0101/geolocation/lat_lowestmode_a1',
  'BEAM0110/geolocation/lat_lowestmode_a1',
  'BEAM1000/geolocation/lat_lowestmode_a1',
  'BEAM1011/geolocation/lat_lowestmode_a1'],
 'lon_lowestmode_a1': ['BEAM0000/geolocation/lon_lowestmode_a1',
  'BEAM0001/geolocation/lon_lowestmode_a1',
  'B

In [81]:
vars.remove(all=True)

In [84]:
vars.append(var_list=["shot_number","beam","lat_lowestmode_a1","lon_lowestmode_a1"], keyword_list=["BEAM0010"])

In [85]:
vars.wanted

{'shot_number': ['BEAM0010/agbd_prediction/shot_number'],
 'beam': ['BEAM0010/beam'],
 'lat_lowestmode_a1': ['BEAM0010/geolocation/lat_lowestmode_a1'],
 'lon_lowestmode_a1': ['BEAM0010/geolocation/lon_lowestmode_a1']}