# Direct Access to LPDAAC GEDI Products
Authors: Alex Mandel (Development Seed), Brian Freitag (NASA MSFC)

Description: In this tutorial, we demonstrate how to use transform HTTPS links into their corresponding S3 links to retrieve GEDI data hosted by the Land Processes Distributed Active Archive Center (LP DAAC).

## Run This Notebook
To access and run this tutorial within MAAP's Algorithm Development Environment (ADE), please refer to the ["Getting started with the MAAP"](https://docs.maap-project.org/en/latest/getting_started/getting_started.html) section of our documentation.

Disclaimer: it is highly recommended to run a tutorial within MAAP's ADE, which already includes packages specific to MAAP, such as maap-py. Running the tutorial outside of the MAAP ADE may lead to errors.

## Additional Resources
- [Searching Granules in CMR](../docs/source/technical_tutorials/search/granules.ipynb)
- [Searching Collections in CMR](../docs/source/technical_tutorials/search/granules.ipynb)

## Importing Packages


In [1]:
import os
from maap.maap import MAAP

maap = MAAP(maap_host="api.maap-project.org")

## Searching the Data

We'll start by gathering a sample list of granules from the GEDI L2A collection. The HTTPS links we're after are nested within the granule object.

In [2]:
results = maap.searchGranule(
    concept_id="C1908348134-LPDAAC_ECS",  # GEDI-L2A
    cmr_host="cmr.earthdata.nasa.gov",
    limit=10,
)

# Download URL of GEDI L2A product
print(results[0].getDownloadUrl())

https://e4ftl01.cr.usgs.gov//GEDI_L1_L2/GEDI/GEDI02_A.002/2019.04.18/GEDI02_A_2019108002012_O01959_01_T03909_02_003_01_V002.h5


## Converting the Paths
We'll create a helper function to handle the link conversions to AWS S3 links.

In [3]:
def lpdaac_gedi_https_to_s3(url):
    dir_comps = url.split("/")
    return f"s3://lp-prod-protected/{dir_comps[6]}/{dir_comps[8].strip('.h5')}/{dir_comps[8]}"


# Sample
lpdaac_gedi_https_to_s3(results[0].getDownloadUrl())

's3://lp-prod-protected/GEDI02_A.002/GEDI02_A_2019108002012_O01959_01_T03909_02_003_01_V002/GEDI02_A_2019108002012_O01959_01_T03909_02_003_01_V002.h5'

## Downloading the Data
We'll start by creating a data directory to store our data.

In [4]:
# set data directory
dataDir = "./data"

# check if directory exists -> if directory doesn't exist, directory is created
if not os.path.exists(dataDir):
    os.mkdir(dataDir)

Here we're able to reassign the URL being used for the `getData` function of maap-py.

In [5]:
results[0]._location = lpdaac_gedi_https_to_s3(results[0]._location)
results[0].getData(dataDir)

'./data/GEDI02_A_2019108002012_O01959_01_T03909_02_003_01_V002.h5'

We can also iterate over the list like so:

In [6]:
for result in results:
    if not result._location.startswith("s3"):
        result._location = lpdaac_gedi_https_to_s3(result._location)
    print(result._location)
    # result.getData(dataDir)

s3://lp-prod-protected/GEDI02_A.002/GEDI02_A_2019108002012_O01959_01_T03909_02_003_01_V002/GEDI02_A_2019108002012_O01959_01_T03909_02_003_01_V002.h5
s3://lp-prod-protected/GEDI02_A.002/GEDI02_A_2019108002012_O01959_03_T03909_02_003_01_V002/GEDI02_A_2019108002012_O01959_03_T03909_02_003_01_V002.h5
s3://lp-prod-protected/GEDI02_A.002/GEDI02_A_2019108002012_O01959_02_T03909_02_003_01_V002/GEDI02_A_2019108002012_O01959_02_T03909_02_003_01_V002.h5
s3://lp-prod-protected/GEDI02_A.002/GEDI02_A_2019108002012_O01959_04_T03909_02_003_01_V002/GEDI02_A_2019108002012_O01959_04_T03909_02_003_01_V002.h5
s3://lp-prod-protected/GEDI02_A.002/GEDI02_A_2019108015253_O01960_01_T03910_02_003_01_V002/GEDI02_A_2019108015253_O01960_01_T03910_02_003_01_V002.h5
s3://lp-prod-protected/GEDI02_A.002/GEDI02_A_2019108015253_O01960_02_T03910_02_003_01_V002/GEDI02_A_2019108015253_O01960_02_T03910_02_003_01_V002.h5
s3://lp-prod-protected/GEDI02_A.002/GEDI02_A_2019108015253_O01960_03_T03910_02_003_01_V002/GEDI02_A_201910