# Accessing the MAAP CMR STAC with R

Authors: Harshini Girish (UAH), Sheyenne Kirkland (UAH), Alex Mandel (DevSeed), Henry Rodman (DevSeed)

Date: December 11, 2024

Description: In this notebook, we'll use `rstac` to search for collections and associated items within the [MAAP STAC Catalog](https://stac.maap-project.org/).

## Run This Notebook

To access and run this tutorial within MAAP's Algorithm Development Environment (ADE), please refer to the ["Getting started with the MAAP"](https://docs.maap-project.org/en/latest/getting_started/getting_started.html) section of our documentation.

Disclaimer: it is highly recommended to run a tutorial within MAAP's ADE, which already includes packages specific to MAAP, such as maap-py. Running the tutorial outside of the MAAP ADE may lead to errors. Users should work within the "R/Python" workspace.

## Additional Resources

- [How do I find data using R?](https://nasa-openscapes.github.io/earthdata-cloud-cookbook/how-tos/find-data/find-r.html)
  - A resource from NASA Openscapes, showing users how to search for NASA data in R and get authentication using the package `earthdatalogin`. Additionally, it shows users how to find data stored in NASA STACs (SpatioTemporal Asset Catalogs).
- [rstac: Client Library for SpatioTemporal Asset Catalog](https://cran.r-project.org/web/packages/rstac/index.html)
  - A page with materials for the `rstac` library.
- [Searching the STAC Catalog (MAAP Docs)](https://docs.maap-project.org/en/latest/technical_tutorials/search/searching_the_stac_catalog.html)
  - A notebook in the MAAP Docs that shows users how to search the MAAP STAC using Python.

 ## Install/Load Packages

Let's install and load the packages necessary for this tutorial.

In [None]:
install.packages("rstac")

library(rstac)

## Initializing the MAAP STAC Endpoint
Before beginning, we'll form a connection to the MAAP STAC endpoint to set up and inspect the STAC endpoint for querying geospatial data.

In [138]:
# Define the MAAP STAC endpoint
stac_endpoint <- stac("https://stac.maap-project.org/")

# Display the STAC endpoint metadata
cat("STAC Endpoint Metadata:\n")
print(stac_endpoint)

STAC Endpoint Metadata:
[1m###rstac_query[22m
- [1murl:[22m https://stac.maap-project.org/
- [1mparams:[22m
- [1mfield(s):[22m version, base_url, endpoint, params, verb, encode


## Fetching and Displaying STAC Collections

This code fetches and displays collections from a STAC (SpatioTemporal Asset Catalog) endpoint. It extracts id and title for each collection for further exploration or querying.



In [139]:
collections <- stac_endpoint |>
    collections() |>
    get_request()
# Ensure collections are retrieved
if (!is.null(collections$collections)) {
    # Extract collection IDs and titles
    collection_info <- lapply(collections$collections, function(x) {
        list(id = x$id, title = x$title)
    })
    # Display the collection information
    for (i in seq_along(collection_info)) {
        cat("Collection ID:", collection_info[[i]]$id, "\n")
        cat("Title:", collection_info[[i]]$title, "\n\n")
    }
} else {
    cat("No collections found or error retrieving collections.\n")
}

Collection ID: Landsat8_SurfaceReflectance 
Title: Landsat 8 Operational Land Imager (OLI) Surface Reflectance Analysis Ready Data (ARD) V1, Peru and Equatorial Western Africa, April 2013-January 2020 

Collection ID: Global_PALSAR2_PALSAR_FNF 
Title: Global 25m Resolution PALSAR-2/PALSAR Forest/Non-Forest Map 

Collection ID: Global_Forest_Change_2000-2017 
Title: Global Forest Change 2000-2017 

Collection ID: AFRISAR_DLR2 
Title: AFRISAR_DLR2 

Collection ID: GlobCover_09 
Title: GlobCover Global Land Cover Product (2009) 

Collection ID: AfriSAR_UAVSAR_KZ 
Title: AfriSAR UAVSAR Vertical Wavenumber (KZ) Generated Using NISAR Tools 

Collection ID: AfriSAR_UAVSAR_Ungeocoded_Covariance 
Title: AfriSAR UAVSAR Ungeocoded Covariance Matrix product Generated Using NISAR Tools 

Collection ID: AfriSAR_UAVSAR_Normalization_Area 
Title: AfriSAR UAVSAR Normalization Area Generated Using NISAR Tools 

Collection ID: AfriSAR_UAVSAR_Geocoded_SLC 
Title: AfriSAR UAVSAR Geocoded SLCs Generated Usi

## Assigning and Selecting a STAC Collection ID
This code selects a collection ID from the list of collections retrieved from the STAC catalog. It selects a single collection ID from the fetched collections.



In [140]:
# Assign collection ID
if (!is.null(collections$collections)) {
    #  choose a specific one
    collection_id <- collections$collections[[21]]$id
    cat("Selected Collection ID:", collection_id, "\n")
} else {
    stop("No collections found.")
}

Selected Collection ID: ESACCI_Biomass_L4_AGB_V4_100m 


## Searching and Retrieving Items from a STAC Collection
This code searches for items in the selected STAC collection using the stac_search() function. It safely handles errors during the query and retrieves the items, printing details such as item IDs, dates, and associated links. If no items are found, it outputs a message indicating so.

In [141]:
# Perform an item search for the selected collection
items <- tryCatch({
    stac_endpoint |>
        stac_search(collections = collection_id) |>
        get_request()
}, error = function(e) {
    cat("Error fetching items:", e$message, "\n")
    NULL
})
print(items)

# Process and display item information
if (!is.null(items) && !is.null(items$features)) {
    cat("Found", length(items$features), "items:\n\n")
    # Display details of the first few items
    for (i in seq_len(min(5, length(items$features)))) {
        item <- items$features[[i]]
        cat("Item ID:", item$id, "\n")
        cat("Date:", item$properties$datetime, "\n")
        cat("Links:", paste(sapply(item$links, function(x) x$href), collapse = ", "), "\n\n")
    }
} else {
    cat("No items found for collection:", collection_id, "\n")
}

# Check and print the number of items retrieved
if (!is.null(items) && !is.null(items$features)) {
    num_items <- length(items$features)
    cat("Number of items retrieved:", num_items, "\n")
} else {
    cat("No items found for collection:", collection_id, "\n")
}

[1m###Items[22m
- [1mfeatures[22m (10 item(s)):
  - S50W080_ESACCI-BIOMASS-L4-AGB-MERGED-100m-2020-fv4.0
  - S50W070_ESACCI-BIOMASS-L4-AGB-MERGED-100m-2020-fv4.0
  - S50W060_ESACCI-BIOMASS-L4-AGB-MERGED-100m-2020-fv4.0
  - S50W040_ESACCI-BIOMASS-L4-AGB-MERGED-100m-2020-fv4.0
  - S50E070_ESACCI-BIOMASS-L4-AGB-MERGED-100m-2020-fv4.0
  - S50E060_ESACCI-BIOMASS-L4-AGB-MERGED-100m-2020-fv4.0
  - S40W080_ESACCI-BIOMASS-L4-AGB-MERGED-100m-2020-fv4.0
  - S40W070_ESACCI-BIOMASS-L4-AGB-MERGED-100m-2020-fv4.0
  - S40E170_ESACCI-BIOMASS-L4-AGB-MERGED-100m-2020-fv4.0
  - S40E160_ESACCI-BIOMASS-L4-AGB-MERGED-100m-2020-fv4.0
- [1massets:[22m estimates, standard_deviation
- [1mitem's fields:[22m 
assets, bbox, collection, geometry, id, links, properties, stac_extensions, stac_version, type
Found 10 items:

Item ID: S50W080_ESACCI-BIOMASS-L4-AGB-MERGED-100m-2020-fv4.0 
Date: 2020-01-01T00:00:00+00:00 
Links: NULL, https://stac.maap-project.org/collections/ESACCI_Biomass_L4_AGB_V4_100m, https://

## Extracting and Displaying Assets from a STAC Item
This code extracts the assets (downloadable data resources) from the first item in the STAC search results.

In [142]:
# Extract the first item's assets

first_item <- items$features[[1]]
assets <- first_item$assets
print(first_item)

# Display the available assets
print(names(assets))  # List of asset types

[1m###Item[22m
- [1mid:[22m S50W080_ESACCI-BIOMASS-L4-AGB-MERGED-100m-2020-fv4.0
- [1mcollection:[22m ESACCI_Biomass_L4_AGB_V4_100m
- [1mbbox:[22m 
xmin: -80.00000, ymin: -60.00000, xmax: -70.00000, ymax: -50.00000
- [1mdatetime:[22m 2020-01-01T00:00:00+00:00
- [1massets:[22m estimates, standard_deviation
- [1mitem's fields:[22m 
assets, bbox, collection, geometry, id, links, properties, stac_extensions, stac_version, type
[1] "estimates"          "standard_deviation"


## Listing and Displaying Asset URLs from a STAC Item
This loop iterates through all available assets in the STAC item and prints each asset's name and its corresponding URL.

In [143]:
for (asset_name in names(assets)) {
    cat("Asset:", asset_name, "\n")
    cat("URL:", assets[[asset_name]]$href, "\n\n")
}

Asset: estimates 
URL: s3://nasa-maap-data-store/file-staging/nasa-map/ESACCI_Biomass_L4_AGB_V4_100m_2020/S50W080_ESACCI-BIOMASS-L4-AGB-MERGED-100m-2020-fv4.0.tif 

Asset: standard_deviation 
URL: s3://nasa-maap-data-store/file-staging/nasa-map/ESACCI_Biomass_L4_AGB_V4_100m_2020/S50W080_ESACCI-BIOMASS-L4-AGB_SD-MERGED-100m-2020-fv4.0.tif 



## Performing a Focused Search Using the MAAP STAC API

This code performs a search query and retrieves items from the MAAP STAC. The search is configured with the following parameters:

Collection: Specifies the dataset to search within.

Temporal Range: Filters items within a specific date range.

Bounding Box: Spatially filters items to a defined area of interest.

In [146]:
datetime <- "2020-01-01T00:00:00Z/2020-01-31T23:59:59Z"   # YYYY-MM-DDTHH:MM:SSZ/YYYY-MM-DDTHH:MM:SSZ
bbox <- c(-74,-57,-18,-5.8)

stac_query <- rstac::stac(
    'https://stac.maap-project.org/'
)|>
  rstac::stac_search(
    collections = collection_id,
    bbox = bbox,
    datetime = datetime
  ) |>
  rstac::get_request()

#stac_query

results <- lapply(
  stac_query$features, 
  \(x) data.frame(collection = x$collection, id = x$id, datetime = x$properties$datetime, desc = x$assets$estimates$description)
) |> 
  do.call(what = rbind)

results

collection,id,datetime,desc
<chr>,<chr>,<chr>,<chr>
ESACCI_Biomass_L4_AGB_V4_100m,S50W080_ESACCI-BIOMASS-L4-AGB-MERGED-100m-2020-fv4.0,2020-01-01T00:00:00+00:00,Cloud Optimized GeoTIFF of AGB estimates
ESACCI_Biomass_L4_AGB_V4_100m,S50W070_ESACCI-BIOMASS-L4-AGB-MERGED-100m-2020-fv4.0,2020-01-01T00:00:00+00:00,Cloud Optimized GeoTIFF of AGB estimates
ESACCI_Biomass_L4_AGB_V4_100m,S50W060_ESACCI-BIOMASS-L4-AGB-MERGED-100m-2020-fv4.0,2020-01-01T00:00:00+00:00,Cloud Optimized GeoTIFF of AGB estimates
ESACCI_Biomass_L4_AGB_V4_100m,S50W040_ESACCI-BIOMASS-L4-AGB-MERGED-100m-2020-fv4.0,2020-01-01T00:00:00+00:00,Cloud Optimized GeoTIFF of AGB estimates
ESACCI_Biomass_L4_AGB_V4_100m,S40W080_ESACCI-BIOMASS-L4-AGB-MERGED-100m-2020-fv4.0,2020-01-01T00:00:00+00:00,Cloud Optimized GeoTIFF of AGB estimates
ESACCI_Biomass_L4_AGB_V4_100m,S40W070_ESACCI-BIOMASS-L4-AGB-MERGED-100m-2020-fv4.0,2020-01-01T00:00:00+00:00,Cloud Optimized GeoTIFF of AGB estimates
ESACCI_Biomass_L4_AGB_V4_100m,S30W080_ESACCI-BIOMASS-L4-AGB-MERGED-100m-2020-fv4.0,2020-01-01T00:00:00+00:00,Cloud Optimized GeoTIFF of AGB estimates
ESACCI_Biomass_L4_AGB_V4_100m,S30W070_ESACCI-BIOMASS-L4-AGB-MERGED-100m-2020-fv4.0,2020-01-01T00:00:00+00:00,Cloud Optimized GeoTIFF of AGB estimates
ESACCI_Biomass_L4_AGB_V4_100m,S30W060_ESACCI-BIOMASS-L4-AGB-MERGED-100m-2020-fv4.0,2020-01-01T00:00:00+00:00,Cloud Optimized GeoTIFF of AGB estimates
ESACCI_Biomass_L4_AGB_V4_100m,S20W080_ESACCI-BIOMASS-L4-AGB-MERGED-100m-2020-fv4.0,2020-01-01T00:00:00+00:00,Cloud Optimized GeoTIFF of AGB estimates


Additionally, we can create a list of URLs associated with the items from our focused search.

In [147]:
# get urls
s3_urls = sapply(stac_query$features, function(x) {x$assets$estimates$href})
s3_urls

http_urls = sapply(stac_query$features, function(x) {x$links[[5]]$href})
http_urls