# How to: Find and Access ECOSTRESS Data  

## Summary  

This notebook will explore how to find and access [ECOsystem Spaceborne Thermal Radiometer Experiment on Space Station (ECOSTRESS)](https://ecostress.jpl.nasa.gov/) data programmaticly using the [`earthaccess`](https://github.com/nsidc/earthaccess) python library. `earthaccess` enables authentication, searching, downloading, and streaming of data over HTTPS or s3 with minimal coding. `earthaccess` leverages NASA's Common Metadata Repository (CMR), a metadata system that catalogs Earth Science data and associated metadata records, to search for and return data access links that can be used to download or access data programatically.  

## Requirements  

- NASA [Earthdata Login](https://urs.earthdata.nasa.gov/) account

## Learning Objectives  

- How to search and access ECOSTRESS data using `earthaccess`  

## Exercise

Import the required packages

In [1]:
import os
import earthaccess
import pandas as pd
import geopandas as gp
import xarray as xr
import sys
import json

### Authentication

Login to your NASA Earthdata account and create a `.netrc` file using the `login` function from the `earthaccess` library. If you do not have an Earthdata Account, you can create one [here](https://urs.earthdata.nasa.gov/home). 

In [2]:
earthaccess.login(persist=True)

<earthaccess.auth.Auth at 0x7fae36f565c0>

## Searching for Collections

If we want to see the available EMIT collections, we can 

In [42]:
Query = earthaccess.collection_query().keyword('ecostress').provider('LPCLOUD')
print(f'Collections found: {Query.hits()}')

Collections found: 48


In [43]:
collections = Query.get()

In [46]:
collections[0].summary()

{'short-name': 'ECO2LSTE',
 'concept-id': 'C2837150320-LPCLOUD',
 'version': '001',
 'file-type': "[{'FormatType': 'Native', 'Format': 'HDF5'}]",
 'get-data': ['https://search.earthdata.nasa.gov/search?q=C1534729776-LPDAAC_ECS',
  'https://e4ftl01.cr.usgs.gov/ECOSTRESS/ECO2LSTE.001/',
  'https://appeears.earthdatacloud.nasa.gov/',
  'https://earthexplorer.usgs.gov/'],
 'cloud-info': {'Region': 'us-west-2',
  'S3BucketAndObjectPrefixNames': ['s3://lp-prod-protected/ECO2LSTE.001',
   's3://lp-prod-public/ECO2LSTE.001'],
  'S3CredentialsAPIEndpoint': 'https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials',
  'S3CredentialsAPIDocumentationURL': 'https://data.lpdaac.earthdatacloud.nasa.gov/s3credentialsREADME'}}

In [45]:
[product['short-name'] for product in [collection.summary() for collection in collections] if 'T_' in product['short-name']]

['ECO_L2T_LSTE',
 'ECO_L4T_ESI',
 'ECO_L4T_WUE',
 'ECO_L3G_ET_ALEXI',
 'ECO_L3T_JET',
 'ECO_L3T_MET',
 'ECO_L3T_SEB',
 'ECO_L3T_SM',
 'ECO_L2T_RAD',
 'ECO_L2T_STARS',
 'ECO_L1CT_RAD']

We can retrieve metadata for these collections, and then the shortnames so we can search for granules. 

In [4]:
# Find Collections
collections = Query.fields(['ShortName']).get(10)
# Retrieve Collection Short-names
[product['short-name'] for product in [collection.summary() for collection in collections]]

['ECO2LSTE',
 'ECO4WUE',
 'ECO4ESIPTJPL',
 'ECO3ETALEXI',
 'ECO3ETPTJPL',
 'ECO2CLD',
 'ECO4ESIALEXI',
 'ECO_L2T_LSTE',
 'ECO1BGEO',
 'ECO1BRAD']

If you print the `collections` object you can explore all of the json metadata.

In [22]:
Query = earthaccess.collection_query().keyword('tiled').provider('LPCLOUD')
print(f'Collections found: {Query.hits()}')

Collections found: 16


In [23]:
collections = Query.get()
len(collections)

16

In [24]:
collections[0]

{
  "meta": {
    "revision-id": 46,
    "deleted": false,
    "format": "application/vnd.nasa.cmr.umm+json",
    "provider-id": "LPCLOUD",
    "has-combine": false,
    "user-id": "jabeck",
    "has-formats": false,
    "s3-links": [
      "s3://lp-prod-protected/ECO_L2T_LSTE.002",
      "s3://lp-prod-public/ECO_L2T_LSTE.002"
    ],
    "has-spatial-subsetting": false,
    "native-id": "ECO_L2T_LSTEV002",
    "has-transforms": false,
    "has-variables": false,
    "concept-id": "C2076090826-LPCLOUD",
    "revision-date": "2024-04-29T19:39:02.582Z",
    "granule-count": 2966807,
    "has-temporal-subsetting": false,
    "concept-type": "collection"
  },
  "umm": {
    "TilingIdentificationSystems": [
      {
        "TilingIdentificationSystemName": "Military Grid Reference System",
        "Coordinate1": {
          "MinimumValue": "1",
          "MaximumValue": "60"
        },
        "Coordinate2": {
          "MinimumValue": "C",
          "MaximumValue": "W"
        }
      }
   

In [25]:
# Retrieve Collection Short-names
[product['short-name'] for product in [collection.summary() for collection in collections]]

['ECO_L2T_LSTE',
 'ECO_L4T_ESI',
 'ECO_L4T_WUE',
 'ECO_L3T_JET',
 'ECO_L3T_MET',
 'ECO_L3T_SEB',
 'ECO_L3T_SM',
 'ECO_L2T_RAD',
 'ECO_L2T_STARS',
 'ECO_L1CT_RAD',
 'MCD12C1',
 'ECO_L3G_JET',
 'ECO_L3G_MET',
 'ECO_L3G_SEB',
 'ECO_L3G_SM',
 'MOD44W']

In [26]:
[product['concept-id'] for product in [collection.summary() for collection in collections]]

['C2076090826-LPCLOUD',
 'C2076104650-LPCLOUD',
 'C2076102081-LPCLOUD',
 'C2076106409-LPCLOUD',
 'C2074877891-LPCLOUD',
 'C2074852168-LPCLOUD',
 'C2074860916-LPCLOUD',
 'C2074842795-LPCLOUD',
 'C2090073749-LPCLOUD',
 'C2595678301-LPCLOUD',
 'C2484078896-LPCLOUD',
 'C2076112011-LPCLOUD',
 'C2074897737-LPCLOUD',
 'C2074855428-LPCLOUD',
 'C2074890845-LPCLOUD',
 'C2565805847-LPCLOUD']

### Searching for Granules

A granule can be thought of as a unique spatiotemporal grouping within a collection. To search for granules, we simply use the `search_data` function from `earthaccess` and provide the arguments for our search. Its possible to specify search products using several criteria shown in the table below:

|dataset origin and location|spatio temporal parameters|dataset metadata parameters|
|:---|:---|:---|
|archive_center|bounding_box|concept_id
|data_center|temporal|entry_title
|daac|point|keyword
|provider|polygon|version
|cloud_hosted|line|short_name

#### Point Search

In this case, we specify the `shortname`, `point`, and `temporal`, as well as `count`, which limits the maximum number of results returned. 

In [14]:
# POINT
results = earthaccess.search_data(
    short_name='ECO_L2T_LSTE',
    version='002',
    point=(-62.1123,-39.89402),
    temporal=('2022-09-03','2022-09-04'),
    count=100
)

Granules found: 4


#### Bounding Box Search

You can also use a bounding box to search. To do this we will first open a geojson file containing our region of interest (ROI) then simplify it to a bounding box by getting the bounds and putting them into a tuple. We will use the `total_bounds` property to get the bounding box of our ROI, and add that to a python tuple, which is the expected data type for the bounding_box parameter `earthaccess` `search_data`.

In [15]:
geojson = gp.read_file('../../data/isla_gaviota.geojson')
geojson.geometry

0    POLYGON ((-62.14758 -39.88951, -62.16900 -39.8...
Name: geometry, dtype: geometry

In [16]:
bbox = tuple(list(geojson.total_bounds))
bbox

(-62.20427143422259,
 -39.95230375907932,
 -62.11609022486114,
 -39.87693893067732)

Now we can search for granules using the a bounding box.

In [17]:
# Search Example using Bounding Box
results = earthaccess.search_data(
    short_name='ECO_L2T_LSTE',
    version='002',
    bounding_box=bbox,
    temporal=('2022-09-03','2022-09-04'),
    count=100
)

Granules found: 4



#### Polygon Search

A polygon can also be used to search. For a simple polygon without holes we can take the geojson we opened and grab the coordinates of the exterior ring and place them in a list.

In [18]:
polygon = list(geojson.geometry[0].exterior.coords)
polygon

[(-62.147583513919045, -39.88950549416461),
 (-62.16899895047814, -39.87693893067732),
 (-62.19419358172446, -39.90641838472922),
 (-62.20427143422259, -39.94071456822524),
 (-62.1318368693898, -39.95230375907932),
 (-62.11609022486114, -39.92091182572591),
 (-62.125538211578245, -39.895787912197314),
 (-62.147583513919045, -39.88950549416461)]

With this list of coordinate pairs we can use the `polygon` parameter for our search. 
> Note that we overwrote the `results` object, because for all 3 types spatial search, the `results` are the same for this example.

In [19]:
# Search Example using a Polygon
results = earthaccess.search_data(
    short_name='ECO_L2T_LSTE',
    version='002',
    polygon=polygon,
    temporal=('2022-09-03','2022-09-04'),
    count=100
)

Granules found: 4


### Working with Search Results

After we've gotten results from our search using `earthaccess` we can view the results in a table and view assets for each granule in the list. 

In [20]:
type(results)

list

In [21]:
type(results[0])

earthaccess.results.DataGranule

In [22]:
results[0]

In [23]:
results[0].keys()

dict_keys(['meta', 'umm', 'size'])

In [24]:
results[0]['meta']

{'concept-type': 'granule',
 'concept-id': 'G2467284459-LPCLOUD',
 'revision-id': 2,
 'native-id': 'ECOv002_L2T_LSTE_23601_028_20GNA_20220903T230024_0700_01',
 'collection-concept-id': 'C2076090826-LPCLOUD',
 'provider-id': 'LPCLOUD',
 'format': 'application/echo10+xml',
 'revision-date': '2024-01-04T17:39:08.136Z'}

In [25]:
results[0]['umm']

{'TemporalExtent': {'RangeDateTime': {'BeginningDateTime': '2022-09-03T23:00:24.760Z',
   'EndingDateTime': '2022-09-03T23:05:24.760Z'}},
 'OrbitCalculatedSpatialDomains': [{'BeginOrbitNumber': 23601,
   'EndOrbitNumber': 23601}],
 'GranuleUR': 'ECOv002_L2T_LSTE_23601_028_20GNA_20220903T230024_0700_01',
 'AdditionalAttributes': [{'Name': 'identifier_product_doi',
   'Values': ['10.5067/ECOSTRESS/ECO_L2T_LSTE.002']},
  {'Name': 'identifier_product_doi_authority', 'Values': ['http://doi.org']}],
 'MeasuredParameters': [{'ParameterName': 'L2T_LSTE'}],
 'SpatialExtent': {'HorizontalSpatialDomain': {'Geometry': {'BoundingRectangles': [{'WestBoundingCoordinate': -63.000237,
      'EastBoundingCoordinate': -61.700428,
      'NorthBoundingCoordinate': -39.742661,
      'SouthBoundingCoordinate': -40.738602}]}}},
 'ProviderDates': [{'Date': '2022-09-19T00:42:38.082Z', 'Type': 'Insert'},
  {'Date': '2022-10-25T13:48:57.391Z', 'Type': 'Update'}],
 'CollectionReference': {'ShortName': 'ECO_L2T_LST

In [26]:
#print(json.dumps(results, sort_keys=False, indent=4))

[
    {
        "meta": {
            "concept-type": "granule",
            "concept-id": "G2467284459-LPCLOUD",
            "revision-id": 2,
            "native-id": "ECOv002_L2T_LSTE_23601_028_20GNA_20220903T230024_0700_01",
            "collection-concept-id": "C2076090826-LPCLOUD",
            "provider-id": "LPCLOUD",
            "format": "application/echo10+xml",
            "revision-date": "2024-01-04T17:39:08.136Z"
        },
        "umm": {
            "TemporalExtent": {
                "RangeDateTime": {
                    "BeginningDateTime": "2022-09-03T23:00:24.760Z",
                    "EndingDateTime": "2022-09-03T23:05:24.760Z"
                }
            },
            "OrbitCalculatedSpatialDomains": [
                {
                    "BeginOrbitNumber": 23601,
                    "EndOrbitNumber": 23601
                }
            ],
            "GranuleUR": "ECOv002_L2T_LSTE_23601_028_20GNA_20220903T230024_0700_01",
            "AdditionalAttributes

In [27]:
pd.json_normalize(results)

Unnamed: 0,size,meta.concept-type,meta.concept-id,meta.revision-id,meta.native-id,meta.collection-concept-id,meta.provider-id,meta.format,meta.revision-date,umm.TemporalExtent.RangeDateTime.BeginningDateTime,...,umm.RelatedUrls,umm.DataGranule.DayNightFlag,umm.DataGranule.Identifiers,umm.DataGranule.ProductionDateTime,umm.DataGranule.ArchiveAndDistributionInformation,umm.Platforms,umm.MetadataSpecification.URL,umm.MetadataSpecification.Name,umm.MetadataSpecification.Version,umm.CollectionReference.EntryTitle
0,2.83323,granule,G2467284459-LPCLOUD,2,ECOv002_L2T_LSTE_23601_028_20GNA_20220903T2300...,C2076090826-LPCLOUD,LPCLOUD,application/echo10+xml,2024-01-04T17:39:08.136Z,2022-09-03T23:00:24.760Z,...,[{'URL': 'https://data.lpdaac.earthdatacloud.n...,Night,[{'Identifier': 'ECOv002_L2T_LSTE_23601_028_20...,2022-09-05T09:48:15.926Z,"[{'Name': 'Not provided', 'Size': 2.83323, 'Si...","[{'ShortName': 'ISS', 'Instruments': [{'ShortN...",https://cdn.earthdata.nasa.gov/umm/granule/v1.6.5,UMM-G,1.6.5,
1,11.9879,granule,G2467285012-LPCLOUD,2,ECOv002_L2T_LSTE_23601_029_20GNA_20220903T2301...,C2076090826-LPCLOUD,LPCLOUD,application/echo10+xml,2024-01-04T17:38:57.218Z,2022-09-03T23:01:16.730Z,...,[{'URL': 'https://data.lpdaac.earthdatacloud.n...,Night,[{'Identifier': 'ECOv002_L2T_LSTE_23601_029_20...,2022-09-05T09:48:15.878Z,"[{'Name': 'Not provided', 'Size': 11.9879, 'Si...","[{'ShortName': 'ISS', 'Instruments': [{'ShortN...",https://cdn.earthdata.nasa.gov/umm/granule/v1.6.5,UMM-G,1.6.5,
2,2.83323,granule,G2467277615-LPDAAC_ECS,3,SC:ECO_L2T_LSTE.002:2563171812,C2204557047-LPDAAC_ECS,LPDAAC_ECS,application/echo10+xml,2022-10-26T05:37:24.964Z,2022-09-03T23:00:24.760Z,...,[{'URL': 'https://e4ftl01.cr.usgs.gov//WORKING...,Night,[{'Identifier': 'ECOv002_L2T_LSTE_23601_028_20...,2022-09-05T09:48:15.926Z,"[{'Name': 'Not provided', 'Size': 2.83323, 'Si...","[{'ShortName': 'ISS', 'Instruments': [{'ShortN...",https://cdn.earthdata.nasa.gov/umm/granule/v1.6.5,UMM-G,1.6.5,ECOSTRESS Tiled Land Surface Temperature and E...
3,11.9879,granule,G2467278584-LPDAAC_ECS,2,SC:ECO_L2T_LSTE.002:2563171851,C2204557047-LPDAAC_ECS,LPDAAC_ECS,application/echo10+xml,2022-10-26T05:37:17.567Z,2022-09-03T23:01:16.730Z,...,[{'URL': 'https://e4ftl01.cr.usgs.gov//WORKING...,Night,[{'Identifier': 'ECOv002_L2T_LSTE_23601_029_20...,2022-09-05T09:48:15.878Z,"[{'Name': 'Not provided', 'Size': 11.9879, 'Si...","[{'ShortName': 'ISS', 'Instruments': [{'ShortN...",https://cdn.earthdata.nasa.gov/umm/granule/v1.6.5,UMM-G,1.6.5,ECOSTRESS Tiled Land Surface Temperature and E...


After we have our results, there are 2 ways we an work with the data:

1. Download
2. Access in place / Stream the data. 

To download the data we can simply use the download function. This will retrieve all assets associated with a granule, and is nice if you plan to work with the data in this way.

In [27]:
# earthaccess.download(results, '../../data/')

If we want to stream the data or further filter the assets for download we want to first create a list of URLs nested by granule using list comprehesion.

In [28]:
emit_results_urls = [granule.data_links() for granule in results]
emit_results_urls

[['https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/ECO_L2T_LSTE.002/ECOv002_L2T_LSTE_23601_028_20GNA_20220903T230024_0700_01/ECOv002_L2T_LSTE_23601_028_20GNA_20220903T230024_0700_01_water.tif',
  'https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/ECO_L2T_LSTE.002/ECOv002_L2T_LSTE_23601_028_20GNA_20220903T230024_0700_01/ECOv002_L2T_LSTE_23601_028_20GNA_20220903T230024_0700_01_cloud.tif',
  'https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/ECO_L2T_LSTE.002/ECOv002_L2T_LSTE_23601_028_20GNA_20220903T230024_0700_01/ECOv002_L2T_LSTE_23601_028_20GNA_20220903T230024_0700_01_view_zenith.tif',
  'https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/ECO_L2T_LSTE.002/ECOv002_L2T_LSTE_23601_028_20GNA_20220903T230024_0700_01/ECOv002_L2T_LSTE_23601_028_20GNA_20220903T230024_0700_01_height.tif',
  'https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/ECO_L2T_LSTE.002/ECOv002_L2T_LSTE_23601_028_20GNA_20220903T230024_0700_01/ECOv002_L2T_LSTE_23601

Now we can also split these into results for specific assets or filter out an asset using the following. In this example, we only want to access or download reflectance.

In [29]:
filtered_asset_links = []
# Pick Desired Assets (leave _ on RFL to distinguish from RFLUNC)
desired_assets = ['LST.tif'] # Add more or do individually for reflectance, reflectance uncertainty, or mask
# Step through each sublist (granule) and filter based on desired assets.
for n, granule in enumerate(emit_results_urls):
    for url in granule: 
        asset_name = url.split('/')[-1]
        if any(asset in asset_name for asset in desired_assets):
            filtered_asset_links.append(url)
filtered_asset_links

['https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/ECO_L2T_LSTE.002/ECOv002_L2T_LSTE_23601_028_20GNA_20220903T230024_0700_01/ECOv002_L2T_LSTE_23601_028_20GNA_20220903T230024_0700_01_LST.tif',
 'https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/ECO_L2T_LSTE.002/ECOv002_L2T_LSTE_23601_029_20GNA_20220903T230116_0700_01/ECOv002_L2T_LSTE_23601_029_20GNA_20220903T230116_0700_01_LST.tif']

After we have our filtered list, we can stream the reflectance asset or download it. Start an https session then open it to stream the data, or download to save the file.

#### Stream Data  

This may take a while to load the dataset.  

In [39]:
import rioxarray as rxr
import rasterio as rio

In [40]:
rio_env = rio.Env(GDAL_DISABLE_READDIR_ON_OPEN='EMPTY_DIR',
                  GDAL_HTTP_COOKIEFILE=os.path.expanduser('~/cookies.txt'),
                  GDAL_HTTP_COOKIEJAR=os.path.expanduser('~/cookies.txt'))
rio_env.__enter__()

<rasterio.env.Env at 0x7efe03b32fe0>

In [41]:
eco_data_link = filtered_asset_links[0]

In [42]:
eco_lst_ds = rxr.open_rasterio(eco_data_link).squeeze('band', drop=True)
eco_lst_ds

#### Download Filtered Data URLS  

To download the filtered list, which only includes URLS for the reflectance files, we can again use the `download` function from `earthaccess`.  

In [31]:
earthaccess.download(filtered_asset_links, local_path='../../data')

QUEUEING TASKS | :   0%|          | 0/2 [00:00<?, ?it/s]

PROCESSING TASKS | :   0%|          | 0/2 [00:00<?, ?it/s]

COLLECTING RESULTS | :   0%|          | 0/2 [00:00<?, ?it/s]

['../../data/ECOv002_L2T_LSTE_23601_028_20GNA_20220903T230024_0700_01_LST.tif',
 '../../data/ECOv002_L2T_LSTE_23601_029_20GNA_20220903T230116_0700_01_LST.tif']

Read local copy

In [43]:
eco_lst_ds = rxr.open_rasterio('../../data/ECOv002_L2T_LSTE_23601_028_20GNA_20220903T230024_0700_01_LST.tif').squeeze('band', drop=True)
eco_lst_ds

## Contact Info:  

Email: LPDAAC@usgs.gov  
Voice: +1-866-573-3222  
Organization: Land Processes Distributed Active Archive Center (LP DAAC)¹  
Website: <https://lpdaac.usgs.gov/>  
Date last modified: 07-03-2023  

¹Work performed under USGS contract G15PD00467 for NASA contract NNG14HH33I. 