# Download Landsat or Sentinel-2 data using STAC API

## Spatio Temporal Asset Catalogs (STAC)
The STAC specification is a common language to describe geospatial information. A STAC API provides a search and selection interface to a catalog of items and files. See https://stacindex.org/catalogs#/ for a list of provides using STAC.

While the STAC specification allows consistent searching and access to available files, how these files are used and interpreted can still be a challenge or at least specific to each custodian.

This notebook demonstrates how to search, download, visualise and export Landsat and Sentinel-2 satellite imagery.
- USGS Landsat on Level-1 and -2 products AWS, https://registry.opendata.aws/usgs-landsat/
- Element-84 Sentinel-2 "sen2cor"-corrected surface reflectance on AWS, https://registry.opendata.aws/sentinel-2/

## Open Data Cube

The Open Data Cube (ODC) records product and scene information in a database and provides tools for tranforming and aggregating scene data into geospatial python `xarray` "cubes". In this context, the STAC API can replace some parts of the ODC database while providing the core geospatial information required for transforming and aggregating the scene data into cubes.

The ODC [odc-stac](https://github.com/opendatacube/odc-stac) and [odc-geo](https://github.com/opendatacube/odc-geo) packages provide the core functionality of reading, tranforming and aggregating files. In particular, the odc-stac library takes a list of STAC items as input and reads these into an `xarray` cube compatibel with ODC functions.

## More information

This notebook was adapted from https://github.com/opendatacube/odc-stac/tree/develop/notebooks.

In [None]:
# Minimal packages
import os, sys
from pystac_client import Client
from odc.stac import configure_s3_access, stac_load

# ODC packages (optional)
from dea_tools.plotting import display_map

# EASI packages (optional)
repo = f'{os.environ["HOME"]}/eocsi-hackathon-2022'  # No easy way to get repo directory
if repo not in sys.path: sys.path.append(repo)
from tools.notebook_utils import xarray_object_size, initialize_dask, localcluster_dashboard

In [None]:
# Setup

# Does this work stand-alone or require an AWS account?
configure_s3_access(requester_pays=True)

# Optional: use EASI SE Asia caching-proxy service
os.environ["AWS_HTTPS"] = "NO"
os.environ["GDAL_HTTP_PROXY"] = "easi-caching-proxy.caching-proxy:80"
print(f'Will use caching proxy at: {os.environ.get("GDAL_HTTP_PROXY")}')

# Optional: Dask
cluster, client = initialize_dask(use_gateway=False, workers=(1,8), wait=False)
if cluster:
    display(cluster)
else:
    print(f'Using dask LocalCluster: {localcluster_dashboard(client)}')

## Select an area of interest

In [None]:
# Select Landsat or Sentinel-2

do_landsat = True
do_sentinel = False

In [None]:
# Select a bounding box

# Vietnam - Ha Long
# latitude = (20.8, 20.9)
# longitude = (106.8, 106.9)
# time=('2020-02-01', '2020-02-20')

# Fiji - blows up JHub memory due to antemeridian
# latitude = (-17.1, -16.2)
# longitude = (178.2, 180.0)
# time=('2020-02-01', '2020-02-20')

# PNG Milne Bay
latitude = (-10.8, -10)
longitude = (149.7, 150.8)  
time=('2020-02-01', '2020-02-20')

# west, south, east, north
bbox = [longitude[0], latitude[0], longitude[1], latitude[1]]

# Display bounding box on a map
display_map(longitude, latitude)

## Landsat configuration and settings

In [None]:
if do_landsat:

    # STAC catalog and query
    catalog = Client.open('https://landsatlook.usgs.gov/stac-server/')
    product = 'landsat-c2l2-sr'
    query_cfg = ["platform=LANDSAT_8", "landsat:collection_category=T1"]

    # Search for available items
    query = catalog.search(
        collections=[product], datetime=f'{time[0]}/{time[1]}', bbox=bbox, query=query_cfg
    )
    items = list(query.get_items())
    print(f"Found: {len(items):d} datasets")

    # Rewrite URLs to use S3
    def landsat_patch(uri: str) -> str:
        """Return the S3 version of the URI"""
        return uri.replace('https://landsatlook.usgs.gov/data/', 's3://usgs-landsat/')

    # Change or update STAC information for use by ODC 
    stac2odc_cfg = {
        'landsat-c2l2-sr': {
            # 'aliases': {'red': 'red', 'nir': 'nir08', 'pixel_quality': 'qa_pixel'},
        },
        "*": {"warnings": "ignore"},
    }

    # `stac_load` parameters
    stac_call = {
        'bands': ('red', 'nir', 'pixel_quality'),  # Optional: selected bands
        'chunks': {},                              # If using Dask
        'groupby': "solar_day",                    # "solarday = group scenes on same solar day into same time layer in cube
        'stac_cfg': stac2odc_cfg,
        'patch_url': landsat_patch,
    }

    # Additional Landsat band specifications
    band_specs = {
        'red': {
            'scale': 0.0000275,
            'offset': -0.2
        },
        'nir': {
            'scale': 0.0000275,
            'offset': -0.2
        },
    }

## Sentinel-2 configuration and settings

In [None]:
if do_sentinel:
    
    # STAC catalog and query
    catalog = Client.open('https://earth-search.aws.element84.com/v0')
    product = 'sentinel-s2-l2a-cogs'
    
    # Search for available items
    query = catalog.search(
        collections=[product], datetime=f'{time[0]}/{time[1]}', bbox=bbox,
    )
    items = list(query.get_items())
    print(f"Found: {len(items):d} datasets")
    
    # Rewrite URLs to use S3
    def patch(uri: str) -> str:
        """Return the Sentinel-2 S3 version of the URI"""
        return uri.replace('https://sentinel-cogs.s3.us-west-2.amazonaws.com/', 's3://sentinel-cogs/')
    
    # Change or update STAC information for use by ODC 
    stac2odc_cfg = {
        "sentinel-s2-l2a-cogs": {
            "assets": {
                "*": {"data_type": "uint16", "nodata": 0},
                "SCL": {"data_type": "uint8", "nodata": 0},
                "visual": {"data_type": "uint8", "nodata": 0},
            },
            "aliases": {"red": "B04", "green": "B03", "blue": "B02"},
        },
        "*": {"warnings": "ignore"},
    }
    
    # `stac_load` parameters
    stac_call = {
        'bands': ("B04",),
        'crs': crs,
        'resolution': 30,
        # chunks={},  # <-- use Dask
        # groupby="solar_day",
        'stac_cfg': cfg,
        'patch_url': patch,
    }

In [None]:
# Optional: Explore the structure of a STAC item
# Optional: Select or filter scenes

display(f'Canonical names: {items[0].assets.keys()}')
display(items[0])

## Load the selected items into an `xarray` cube

In [None]:
xx = stac_load(items, **stac_call)

display(xarray_object_size(xx))
display(xx)
display(xx.odc.geobox)

## Visual check

In [None]:
# First look

xx.red.isel(time=0).plot()

In [None]:
## Apply scale and offset

In [None]:
## Add coastline

## Export cube layers to netCDF or Geotiff