# Using ODC-STAC with Sentinel-2

This notebook explores Sentinel-2 data on Earth Search, demonstrating how [odc-stac](https://odc-stac.readthedocs.io/) can be used to do processing at scale.

 - [Earth Search](https://element84.com/earth-search), a catalog of public data
 - [pystac-client](https://pystac-client.readthedocs.io/), for searching and access data
 - [OpenDataCube](https://www.opendatacube.org/) and [odc-stac](https://odc-stac.readthedocs.io/) for loading STAC assets and representing geospatial data as XArrays
 - [XArray](http://xarray.pydata.org/en/stable/), [pandas](https://pandas.pydata.org/) and [geopandas](https://geopandas.org/) for manipulating data
 - [Dask](https://dask.org/) for performing parallel, distributed computing
 - [Folium](https://python-visualization.github.io/folium/index.html) and [hvplot](https://hvplot.holoviz.org/) for visualization

Shown will be how find data for an area of interest, explore the resulting metadata, perform calculations like NDVI, and visualize the results.

# Choose Area of Interest

In [None]:
# AOIs available

from glob import glob
from pprint import pprint

pprint(glob("../aois/*"))

In [None]:
# read the GeoJSON file and create a map

import json
from pathlib import Path

aoi_fname = "../aois/bear-fire.geojson"

aoi = json.loads(Path(aoi_fname).read_text())

# use folium to display vectors
# Several folium basemap tiles are available:
#   - OpenStreetMap
#   - Stamen Terrain
#   - Stamen Toner
#   - Stamen Watercolor
#   - CartoDB positron
#   - CartoDB dark_matter

import folium

map = folium.Map(tiles='OpenStreetMap')

# add vector to map, as transparent polygon
folium.GeoJson(aoi, style_function = lambda x: {'fillColor': '#00000000'}).add_to(map)

# fit the map to the bounds of the data
lons = [x[0] for x in aoi["geometry"]["coordinates"][0]]
lats = [x[1] for x in aoi["geometry"]["coordinates"][0]]
map.fit_bounds([(min(lats), min(lons)), (max(lats), max(lons))])

map

# Search the API

In [None]:
# Use pystac-client to find data in the STAC API.

from pystac_client import Client
api = Client.open("https://earth-search.aws.element84.com/v1/")

col = 'sentinel-2-l2a'
collection = api.get_collection(col)
collection

In [None]:
# print the Collection
import pandas as pd

pd.DataFrame.from_dict(collection.to_dict()['item_assets'], orient='index')

In [None]:
%%time

# search the API

query = api.search(
    collections=[collection.id],
    intersects=aoi['geometry'],
    datetime="2019-10-01/2021-10-01",
    limit=100,
    query = [
        "eo:cloud_cover<10"
    ]
)
item_collection = query.item_collection()

print(f"Found: {len(item_collection):d} STAC Items")

In [None]:
%%time

# display the map with footprints

# view footprints
style = {
    'fillColor': '#00000000', # transparent
    'color': '#fc0f03',       # red
    'weight': 1
}

for item in item_collection:
    folium.GeoJson(item.to_dict(), style_function=lambda x: style).add_to(map)

map

In [None]:
%%time
# Here we load as a DataCube. A PySTAC ItemCollection is created from the found STAC Items,
# and we specify various parameters, such as bands of interest and chunk size.
# We are requesting to only load pixels within a bounding box of the requested
# geometry (`bbox=geom.bounds`).

from odc.stac import stac_load
import geopandas as gpd

aoi_df = gpd.read_file(aoi_fname)['geometry'][0]

dc = stac_load(item_collection,
               measurements=['red', 'green', 'blue', 'nir'],
               chunks={"x": 1024, "y": 1024},
               bbox=aoi_df.bounds,
               groupby='solar_day',
)
dc

In [None]:
%%time

# Create scaled RGB image
#
# We will create an RGBA datacube representation (`nodata` values have `alpha=0`),
# and generate an NDVI datacube.

vis = dc.odc.to_rgba(vmin=1, vmax=2000, bands=['blue', 'green', 'red'])
vis

In [None]:
%%time

# Calcualate NDVI

ndvi = ((dc['nir'] - dc['red']) / (dc['nir'] + dc['red'])).clip(0, 1).rename("ndvi")
ndvi

In [None]:
%%time

# start Dask cluster using coiled

import coiled
from dask.distributed import Client

# start dask cluster on coiled.io
cluster = coiled.Cluster(
    n_workers=10,
    software="cng-workshop",
    account="element84-demo-workspace",
    backend_options={"region": "us-west-2"}
)
client = Client(cluster)

print('Dashboard:', client.dashboard_link)
client

In [None]:
%%time

# use Dask to compute

# The Dask `persist` function performs the compuation and keeps data in memory on the cluster
# The Dask `compute` function is used when we actually want the data, such as displaying it.

from dask.distributed import wait

ndvi, vis = client.persist([ndvi, vis])
_ = wait([ndvi, vis])

In [None]:
%%time

# display RGB image

vis_ = vis.compute()

import hvplot.xarray

hvplot_kwargs = {
    "frame_width": 800,
    "xaxis": None,
    "yaxis": None,
    "widget_location": "bottom",
    "aspect": len(vis.x)/len(vis.y)
}

vis_.hvplot.rgb('x', 'y', bands='band', groupby='time', **hvplot_kwargs)

In [None]:
# display NDVI image

ndvi_ = ndvi.compute()
ndvi_.hvplot('x', 'y', groupby='time', **hvplot_kwargs)

In [None]:
%%time

# create time series plot of average scene NDVI

ndvi_mean = ndvi.mean(dim=['x', 'y']).compute()
ndvi_mean.hvplot()

In [None]:
# Stopping Dask cluster and cleaning resources

client.close()
cluster.shutdown()
cluster.close()