# The odc-stac

## Why use odc-stac?

The Digital Earth Africa Sandbox is a managed environment based on JupyterLab.
This environment provides users with direct access to an installed and pre-cconfigured Open Data Cube instance containing all of Digital Earth Africa's earth observation data. 
The Sandbox also provides users with limited computing resources for interacting with and analyzing Digital Earth Africa's earth observation data. 
To use the Digital Earth Africa Sandbox, you need to open http://sandbox.digitalearth.africa/ in your browser, create an account and log in using your credentials. 
One of the limitations of the Sandbox is that carrying out an analysis over a large area like an entire country or for the entire continent of Africa can challenging even with the larger 32GB environment. Another limitation is that being a managed environment, the user is limited in how much they can customize the Sandbox.  

This is where `odc-stac` can come in.
Digital Earth Africa stores a range of data products on Amazon Web Service's Simple Cloud Storage (S3) with free public access. 
Digital Earth Africa also provides a SpatioTemporal Asset Catalog (STAC) endpoint for listing or searching the metadata, e.g. bounding box (area of interest coordinates), collection and date and time, for this archive here: https://explorer.digitalearth.africa/stac.
Using the STAC enpoint provided, the `odc-stac` tooling gives you the ability to access Digital Earth Africa's earth observation data outside of the Sandbox in the same format, as an `xarray.Dataset`, as you would in the Analysis Sandbox.
This is because the `odc-stac` Python library allows you to load data into an `xarray` from STAC items from a STAC catalog. An example of this is provided here: [Access Sentinel 2 Analysis Ready Data from Digital Earth Africa notebook](https://odc-stac.readthedocs.io/en/latest/notebooks/stac-load-S2-deafrica.html).
You can also use `odc-stac` to load other STAC compliant earth observation data as an `xarray.Dataset`.

Using `odc-stac` means that, you can set up your own custom analysis environment locally or remotely and be able access,, interact with and analyze Digital Earth Africa's earth observation data in the same format as you would in the Sandbox. 

## Getting started with odc-stac.

Instructions on how to install `odc-stac` are provided [here](https://odc-stac.readthedocs.io/en/latest/intro.html#installation).
Example notebooks on how you can use `odc-stac` can be found [here](https://odc-stac.readthedocs.io/en/latest/examples.html)

For more on the `odc-stac` see the [odc-stac documentation](https://odc-stac.readthedocs.io/en/latest) and the [odc-stac Github repository](https://github.com/opendatacube/odc-stac).

## What is the odc-stac?

The `odc-stac` is a set of tools for converting STAC metadata to the Open Data Cube data model.
`odc-stac` allows you to load STAC items into `xarray` Datasets, and process them locally or disribute data loading and computation with [Dask](https://dask.org/).

The Open Data Cube (ODC) project is an open source solution for accessing, managing, and analyzing large quantities of Geographic Information System (GIS) data - namely Earth observation (EO) data by presenting a common analytical framework composed of a series of data structures and tools which facilitate the organization and analysis of large gridded data collections.
The Open Data Cube (ODC) data model is based on 3 core concepts: Product, Dataset, Measurement.
Datasets are a fundamental part of the ODC project. 
A dataset is *“The smallest aggregation of data independently described, inventoried, and managed”* (Definition of “Granule” from NASA EarthData Unified Metadata Model). 
A dataset can also be described as a container of metadata.
Products are collections of datasets that share the same set of measurements and some subset of metadata.
A measurement describes a single data variable of a Product or Dataset.
For more information on the Open Data Cube project, see the [Open Data Cube website](https://www.opendatacube.org/) and the [Open Data Cube Manual](https://datacube-core.readthedocs.io/en/latest/).

The SpatioTemporal Asset Catalog (STAC) specification is a newer, widely used, open specification that allows providers of spatiotemporal assets (Imagery, SAR, Point Clouds, Data Cubes, Full Motion Video, etc) to expose their data as SpatioTemporal Asset Catalogs (STAC), so that new code doesn't need to be written whenever a new dataset or API is released. 
This makes their data more easily indexed and discovered.
A spatiotemporal asset is any file that represents information about the earth captured in a certain space and time.
A UML diagram of the STAC model is provided [here](https://github.com/radiantearth/stac-spec/blob/master/STAC-UML.pdf#toolbar=0).

There are 3 component specifications that together make up the core STAC specification. 
These are the STAC Item, STAC Catalog, and STAC Collection specifications. 
These specifications define related JSON object types connected by link relations. 
The STAC item object represents an unit of inseparable data and metadata, typically representing a single scene of data at one place at one time. 
The STAC Catalog is a very simple construct that provides a flexible structure to link various STAC Items or other STAC Catalogs together to be crawled or browsed.
A STAC Collection provides additional information about a spatio-temporal collection of data by extending a STAC Catalog directly, layering on additional fields to enable description of things like the spatial and temporal extent of the data, the license, keywords, providers, etc. 

The STAC specification can be extended using stac-extensions. 
These extensions allow data providers to fully the describe the spatial information they wish to expose as SpatioTemporal Asset Catalogs (STAC). 
Extensions to the core STAC specification provide additional JSON fields that can be used to better describe spatial data. The Electro-Optical extension specification provides an STAC Item property, `eo:bands`. 
This property is used to describe the available spectral bands in a spatial-temporal asset. 
The band's `common_name` is the name that is commonly used to refer to that band's spectral properties. 
For more information on the STAC specification see the [STAC website](https://stacspec.org/) and the [STAC Github repository](https://github.com/radiantearth/stac-spec).

As seen in the above, ODC and STAC use different terminology for otherwise similar concepts. 

**Table 1: Comparison between ODC and STAC concepts.**

| STAC       | ODC                    | Description                                      | 
| :--        | :--                    | :--                                              |
| [Collection](https://pystac.readthedocs.io/en/latest/api/pystac.html#pystac.Collection) | [Product](https://opendatacube.readthedocs.io/en/latest/about-core-concepts/products.html) or [DatasetType](https://datacube-core.readthedocs.io/en/latest/api/core-classes/datasetType.html#datacube.model.DatasetType) | Collection of observations across space and time |
| [Item](https://pystac.readthedocs.io/en/latest/api/pystac.html#pystac.Item)      | [Dataset](https://datacube-core.readthedocs.io/en/latest/api/core-classes/dataset.html#datacube.model.Dataset)                | Single observation (specific time and place), multi-channel |
| [Asset](https://pystac.readthedocs.io/en/latest/api/pystac.html#pystac.Asset)      | [Measurement](https://datacube-core.readthedocs.io/en/latest/api/core-classes/measurement.html#datacube.model.Measurement)           | Component of a single observation |
| [Band](https://github.com/stac-extensions/eo#band-object)         | [Measurement](https://datacube-core.readthedocs.io/en/latest/api/core-classes/measurement.html#datacube.model.Measurement)             | Pixel plane within a multi-plane asset |
| [Common Name](https://github.com/stac-extensions/eo#common-band-names)  | Alias                  | Refer to the same band by different  |

**References**

[odc-stac Github repository](https://github.com/opendatacube/odc-stac)

[odc-stac Documentation](https://odc-stac.readthedocs.io/en/latest)

[STAC Github repository](https://github.com/radiantearth/stac-spec)

[STAC website](https://stacspec.org/)

[STAC Electro-Optical (EO) Extension Specification Github repository](https://github.com/stac-extensions/eo#common-band-names)

[Open Data Cube Manual](https://datacube-core.readthedocs.io/en/latest/)