# Indexing using STAC
This notebook explains on indexing datasets available at STAC API endpoints to be indexed to the ODC datacube setup.

## Description
The topics covered in this notebook include
* [Prerequisites](#Prerequisites)
* [What is STAC?](#STAC)
* [Indexing the Product Definition](#Indexing-the-Product-Definition)
* [Example](#Example)

**Note:** *The commands are meant to be run on a command line interface(like terminal in Linux). But in JupyterHub Notebook you can run the commands by placing a `!` before the command.*

`! <command>`

## Prerequisites
Indexing from STAC requires a command line tool `stac-to-dc`. With the help of python package [odc_apps_dc_tools](https://github.com/opendatacube/odc-tools/tree/develop/apps/dc_tools).
users can index the datsets to datacube.

The python package `odc-apps-dc-tools` can be installed using command:

`pip install odc-apps-dc-tools`

***Note: This package has already been installed in the JupyterHub environment you are currently using***

## STAC
[STAC](https://stacspec.org/) or **SpatioTemporal Asset Catalogs** specification provides a common language to describe a range of geospatial information, so it can more easily be indexed and discovered. A 'spatiotemporal asset' is any file that represents information about the earth captured in a certain space and time.

Here are some STAC endpoints that make geospatial data available:
   
| Provider                                                                          |Endpoint        |
| :-------------------------------------------------------------------------------- | :------------- |
|[Element 84](https://www.element84.com/earth-search/)                              | https://earth-search.aws.element84.com/v0|
|[Planetary Computers Data Catalog](https://planetarycomputer.microsoft.com/catalog)|https://planetarycomputer.microsoft.com/api/stac/v1/|

## Indexing the Product Definition
Before indexing the datasets the corresponding product definition should be indexed. Given below is syntax to the `datacube` command.

`datacube product add <product_definition_source>`

In [None]:
! datacube product add https://raw.githubusercontent.com/digitalearthafrica/config/master/products/esa_s2_l2a.odc-product.yaml

##### The cell below runs the help command on the s3-to-dc app

In [1]:
! stac-to-dc --help

Usage: stac-to-dc [OPTIONS]

  Iterate through STAC items from a STAC API and add them to datacube.

Options:
  --limit INTEGER        Stop indexing after n datasets have been indexed.
  --update-if-exists     If the dataset or product already exists, update it
                         instead of skipping it.

  --allow-unsafe         Allow unsafe changes to a dataset. Take care!
  --catalog-href TEXT    URL to the catalog to search
  --collections TEXT     Comma separated list of collections to search
  --bbox TEXT            Comma separated list of bounding box coords, lon-min,
                         lat-min, lon-max, lat-max

  --datetime TEXT        Dates to search, either one day or an inclusive
                         range, e.g. 2020-01-01 or 2020-01-01/2020-01-02

  --options TEXT         Other search terms, as a # separated list, i.e.,
                         --options=cloud_cover=0,100#sky=green

  --rewrite-assets TEXT  Rewrite asset hrefs, for example, to change from
  

## Example

Syntax to use `stac-to-dc` command:
```
stac-to-dc --bbox='longitude-min,latitude-min,longitude-max,latitude-max' 
           --catalog-href='<STAC API endpoint>'
           --datetime='YYYY-MM-DD/yyyy-mm-dd'
           --collections='collection_name'
```

* `--bbox` - specifies the geographical area that you want to index
* `--catalog-href` - specifies the STAC endpoint
* `--datetime` - Dates to search, either one day or an inclusive range, here `YYYY-MM-DD` comes earlier than `yyyy-mm-dd`
* `--collections` - s an extension of the STAC Catalog with additional information such as the extents, license, keywords, providers, etc that describe STAC Items that fall within the Collection.

Example:
```
stac-to-dc --bbox='25,20,35,30' 
           --catalog-href='https://earth-search.aws.element84.com/v0/'
           --datetime='2021-06-01/2021-07-01'
           --collections='sentinel-s2-l2a-cogs'
```

**Indexing the datasets using the `stac-to-dc` command available through the [odc_apps_dc_tools](https://github.com/opendatacube/odc-tools/tree/develop/apps/dc_tools)**

In [None]:
! stac-to-dc --bbox='25,20,35,30' --catalog-href='https://earth-search.aws.element84.com/v0/' --datetime='2021-06-01/2021-07-01' --collections='sentinel-s2-l2a-cogs'

## Recommended Next Steps
Loading the datasets and plotting satellite images come after the process of indexing. Therefore we recommend you to go through the indexing notebooks to understand the different steps involved and different sources from where data could be indexed. Click on the links which will take you to the respective notebooks.

1. [Introduction to ODC Indexing](01_Introduction_to_ODC_Indexing.ipynb)
2. [Indexing Product Definition](02_Indexing_Product_Definition.ipynb)
3. [Indexing from Local File System](03_Indexing_from_Local_File_System.ipynb)
4. [Indexing from Amazon - AWS S3](04_Indexing_from_AWS_S3.ipynb)
5. **Indexing using STAC(This Notebook)**