#  Discover and Access to Data Products in Brazil Data Cube


The Image Collections and Data Cube Collections produced in the Brazil Data Cube (BDC) project can be discovered and accesible through a standardized API known as **S**patio**T**emporal **A**sset **C**atalog (STAC). This Jupyter Notebook shows how to use the [Python Client Library for STAC](https://github.com/brazil-data-cube/stac.py) to query this data catalog and how to retrieve the data.


The diagram depicted in the picture contains the most important concepts behind the STAC data model:

![STAC 0.8.0 Model](./img/stac-model-0.8.0.png)


The description of the concepts below are adapted from the [STAC Specification](https://github.com/radiantearth/stac-spec):

- **Item**: a `STAC Item` is the atomic unit of metadata in STAC, providing links to the actual `assets` (including thumbnails) that they represent. It is a `GeoJSON Feature` with additional fields for things like time, links to related entities and mainly to the assets. According to the specification, this is the atomic unit that describes the data to be discovered in a `STAC Catalog` or `Collection`.

- **Asset**: a `spatiotemporal asset` is any file that represents information about the earth captured in a certain space and time.


- **Catalog**: provides a structure to link various `STAC Items` together or even to other `STAC Catalogs` or `Collections`.


- **Collection:** is a specialization of the `Catalog` that allows additional information about a spatio-temporal collection of data.

## Python Client API

For running the examples in this Jupyter Notebook you will need to install the [STAC client for Python](https://github.com/brazil-data-cube/stac.py). To install it from the Brazil Data Cube's GitHub repository, you can use `pip` with the following command:

In [None]:
!python -m pip install "stac @ git+git://github.com/brazil-data-cube/stac.py.git@v0.8.1-0#egg=stac"

In order to access the funcionalities of the client API, you should import the `stac` package, as follows:

In [None]:
import stac

After that, you can check the installed version of `stac` package:

In [None]:
stac.__version__

Then, create a `STAC` object attached to the service address:

In [None]:
bdc_stac_service = stac.STAC('http://brazildatacube.dpi.inpe.br/bdc-stac/0.8.0/')

The above cell will create an object named `bdc_stac_service` that will allow us to comunicate to the given `STAC` service.

## Listing the Available Data Products

The `catalog` attribute allows the client to retrieve the image collections and data cube collections available in the server.

In [None]:
bdc_stac_service.catalog

##  Retrieving Infomation on Image Collections and Data Cube Collections

The `collection` operation returns information about a given image or data cube collection identified by its name. In this example we are retrieving inormation about the datacube collection `C4_64_1M_STK` (**TODO:** explicar)... 

In [None]:
bdc_stac_service.collection('C4_64_1M_STK')

The returned document includes *its range in the spatial and temporal dimensions. It also receives a JSON document as a response* (**revisar**).

## Listing the Items of a Collection

In [None]:
collection = bdc_stac_service.collection('C4_64_1M_STK')

items = collection.get_items()

items

## Acessing Assets Records

In [None]:
items.features[0].assets

## Retrieving Assets

The client library provides a method named `download` that can be used to retrieve a specific asset from the catalog. The following code snippet shows how to retrieve the spectral bands for the first item (`features[0]`) in the data cube collection `C4_64_1M_STK`:

In [None]:
red = items.features[0].assets['red'].download()
green = items.features[0].assets['green'].download()
blue = items.features[0].assets['blue'].download()

If not informed a specific path in the filesystem the download will store the files under the application default path (the path where the script is running). You can inspect the path of the downloaded files as follow:

In [None]:
red

## Using RasterIO

After retrieving the asset (or image) you can use any Python library to perform data processing. In this section we show how to use RasterIO to load and operate on the pixel level.

In [None]:
import rasterio
import numpy as np

In [None]:
r = rasterio.open(red).read(1)
g = rasterio.open(green).read(1)
b = rasterio.open(blue).read(1)


In [None]:
r.max()

Let's define a simple function called `normalize` to ...

In [None]:
def normalize(array):
    """Normalizes numpy arrays into scale 0.0 - 1.0"""
    array_min, array_max = array.min(), array.max()
    return ((array - array_min)/(array_max - array_min))

Applying the function for each downloaded image:

In [None]:
n_r = normalize(r)
n_g = normalize(g)
n_b = normalize(b)

## Image Visualization

There are many powerful libraries for data visualization in Python and [Matplotlib](https://matplotlib.org/) is one of these great libraries. This section explore some options to visualize the three spectral bands downloaded from the data cube collection `C4_64_1M_STK`:

In [None]:
from matplotlib import pyplot

rgb = np.dstack((n_r, n_g, n_b))

pyplot.imshow(rgb)