<img src="https://raw.githubusercontent.com/brazil-data-cube/code-gallery/master/img/logo-bdc.png" align="right" width="64"/>

# <span style="color:#336699">Brazil Data Cube Platform: Earth Observation data cubes and satellite image time series analysis</span>
<hr style="border:2px solid #0077b9;">

<br/>

<div style="text-align: center;font-size: 90%;">
    Karine R. Ferreira, Gilberto R. Queiroz, Baggio L. C. Silva, Fabiana Ziotti, Raphael W. Costa, Rennan F. B. Marujo, Gabriel Sansigolo
    <br/><br/>
    Earth Observation and Geoinformatics Division, National Institute for Space Research (INPE)
    <br/>
    Avenida dos Astronautas, 1758, Jardim da Granja, São José dos Campos, SP 12227-010, Brazil
    <br/><br/>
    Last Update: Nov 21, 2024
</div>

<br/>

<div style="text-align: justify;  margin-left: 25%; margin-right: 25%;">
<b>Abstract.</b> This Jupyter Notebook gives an overview on how to use the STAC service to discover and access the data products from the <em>Brazil Data Cube</em>.
</div>

<img src="https://raw.githubusercontent.com/brazil-data-cube/code-gallery/master/img/stac/stac.png?raw=true" align="right" width="66"/>

# **S**patio**T**emporal **A**sset **C**atalog (STAC)
<hr style="border:1px solid #0077b9;">

The [**S**patio**T**emporal **A**sset **C**atalog (STAC)](https://stacspec.org/) is a specification created through the colaboration of several organizations intended to increase satellite image search interoperability.

The diagram depicted in the picture contains the most important concepts behind the STAC data model:

<center>
<img src="https://raw.githubusercontent.com/brazil-data-cube/code-gallery/master/img/stac/stac-concept.png" width="480" />
<br/>
STAC model.
</center>

The description of the concepts below are adapted from the [STAC Specification](https://github.com/radiantearth/stac-spec):

- **Item**: a `STAC Item` is the atomic unit of metadata in STAC, providing links to the actual `assets` (including thumbnails) that they represent. It is a `GeoJSON Feature` with additional fields for things like time, links to related entities and mainly to the assets. According to the specification, this is the atomic unit that describes the data to be discovered in a `STAC Catalog` or `Collection`.

- **Asset**: a `spatiotemporal asset` is any file that represents information about the earth captured in a certain space and time.


- **Catalog**: provides a structure to link various `STAC Items` together or even to other `STAC Catalogs` or `Collections`.


- **Collection:** is a specialization of the `Catalog` that allows additional information about a spatio-temporal collection of data.

STAC Client API
<hr style="border:1px solid #0077b9;">

For running the examples in this Jupyter Notebook you will need to install the [pystac-client](https://pystac-client.readthedocs.io/en/latest/). To install it from PyPI using `pip`, use the following command:

In [None]:
#!pip install pystac-client

In [None]:
#!pip install shapely tqdm

In order to access the funcionalities of the client API, you should import the `stac` package, as follows:

In [None]:
import pystac_client

Then, create a `STAC` object attached to the Brazil Data Cube' STAC service:

In [None]:
service = pystac_client.Client.open('https://data.inpe.br/bdc/stac/v1/')

Listing the Available Data Products
<hr style="border:1px solid #0077b9;">

In the Jupyter environment, the `STAC` object will list the available image and data cube collections from the service:

In [None]:
for collection in service.get_collections():
    print(collection)

<img src="https://raw.githubusercontent.com/brazil-data-cube/code-gallery/master/img/stac/stac-catalog.png?raw=true" align="right" width="300"/>

Retrieving the Metadata of a Collection
<hr style="border:1px solid #0077b9;">

The `collection` method returns information about a given image or data cube collection identified by its name. In this example we are retrieving information about the datacube collection `S2-16D-2`:

In [None]:
collection = service.get_collection('S2-16D-2')
collection

<img src="https://raw.githubusercontent.com/brazil-data-cube/code-gallery/master/img/stac/stac-item.png?raw=true" align="right" width="300"/>

Retrieving Items
<hr style="border:1px solid #0077b9;">

The `get_items` method returns the items of a given collection:

In [None]:
import folium

In [None]:
bbox = [-52.3625, -6.43, -52.3575, -6.425]

In [None]:
f = folium.Figure(width=1000, height=300) # Restrict figure size

# Create a folium map centered around the geographic area of interest
folium_map = folium.Map(location=[-6.41, -52.35], zoom_start=13)

folium.Rectangle(
    bounds=[[bbox[1],bbox[0]],[bbox[3],bbox[2]]],
    color="blue",
    weight=2,
    fill=True,
    fill_color="blue",
    fill_opacity=0.2
).add_to(folium_map)

folium_map

In order to support filtering rules through the specification of a rectangle (`bbox`) or a date and time (`datatime`) criterias, use the `Client.search(**kwargs)`:

In [None]:
item_search = service.search(bbox=bbox,
                             datetime='2020-01-01/2020-12-31',
                             collections=['S2-16D-2'])
item_search

The method `.search(**kwargs)` returns a `ItemSearch` representation which has handy methods to identify the matched results. For example, to check the number of items matched, use `.matched()`:

In [None]:
item_search.matched()

To iterate over the matched result, use `.get_items()` to traverse the list of items:

In [None]:
for item in item_search.items():
    print(item)

In [None]:
items = list(item_search.items())
items

<img src="https://raw.githubusercontent.com/brazil-data-cube/code-gallery/master/img/stac/stac-asset.png?raw=true" align="right" width="300"/>

Assets
<hr style="border:1px solid #0077b9;">

The assets with the links to the images, thumbnails or specific metadata files, can be accessed through the property `assets` (from a given item):

In [None]:
assets = item.assets #Last item of the loop
assets

Then, from the assets it is possible to traverse or access individual elements:

The metadata related to the Sentinel-2/MSI blue band is available under the dictionary key `B02`:

In [None]:
blue_asset = assets['B02']
blue_asset

To iterate in the item's assets, use the following pattern:

In [None]:
for asset in assets.values():
    print(asset)

Retrieving Image Files
<hr style="border:1px solid #0077b9;">

Note that the URL for a given asset can be retrieved by the property `href`:

In [None]:
blue_asset.href

In [None]:
#!pip install rasterio

In [None]:
%matplotlib inline

import numpy as np
import rasterio
from matplotlib import pyplot as plt
from pyproj import Transformer
from pyproj.crs import CRS
from rasterio.windows import bounds, from_bounds, Window

DataCubes generated by Brazil Data Cube use an Alber Equal Areas Projection ([see here](https://brazil-data-cube.github.io/specifications/bdc-projection.html)).

Here we define some auxiliar functions to help in this Jupyter Notebook.

- `normalize`: Normalizes image values (for visualization).

- `read_img`: Reads an image using window.

- `read_bdcimg_using_window_from_4326`: Reads parts (windows) of a BDC image using coordinates from EPSG 4326.

In [None]:
def normalize(array):
    """Normalizes numpy arrays into scale 0.0 - 1.0"""
    array_min, array_max = array.min(), array.max()
    return ((array - array_min)/(array_max - array_min))

def read_img(uri: str, window: Window = None, masked: bool = True):
    """Read raster window as numpy.ma.masked_array."""
    with rasterio.open(uri) as src:
        return src.read(1, window=window, masked=masked)

def read_bdcimg_using_window_from_4326(uri: str, bbox, transformer):
    """Read raster window as numpy using EPSG:4326 to crop the window."""
    x1, y1, x2, y2 = bbox
    x1_reproj, y1_reproj = transformer.transform(x1, y1)
    x2_reproj, y2_reproj = transformer.transform(x2, y2)
    with rasterio.open(uri) as src:
        window = from_bounds(x1_reproj, y1_reproj, x2_reproj, y2_reproj, src.transform)
        rst = src.read(1, window=window)
        window_transform = src.window_transform(window)
        # window_bounds = bounds(window, src.transform)
    return rst, window_transform

Now let's suppose we don't want to use the entire image, only a part of it.

So we define a bounding box of the area of interest in order to open and visualize the RGB bands.

In [None]:
window_bbox = [-52.4, -6.5, -52.3, -6.4]

In [None]:
# Create the transformer
crs = rasterio.open(assets['B02'].href).crs
in_proj = CRS.from_epsg(4326)
out_proj = CRS.from_user_input(crs)
transformer = Transformer.from_crs(in_proj, out_proj, always_xy=True)

In [None]:
b02_image, window_transform = read_bdcimg_using_window_from_4326(items[7].assets['B02'].href, window_bbox, transformer)
b03_image, _ = read_bdcimg_using_window_from_4326(items[7].assets['B03'].href, window_bbox, transformer)
b04_image, _ = read_bdcimg_using_window_from_4326(items[7].assets['B04'].href, window_bbox, transformer)

fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(12, 4))
ax1.imshow(b02_image, cmap='gray')
ax2.imshow(b03_image, cmap='gray')
ax3.imshow(b04_image, cmap='gray')

In [None]:
rgb_normalized_stack = np.dstack((normalize(b04_image), normalize(b03_image), normalize(b02_image)))
plt.imshow(rgb_normalized_stack)

-----

# Exemplo caso de uso
<hr style="border:1px solid #0077b9;">

Agora vamos fazer alguns exemplos abrindo imagens de diferentes datas, calculando uma média e calculando o índice Temperature Condition Index (TCI).


Formula : TCI = 100 * (BTmax - BT) / (BTmax – BTmin) (Kogan, 1995)



Primeiramente vamos definir uma área, a qual utilizaremos para buscar dados.

In [None]:
bbox = [-58, -12, -57, -11]

f = folium.Figure(width=1000, height=300) # Restrict figure size

# Create a folium map centered around the geographic area of interest
folium_map = folium.Map(location=[-11.5, -57.5], zoom_start=6)

folium.Rectangle(
    bounds=[[bbox[1],bbox[0]],[bbox[3],bbox[2]]],
    color="blue",
    weight=2,
    fill=True,
    fill_color="blue",
    fill_opacity=0.2
).add_to(folium_map)

folium_map

Vamos usar o STAC para procurar dados da coleção `mod11a2-6.1`:

In [None]:
item_search = service.search(bbox=bbox,
                             datetime='2020-01-01/2023-12-31',
                             collections=['mod11a2-6.1'])
item_search

In [None]:
item_search.matched()

In [None]:
items = list(item_search.items())
items

Vamos criar um dicionário, que tem como chave uma string contendo o dia do ano (de 1 a 366) e que agrupa os items dos diversos anos referentes a esse dia.

In [None]:
items_day_of_the_year = {}
for item in items:
    day = item.id.split(".")[1][5:]
    if day not in items_day_of_the_year:
        items_day_of_the_year[day] = []
    items_day_of_the_year[day].append(item)
items_day_of_the_year['001']

In [None]:
items_day_of_the_year['001'][0].assets

<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th>SDS Name</th>
      <th>Description</th>
      <th>Units</th>
      <th>Data Type</th>
      <th>Fill Value</th>
      <th>No Data Value</th>
      <th>Valid Range</th>
      <th>Scale Factor</th>
      <th>Offset</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>QC_Day</td>
      <td>Daytime LST Quality Indicators</td>
      <td>Bit Field</td>
      <td>8-bit unsigned integer</td>
      <td>N/A</td>
      <td>N/A</td>
      <td>0 to 255</td>
      <td>N/A</td>
      <td>N/A</td>
    </tr>
    <tr>
      <td>Emis_31</td>
      <td>Band 31 emissivity</td>
      <td>N/A</td>
      <td>8-bit unsigned integer</td>
      <td>0</td>
      <td>N/A</td>
      <td>1 to 255</td>
      <td>0.002</td>
      <td>0.49</td>
    </tr>
  </tbody>
</table>

In [None]:
num_images = len(items_day_of_the_year['001'])  # Numero total de imagens a serem plotadas
fig, axes = plt.subplots(1, num_images, figsize=(4 * num_images, 4))  # Cria figura

for ax, item in zip(axes, items_day_of_the_year['001']):
    image = read_img(item.assets['Emis_31'].href)
    ax.imshow(image)
    ax.axis('off')  # Remove valores nos eixos para maior clareza no plot
    ax.set_title(item.id, fontsize=8, loc='left')
plt.show()

In [None]:
band = 'Emis_31'
Emis_31_sum = None
num_items = len(items_day_of_the_year['001'])

for idx, item in enumerate(items_day_of_the_year['001']):
    image = read_img(item.assets[band].href)  # Lê a imagem
    if Emis_31_sum is None:
        Emis_31_sum = np.zeros_like(image, dtype=np.float64)  # Inicializa com zeros
    Emis_31_sum += image
Emis_31_avg = Emis_31_sum / num_items


plt.figure(figsize=(8, 6))
plt.imshow(Emis_31_avg, cmap='viridis')
plt.colorbar(label=f"Média de {band}")
plt.title(f"Média da Banda {band}")
plt.axis('off') # Remove eixos para clareza
plt.show()

### WARNING! ESSA MÉDIA FOI FEITA INCLUSIVE COM PIXELS DE NUVENS! ###

Calculo TCI:

In [None]:
def calculate_tci(bt, bt_max, bt_min):
    bt = bt.astype(float)
    bt_max = bt_max.astype(float)
    bt_min = bt_min.astype(float)

    denominator = bt_max - bt_min
    denominator[denominator == 0] = np.nan  # Define como NaN se o denominador for zero

    tci = 100 * (bt_max - bt) / denominator # Equação TCI

    return tci


def obtain_max(items, band):
    max = None
    for idx, item in enumerate(items):
        image = read_img(item.assets[band].href) # Lê a imagem
        if max is None:
            max = image # Inicializa
            continue
        max = np.maximum(max, image)
    return max


def obtain_min(items, band):
    min = None
    for idx, item in enumerate(items):
        image = read_img(item.assets[band].href) # Lê a imagem
        if min is None:
            min = image # Inicializa
            continue
        min = np.minimum(min, image)
    return min

In [None]:
bt_max = obtain_max(items_day_of_the_year['001'], band)
bt_min = obtain_min(items_day_of_the_year['001'], band)
my_image = read_img(items_day_of_the_year['001'][0].assets[band].href)
tci = calculate_tci(my_image, bt_max, bt_min)

plt.figure(figsize=(8, 6))
plt.imshow(tci, cmap='viridis')
plt.colorbar(label=f"Índice TCI")
plt.title(f"Índice TCI")
plt.axis('off') # Remove eixos para clareza
plt.show()