# STAC assets generators

When we want to generate STAC metadata from a imagery dataset through EOTDL, we must generate a STACDataFrame, as seen in this [notebook](20_stac.ipynb). When generating the STACDataFrame, there is a needed parameter called `assets_generator`. In this notebook we are going to dive in it.

Uncomment the following line to install eotdl if needed.

In [None]:
# !pip install eotdl

The assets_generator parameter defines the strategy to follow with the generation of assets from each image. In this way, it could be the case that from a Sentinel-2 image we want to extract all its bands as assets, or simply extract the RGB bands, or not extract any as assets. By default, three strategies have been established, which can be expanded according to needs.

- `STACAssetGenerator`: does not extract new assets from the image bands, so a single asset is generated for the image.
- `BandsAssetGenerator`: from the original image it creates a new file for each band established in the 'bands` column, deleting the original file. An asset is added to the STAC item for each band. _Attention: this feature is still under development and could not work._

In [1]:
from eotdl.curation.stac.stac import STACGenerator
from eotdl.curation.stac.assets import STACAssetGenerator, BandsAssetGenerator
from eotdl.curation.stac.parsers import UnestructuredParser
from eotdl.curation.stac.dataframe_labeling import LabeledStrategy

stac_generator = STACGenerator(item_parser=UnestructuredParser, 
                               assets_generator=STACAssetGenerator, 
                               labeling_strategy=LabeledStrategy,
                               image_format='tif'
                               )

In [2]:
df = stac_generator.get_stac_dataframe('example_data/jaca_dataset/')
df.head()

Unnamed: 0,image,label,ix,collection,extensions,bands
0,example_data/jaca_dataset/Jaca_1.tif,Jaca,0,example_data/jaca_dataset/source,,
1,example_data/jaca_dataset/Jaca_2.tif,Jaca,0,example_data/jaca_dataset/source,,
2,example_data/jaca_dataset/Jaca_3.tif,Jaca,0,example_data/jaca_dataset/source,,
3,example_data/jaca_dataset/Jaca_4.tif,Jaca,0,example_data/jaca_dataset/source,,


A key feature is the `label` column. Using the label of every image we are going to assign parameters like the STAC extensions that this image's item is going to have, or the bands we want to extract using the `BandsAssetGenerator`. We can obtain the existing labels in the STACDataFrame before adding new information.

In [3]:
labels = df.label.unique().tolist()
labels

['Jaca']

Starting from the found label we are going to define the image bands. To simplify, let's only define the bands `B04`, `B03` and `B02`, which are the RGB bands.

To define these parameters for each label, we simply have to declare a dictionary.

In [4]:
bands = {'Jaca': ('B02', 'B03', 'B04')}

In [6]:
df = stac_generator.get_stac_dataframe('example_data/jaca_dataset/', bands=bands)
df.head()

Unnamed: 0,image,label,ix,collection,extensions,bands
0,example_data/jaca_dataset/Jaca_1.tif,Jaca,0,example_data/jaca_dataset/source,,"(B02, B03, B04)"
1,example_data/jaca_dataset/Jaca_2.tif,Jaca,0,example_data/jaca_dataset/source,,"(B02, B03, B04)"
2,example_data/jaca_dataset/Jaca_3.tif,Jaca,0,example_data/jaca_dataset/source,,"(B02, B03, B04)"
3,example_data/jaca_dataset/Jaca_4.tif,Jaca,0,example_data/jaca_dataset/source,,"(B02, B03, B04)"


If we just generate STAC metadata using the `generate_stac_metadata` function, those defined bands are useless, since we haven't defined a STAC `eo` extension to define those bands in the STAC item. However, if we were to use the `BandsAssetGenerator`, we would extract the selected bands from the original image, save them as `.tif` files in the image location, and add them to the STAC item as assets. This could be carried out with any combination of bands of both Sentinel-1 (VV, VH) and Sentinel-2, as desired, as long as the bands exist in the original image.