In [1]:
%load_ext autoreload
%autoreload 2

import os
os.environ["EOTDL_API_URL"] = "http://localhost:8000/"

# STAC

When you ingest a dataset to the EOTDL, a `catalog.parquet` file is created with the metadata of the dataset. This metadata is STAC-compliant, so it can be used to query the dataset using the STAC API and generate STAC catalogs.

# STAC Catalogs

The following code will ingest a dataset to the EOTDL and create a `catalog.parquet` file with the metadata of the dataset.

In [2]:
from eotdl.datasets import ingest_dataset

path = "example_data/EuroSAT-small"
ingest_dataset(path)

  np.nanmin(b[:, 0]),  # minx
  np.nanmin(b[:, 1]),  # miny
  np.nanmax(b[:, 2]),  # maxx
  np.nanmax(b[:, 3]),  # maxy


Ingesting directory: example_data/EuroSAT-small


Ingesting files: 100%|██████████| 8/8 [00:00<00:00, 48.91it/s]
  np.nanmin(b[:, 0]),  # minx
  np.nanmin(b[:, 1]),  # miny
  np.nanmax(b[:, 2]),  # maxx
  np.nanmax(b[:, 3]),  # maxy


PosixPath('example_data/EuroSAT-small/catalog.parquet')

In [3]:
import geopandas as gpd

catalog = f"{path}/catalog.parquet"

gdf = gpd.read_parquet(catalog)
gdf.head()

Unnamed: 0,type,stac_version,stac_extensions,datetime,id,bbox,geometry,assets,links,repository
0,Feature,1.0.0,[],2025-02-06 12:38:53.927428,catalog.parquet,"{'xmax': 0.0, 'xmin': 0.0, 'ymax': 0.0, 'ymin'...",POLYGON EMPTY,{'asset': {'href': 'http://localhost:8000/data...,[],eotdl
1,Feature,1.0.0,[],2025-02-06 12:38:53.927533,README.md,"{'xmax': 0.0, 'xmin': 0.0, 'ymax': 0.0, 'ymin'...",POLYGON EMPTY,{'asset': {'href': 'http://localhost:8000/data...,[],eotdl
2,Feature,1.0.0,[],2025-02-06 12:38:53.927581,Forest/Forest_3.tif,"{'xmax': 0.0, 'xmin': 0.0, 'ymax': 0.0, 'ymin'...",POLYGON EMPTY,{'asset': {'href': 'http://localhost:8000/data...,[],eotdl
3,Feature,1.0.0,[],2025-02-06 12:38:53.927615,Forest/Forest_1.tif,"{'xmax': 0.0, 'xmin': 0.0, 'ymax': 0.0, 'ymin'...",POLYGON EMPTY,{'asset': {'href': 'http://localhost:8000/data...,[],eotdl
4,Feature,1.0.0,[],2025-02-06 12:38:53.927647,Forest/Forest_2.tif,"{'xmax': 0.0, 'xmin': 0.0, 'ymax': 0.0, 'ymin'...",POLYGON EMPTY,{'asset': {'href': 'http://localhost:8000/data...,[],eotdl


Since data metadata generated by the EOTDL is STAC-compliant, it can be used to automatically generate STAC catalogs.

In [4]:
from eotdl.curation.stac import create_stac_catalog

items = create_stac_catalog(catalog)

items

  0%|          | 0/8 [00:00<?, ?it/s]

100%|██████████| 8/8 [00:00<00:00, 367.64it/s]


[<Item id=catalog.parquet>,
 <Item id=README.md>,
 <Item id=Forest/Forest_3.tif>,
 <Item id=Forest/Forest_1.tif>,
 <Item id=Forest/Forest_2.tif>,
 <Item id=AnnualCrop/AnnualCrop_3.tif>,
 <Item id=AnnualCrop/AnnualCrop_1.tif>,
 <Item id=AnnualCrop/AnnualCrop_2.tif>]

Optionally, you can create a STAC catalog / collection and link the items to it.

In [5]:
from eotdl.curation.stac import create_stac_catalog
import pystac

stac_catalog = pystac.Catalog(
	id = "eotdl-catalog",
	description = "EOTDL Catalog",
	title = "EOTDL Catalog",
	stac_extensions = [],
	extra_fields = {},
)

stac_catalog = create_stac_catalog(catalog, stac_catalog)

stac_catalog

100%|██████████| 8/8 [00:00<00:00, 411.60it/s]




Either way, once the STAC metadata is generated, can be saved to disk.

In [6]:
stac_catalog.normalize_and_save(
	root_href='data/stac',
	catalog_type=pystac.CatalogType.SELF_CONTAINED
)

Keep in mind that if the original dataset already has STAC metadata, it will be overwritten.

# STAC API

> TODO