# Discover Data via the STAC API

Datasets hosted at EODC are cataloged by making use of the [STAC](https://stacspec.org/en) (SpatioTemporal Asset Catalog) specifications. The catalog service is available as STAC API via [https://stac.eodc.eu/api/v1](https://stac.eodc.eu/api/v1) to enable users to discover and search for datasets filtering by space, time and other attributes. In the following we will demonstrate the use of the STAC API and open-source Python libraries to run search queries against multiple STAC API instances.

## Connecting to the EODC STAC catalogue

In this example, we are going to make use of a popular STAC client for Python, the pystac-client. The library is already installed in this environment, but can be manually installed anywhere else via pip install pystac-client.

In [1]:
try:
    from pystac_client import Client
except ImportError:
    %pip install pystac-client
    from pystac_client import Client

try:
    from IPython.display import Image
except ImportError:
    %pip install IPython
    from IPython.display import Image

try:
    from rich.console import Console
except ImportError:
    %pip install rich
    from rich.console import Console

import rich.table

try:
    import geopandas
except ImportError:
    %pip install geopandas
    import geopandas

In [2]:
eodc_catalog = Client.open(
    "https://stac.eodc.eu/api/v1",
)

eodc_catalog.title


'EODC Data Catalogue'

## Searching for collections

All data in the catalog is stored in so-called collections, which are named, for example, after the satellite mission.

In [3]:
for collection in eodc_catalog.get_collections():
    print(collection)

<CollectionClient id=SENTINEL2_L2A>
<CollectionClient id=SENTINEL2_GRI_L1C>
<CollectionClient id=GFM>
<CollectionClient id=SENTINEL1_HPAR>
<CollectionClient id=DOP_AUT_K_KLAGENFURT>
<CollectionClient id=DOP_AUT_K_OSTTIROL>
<CollectionClient id=DOP_AUT_K_TAMSWEG>
<CollectionClient id=DOP_AUT_K_VILLACH>
<CollectionClient id=DOP_AUT_K_WOLFSBERG>
<CollectionClient id=DOP_AUT_K_ZELL_AM_SEE>
<CollectionClient id=DOP_AUT_K_ZELTWEG>
<CollectionClient id=AUT_DEM>
<CollectionClient id=COP_DEM>
<CollectionClient id=SENTINEL1_SLC>
<CollectionClient id=SENTINEL1_MPLIA>
<CollectionClient id=SENTINEL1_SIG0_20M>
<CollectionClient id=AI4SAR_SIG0>
<CollectionClient id=SENTINEL1_GRD>
<CollectionClient id=SENTINEL2_L1C>
<CollectionClient id=SENTINEL3_SRAL_L2>
<CollectionClient id=SENTINEL1_GRD_COVERAGE>
<CollectionClient id=INTRA_FIELD_CROP_GROWTH_POTENTIAL>
<CollectionClient id=DROUGHT_VULNERABILITY>
<CollectionClient id=SENTINEL2_MFCOVER>
<CollectionClient id=VEGETATION_CHANGE_AUSTRIA>
<CollectionClient

On static as well as dynamic catalogues we cann also make use of the `links` attributes which lets us quickly examinate, for instance, the number of available collections.

In [4]:
child_links = eodc_catalog.get_links('child')
print(f"The EODC STAC catalogue currently features {len(child_links)} collections.")

The EODC STAC catalogue currently features 33 collections.


Individual collections can be searched for.

In [5]:
collection = eodc_catalog.get_collection("SENTINEL2_L1C")
collection

In [6]:
print(f"This collection contains data in the following temporal inteval: {collection.extent.temporal.to_dict()}")

This collection contains data in the following temporal inteval: {'interval': [['2015-07-04T00:00:00Z', None]]}


## STAC Items

Simlarly to before, we can use the collection client instance to iterate over the items contained in the collection. The server must provide the `/collections/<collection_id>/items` endpoint to support this feature automatically. This can be useful to manually filter items or extract information programmatically. The `get_all_items()` method again returns an iterator.

In [7]:
items = collection.get_all_items()

Load 10 items with cloud cover less than 10%

In [8]:
items10 = []
for n, item in enumerate(items):
    if len(items10) == 10:
        break
    cloud_cover = item.properties.get("eo:cloud_cover")
    if cloud_cover < 10:
        print(f"Append item {item.id} with {cloud_cover:.2f}% cloud cover")
        items10.append(item)

Append item S2B_MSIL1C_20240301T124939_R138_T30WWE_20240301T134634 with 0.73% cloud cover
Append item S2B_MSIL1C_20240301T124939_R138_T30WVD_20240301T134634 with 9.63% cloud cover
Append item S2A_MSIL1C_20240301T123851_R066_T23ENN_20240301T140054 with 1.00% cloud cover
Append item S2A_MSIL1C_20240301T123851_R066_T23EMP_20240301T140054 with 5.11% cloud cover
Append item S2A_MSIL1C_20240301T123851_R066_T23EMN_20240301T140054 with 4.77% cloud cover
Append item S2A_MSIL1C_20240301T123851_R066_T23ELP_20240301T140054 with 2.23% cloud cover
Append item S2A_MSIL1C_20240301T123851_R066_T23ELN_20240301T140054 with 0.09% cloud cover
Append item S2A_MSIL1C_20240301T123851_R066_T22EFU_20240301T140054 with 6.46% cloud cover
Append item S2A_MSIL1C_20240301T123851_R066_T22EFT_20240301T140054 with 0.00% cloud cover
Append item S2A_MSIL1C_20240301T123851_R066_T22EET_20240301T140054 with 0.00% cloud cover


If the item provides a preview image we can look at it.

In [9]:
Image(url=items10[4].assets["thumbnail"].href, width=500)

## Search for items in a collection with filter criterias



First we set the temporal and spatial extent.

There are two options for a spatial extent.
1. A polygon in GEOJSON
2. A bounding box (bbox)

In [10]:
console = Console()

time_range = "2023-05-01/2024-05-01"

# GEOJSON can be created on geojson.io
# Area around the Neusiedler See
area_of_interest = {
"coordinates": [
          [
            [
              16.685331259653253,
              48.001346032803355
            ],
            [
              16.621884871275512,
              47.902601630022275
            ],
            [
              16.62588718725482,
              47.81041047247777
            ],
            [
              16.664809254423375,
              47.774602171781936
            ],
            [
              16.96808652311867,
              47.76771348708101
            ],
            [
              16.963971948548988,
              48.00956486424042
            ],
            [
              16.685331259653253,
              48.001346032803355
            ]
          ]
        ],
        "type": "Polygon"
      }


In [11]:
# Bounding box of Austria
#bbox_aut = [9.25, 46.31, 17.46, 49.18]

We search for Sentinel-2 data, that matches our filter criteria

In [12]:
search = eodc_catalog.search(
    collections=["SENTINEL2_L1C"],
    intersects=area_of_interest,
    #bbox = bbox_aut,
    datetime=time_range
)

items_eodc = search.item_collection()
console.print(f"On EODC we found {search.matched()} items for the given search query")

In [14]:
df = geopandas.GeoDataFrame.from_features(items_eodc.to_dict(), crs="epsg:4326")

#print the first three rows of the dataframe
df.head(3)

Unnamed: 0,geometry,created,datetime,platform,grid:code,proj:epsg,providers,published,deprecated,instruments,...,s2:datatake_type,view:sun_azimuth,mgrs:latitude_band,s2:generation_time,sat:relative_orbit,view:sun_elevation,processing:facility,s2:processing_baseline,s2:degraded_msi_data_percentage,s2:reflectance_conversion_factor
0,"POLYGON ((17.63356 48.72170, 17.60873 48.66828...",2024-03-11T21:32Z,2024-02-29T09:58:39.024000Z,sentinel-2b,MGRS-33UXP,32633,[{'url': 'https://earth.esa.int/web/guest/home...,2024-03-11T21:32:53Z,False,[msi],...,INS-NOBS,163.212324,U,2024-02-29T15:19:37.000000Z,122,32.674225,2BPS,5.1,0.0276,1.020997
1,"POLYGON ((17.22611 47.82988, 17.21228 47.79921...",2024-03-11T21:28Z,2024-02-29T09:58:39.024000Z,sentinel-2b,MGRS-33TXN,32633,[{'url': 'https://earth.esa.int/web/guest/home...,2024-03-11T21:29:04Z,False,[msi],...,INS-NOBS,163.071101,T,2024-02-29T15:19:37.000000Z,122,33.539305,2BPS,5.1,0.0296,1.020997
2,"POLYGON ((16.36028 48.74498, 17.85244 48.71769...",2024-03-08T09:40Z,2024-02-26T09:48:59.024000Z,sentinel-2b,MGRS-33UXP,32633,[{'url': 'https://earth.esa.int/web/guest/home...,2024-03-08T09:40:35Z,False,[msi],...,INS-NOBS,160.507351,U,2024-02-26T10:40:58.000000Z,79,31.024304,2BPS,5.1,0.0214,1.022373


Now we can select the item with the least (min) cloud cover. Data providers exposing STAC can make use of a number of STAC extensions. Some collections implement the so-called eo extension, which can be used to sort items by cloudiness.

In [15]:
selected_item = min(items_eodc, key=lambda item: item.properties["eo:cloud_cover"])

selected_item

Print the Thumbnail of the item

In [16]:
Image(url=selected_item.assets["thumbnail"].href, width=500)

Each STAC item has one or more Assets, which include links to actual files. So let's print a list with all assets.

In [17]:
table = rich.table.Table(title="Assets in STAC Item")
table.add_column("Asset Key", style="cyan", no_wrap=True)
table.add_column("Description")
for asset_key, asset in selected_item.assets.items():
    table.add_row(asset_key, asset.title)

console.print(table)