Meta

In [1]:
import datetime as dt
print(dt.datetime.now())

2026-02-02 12:52:28.293999


# Find a specific dataset
In STAC it is common to search variables and query time/space. However, in some cases you might want to find a specific dataset by its title. 
## Scenario 1: you know in which collection the dataset is stored. 
EDITO STAC has several thematic collections. We have prior knowledge that de dataset we are looking for is stored in <b>emodnet-biology</b> collection.
### Step 1 load the collection
The collection that we are looking for is emodnet-biology.

In [2]:
from pystac_client import Client
from pprint import pprint

URL = "https://api.dive.edito.eu/data/collections"
catalog = Client.open(URL)
collection_emodnet_biology = catalog.get_collection("emodnet-biology")

/opt/python/lib/python3.13/site-packages/pystac_client/client.py:191: NoConformsTo: Server does not advertise any conformance classes.
/opt/python/lib/python3.13/site-packages/pystac_client/client.py:410: FallbackToPystac: Falling back to pystac. This might be slow.
  self._warn_about_fallback("COLLECTIONS", "FEATURES")


Inspect the collections

In [3]:
collection_emodnet_biology

## List all items in the collection
As you will see, there are many items in this collection

In [4]:
emodnet_biology_items = list(collection_emodnet_biology.get_items())
for i, item in enumerate(emodnet_biology_items):
    print(f"item {i} out of {len(emodnet_biology_items)}, {item.properties.get('title', '')}")

item 0 out of 1718, Habitat suitability reef-forming species in the North Sea
item 1 out of 1718, Biodiversity data from rocky intertidal zones, surveyed from Scotland to Morocco in 2022 and 2023
item 2 out of 1718, Counts of seabirds, marine mammals and other megafauna during POLARSTERN cruise ANT-XXVIII/5
item 3 out of 1718, Counts of seabirds, marine mammals and other megafauna during POLARSTERN cruise PS83 (ANT-XXIX/10)
item 4 out of 1718, Counts of seabirds, marine mammals and other megafauna during POLARSTERN cruise ANT-XXIX/1
item 5 out of 1718, Counts of seabirds, marine mammals and other megafauna during POLARSTERN cruise ANT-XXVIII/1
item 6 out of 1718, Modeled distribution map of Sea pens and burrowing megafauna in the North East Atlantic in 2021
item 7 out of 1718, Potential catch biomass summed up along the water column for the 30 main commercial fish species from the Atlantic Ocean
item 8 out of 1718, 3D habitat suitability maps of the 30 main commercial fish species from

### Search for the dataset
The dataset we are looking for is this one: https://zenodo.org/records/13589902

In [5]:
search_title = "Koster"

In [6]:
results = []
for i, item in enumerate(emodnet_biology_items):
    if search_title in item.properties.get('title'):
        print(item.properties.get('title'))
        results.append(item)

Koster historical biodiversity assessment


In [7]:
koster_item = results[0]
print(koster_item)

<Item id=bdbeb221-7656-52e5-9ade-4b3304db82cd>


### Inspect the Item
Using help() gives you all info. 

In [8]:
#help(koster_item)

You can also visit the docs: https://pystac.readthedocs.io/en/latest/api/pystac.html#pystac.Item 
<br>
Or inspect object signature

In [9]:
import inspect

import inspect
sig = inspect.signature(type(koster_item))
for name, param in sig.parameters.items():
    print(name, param)

id id: 'str'
geometry geometry: 'dict[str, Any] | None'
bbox bbox: 'list[float] | None'
datetime datetime: 'Datetime | None'
properties properties: 'dict[str, Any]'
start_datetime start_datetime: 'Datetime | None' = None
end_datetime end_datetime: 'Datetime | None' = None
stac_extensions stac_extensions: 'list[str] | None' = None
href href: 'str | None' = None
collection collection: 'str | Collection | None' = None
extra_fields extra_fields: 'dict[str, Any] | None' = None
assets assets: 'dict[str, Asset] | None' = None


### View the properties

In [10]:
koster_item.properties

{'productIdentifier': 'koster_historical_biodiversity_assessment',
 'title': 'Koster historical biodiversity assessment',
 'start_datetime': '1997-08-27T00:00:00.000000Z',
 'end_datetime': '2023-10-09T00:00:00.000000Z',
 'license': 'CC-BY-4.0',
 'provider': 'EMODnet Biology',
 'proj:code': 'EPSG:EPSG:4326',
 'collection': 'emodnet-biology',
 'updated': '2025-03-27T13:37:59.583384Z',
 'created': '2025-03-27T10:44:11.884559Z',
 'likes': 0,
 'comments': 0,
 'owner': '226779592474001643',
 'status': 1,
 'centroid': [0, 0],
 'ssys:targets': ['earth']}

### View the assets

In [11]:
koster_item.assets

{'doi': <Asset href=https://doi.org/10.5281/zenodo.13589902>,
 'wfs': <Asset href=https://geo.vliz.be/geoserver/Dataportal/wfs?service=wfs&version=1.1.0&typeName=eurobis-obisenv_basic&request=GetFeature&outputFormat=text%2Fcsv&viewParams=datasetid%3A8744>,
 'iptdwca': <Asset href=https://ipt.gbif.org.nz/archive.do?r=koster_historical_assessment>,
 'iptresource': <Asset href=https://ipt.gbif.org.nz/resource?r=koster_historical_assessment>,
 'xml': <Asset href=https://emodnet.ec.europa.eu/geonetwork/srv/api/records/6d617269-6e65-696e-666f-000000008744/formatters/xml>,
 'csw': <Asset href=https://emodnet.ec.europa.eu/geonetwork/emodnet/eng/csw?request=GetRecordById&service=CSW&version=2.0.2&elementSetName=full&id=6d617269-6e65-696e-666f-000000008744>}

## Scenario 2: you do not know in which collection the dataset is stored
In this case, we will loop over each collection and then over each item in the collection. 


In [12]:
collections = list(catalog.get_all_collections())

/opt/python/lib/python3.13/site-packages/pystac_client/client.py:446: FallbackToPystac: Falling back to pystac. This might be slow.
  self._warn_about_fallback("COLLECTIONS", "FEATURES")


In [13]:
print(len(collections))

445


### Experimental library for searching STAC items.

In [14]:
!pip install git+https://github.com/DTO-Bioflow/EDITO_python_tools

Collecting git+https://github.com/DTO-Bioflow/EDITO_python_tools
  Cloning https://github.com/DTO-Bioflow/EDITO_python_tools to /tmp/pip-req-build-i5b3dlt3
  Running command git clone --filter=blob:none --quiet https://github.com/DTO-Bioflow/EDITO_python_tools /tmp/pip-req-build-i5b3dlt3
  Resolved https://github.com/DTO-Bioflow/EDITO_python_tools to commit e8167efb029e6eaf103d78a610fb7065c33678e4
  Installing build dependencies ... [?25ldone
[?25h  Getting requirements to build wheel ... [?25ldone
[?25h  Preparing metadata (pyproject.toml) ... [?25ldone


In [15]:
from dtotools.search import search_on_title
results = search_on_title(title="koster", verbose=1)

2026-02-02 12:53:48.257391 | loading all collections...
--------------------------------------------------
2026-02-02 12:54:45.023733 | Collection 1/445: Animal Tracking Datasets
2026-02-02 12:54:45.251469 | scanning 51 items...
--------------------------------------------------
2026-02-02 12:54:45.251666 | Collection 2/445: Age of sea ice (Climate Forecast convention)
2026-02-02 12:54:45.404460 | scanning 36 items...
--------------------------------------------------
2026-02-02 12:54:45.404583 | Collection 3/445: Aggregate quality flag (Climate Forecast convention)
2026-02-02 12:54:45.941717 | scanning 136 items...
--------------------------------------------------
2026-02-02 12:54:45.941978 | Collection 4/445: Air density (Climate Forecast convention)
2026-02-02 12:54:46.395015 | scanning 110 items...
--------------------------------------------------
2026-02-02 12:54:46.395262 | Collection 5/445: Air pressure (Climate Forecast convention)
2026-02-02 12:54:46.613139 | scanning 51 ite

In [16]:
print(results)

[<Item id=bdbeb221-7656-52e5-9ade-4b3304db82cd>]


In [17]:
print(results[0].properties.get("title"))

Koster historical biodiversity assessment


In [33]:
print(results[0].extra_fields.get("url"))

https://radiantearth.github.io/stac-browser/#/external/api.dive.edito.eu/data/collections/emodnet-biology/items/bdbeb221-7656-52e5-9ade-4b3304db82cd
