# Metadata API Introduction

This notebook shows basic use of our metadata API using a Python library PySTAC-Client. It demonstrates how to fetch all collections, fetch a given collection/item, and perform simple searches.

This is a slightly updated copy of a notebook from the GBR DMS: https://github.com/aodn/rimrep-examples/blob/main/Python_based_scripts/stac-metadata.ipynb

# Connecting to metadata API

We first connect to the metadata API by retrieving the root catalog

To do this, you will need to go to https://dashboard.reefdata.io/ and copy your Authentication token.
This can then be pasted into the password prompt.

In [1]:
from getpass import getpass
import pystac_client

# Metadata STAC API root url
URL = 'https://stac.reefdata.io'

# Go to https://dashboard.reefdata.io/, copy your API key and paste into password box

# Create the client
api = pystac_client.Client.open(
    url="https://stac.reefdata.io/",
    headers={
        'Authorization': f"Bearer {getpass()}"
    },
    ignore_conformance=True
)

api.title

 ········




'GBR-DMS Data Catalogue'

# Fetch all STAC collections

In [2]:
for collection in api.get_collections():
    print(collection)

<CollectionClient id=seltmp>
<CollectionClient id=abs-census>
<CollectionClient id=jcu-tropwater-seagrass-mapping>
<CollectionClient id=jcu-nerp-effectiveness-inshore-monitoring>
<CollectionClient id=bom-reeftemp>
<CollectionClient id=nasa-jpl-mursst>
<CollectionClient id=aims-temp>
<CollectionClient id=mmp>
<CollectionClient id=aims-ltmp-mmp-coralreef>
<CollectionClient id=coral-sea-boundary>
<CollectionClient id=des-coastal-data-system>
<CollectionClient id=imos-nrmn>
<CollectionClient id=amsa-vessel-tracking>
<CollectionClient id=qtmr-vessels>
<CollectionClient id=des-slats>
<CollectionClient id=abs-spatial>
<CollectionClient id=aims-weather>
<CollectionClient id=des-wq>
<CollectionClient id=bom-auswave>
<CollectionClient id=noaa-crw>
<CollectionClient id=ereefs>
<CollectionClient id=gbrmpa-admin-regions>
<CollectionClient id=imos-satellite-remote-sensing>
<CollectionClient id=ga-gbr-hr-depth-model>


# Fetch a given collection by ID

In [4]:
collection = api.get_collection('mmp')
collection

# Fetch all items

The function get_items return iterators, where pystac-client will handle retrieval of additional pages when needed. Note that one request is made for the first ten items, you can make a second request for the next ten.

In [5]:
items = collection.get_items()

# flush stdout so we can see the exact order that things happen
def get_ten_items(items):
    for i, item in enumerate(items):
        print(f"{i}: {item}", flush=True)
        if i == 9:
            return

print('First page', flush=True)
get_ten_items(items)

First page
0: <Item id=aims-mmp-inshore-wq-ctd>


# Fetch a given item


In [7]:
item = collection.get_item('aims-mmp-inshore-wq-ctd')
item

# Inspect an item for assets

In [8]:
# Inspect assets
item_assets = item.get_assets()
data_asset = item_assets['data']
if data_asset is not None:
    print(data_asset.to_dict())  

{'href': 's3://gbr-dms-data-public/aims-mmp-inshore-wq-ctd/data.parquet', 'type': 'application/x-parquet', 'title': 'AIMS - MMP Inshore Water Quality Vertical Profiles Of Conductivity Temperature And Depth (CTD)', 'description': 'S3 address of the AIMS - MMP Inshore Water Quality Vertical Profiles Of Conductivity Temperature And Depth (CTD) in GeoParquet format', 'roles': ['data']}


# Inspect an item for link to data API

In [9]:
# Inspect link to data API
link = item.get_single_link(rel="describedby")
if link is not None:
    print(link.to_dict())

{'rel': 'describedby', 'href': 'https://pygeoapi.reefdata.io/collections/aims-mmp-inshore-wq-ctd', 'title': 'Link to Data API'}


# Search for items by spatial and temporal extent

In [11]:
geom = {
    "type": "Polygon",
    "coordinates": [
      [
        [
          162,
          -33
        ],
        [
          162,
          -3
        ],
        [
          136,
          -3
        ],
        [
          136,
          -33
        ],
        [
          162,
          -33
        ]
      ]
    ]
}

results = api.search(
    max_items = 15,
    limit = 5,
    intersects = geom,
    datetime = "2024-01-01/2024-05-08",
)

for item in results.items():
    print(item.id)

bom-auswave-3d
bom-auswave-10d
des-storm-tides
aims-weather-observing
aims-ltmp-mmp-coralreef-model
des-wq-gr-buoy3-env-nearrealtime
des-wq-el-buoy1-nearrealtime
des-wq-el-buoy1-env-nearrealtime
des-wq-el-buoy1-current-nearrealtime
des-wq-dempster-nearrealtime
aims-ereefs-agg-hydrodynamic-1km-daily
aims-ereefs-agg-hydrodynamic-4km-daily
imos-srs-aqua-oc-sst
imos-srs-aqua-oc-picop-brewin2012
imos-srs-aqua-oc-npp-oc3-vgpm


# Search for items using query

In [13]:
# Search for items using a query to filter by keywords
# Currently "contains" operator is not supported, so you can only "eq" operator find exact matches

results = api.search(
    max_items = 15,
    limit = 5,
    query={"themes": {"eq": [{'scheme': 'https://wiki.esipfed.org/ISO_19115-3_Codelists#MD_TopicCategoryCode',
                              'concepts': [{'id': 'oceans', 'title': 'Oceans'}]}]}}
)

for item in results.items():
    print(item.id)