# Install PySTAC Client

Import the PySTAC Client and connect the Client
to the STAC API endpoint.

PySTAC returns generator objects as a response that
can be iterated over. This abstracts away listing and pagination.

In [3]:
from pystac_client.client import Client
from pystac.item import Item
from pystac.collection import Collection
from pystac.link import Link
from pystac_client.asset_search import Asset

## Pre-Condition: connect to STAC endpoint

In [4]:
api = Client.open("http://127.0.0.1:8000")

Get basic information about the API from the Client

In [5]:
print(
    f"Title: {api.title}\n"
    f"Description: {api.description}\n"
    f"Catalog ID: {api.id}\n"
)

Title: test_api_title
Description: test api
Catalog ID: stac-fastapi



## 01: Basic free-text search, across all items in all collections

A collection must be specified, PySTAC supports all collection however
the stac-fastapi-elasticsearch encounters a 500 error. Will need to look into why.

Search query must be inside a query argument.

## 02: Same as `01`, with named parameter "q" instead of unnamed parameter

q search allows for case-insensitive search and partial match with wild-card.

PySTAC uses 'query' instead of 'q'

```py
query="AerChemMIP"
query="aerchemmip"
query="aerchemm*"
```

In [6]:
results = api.search(
    collections="cmip6",
    q="aerchemm*",
    max_items=1,
)
items = results.get_items()
item = next(items)
item.properties['activity_id']

['AerChemMIP']

## 03: Same as `02`: but checking search is case-insensitive

demonstrated in `02`

## 04: same as `01`, but specifying doctype

`api.search()` and `api.asset_search()` are item and asset level searches respectively.

In [7]:
items = api.search(collections='cmip6', max_items=1).get_items()

assets = api.asset_search(max_assets=1).get_assets()

asset: Asset = next(assets)
item: Item = next(items)

print(asset)
print(item)

<Asset href=http://esgf-data3.ceda.ac.uk/thredds/fileServer/esg_cmip6/CMIP6/CMIP/CAS/FGOALS-g3/historical/r1i1p1f1/fx/orog/gn/v20201202/orog_fx_FGOALS-g3_historical_r1i1p1f1_gn.nc>
<Item id=c2d94dd296525cc105cfa657b6b559c0>


## 05: like `03` but doctype is set to collection

There is no collection-level search

## 06: like `01`, but checking partial match works with a wild-card

Demonstated in `02`

## 07: like `01`, only search within the collections specified.

Must specify a collection to begin with.

Can not test multiple collections as of writing however the code does not show that capability
but is likely it can be abstracted in elasticsearch to work.

## 07b: like `07` only you can search collection objects and or collection IDs.

Collection ID can be retrieved from the object to search using a collection object
if ID is unknown. This will return a string.

In [8]:
collections = api.get_collections()
collection: Collection = next(collections)

items = api.search(collections=collection.id, max_items=1).get_items()
item = next(items)

item

<Item id=c2d94dd296525cc105cfa657b6b559c0>

## 08: facetted search, looking to match a given facet over all items

Faceted search can be done using the filter parameter.

In [9]:
results = api.search(
    collections=collection.id,
    max_items=10,
    filter={
        "eq": [{"property": "activity_id"}, "AerChemMIP"]
    }
).get_items()

item = next(results)
item.properties['activity_id']

['AerChemMIP']

## 09: faceted search, looked to match multiple facets over all items

In [10]:
results = api.search(
    collections=collection.id,
    max_items=10,
    filter={
        "eq": [{"property": "institution_id"}, "MOHC"],
        "eq": [{"property": "activity_id"}, "CFMIP"],
    }
).get_items()

item = next(results)
item.properties

{'further_info_url': ['https://furtherinfo.es-doc.org/CMIP6.AS-RCEC.TaiESM1.abrupt-0p5xCO2.none.r1i1p1f1'],
 'index_node': ['esgf-index4.ceda.ac.uk'],
 'source_type': ['AER', 'AOGCM', 'BGC'],
 'experiment_id': ['abrupt-0p5xCO2'],
 'variable_id': ['huss'],
 'activity_drs': ['CFMIP'],
 'dataset_id_template_': ['%(mip_era)s.%(activity_drs)s.%(institution_id)s.%(source_id)s.%(experiment_id)s.%(member_id)s.%(table_id)s.%(variable_id)s.%(grid_label)s'],
 'institution_id': ['AS-RCEC'],
 'variable': ['huss'],
 'model_cohort': ['Registered'],
 'dataset_version': ['20210913'],
 'dataset_id': ['CMIP6.CFMIP.AS-RCEC.TaiESM1.abrupt-0p5xCO2.r1i1p1f1.Amon.huss.gn.v20210913|esgf-data3.ceda.ac.uk'],
 'grid_label': ['gn'],
 'file_id': ['CMIP6.CFMIP.AS-RCEC.TaiESM1.abrupt-0p5xCO2.r1i1p1f1.Amon.huss.gn.v20210913.huss_Amon_TaiESM1_abrupt-0p5xCO2_r1i1p1f1_gn_000101-012012.nc|esgf-data3.ceda.ac.uk',
  'CMIP6.CFMIP.AS-RCEC.TaiESM1.abrupt-0p5xCO2.r1i1p1f1.Amon.huss.gn.v20210913.huss_Amon_TaiESM1_abrupt-0p5xCO2_

## 10: faceted search, looking to match multiple facets within multiple collections

Same as `09` but use the `mip_era` property to add a collections to the filter or
list the collection id in the `collections` parameter.

## 11: Global Asset-level search

In [11]:
assets = api.asset_search(
    filter={
        "eq": [{"property": "activity_id"}, "AerChemMIP"]
    }
).get_assets()

asset: Asset = next(assets)
asset

<Asset href=http://esgf-data3.ceda.ac.uk/thredds/fileServer/esg_cmip6/CMIP6/AerChemMIP/BCC/BCC-ESM1/ssp370/r1i1p1f1/Ofx/areacello/gn/v20201021/areacello_Ofx_BCC-ESM1_ssp370_r1i1p1f1_gn.nc>

## 12 Asset-level search within Collection

Can use global asset search and search by `mip_era` in the properties.

In [12]:
assets = api.asset_search(
    filter={
        "eq": [{"property": "mip_era"}, "CMIP6"],
        "eq": [{"property": "activity_id"}, "AerChemMIP"]
    },
    max_assets=1
).get_assets()
asset = next(assets)
asset.href

'http://esgf-data3.ceda.ac.uk/thredds/fileServer/esg_cmip6/CMIP6/RFMIP/MOHC/UKESM1-0-LL/piClim-aer/r1i1p1f4/AERmonZ/bry/gnz/v20200305/bry_AERmonZ_UKESM1-0-LL_piClim-aer_r1i1p1f4_gnz_185001-189412.nc'

## 13: Asset-level search within Item

Not a feature yet in asset-search extension