# Seachable STAC catalog

This notebook shows an example how to access items and the corresponding hrefs stored in the assets for given search and open data parameters. A searchable STAC catalog implements the [STAC API - Item Search](https://github.com/radiantearth/stac-api-spec/tree/release/v1.0.0/item-search) conformance class, which provides the ability to search for STAC Item objects across collections.

### Setup
In order to run this notebook you need to install [`xcube`](https://xcube.readthedocs.io/en/latest/) and the [`xcube_stac`](https://github.com/xcube-dev/xcube-stac) plugin. You may install [`xcube_stac`](https://github.com/xcube-dev/xcube-stac) directly from the git repository by cloning the repository, directing into `xcube-stac`, and following the steps below:

```bash
conda env create -f environment.yml
conda activate xcube-stac
pip install .
```

Note that [`xcube`](https://xcube.readthedocs.io/en/latest/) is included in the `environment.yml`.  

Now, we first import everything we need:

In [17]:
from xcube.core.store import new_data_store, get_data_store_params_schema
import itertools

First, we get the store parameters needed to initialize a STAC [data store](https://xcube.readthedocs.io/en/latest/dataaccess.html#data-store-framework). 

In [18]:
store_params = get_data_store_params_schema("stac")
store_params

<xcube.util.jsonschema.JsonObjectSchema at 0x7ee1601fc140>

We determine the url of the [Earth Search](https://element84.com/earth-search/) STAC catalog and initiate a STAC [data store](https://xcube.readthedocs.io/en/latest/dataaccess.html#data-store-framework) where the `xcube-stac` plugin is recognized by setting the first argument to `"stac"` in the `new_data_store` function.

In [19]:
url = "https://earth-search.aws.element84.com/v1"
store = new_data_store("stac", url=url)

The data IDs point to a [STAC item's JSON](https://github.com/radiantearth/stac-spec/blob/master/item-spec/item-spec.md) and are specified by the segment of the URL that follows the catalog's URL. The data IDs can be streamed using the following code where we show the first 10 data IDs as an example.

In [20]:
data_ids = store.get_data_ids()
list(itertools.islice(data_ids, 10))

['collections/sentinel-1-grd/items/S1A_IW_GRDH_1SDV_20240528T070327_20240528T070354_054068_0692F7',
 'collections/sentinel-1-grd/items/S1A_IW_GRDH_1SDV_20240528T070302_20240528T070327_054068_0692F7',
 'collections/sentinel-1-grd/items/S1A_IW_GRDH_1SDV_20240528T070237_20240528T070302_054068_0692F7',
 'collections/sentinel-1-grd/items/S1A_IW_GRDH_1SDV_20240528T070212_20240528T070237_054068_0692F7',
 'collections/sentinel-1-grd/items/S1A_IW_GRDH_1SDV_20240528T070147_20240528T070212_054068_0692F7',
 'collections/sentinel-1-grd/items/S1A_IW_GRDH_1SDV_20240528T070122_20240528T070147_054068_0692F7',
 'collections/sentinel-1-grd/items/S1A_IW_GRDH_1SDV_20240528T070057_20240528T070122_054068_0692F7',
 'collections/sentinel-1-grd/items/S1A_IW_GRDH_1SDV_20240528T070028_20240528T070057_054068_0692F7',
 'collections/sentinel-1-grd/items/S1A_IW_GRDH_1SDV_20240528T065729_20240528T065750_054068_0692F6',
 'collections/sentinel-1-grd/items/S1A_IW_GRDH_1SDV_20240528T065704_20240528T065729_054068_0692F6']

In the next step, we can search for items using search parameters. The following code shows which search parameters are available.

In [21]:
search_params = store.get_search_params_schema()
search_params

<xcube.util.jsonschema.JsonObjectSchema at 0x7ee16024b380>

Now, let's search for Sentinel-2 L2a data in the region of Lake Constance for the period from March 4th to March 6th, 2020.

In [22]:
descriptors = list(store.search_data(
    collections=["sentinel-2-l2a"],
    bbox=[9, 47, 10, 48],
    time_range=["2020-03-04", "2020-03-06"]
))
[d.to_dict() for d in descriptors]

[{'data_id': 'collections/sentinel-2-l2a/items/S2A_32TMT_20200305_0_L2A',
  'data_type': 'dataset',
  'bbox': [7.662878883910047,
   46.85818510451771,
   9.130456971519783,
   47.85361872923358],
  'time_range': ['2020-03-05', None]},
 {'data_id': 'collections/sentinel-2-l2a/items/S2A_32TNT_20200305_0_L2A',
  'data_type': 'dataset',
  'bbox': [8.999746010269408,
   46.86543294418706,
   9.69101980741396,
   47.85369284462768],
  'time_range': ['2020-03-05', None]},
 {'data_id': 'collections/sentinel-2-l2a/items/S2A_32UMU_20200305_0_L2A',
  'data_type': 'dataset',
  'bbox': [7.639190643894996,
   47.75741268155721,
   9.132768981529203,
   48.752927528985374],
  'time_range': ['2020-03-05', None]},
 {'data_id': 'collections/sentinel-2-l2a/items/S2A_32UNU_20200305_0_L2A',
  'data_type': 'dataset',
  'bbox': [8.999741508947045,
   47.76332693643577,
   10.103302080212728,
   48.75300400802843],
  'time_range': ['2020-03-05', None]}]

In the next step, we can open the data for each data id. (Note that this is not fully implemented yet. So far we can access assets which will give the href to the data resource). The following code shows which parameters are available for opening the data.

In [23]:
open_params = store.get_open_data_params_schema()
open_params

<xcube.util.jsonschema.JsonObjectSchema at 0x7ee1601fbe00>

We select the band B04 (red) and get the corresponding assets and the corresponding hrefs pointing to the data resources by running the following code.

In [24]:
asset_collection = []
for descriptor in descriptors:
    assets = store.open_data(descriptor.data_id, asset_names=["red"])
    assert len(assets) == 1
    asset_collection.append(assets[0])
[asset.href for asset in asset_collection]

['https://sentinel-cogs.s3.us-west-2.amazonaws.com/sentinel-s2-l2a-cogs/32/T/MT/2020/3/S2A_32TMT_20200305_0_L2A/B04.tif',
 'https://sentinel-cogs.s3.us-west-2.amazonaws.com/sentinel-s2-l2a-cogs/32/T/NT/2020/3/S2A_32TNT_20200305_0_L2A/B04.tif',
 'https://sentinel-cogs.s3.us-west-2.amazonaws.com/sentinel-s2-l2a-cogs/32/U/MU/2020/3/S2A_32UMU_20200305_0_L2A/B04.tif',
 'https://sentinel-cogs.s3.us-west-2.amazonaws.com/sentinel-s2-l2a-cogs/32/U/NU/2020/3/S2A_32UNU_20200305_0_L2A/B04.tif']

This notebook will be continued once the data access is implemented.