# Non-seachable STAC catalog

This notebook shows an example how to access items in a non-searchable STAC catalog, which does not implement the [STAC API - Item Search](https://github.com/radiantearth/stac-api-spec/tree/release/v1.0.0/item-search) conformance class. When searching in such type of catalog, the catalog needs to be crawled through and the items properties needs to be matched to the search parameters. This process will there be slow, especially for large catalogs.

### Setup
In order to run this notebook you need to install [`xcube`](https://xcube.readthedocs.io/en/latest/) and the [`xcube_stac`](https://github.com/xcube-dev/xcube-stac) plugin. You may install [`xcube_stac`](https://github.com/xcube-dev/xcube-stac) directly from the git repository by cloning the repository, directing into `xcube-stac`, and following the steps below:

```bash
conda env create -f environment.yml
conda activate xcube-stac
pip install .
```

Note that [`xcube`](https://xcube.readthedocs.io/en/latest/) is included in the `environment.yml`.  

Now, we first import everything we need:

In [1]:
from xcube.core.store import new_data_store, get_data_store_params_schema
import itertools

First, we get the store parameters needed to initialize a STAC [data store](https://xcube.readthedocs.io/en/latest/dataaccess.html#data-store-framework). 

We determine the url of the [EcoDataCube.eu](https://stac.ecodatacube.eu/) STAC catalog and initiate a STAC [data store](https://xcube.readthedocs.io/en/latest/dataaccess.html#data-store-framework) where the `xcube-stac` plugin is recognized by setting the first argument to `"stac"` in the `new_data_store` function.

In [2]:
url = "https://s3.eu-central-1.wasabisys.com/stac/odse/catalog.json"
store = new_data_store("stac", url=url)

/home/konstantin/micromamba/envs/xcube-stac/lib/python3.12/site-packages/pystac_client/client.py:190: NoConformsTo: Server does not advertise any conformance classes.


The data IDs point to a [STAC item's JSON](https://github.com/radiantearth/stac-spec/blob/master/item-spec/item-spec.md) and are specified by the segment of the URL that follows the catalog's URL. The data IDs can be streamed using the following code where we show the first 10 data IDs as an example.

In [3]:
data_ids = store.get_data_ids()
list(itertools.islice(data_ids, 10))

['lcv_land.mask_eumap/lcv_land.mask_eumap_2014.01.01..2016.12.31/lcv_land.mask_eumap_2014.01.01..2016.12.31.json',
 'lcv_blue_landsat.glad.ard/lcv_blue_landsat.glad.ard_1999.12.02..2000.03.20/lcv_blue_landsat.glad.ard_1999.12.02..2000.03.20.json',
 'lcv_blue_landsat.glad.ard/lcv_blue_landsat.glad.ard_2000.03.21..2000.06.24/lcv_blue_landsat.glad.ard_2000.03.21..2000.06.24.json',
 'lcv_blue_landsat.glad.ard/lcv_blue_landsat.glad.ard_2000.06.25..2000.09.12/lcv_blue_landsat.glad.ard_2000.06.25..2000.09.12.json',
 'lcv_blue_landsat.glad.ard/lcv_blue_landsat.glad.ard_2000.09.13..2000.12.01/lcv_blue_landsat.glad.ard_2000.09.13..2000.12.01.json',
 'lcv_blue_landsat.glad.ard/lcv_blue_landsat.glad.ard_2000.12.02..2001.03.20/lcv_blue_landsat.glad.ard_2000.12.02..2001.03.20.json',
 'lcv_blue_landsat.glad.ard/lcv_blue_landsat.glad.ard_2001.03.21..2001.06.24/lcv_blue_landsat.glad.ard_2001.03.21..2001.06.24.json',
 'lcv_blue_landsat.glad.ard/lcv_blue_landsat.glad.ard_2001.06.25..2001.09.12/lcv_blue_l

In the next step, we can search for items using search parameters. The following code shows which search parameters are available.

In [4]:
search_params = store.get_search_params_schema()
search_params

<xcube.util.jsonschema.JsonObjectSchema at 0x71bc9529e120>

Now, let's search for Landsat Thematic Mapper data for the European region during the first quarter of 2000.

In [5]:
descriptors = list(
    store.search_data(
        collections=["lcv_blue_landsat.glad.ard"],
        bbox=[-10, 40, 40, 70],
        time_range=["2000-01-01", "2000-04-01"],
    )
)
[d.to_dict() for d in descriptors]

[{'data_id': 'lcv_blue_landsat.glad.ard/lcv_blue_landsat.glad.ard_1999.12.02..2000.03.20/lcv_blue_landsat.glad.ard_1999.12.02..2000.03.20.json',
  'data_type': 'dataset',
  'bbox': [-23.550818268711048,
   24.399543432891665,
   63.352379098951936,
   77.69295185585888],
  'time_range': ['1999-12-02', '2000-03-20']},
 {'data_id': 'lcv_blue_landsat.glad.ard/lcv_blue_landsat.glad.ard_2000.03.21..2000.06.24/lcv_blue_landsat.glad.ard_2000.03.21..2000.06.24.json',
  'data_type': 'dataset',
  'bbox': [-23.550818268711048,
   24.399543432891665,
   63.352379098951936,
   77.69295185585888],
  'time_range': ['2000-03-21', '2000-06-24']}]

In the next step, we can open the data for each data ID. The following code shows which parameters are available for opening the data.

In [6]:
open_params = store.get_open_data_params_schema()
open_params

<xcube.util.jsonschema.JsonObjectSchema at 0x71bc941e6ba0>

Now, we lazily load for the given data IDs, where we select all available assets, which can be opened by the data store.

In [7]:
for descriptor in descriptors:
    ds = store.open_data(descriptor.data_id)
    print("-" * 100)
    print(ds)

----------------------------------------------------------------------------------------------------
<xarray.Dataset> Size: 114GB
Dimensions:      (x: 188000, y: 151000)
Coordinates:
  * x            (x) float64 2MB 9e+05 9e+05 9.001e+05 ... 6.54e+06 6.54e+06
  * y            (y) float64 1MB 5.46e+06 5.46e+06 ... 9.301e+05 9.3e+05
    spatial_ref  int64 8B 0
Data variables:
    blue_p50     (y, x) uint8 28GB dask.array<chunksize=(512, 512), meta=np.ndarray>
    blue_p25     (y, x) uint8 28GB dask.array<chunksize=(512, 512), meta=np.ndarray>
    blue_p75     (y, x) uint8 28GB dask.array<chunksize=(512, 512), meta=np.ndarray>
    qa_f         (y, x) uint8 28GB dask.array<chunksize=(512, 512), meta=np.ndarray>
----------------------------------------------------------------------------------------------------
<xarray.Dataset> Size: 114GB
Dimensions:      (x: 188000, y: 151000)
Coordinates:
  * x            (x) float64 2MB 9e+05 9e+05 9.001e+05 ... 6.54e+06 6.54e+06
  * y            (y) fl