# Searching the EOPF Sentinel Zarr Samples Service STAC API

### Introduction

In this tutorial, we will dive into the access of Sentinel-1, Sentinel-2 and Sentinel-3 `.zarr` Collections available in the [EOPF Sentinel Zarr Sample Service STAC Catalog](https://stac.browser.user.eopf.eodc.eu/?.language=en). <br>
This powerful API provides a structured way to search and access the EOPF Data through Python programming language.

### What we will learn
- 🔍 How to **programmatically browse** available collections inside the EOPF STAC API
- 📊 Understanding **collection metadata** in user-friendly terms
- 🎯 **Searching for specific data** with help of the `pystac` and `pystac-client` libraries.

### Prerequisites

For this tutorial, we will make use of the`pystac` and `pystac_client` libraries facilitate the request and deep search inside the STAC environment, enabling an efficient processing.
Check out [pystac documentation](https://pystac.readthedocs.io/en/stable/) and [pystac_client documentation](https://pystac-client.readthedocs.io/en/latest/api.html)  for additional resources.

<hr>

#### Import libraries

In [1]:
import requests
from typing import List, Optional, cast
from pystac import Collection, MediaType
from pystac_client import Client, CollectionClient
from datetime import datetime

#### Helper functions

##### `list_found_elements`
As we are expecting to visualise several elements that will be stored in lists, we define a function that will allow us retrieve item `id`'s and collections `id`'s for further retrieval.

In [2]:
def list_found_elements(search_result):
    id = []
    coll = []
    for item in search_result.items(): #retrieves the result inside the catalogue.
        id.append(item.id)
        coll.append(item.collection_id)
    return id , coll

<hr>

## Establish the connection

Our first step is to create our connection to interact with the EOPF STAC Catalog.<br>
This involves defining the starting point for the data we wish to retrieve.<br>

The API's base URL is available through the 🔗**Source** ([click here](https://stac.core.eopf.eodc.eu/)), which can be found in the **API & URL** tab of the [EOPF Sentinel Zarr Sample Service STAC Catalog](https://stac.browser.user.eopf.eodc.eu/?.language=en).

![EOPF API url for connection](img/api_connection.png)

Through `Client.open()` function, we can create the access to the starting point of the Catalogue by providing the specific url.

In [3]:
max_description_length = 100

eopf_stac_api_root_endpoint = "https://stac.core.eopf.eodc.eu/" #root starting point
eopf_catalog = Client.open(url=eopf_stac_api_root_endpoint) # calls the selected url

Rectifying the catalog we have just accessed:

In [4]:
print(
    "Selected Catalog: {id}: {description}".format(
        id=eopf_catalog.id,
        description=eopf_catalog.description
    )
)

Selected Catalog: eopf-sample-service-stac-api: STAC catalog of the EOPF Sentinel Zarr Samples Service


<!-- It is important to remember that the Sentinel Zarr Sample Service STAC **is still under development** and recieves constant updates and additions to the Collections.<br>
To ensure we have access to available resources, we include some verification of the availability of data inside the catalogue. <br>
This proactive step helps us understand what data is currently accessible. -->

> **Note:** <br>
> To explore further issues or more considerations check un the [EOPF Sentinel Zarr Samples Service](https://zarr.eopf.copernicus.eu/) updates and their [Github Issues](https://github.com/EOPF-Sample-Service/eopf-stac/issues)

## Searching inside the EOPF STAC API

With the `.search()` function inside our `client` (the Catalog) definition, we are able to define a series of parameters that allow us filtering the data that matches the criteria we are interested in.


### Collection

To search an individual collection, we can define the `collections` parameter from the catalog we have accesed.<br>
This capability is incredibly powerful for pinpointing precisely the data we need. We need to indicate the collections `id` <br>
We will focus our attention on the Sentinel-2 L2A Collection for this tutorial. In this case, the EOPF STAC Catalog corresponding id is: `sentinel-2-l2a`

In [5]:
sentinel2 = eopf_catalog.search(collections='sentinel-2-l2a') # the collection we are interesed in

if sentinel2:
    print('Selected Collection Exists') # to validate the search

Selected Collection Exists


We are able to retrieve certain metadata that will allow us to find more information about the selected collection, such as keywords, the ID and useful links for resources.

In [6]:
S2l2a_coll = eopf_catalog.get_collection('sentinel-2-l2a')
print('Keywords:        ',S2l2a_coll.keywords)
print('Catalog ID:      ',S2l2a_coll.id)
print('Available Links: ',S2l2a_coll.links)

Keywords:         ['Copernicus', 'Sentinel', 'EU', 'ESA', 'Satellite', 'Global', 'Imagery', 'Reflectance']
Catalog ID:       sentinel-2-l2a
Available Links:  [<Link rel=items target=https://stac.core.eopf.eodc.eu/collections/sentinel-2-l2a/items>, <Link rel=parent target=https://stac.core.eopf.eodc.eu/>, <Link rel=root target=<Client id=eopf-sample-service-stac-api>>, <Link rel=self target=https://stac.core.eopf.eodc.eu/collections/sentinel-2-l2a>, <Link rel=license target=https://sentinel.esa.int/documents/247904/690755/Sentinel_Data_Legal_Notice>, <Link rel=cite-as target=https://doi.org/10.5270/S2_-znk9xsj>, <Link rel=http://www.opengis.net/def/rel/ogc/1.0/queryables target=https://stac.core.eopf.eodc.eu/collections/sentinel-2-l2a/queryables>]


#### Temporal extent

STAC, allows us to filter also data by any period we are interested in, as long as it is available.
The `datetime` parameter inside `.search()` argument in the `eopf_catalog` element allows us to focus on imagery captured within a particular period.<br>
We will define an interval that spans, for example, between May 1, 2020, and May 31, 2023.
For instance, we can search for items within a specific time frame and from a particular collection simultaneously.<br>

In [7]:
time_frame = eopf_catalog.search(  #searching the catalog
    collections='sentinel-2-l2a',
    datetime="2020-05-01T00:00:00Z/2023-05-31T23:59:59.999999Z")  # the interval we are interested in, separated by '/'

time_items=list_found_elements(time_frame) #we apply our constructed function

#Results
print("Search Results:")
print('Total Items Found for Sentinel-2 L-2A between May 1, 2020, and May 31, 2023:  ',len(time_items[0]))


Search Results:
Total Items Found for Sentinel-2 L-2A between May 1, 2020, and May 31, 2023:   196


### Spatial Extent
To narrow down our data search, we can define a specific area of interest.<br>
We are able to do this by providing a bounding box (`bbox`), which is composed by providing the top-left and bottom-right corner coordinates. It is similar to drawing the extent in the interactive map of the EOPF browser interface.<br>

We can focus in a speficic area for our search.<br>
We can define for example, the outskirts of Innsbruck, Austria.

In [8]:
bbox_search =  eopf_catalog.search(  #searching the catalog
    collections='sentinel-2-l2a',
    bbox=(
        11.124756, 47.311058, #top left
        11.459839, 47.463624  #bottom-right
        )
)

innsbruck_sets=list_found_elements(bbox_search) #we apply our constructed function that stores internal information

#Results
print("Search Result:")
print('Total Items Found:  ',len(innsbruck_sets[0]))

Search Result:
Total Items Found:   38


Based on our search within the defined area of interest (until the most updated version of the tutorial), 38 items have been captured, containing `zarr` encoded items, that intersect the specified coordinates.<br>

This gives us a clear picture of the data density and variety available for our Area of Interest (AOI).

#### Collection + Temporal + Spatial extents
A usual workflow in EO analysis, considers retrieving datasets within an AOI and a time frame. `pystac` allows us to combine the `collection`, `bbox` and `datetime` arguments for a fine data retrieval.

Defining Innsbruck within the previously deifned timeframe for the **Sentinel-2 Level-2A** collection:

In [9]:
innsbruck_s2 = eopf_catalog.search( 
    collections= 'sentinel-2-l2a', # interest Collection,
    bbox=(11.124756, 47.311058, # AOI extent
          11.459839,47.463624),
    datetime='2020-05-01T00:00:00Z/2025-05-31T23:59:59.999999Z' # interest period
)

combined_ins =list_found_elements(innsbruck_s2)

print("Search Results:")
print('Total Items Found for Sentinel-2 L-2A over Innsbruck:  ',len(combined_ins[0]))

Search Results:
Total Items Found for Sentinel-2 L-2A over Innsbruck:   27


We can try the same workflow for a complete different location. <br>

Lets define a new AOI outside land and the **Sentinel-3 SLSTR-L2** collection for the Coast Area of Rostock, Germany:

In [10]:
rostock_s3 = eopf_catalog.search(
    bbox=(11.766357,53.994566, # AOI extent
          12.332153,54.265086),
    collections= ['sentinel-3-slstr-l2-lst'], # interest Collection
    datetime='2020-05-01T00:00:00Z/2025-05-31T23:59:59.999999Z' # interest period
)

combined_ros=list_found_elements(rostock_s3)

print("Search Results:")
print('Total Items Found for Sentinel-3 SLSTR-L2 over Rostock Coast:  ',len(combined_ros[0]))

Search Results:
Total Items Found for Sentinel-3 SLSTR-L2 over Rostock Coast:   13


### Location in Catalogue

So far, we have made a search among the STAC catalog and browsed over the general metadata of the collections. To **download** or get the actual `zarr` Items, we need to get their URL storage location in the cloud.<br>
Such element, can be found inside the `.items` object by the `.get_assets` parameter. Inside, it allows to obtain the `.MediaType` element we are interested in. For us `.zarr`.

We will focus on the fist element of the 27 available items over Innsbruck:

In [11]:
assets_loc=[] # a list with the ids of the items we are interested in
for x in range(len(combined_ins[0])): # We retrieve only the first asset in the Innsbruck list combined_ins
    assets_loc.append(S2l2a_coll # we set into the Sentinel-2 L-2A collection
                      .get_item(combined_ins[0][x])  # We only get the Innsbruck filtered items
                      .get_assets(media_type=MediaType.ZARR)) # we obtain the .zarr location
    
first_item = assets_loc[0]   # we select the first item from our list

print("Search Results:")
print('URL for accesing',combined_ins[0][0],'item:  ',first_item['product']) # assets_loc[0] corresponds only to the first element:


Search Results:
URL for accesing S2B_MSIL2A_20250530T101559_N0511_R065_T32TPT_20250530T130924 item:   <Asset href=https://objects.eodc.eu:443/e05ab01a9d56408d82ac32d69a5aae2a:202505-s02msil2a/30/products/cpm_v256/S2B_MSIL2A_20250530T101559_N0511_R065_T32TPT_20250530T130924.zarr>


This URL is te equivalent to directly searching in the Item inside the [EOPF Sentinel Zarr Sample Service STAC Catalog](https://stac.browser.user.eopf.eodc.eu/?.language=en).
This is the power of STAC API. It allows us retrieving multiple information through a programming language.

### Item Metadata

Finally, to have an overview of the availanle assets inside the selected item, we are able to call it directly form the stored information:

In [12]:
print('Available Assets: ', list(first_item.keys()))

Available Assets:  ['SR_10m', 'SR_20m', 'SR_60m', 'AOT_10m', 'B01_20m', 'B02_10m', 'B03_10m', 'B04_10m', 'B05_20m', 'B06_20m', 'B07_20m', 'B08_10m', 'B09_60m', 'B11_20m', 'B12_20m', 'B8A_20m', 'SCL_20m', 'TCI_10m', 'WVP_10m', 'product']


## 💪 Now it is your turn

These exercises will help you master the STAC API and understand how to find the data you need.



#### 1. Explore Your Own Area of Interest

1. Go to [http://bboxfinder.com/](http://bboxfinder.com/) and select an area of interest (your hometown, a research site, etc.)
2. Copy the bounding box coordinates of your interest
3. Change the orvided code to search for data over your interest area

#### 2. Temporal Analysis

1. Compare data availability across different years for the **Sentinel-2 L-2A Collection**.
2. Search for 2022
3. Search for 2024

##### 3. Explore the SAR Mission and combine multiple criteria

1. Do the same for another mission for example the **Sentinel-1 Level-1 GRD**. For this one, the id= `sentinel-1-l1-grd`
2. How many assets are availanble for the year 2024?



<hr>

## Conclusion

This tutorial has provided a clear and practical introduction to exploring the [EOPF Sentinel Zarr Sample Service STAC API](https://stac.browser.user.eopf.eodc.eu/?.language=en).<br>
We were able to explore how to connect to the EOPF available API, navigate its structure, and filter data by spatial and temporal criteria though Python. <br>

By leveraging the `pystac` and `pystac_client` libraries, we have the tools to efficiently search for and access the vast amounts of Earth Observation data available through this powerful catalog. <br>
This understanding forms a solid foundation for further analysis and application of Sentinel data in your projects.

<hr>

## What's next?

In the following tutorial, we will explore how to retrieve an Item of our interest, based on several parameters and load it through `xarray`.<br>

This will allow us to seamlessly work with the multi-dimensional array data stored insude `zarr`, opening a new workflow for analysis and visualisation of the EOPF for the Copernicus Sentinel-1, Sentinel-2 and Sentinel-3 missions.