In [1]:
#already built in with python
import os
import datetime 

#installed with conda-forge
import numpy
import xarray as xr
import matplotlib as plt
import folium
import geopandas as gpd
from shapely.geometry import Point

#installed with pip 
import pystac
import pystac_client
import xarray_eopf
import rasterio

# **Same visuals, but different on the inside**

The new Earth Observation Processing Framework (**EOPF**) Zarr format is changing the way Sentinel products are delivered. The already .SAFE format will no longer exist to be replaced by Zarr data but don't be scared, Sentinel data is the same, just the storage format and delivery is different. Let's take a look...

## **STAC catalog**

The STAC catalog, when you can access the data, has the same apearance and organization schema!

| ![Image 1](img/old.png) | ![Image 2](img/zarr.png) |
|---------------------------------|------------------------|
| CDSE - STAC API                 | EOPF Sentinel Zarr Samples Service STAC API              


But, if you notice carefully, there is a small but significant difference: the EOPF Zarr service has Sentinel-1 SLC data, which didn't exist in the old CDSE STAC catalogue. The main reason for this is the the new Zarr format, because it's cloud-native, can handle in a much simpler way, heavier data, such as Sentinel-1 SLC data (*not only contains the backscatter information, but also the phase informtion, which is essential for InSAR, but we'll talk about that later*)

**Access the data using ID** - The way data is presented is the same, even though the way it is stored is not the same. These are great news! If you want to access a specif product, it still follows the same logic and it did for the CDSE - STAC catalog.

Let's follow this example on how to access Sentinel-1 Level-1 GRD data to see how the data is stored on the EOPF Sentinel Zarr Samples Service.

### 1. Source ID for the STAC

<img src="img/source_id_catalogue.png" width="500"/>

In [2]:
stacID = "eopf-sample-service-stac-api"
stacMetadata = "https://stac.core.eopf.eodc.eu/"

### 2. Source ID for Sentinel-1 Level-1 GRD

<img src="img/source_id_grd.png" width="500"/>

In [3]:
grdID = "sentinel-1-l1-grd"
grdMetadata = "https://stac.core.eopf.eodc.eu/collections/sentinel-1-l1-grd"

### 3. Source ID for the specific product

<img src="img/source_id_product.png" width="500"/>

In [4]:
productID = "S1A_IW_GRDH_1SSH_20250708T124813_20250708T124838_059992_0773EF_6FFD"
productMetadata = "https://stac.core.eopf.eodc.eu/collections/sentinel-1-l1-grd/items/S1A_IW_GRDH_1SSH_20250708T124813_20250708T124838_059992_0773EF_6FFD"

## **But Attention** üö®

Two things must be noticed:

**1-** If by any chance you try to look for the same product but on the old catalog we'll find basically the same product but with a different identification name. Are those the same products? Why do they have different names?

**2-** When chekcing for the souce ID for the GRD product, we can see that it's not valid.. What does it mean?

Let's dive into this problems, one by one!


#### **1. ID names of products**
Let's analyse the ID names of Sentinel products!

Each Sentinel product follows a similar structure, having identifiers to distiguish between products. Each product can be seen a "photo" taken by the satellite and, for each photo, the information is stored under this product ID names.

In [5]:
print("Product ID =", productID)

Product ID = S1A_IW_GRDH_1SSH_20250708T124813_20250708T124838_059992_0773EF_6FFD


And, because this product belongs to a specif group of products (Sentinel-1 Level-1 GRD but it could be, Sentinel-1 Level-1 SLC or Sentinel-1 Level-2A), it also has a collection ID.

In [6]:
print("Collection ID =", grdID)

Collection ID = sentinel-1-l1-grd


In the end, all the products from all the collections are stored on the same STAC catalog.

In [7]:
print("Catalog ID =", stacID)

Catalog ID = eopf-sample-service-stac-api


------

Now let's breakdown the point product name from the EOPF Sentinel Zarr Samples Service: 

1- ```S1A_IW_GRDH_1SSH_20250708T124813_20250708T124838_059992_0773EF_6FFD``` 

and compare it with the same product from the old STAC catalog: 

2- ```S1A_IW_GRDH_1SSH_20250708T124813_20250708T124838_059992_0773EF_E620_COG```

Product from EOPF Sentinel Zarr Samples Service: `S1A_IW_GRDH_1SSH_20250708T124813_20250708T124838_059992_0773EF_6FFD`:

| Part                 | Value               | Meaning                                                                |
|----------------------|---------------------|------------------------------------------------------------------------|
| **S1A**              | S1A                 | **Satellite**: Sentinel-1A                                             |
| **IW**               | IW                  | **Acquisition Mode**: Interferometric Wide Swath                       |
| **GRDH**             | GRDH                | **Product Type**: Ground Range Detected (GRD), High resolution         |
| **1SSH**             | 1SDV                | **Product Level and Polarisation**: Level-1, Dual polarization (VV+VH) |
| **20250708T124813**  | 08-07-2025 12:48:13 | **Start Time** (UTC): acquisition start time                           |
| **20250708T124838**  | 08-07-2025 12:48:38 | **Stop Time** (UTC): acquisition end time                              |
| **059992**           | 059992              | **Absolute Orbit Number**: Sequential number representing how many complete orbits the satellite has made since launch. It increases by 1 for each orbit (approx. every 98 minutes)                          |
| **0773EF**           | 0773EF              | **Mission Data Take ID**: It increases each time the sensor is turned on, kind of identifying a "recording session". It is useful to identify if two products were taken on the same continuous acquisition                                                |
| **6FFD**             | 6FFD                | **Unique Identifier / Product ID**: It is unique for each product, similar to an ID number|

-------

Now, let's compare the two products!

`S1A_IW_GRDH_1SSH_20250708T124813_20250708T124838_059992_0773EF_6FFD` vs `S1A_IW_GRDH_1SSH_20250708T124813_20250708T124838_059992_0773EF_E620_COG`


|      | EOPF Sentinel Zarr Samples Service | old STAC catalog   |  Is it the same?|   
|--------------------------------------|---------------------|--------------------------------------------------------------|-----|
| **Satellite**                        | S1A                 | S1A                                            |‚úÖ|
| **Acquisition Mode**                 | IW                  | IW                       |‚úÖ|
| **Product Type**                     | GRDH                | GRDH         |‚úÖ|
| **Product Level and Polarisation**   | 1SSH                | 1SSH|‚úÖ|
| **Start Time**                       | 08-07-2025 12:48:13 |08-07-2025 12:48:13|‚úÖ|
| **Stop Time**                        | 08-07-2025 12:48:38 |08-07-2025 12:48:38|‚úÖ|
| **Absolute Orbit Number**            | 059992              | 059992                                              |‚úÖ|
| **Mission Data Take ID**             | 0773EF              |0773EF                                               |‚úÖ|
| **Unique Identifier**                | 6FFD                | E620                                     |‚ùå|
| **Extra Notation**                | -             | COG                                     |‚ùå|

What does this mean? In fact, these are exactly the same products, just the unique identifier and the extra notation is different. This is because:
- The 6FFD unique identifier shows the original ESA-generated product ID, created when the scene was first processed and the E620 identifier was generated when the file was reprocessed and converted into a COG (Cloud Optimized GeoTIFF).
- This leads us to the extra notation found on the old STAC catalog. The COG suffix	indicates the product was repackaged for cloud/web use and, of course, converted into a COG (Cloud Optimized GeoTIFF). We don't face this situation on the new Zarr format.

#### **2. GRD product unvalid**
(no idea why, yet)

(this one: ```S1A_IW_GRDH_1SSH_20250708T124813_20250708T124838_059992_0773EF_6FFD```)

## **Accessing the product from  EOPF Sentinel Zarr Samples**
Starting from the begining, let's access a specific product. This product, `S1A_IW_GRDH_1SDV_20250629T234043_20250629T234108_059868_076FA4_A8F4`, was the chosen one!

The new Zarr format stores Sentinel data in four different groups. It is important to understand this structure because later, when we'll need to access the data, we'll need to know where to find it.

The EOPF Zarr Sentinel structure contains four main groups:
- **Attributes**: STAC format metadata of the imagery, such as chunking information (how the data is divided into the several chunks) and product specific metadata (like acquisition time or sensor specifics);
- **Measurements**: Main retrieved variables, such as reflectance (for Sentinel-1 GRD) and phase information (for Sentinel-1 SLC) or the several band information (when talking about Sentinel-2);
- **Conditions**: Geometric angles, meteorological and instrumental data or any other information concerning;
- **Quality**: Quality information concerning the measurements and masks;

-----

To start accessing and exploring the data we'll need the following libraries:

In [10]:
import os
import xarray as xr

We'll use a function from the ``xarray``librarty called ``open_datatree()``. This function opens hierarchical datasets, which is a fancy way to say that our dataset is structured in folders (in our case called grouos) and subfolders (or subfolders). So, in conclusion, using ``open_datatree()``will allows us to **load the entire structure of groups and subgroups into accessible objects, where we can navigate, inspect and extract data from any level.

On the other hand, if we prefer to open just one group at a time (intead of unfolding the whole dataset tree), we can use ``open_dataset()``funtion and proceed with the same inspection.

In [28]:
url = "https://objects.eodc.eu/e05ab01a9d56408d82ac32d69a5aae2a:202506-s01siwgrh/29/products/cpm_v256/S1A_IW_GRDH_1SDV_20250629T234043_20250629T234108_059868_076FA4_A8F4.zarr"

In [33]:
sample = xr.open_datatree(url,
    engine = "eopf-zarr", # storage format
    op_mode = "native", # no analysis mode
    chunks={}, # allows to open the default chunking
)
#print(sample)

In [30]:
def print_gen_structure(node, indent=""):
    print(f"{indent}{node.name}")     #allows us access each node
    for child_name, child_node in node.children.items(): #loops inside the selected nodes to extract naming
        print_gen_structure(child_node, indent + "  ") # prints the name of the selected nodes

In [31]:
print_gen_structure(sample.root)

None
  S01SIWGRD_20250629T234043_0025_A342_A8F4_076FA4_VH
    conditions
      antenna_pattern
      attitude
      azimuth_fm_rate
      coordinate_conversion
      doppler_centroid
      gcp
      orbit
      reference_replica
      replica
      terrain_height
    measurements
    quality
      calibration
      noise
  S01SIWGRD_20250629T234043_0025_A342_A8F4_076FA4_VV
    conditions
      antenna_pattern
      attitude
      azimuth_fm_rate
      coordinate_conversion
      doppler_centroid
      gcp
      orbit
      reference_replica
      replica
      terrain_height
    measurements
    quality
      calibration
      noise


In [9]:
import requests
from typing import List, Optional, cast
from pystac import Collection, MediaType
from pystac_client import Client, CollectionClient
from datetime import datetime

**its not possible to download the files**