# Deep dive into the Zarr format: Inside `Sentinel_1_SLC.zarr`

## Introduction
This tutorial introduces the structure of a `zarr` sample for **Sentinel 1 SLC** (Single Look Complex) radar data. We will demonstrate how to visualise the `.zarr` encoding structure, explore embedded information, and retrieve metadata for further processing.

### Prerequisites
A sample dataset for this tutorial can be obtained from the [EOPF available Samples](https://common.s3.sbg.perf.cloud.ovh.net/product.html). If further data sets want to be explored, the code indicates where the code needs to be updated.

For local **Sentinel 1 SLC** data exploration, the resource with the format `S01SIWSLC_....zarr` should be located and downloaded in the same directory as this example.

> **Note:** <br>
> Further sample descriptions will be included in subsequent notebook updates.<br>
> To look into the `.zarr` products naming, visit [the EOPF product types and file naming rules](https://cpm.pages.eopf.copernicus.eu/eopf-cpm/main/PSFD/3-product-types-naming-rules.html).<br>
<br>
> Names, can give some context of the type of product we are working with.
><br>

To manage the indicated libraries, it is recommended to work within a dedicated and stable set up. To ensure package compatibility and avoid conflicts, the following virtual environment setup is suggested:

For Conda:

`conda create --name zarr_explore python=3.11 os xarray zarr numpy jupyter`

For pip (for Windows):

`python -m venv .zarr_explore`<br>
`.zarr_explore\Scripts\activate.bat`<br>
`pip install os xarray zarr numpy jupyter`

### Setting up the environment
The `xarray` library facilitates the handling of labeled multi-dimensional arrays, enabling more efficient processing. This library will be explored in detail along [Chapter 3](). <br>
Check out their [documentation](https://docs.xarray.dev/en/stable/) for additional resources.

We then import the specific dependencies.

In [1]:
import os
import xarray as xr

To allow us retrieve only the names for each of the stored groups inside `zarr`, the subsequent function definition allows us looping and retrieving the names to be visualised at each main node in an efficient way. <br> 
This will allow general overview of the elements stored within them without the defaults `xarray` fine description.

In [2]:
def print_gen_structure(node, indent=""):
    print(f"{indent}{node.name}")     #allows us access each node
    for child_name, child_node in node.children.items(): #loops inside the selected nodes to extract naming
        print_gen_structure(child_node, indent + "  ") # prints the name of the selected nodes

From `xarray`, The `.open_datatree()` function enables access and decoding of a `DataTree` from a file-like object (in this case, the `.zarr` stored file), creating a tree node for each group within the file.

In [3]:
# Open the Zarr store with xarray as a DataTree
s1_zarr_sample= xr.open_datatree(
    'S01SIWSLC_20231201T170634_0027_A117_S27C_VH_IW1_249411.zarr',  # Substitute with the downloaded sample of your interest
    engine="zarr", # storage format
    chunks={}, # allows to open the default chunking
)

The following output displays the information contained inside the attributes, conditions, measurements, and quality main `.zarr` groups.

In [4]:
print('Zarr Sentinel 1 SLC')
print_gen_structure(s1_zarr_sample.root) 
print("-" * 30)

To have a finer visualisation of the `zarr` element, `xarray` also allows us to access a representation of the entire data content within the `.zarr` object. This visualisation displays each group defined inside the `.zarr` file and its respective arrays, including detailed information such as general metadata, dimensions, chunking geometry, and chunk size.

In [5]:
# Open the Zarr store with xarray and print the detailed structure.
# Run this lines in case the print() of the whole data set is of your interest.
# print("Dataset Structure:")
# print(s1_zarr_sample)
# print("-" * 30)

If we are  looking forward to extract specific information from a group, `xarray`'s lables allows us to retrieve by group, the information we are interested in. <br>
<br>
Lets say we are willing to visualise only the `elevation_angle` of retrieval inside this asset.<br>
We need to remember then, that according to the structure, it is located  inside the `antenna information`. The path or group where the `conditions/antenna_pattern` array is contained inside the `zarr`, will allow us to retrieve the group's information. <br>
We can visualise it:

In [6]:
# Retrieving the satellites antenna relevant conditions:
print(s1_zarr_sample['conditions/antenna_pattern'])

It is important to point out that if we are willing to actually explore the groups and definition inside the `zarr`, we are able to take out the `print()` statement. <br>
This will enable the `xarray.DataTree` **Drop down** interface that will let us explore interactiveley group related metadata and information. <br>
We can viasualise each contained `array` and the `dtype`.

In [7]:
# Retrieving the same group in an interactve xarray.DataTree:
s1_zarr_sample['conditions/antenna_pattern']

Inside this element, we are able to visualise the main data for the Sentinel 1 Mission, the SLC included inside the group `measurements`.
If we revise further inside each of them we will find the chunks containing the arrays with the reflectance information.

In [8]:
# Retrieving the SLC data inside .zarr:
s1_zarr_sample['/measurements']

Additionally, through `s1_zarr_sample.attrs[]` we are able to visualise both the `stac_discovery` and `other_metadata`. <br>
<br>
For the properties inside `stac_discovery` for example:

In [9]:
# STAC metadata style:
s1_zarr_sample.attrs['stac_discovery']['properties']

And inside `other_metadata` the raw data analysis (to have a digestible print):

In [10]:
# Complementing metadata:
s1_zarr_sample.attrs['other_metadata']['raw_data_analysis']

## Conclusion
This tutorial provides an initial understanding of the `zarr` structure for a Sentinel 1 SLC radar sample. <br>
<br>
By using the `xarray` library, one can effectively navigate and inspect the different components within the `zarr` format, including its metadata and array organisation.<br> 
This foundation will help deeply undestand the subsequent data analysis and processing workflows intended in our series.

For a deeper description of the metadata structure, follow the [metadata structure]() tutorial.