<a href="https://jupyterhub.user.eopf.eodc.eu/hub/login?next=%2Fhub%2Fspawn%3Fnext%3D%252Fhub%252Fuser-redirect%252Fgit-pull%253Frepo%253Dhttps%253A%252F%252Fgithub.com%252Feopf-toolkit%252Feopf-101%2526branch%253Dmain%2526urlpath%253Dlab%252Ftree%252Feopf-101%252F06_eopf_zarr_in_action%252F64_flood_mapping_valencia.ipynb%23fancy-forms-config=%7B%22profile%22%3A%22choose-your-environment%22%2C%22image%22%3A%22unlisted_choice%22%2C%22image%3Aunlisted_choice%22%3A%224zm3809f.c1.de1.container-registry.ovh.net%2Feopf-toolkit-python%2Feopf-toolkit-python%3Alatest%22%2C%22autoStart%22%3A%22true%22%7D" target="_blank">
  <button style="background-color:#0072ce; color:white; padding:0.6em 1.2em; font-size:1rem; border:none; border-radius:6px; margin-top:1em;">
    üöÄ Launch this notebook in JupyterLab
  </button>
</a>

### Introduction

**Sentinel-1 GRD** data is particularly valuable to detect water and underwater areas. Synthetic Aperture Radar (SAR) can capture images day and night, in any weather, a feature especially important for flooding events, where cloudy and rainy weather can persist for weeks. This makes it far more reliable than optical sensors during storms.

With its frequent revisits, wide coverage, and free high-resolution data, **Sentinel-1** enables the rapid mapping of flood extents, as will be demonstrated in this workflow. **VV** polarization is preferred for flood mapping due to its sensitivity to water surfaces, which typically appear darker in the images compared to land surfaces.

#### The Flooding Event

On October 29, 2024, the city of Valencia (Spain) was hit by catastrophic flooding caused by intense storms, leaving over 230 deaths and billions in damages. This disaster was part of Europe‚Äôs worst flood year in over a decade, with hundreds of thousands affected continent-wide. Such events highlight the urgent need for reliable flood monitoring to support **emergency response**, damage assessment and long-term resilience planning.

With respect to this event, we will demonstrate how to use **Sentinel-1 GRD** data to map flood extents. We will use 14 **Sentinel-1 GRD** images from the **IW** swath, covering the city and metropolitan area of Valencia from October 7, 2024 to March 24, 2025. This includes 2 images captured before, 1 immediately after the heavy rains, and 11 images taken after the flooding event, until the water levels got back to normal:
- October 7, 2424 (before)
- October 19, 2024 (before)
- October 31, 2024 (right after the event)
- November 12, 2024 (after)
- November 24, 2024 (after)
- December 6, 2024 (after)
- December 18, 2024 (after)
- December 30, 2024 (after)
- January 11, 2025 (after)
- January 23, 2025 (after)
- February 4, 2025 (after)
- February 16, 2025 (after)
- March 12, 2025 (after)
- March 24, 2025 (after)

#### What we will learn

- üåä How to create a workflow to map flood events.
- ‚öíÔ∏è Basic SAR processing tools.
- üìä How to create a data cube to perform time series analysis.

<hr>

#### Import libraries

In [None]:
import xarray as xr 
import xarray_sentinel 
import pandas as pd
import matplotlib as plt
import matplotlib.pyplot as plt
import numpy as np

import dask                             # these last two libraries are imported to open the datasets faster
from dask.distributed import Client     # and in the end take advantage of the optimized .zarr format


<hr>

## Data pre-processing

To search and load the data needed for the analysis, we will follow the processes we presented in [Sentinel-1 GRD structure tutorial](/02_about_eopf_zarr/22_zarr_structure_S1GRD.ipynb) and [S1 basic operations tutorial](/02_about_eopf_zarr/23_S1_basic_operations.ipynb).

Once we defined our interest Sentinel-1 GRD items, we can see that they contain both **VH** and **VV** polarizations.<br>
For this flood mapping context, **VV** polarization is the choice of interest, as water backscatter is much more visible with it, rather than with VH.

### Loading the datatree

The list below shows the names of the products we will use for the flood mapping and time series analysis.<br>
As we have seen in previous chapters, these names already contain valuable information that can be used to search for specific products within the [EOPF STAC catalogue]().

In [None]:
scenes = ["S1A_IW_GRDH_1SDV_20241007T180256_20241007T180321_056000_06D943_D46B", 
          "S1A_IW_GRDH_1SDV_20241019T180256_20241019T180321_056175_06E02E_2D52", 
          "S1A_IW_GRDH_1SDV_20241031T180256_20241031T180321_056350_06E71E_479F", 
          "S1A_IW_GRDH_1SDV_20241112T180255_20241112T180320_056525_06EE16_DC29", 
          "S1A_IW_GRDH_1SDV_20241124T180254_20241124T180319_056700_06F516_BA27", 
          "S1A_IW_GRDH_1SDV_20241206T180253_20241206T180318_056875_06FBFD_25AD", 
          "S1A_IW_GRDH_1SDV_20241218T180252_20241218T180317_057050_0702F2_0BC2", 
          "S1A_IW_GRDH_1SDV_20241230T180251_20241230T180316_057225_0709DD_15AC", 
          "S1A_IW_GRDH_1SDV_20250111T180250_20250111T180315_057400_0710C7_ADBB", 
          "S1A_IW_GRDH_1SDV_20250123T180249_20250123T180314_057575_0717B9_A784", 
          "S1A_IW_GRDH_1SDV_20250204T180249_20250204T180314_057750_071EA2_4373", 
          "S1A_IW_GRDH_1SDV_20250216T180248_20250216T180313_057925_0725AE_8AC7", 
          "S1A_IW_GRDH_1SDV_20250312T180248_20250312T180313_058275_0733E6_4F5B", 
          "S1A_IW_GRDH_1SDV_20250324T180248_20250324T180313_058450_073AD0_04B7", 
          ]

zarr_paths = []
for scene in scenes:
    zarr_paths.append(f"https://objects.eodc.eu/e05ab01a9d56408d82ac32d69a5aae2a:notebook-data/tutorial_data/cpm_v260/{scene}.zarr")

Next, we will load all `zarr` datasets as xarray.Datatrees. Here **we are not reading** the entire dataset from the store; but, creating a set of references to the data, which enables us to access it efficiently later in the analysis.

In [None]:
client = Client()  # Set up local cluster on your laptop
client

@dask.delayed
def load_datatree_delayed(path):
    return xr.open_datatree(path, consolidated=True, chunks="auto")

# Create delayed objects
delayed_datatrees = [load_datatree_delayed(path) for path in zarr_paths]
# Compute in parallel
datatrees = dask.compute(*delayed_datatrees)

Each element inside the `datatree` list is a datatree and corresponds to a Sentinel-1 GRD scene datatree present on the list above.

In [None]:
# Each element inside the datatree list is a datatree and corresponds to a Sentinel-1 GRD scene datatree present on the list above
type(datatrees[0]) 

### Defining variables

In [None]:
# Number of scenes we are working with for the time series analysis
DATASET_NUMBER = len(datatrees) 

If we run the following commented out code line we will be able to see how each datatree is organized within its groups and subgroups (as explained in this [section](./22_zarr_structure_S1GRD.ipynb)). From this datatree, we took the groups and subgroups constant `ID` numbers used to open specific grouos and variables such as:
- Measurements group = 7 so, in order to open this group, on the first element of our list of scenes, over the first polarization `VV`, we do `datatrees[0][datatrees[0].groups[7]]`
- Calibration group = 33 so, in order to open this group, on the first element of our list of scenes, over the first polarization `VV`, we do `datatrees[0][datatrees[0].groups[33]]`

Over the course of this notebook these `IDs` will be used to call variables and compute some other functions.

In [None]:
# Opening the measurements group from the datatree
datatrees[0][datatrees[0].groups[7]]

In [None]:
# Some other important constant ID numbers 
MEASUREMENTS_GROUP_ID = 7
GCP_GROUP_ID = 28
CALIBRATION_GROUP_ID = 33

We now define the thresholds that will be used for the flood mapping analysis. These values are not fixed and they can be calibrated and adjusted to achieve a better fit for different regions or flood events.<br>

In SAR imagery, open water surfaces typically appear very dark because they reflect the radar signal away from the sensor. This results in low backscatter values. In our case, pixels with a backscatter lower than approximately ‚Äì15 dB are likely to correspond to water.

In [None]:
WATER_THRESHOLD_DB = -15

It is interesting to study the flood event over a specific point within the area of interest.<br>
Therefore, we are storing the coordinates of an anchor point inside the area which is not usually covered by water. After the heavy rain, it became flooded for a few weeks.

In [None]:
TARGET_LAT = 39.28
TARGET_LONG = -0.30

## Extracting information from the `.zarr`

As explained in the [S1 basic operations tutorial](23_S1_basic_operations.ipynb), we will perform over all the selected data the following operations:

- Slicing the data to meet our area of interest and decimate it
- Assigning latitude and longitude coordinates to the dataset
- Computing the backscatter

### Slicing and decimating GRD variable

To begin with, we access all our `.zarr` items `measurements` groups by creating a list storing all of them.

In [None]:
measurements = []
# Looping to populate the measurements list with only the measurements groups of each dataset on the datatree list
for i in range(DATASET_NUMBER):
    measurements.append(datatrees[i][datatrees[i].groups[MEASUREMENTS_GROUP_ID]].to_dataset())

We continue by slicing and decimating `grd`'s data for our area of interest around Valencia.

We'll use Ground Control Points (GCPs) to perform geographically-based slicing. This ensures that pixels from different products represent the same geographical areas, making them suitable for time series analysis.

First, let's define a bounding box around our target area in Valencia:

In [None]:
# Define bounding box around Valencia (expanded from target coordinates)
bbox = [
    TARGET_LONG - 0.15,  # min longitude
    TARGET_LAT - 0.10,   # min latitude  
    TARGET_LONG + 0.15,  # max longitude
    TARGET_LAT + 0.10    # max latitude
]

print(f"Bounding box: {bbox}")
print(f"Longitude range: {bbox[0]:.2f} to {bbox[2]:.2f}")
print(f"Latitude range: {bbox[1]:.2f} to {bbox[3]:.2f}")

In [None]:
# Plotting the first decimated GRD product from our list, corresponding to the whole scene
measurements[0].grd.isel(
        azimuth_time=slice(None, None, 20),
        ground_range=slice(None, None, 20)).plot(vmax=300)
plt.show()

print("Azimuth time has", measurements[0].grd.shape[0], "values.")
print("Ground range has", measurements[0].grd.shape[1], "values.")

Now we'll implement helper functions for GCP-based spatial slicing and geocoding:

In [None]:
def build_slice(shape, idx, offset=2):
    """
    Builds two slice objects around a given index, clamped within the shape bounds.

    Parameters
    ----------
    shape : tuple of int
        The dimensions of the array (e.g., (height, width)).
    idx : sequence of tensors or scalars
        The index with two elements.
    offset : int, optional
        The number of elements to include on each side of the index (default is 2).

    Returns
    -------
    list of slice
        A list containing two slice objects for each dimension.
    """
    i0 = int(min(idx[0]))
    i1 = int(max(idx[1]))

    def clamp_slice(i, dim_size):
        start = max(0, i - offset)
        end = min(dim_size - 1, i + offset)
        return slice(start, end + 1)

    return [clamp_slice(i0, shape[0]), clamp_slice(i1, shape[1])]


def create_regular_grid(min_x, max_x, min_y, max_y, spatialres):
    """
    Create a regular coordinate grid given bounding box limits.

    Parameters
    ----------
    min_x, max_x : float
        Minimum and maximum X coordinates (e.g., longitude or projected X).
    min_y, max_y : float
        Minimum and maximum Y coordinates (e.g., latitude or projected Y).
    spatialres : float
        Desired spatial resolution (in same units as x/y).

    Returns
    -------
    grid_x_regular : ndarray
        2D array of regularly spaced X coordinates.
    grid_y_regular : ndarray
        2D array of regularly spaced Y coordinates.
    """
    # Ensure positive dimensions and consistent spacing
    width = int(np.ceil((max_x - min_x) / spatialres))
    height = int(np.ceil((max_y - min_y) / spatialres))

    # Compute grid centers (half-pixel offset)
    half_pixel = spatialres / 2.0
    x_regular = np.linspace(
        min_x + half_pixel, max_x - half_pixel, width, dtype=np.float32
    )
    y_regular = np.linspace(
        min_y + half_pixel, max_y - half_pixel, height, dtype=np.float32
    )

    grid_x_regular, grid_y_regular = np.meshgrid(x_regular, y_regular)

    return grid_x_regular, grid_y_regular


def geocode_grd(sigma_0, grid_x_regular, grid_y_regular):
    """
    Geocode GRD data to a regular grid using nearest neighbor interpolation.
    """
    from scipy.interpolate import griddata
    
    grid_lat = sigma_0.latitude.values
    grid_lon = sigma_0.longitude.values

    # Set the border values to zero to avoid border artifacts with nearest interpolator
    sigma_0_copy = sigma_0.copy()
    sigma_0_copy.data[[0, -1], :] = 0
    sigma_0_copy.data[:, [0, -1]] = 0

    interpolated_values_grid = griddata(
        (grid_lon.flatten(), grid_lat.flatten()),
        sigma_0_copy.values.flatten(),
        (grid_x_regular, grid_y_regular),
        method="nearest",
    )

    ds = xr.Dataset(
        coords=dict(
            time=(["time"], [sigma_0.time.values]),
            y=(["y"], grid_y_regular[:, 0]),
            x=(["x"], grid_x_regular[0, :]),
        )
    )
    ds["grd"] = (("time", "y", "x"), np.expand_dims(interpolated_values_grid, 0))
    ds = ds.where(ds != 0)

    return ds

In [None]:
# Now perform GCP-based spatial slicing for each product
grd = []
gcps_list = []

for i in range(DATASET_NUMBER):
    print(f"Processing dataset {i+1}/{DATASET_NUMBER}...")
    
    # Access GRD and GCP data
    grd_group = measurements[i].grd
    
    # Get GCPs from the corresponding datatree
    # Find the measurements group name dynamically
    group_name = datatrees[i].groups[MEASUREMENTS_GROUP_ID]
    gcps = datatrees[i][group_name].conditions.gcp.to_dataset()[["latitude", "longitude"]]
    
    # Create mask based on bounding box
    mask = (
        (gcps.latitude < bbox[3])    # lat < max_lat
        & (gcps.latitude > bbox[1])  # lat > min_lat
        & (gcps.longitude < bbox[2]) # lon < max_lon
        & (gcps.longitude > bbox[0]) # lon > min_lon
    )
    
    # Find indices where mask is True
    idx = np.where(mask == 1)
    
    if len(idx[0]) == 0:
        print(f"Warning: No GCPs found within bounding box for dataset {i}")
        continue
    
    # Build slices around the found indices
    azimuth_time_slice, ground_range_slice = build_slice(mask.shape, idx)
    
    # Crop GCPs to the area of interest
    gcps_crop = gcps.isel(
        dict(azimuth_time=azimuth_time_slice, ground_range=ground_range_slice)
    )
    
    # Get min/max coordinates for final slicing
    azimuth_time_min = gcps_crop.azimuth_time.min().values
    azimuth_time_max = gcps_crop.azimuth_time.max().values
    ground_range_min = gcps_crop.ground_range.min().values
    ground_range_max = gcps_crop.ground_range.max().values
    
    # Crop and decimate GRD data
    grd_crop = grd_group.sel(
        azimuth_time=slice(azimuth_time_min, azimuth_time_max),
        ground_range=slice(ground_range_min, ground_range_max)
    ).isel(
        azimuth_time=slice(None, None, 10),  # Decimate by factor of 10
        ground_range=slice(None, None, 10)
    )
    
    # Interpolate GCPs to match decimated GRD data
    gcps_crop_interp = gcps_crop.interp_like(grd_crop)
    
    # Assign coordinates to GRD data
    grd_crop = grd_crop.assign_coords(
        {"latitude": gcps_crop_interp.latitude, "longitude": gcps_crop_interp.longitude}
    )
    
    # Apply final mask to ensure we only keep data within our AOI
    final_mask = (
        (gcps_crop_interp.latitude < bbox[3])
        & (gcps_crop_interp.latitude > bbox[1])
        & (gcps_crop_interp.longitude < bbox[2])
        & (gcps_crop_interp.longitude > bbox[0])
    )
    grd_crop = grd_crop.where(final_mask.compute(), drop=True)
    
    grd.append(grd_crop)
    gcps_list.append(gcps_crop_interp)
    
    print(f"Dataset {i+1} cropped to shape: {grd_crop.shape}")

print(f"\nProcessed {len(grd)} datasets successfully.")

In [None]:
# Check shape of processed data
if len(grd) > 1:
    print(f"GRD data shape: {grd[1].shape}")
    
    # Plotting the second sliced and decimated GRD product from our list with coordinates
    grd[1].plot(x="longitude", y="latitude", vmax=300)
    plt.title("GCP-based sliced GRD product for Valencia AOI")
    plt.xlabel("Longitude")
    plt.ylabel("Latitude")
    
    # Add target point
    plt.scatter(TARGET_LONG, TARGET_LAT, color="red", marker="o", s=50, label="Target Point")
    plt.legend()
    plt.show()
else:
    print("No data to display")

### Creating Regular Grid for Time Series Analysis

Now that we have geographically aligned data, let's create a regular grid to enable proper time series analysis. This step geocodes all products to the same regular grid, ensuring that each pixel represents the same geographical area across all time steps.

First, let's determine the common geographic bounds across all datasets and create a regular grid:

In [None]:
# Extract min/max values for longitude and latitude to build a common regular grid
if len(grd) > 0:
    # Collect all latitude and longitude values from all datasets
    all_lats = []
    all_lons = []
    
    for i in range(len(grd)):
        if 'latitude' in grd[i].coords and 'longitude' in grd[i].coords:
            lats = grd[i].latitude.values
            lons = grd[i].longitude.values
            
            # Remove NaN values
            valid_mask = ~(np.isnan(lats) | np.isnan(lons))
            all_lats.extend(lats[valid_mask].flatten())
            all_lons.extend(lons[valid_mask].flatten())
    
    # Calculate bounds
    min_lon = np.min(all_lons)
    max_lon = np.max(all_lons)
    min_lat = np.min(all_lats)
    max_lat = np.max(all_lats)
    
    print(f"Geographic bounds:")
    print(f"Longitude: {min_lon:.4f} to {max_lon:.4f}")
    print(f"Latitude: {min_lat:.4f} to {max_lat:.4f}")
    
    # Create regular grid with 0.001 degree resolution (~110m)
    spatial_resolution = 0.001
    grid_x_regular, grid_y_regular = create_regular_grid(
        min_lon, max_lon, min_lat, max_lat, spatial_resolution
    )
    
    print(f"Regular grid shape: {grid_x_regular.shape}")
else:
    print("No data available for grid creation")

In [None]:
# Geocode all GRD datasets to the common regular grid
grd_geocoded = []

if len(grd) > 0 and 'grid_x_regular' in locals():
    for i in range(len(grd)):
        print(f"Geocoding dataset {i+1}/{len(grd)}...")
        
        # Add time coordinate if not present
        if 'time' not in grd[i].coords:
            # Extract time from the original datatree
            group_name = datatrees[i].groups[MEASUREMENTS_GROUP_ID]
            time_val = datatrees[i][group_name].attrs.get('start_time', 
                      datatrees[i].attrs.get('start_time', f'2024-{i+1:02d}-01'))
            grd[i] = grd[i].assign_coords(time=pd.to_datetime(time_val))
        
        # Geocode to regular grid
        geocoded_ds = geocode_grd(grd[i], grid_x_regular, grid_y_regular)
        grd_geocoded.append(geocoded_ds)
        
        print(f"Dataset {i+1} geocoded to regular grid shape: {geocoded_ds.grd.shape}")

    print(f"\nGeocoded {len(grd_geocoded)} datasets to regular grid.")
else:
    print("Skipping geocoding - no data or grid available")
    grd_geocoded = []

In [None]:
# Plot comparison between original and geocoded data
if len(grd) > 1 and len(grd_geocoded) > 1:
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))
    
    # Original data (irregular grid)
    grd[1].plot(x="longitude", y="latitude", vmax=300, ax=ax1)
    ax1.set_title("Original GRD (irregular grid)")
    ax1.scatter(TARGET_LONG, TARGET_LAT, color="red", marker="o", s=50, label="Target")
    ax1.legend()
    
    # Geocoded data (regular grid)
    grd_geocoded[1].grd.squeeze().plot(x="x", y="y", vmax=300, ax=ax2)
    ax2.set_title("Geocoded GRD (regular grid)")
    ax2.set_xlabel("Longitude")
    ax2.set_ylabel("Latitude")
    ax2.scatter(TARGET_LONG, TARGET_LAT, color="red", marker="o", s=50, label="Target")
    ax2.legend()
    
    plt.tight_layout()
    plt.show()
    
    # Print information about the geocoded grid
    print(f"Geocoded data info:")
    print(f"Shape: {grd_geocoded[1].grd.shape}")
    print(f"X (longitude) range: {grd_geocoded[1].x.min().values:.4f} to {grd_geocoded[1].x.max().values:.4f}")
    print(f"Y (latitude) range: {grd_geocoded[1].y.min().values:.4f} to {grd_geocoded[1].y.max().values:.4f}")
elif len(grd) > 0:
    # Fallback to just showing original data
    grd[0].plot(x="longitude", y="latitude", vmax=300)
    plt.title("GRD product with latitude and longitude coordinates")
    plt.scatter(TARGET_LONG, TARGET_LAT, color="red", marker="o", s=50, label="Target")
    plt.legend()
    plt.show()
else:
    print("No data to display")

### Computing backscatter

Now we'll compute the backscatter for our geocoded data. The geocoded data is now properly aligned on a regular grid, making it suitable for time series analysis and comparison across different acquisition dates.

We'll now compute backscatter intensity for both our original sliced data and our geocoded data. The geocoded data provides properly aligned pixels for time series analysis.

For the original data, we need to access and decimate the calibration values to match our sliced GRD data. For the geocoded data, we'll compute backscatter first on the original data and then geocode the results.

In [None]:
# Compute backscatter for original sliced data (if available)
intensity = []
calibration = []

if len(grd) > 0:
    # Get calibration data for each dataset and interpolate to match decimated GRD data
    for i in range(len(grd)):
        print(f"Processing calibration for dataset {i+1}/{len(grd)}...")
        
        # Get calibration data
        cal_data = datatrees[i][datatrees[i].groups[CALIBRATION_GROUP_ID]].to_dataset()
        
        # Interpolate calibration to match the decimated GRD data
        cal_interp = cal_data.interp_like(grd[i])
        calibration.append(cal_interp)
        
        # Compute backscatter intensity
        intensity_data = xarray_sentinel.calibrate_intensity(
            grd[i], 
            cal_interp.beta_nought, 
            as_db=True
        )
        intensity.append(intensity_data)
        
        print(f"Computed intensity for dataset {i+1}, shape: {intensity_data.shape}")
else:
    print("No GRD data available for backscatter computation")

In [None]:
# Compute backscatter for geocoded data
intensity_geocoded = []

if len(grd_geocoded) > 0:
    print("\nComputing backscatter for geocoded data...")
    
    for i in range(len(grd_geocoded)):
        # The geocoded data already contains GRD values
        # We'll convert from linear to dB scale for the geocoded data
        grd_db = 10 * np.log10(grd_geocoded[i].grd.where(grd_geocoded[i].grd > 0))
        
        # Create a new dataset with the dB values
        intensity_ds = grd_geocoded[i].copy()
        intensity_ds['intensity'] = grd_db
        intensity_ds = intensity_ds.drop_vars('grd')
        
        intensity_geocoded.append(intensity_ds)
        
        print(f"Computed geocoded intensity for dataset {i+1}")
    
    print(f"Processed {len(intensity_geocoded)} geocoded intensity datasets.")
else:
    print("No geocoded data available for intensity computation")

In [None]:
# Plot comparison of backscatter intensity results
if len(intensity) > 1 and len(intensity_geocoded) > 1:
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))
    
    # Original backscatter (irregular grid)
    intensity[1].plot(x="longitude", y="latitude", vmin=-25, vmax=5, ax=ax1)
    ax1.set_title("Backscatter Intensity (irregular grid)")
    ax1.scatter(TARGET_LONG, TARGET_LAT, color="red", marker="o", s=50, label="Target")
    ax1.legend()
    
    # Geocoded backscatter (regular grid)
    intensity_geocoded[1].intensity.squeeze().plot(x="x", y="y", vmin=-25, vmax=5, ax=ax2)
    ax2.set_title("Backscatter Intensity (regular grid)")
    ax2.set_xlabel("Longitude")
    ax2.set_ylabel("Latitude")
    ax2.scatter(TARGET_LONG, TARGET_LAT, color="red", marker="o", s=50, label="Target")
    ax2.legend()
    
    plt.tight_layout()
    plt.show()
elif len(intensity) > 1:
    # Fallback to original data only
    intensity[1].plot(x="longitude", y="latitude", vmin=-25, vmax=5)
    plt.title("Computed backscatter intensity")
    plt.scatter(TARGET_LONG, TARGET_LAT, color="red", marker="o", s=50, label="Target")
    plt.legend()
    plt.show()
else:
    print("Insufficient data for plotting backscatter intensity")

## Creating the data cube

Now we have both irregular and regular grid data. The regular grid (geocoded) data is particularly valuable for time series analysis because all pixels are aligned and represent the same geographical areas across different acquisition dates.

Now we can create proper data cubes from both our irregular grid data and our geocoded regular grid data. The geocoded data cube will provide properly aligned pixels for accurate time series analysis.

First, let's extract acquisition dates for the time dimension:

In [None]:
# Extract acquisition dates from the datatrees
acquisition_dates = []

for i in range(min(len(datatrees), len(intensity))):
    # Try to get the start time from datatree attributes
    group_name = datatrees[i].groups[MEASUREMENTS_GROUP_ID] if i < len(datatrees) else None
    
    if group_name and hasattr(datatrees[i][group_name], 'attrs'):
        start_time = datatrees[i][group_name].attrs.get('start_time')
        if start_time:
            acquisition_dates.append(pd.to_datetime(start_time).date())
        else:
            # Fallback to azimuth_time if available
            if len(intensity) > i and 'azimuth_time' in intensity[i].coords:
                date_val = intensity[i].azimuth_time.values[1].astype('datetime64[D]')
                acquisition_dates.append(pd.to_datetime(date_val).date())
            else:
                # Create a dummy date
                acquisition_dates.append(pd.to_datetime(f'2024-{i+1:02d}-01').date())
    else:
        # Create a dummy date
        acquisition_dates.append(pd.to_datetime(f'2024-{i+1:02d}-01').date())

print(f"Acquisition dates: {acquisition_dates}")

### Creating Data Cubes

We'll create two different data cubes:

1. **Regular Grid Data Cube (Geocoded)**: This uses our geocoded data where all pixels are properly aligned on the same regular geographic grid. Each pixel represents the same area on the ground across all time steps - this is the **recommended approach** for time series analysis.

2. **Irregular Grid Data Cube (Coordinate Aligned)**: This uses the original coordinate alignment method for comparison, but note that pixels may not represent exactly the same ground locations across dates.

#### Regular Grid Data Cube (Recommended)

In [None]:
# Create regular grid data cube from geocoded intensity data
if len(intensity_geocoded) > 0:
    print("Creating regular grid data cube...")
    
    # Prepare geocoded intensity datasets for stacking
    geocoded_datasets = []
    valid_dates = []
    
    for i, ds in enumerate(intensity_geocoded):
        if i < len(acquisition_dates):
            # Add time coordinate
            ds_with_time = ds.assign_coords(time=pd.to_datetime(acquisition_dates[i]))
            geocoded_datasets.append(ds_with_time)
            valid_dates.append(acquisition_dates[i])
    
    # Stack into a data cube along time dimension
    if len(geocoded_datasets) > 0:
        intensity_datacube_regular = xr.concat(geocoded_datasets, dim='time')
        print(f"Regular grid data cube shape: {intensity_datacube_regular.intensity.shape}")
        print(f"Coordinates: {list(intensity_datacube_regular.coords.keys())}")
    else:
        intensity_datacube_regular = None
        print("No geocoded datasets available for regular grid data cube")
else:
    intensity_datacube_regular = None
    print("No geocoded data available for regular grid data cube")

#### Irregular Grid Data Cube (For Comparison)

In [None]:
# Create irregular grid data cube using coordinate alignment (original method)
if len(intensity) > 0:
    print("\nCreating irregular grid data cube for comparison...")
    
    # Use coordinate alignment approach
    reference_coords = intensity[0].coords
    datasets_aligned = []
    
    for ds in intensity:
        ds_no_coords = ds.reset_coords(drop=True)
        datasets_aligned.append(ds_no_coords.assign_coords(reference_coords))
    
    # Create time data array
    time_data = [pd.to_datetime(date) for date in acquisition_dates[:len(datasets_aligned)]]
    
    # Stack into data cube
    intensity_data_cube = xr.concat(datasets_aligned, dim=xr.DataArray(time_data, dims="time"))
    
    print(f"Irregular grid data cube shape: {intensity_data_cube.shape}")
    print(f"‚ö†Ô∏è  Note: Pixels may not represent the same ground locations across dates")
else:
    intensity_data_cube = None
    print("No intensity data available for irregular grid data cube")

# Display the data cubes
if intensity_datacube_regular is not None:
    print("\n=== Regular Grid Data Cube (Recommended) ===")
    display(intensity_datacube_regular)

if intensity_data_cube is not None:
    print("\n=== Irregular Grid Data Cube (For Comparison) ===")
    display(intensity_data_cube)

## Flood mapping and time series analysis

The last step is to perform the time series and flood mapping analysis.

### Visualization of Time Series Data

Now we can visualize our time series data. We'll compare both the regular grid (geocoded) and irregular grid data cubes to demonstrate the difference.

The regular grid data ensures that each pixel represents the same geographical area across all time steps, making it much more suitable for accurate time series analysis and flood detection.

#### Regular Grid Time Series (Recommended)

In [None]:
### Flood Detection Using Threshold Values

Now we'll demonstrate flood detection using both data cubes. Water appears as darker pixels in SAR imagery, typically with backscatter values lower than **-15 dB**.

The regular grid approach provides more accurate flood mapping because:
- Each pixel represents the same geographical area across all dates
- Changes in backscatter values accurately reflect changes in surface conditions
- Time series analysis is geographically meaningful

#### Regular Grid Flood Maps

In [None]:
# Plot regular grid (geocoded) time series
if intensity_datacube_regular is not None:
    n_times = len(intensity_datacube_regular.time)
    cols = 4
    rows = int(np.ceil(n_times / cols))
    
    fig, axes = plt.subplots(rows, cols, figsize=(4*cols, 4*rows))
    if rows == 1 and cols == 1:
        axes = [axes]
    else:
        axes = axes.flatten()
    
    print(f"Plotting {n_times} time steps from regular grid data cube")
    
    for i in range(n_times):
        ax = axes[i]
        
        # Plot the geocoded intensity data
        intensity_datacube_regular.intensity.isel(time=i).plot(
            x="x", y="y",
            vmin=-25, vmax=5,
            ax=ax,
            add_colorbar=False
        )
        
        ax.scatter(TARGET_LONG, TARGET_LAT, color="red", marker="o", s=20, label="Target Point")
        ax.set_xlabel("Longitude")
        ax.set_ylabel("Latitude")
        
        # Add date to title
        time_val = intensity_datacube_regular.time.values[i]
        ax.set_title(f"Regular Grid - {pd.to_datetime(time_val).strftime('%Y-%m-%d')}")
        ax.legend()
    
    # Hide unused subplots
    for j in range(n_times, len(axes)):
        axes[j].axis('off')
    
    plt.suptitle("Geocoded Time Series (Regular Grid) - Each pixel represents the same ground area", fontsize=14)
    plt.tight_layout()
    plt.show()
else:
    print("No regular grid data available for visualization")

#### Irregular Grid Time Series (For Comparison)

Below is the original approach for comparison. Note that pixels may not represent exactly the same ground locations across different dates.

In [None]:
# Plot irregular grid (coordinate aligned) time series for comparison
if intensity_data_cube is not None:
    n_times = len(intensity_data_cube.time)
    cols = 4
    rows = int(np.ceil(n_times / cols))
    
    fig, axes = plt.subplots(rows, cols, figsize=(4*cols, 4*rows))
    if rows == 1 and cols == 1:
        axes = [axes]
    else:
        axes = axes.flatten()
    
    for i in range(n_times):
        ax = axes[i]
        intensity_data_cube[i].plot(
            x="longitude", y="latitude",
            vmin=-25, vmax=5,
            ax=ax,
            add_colorbar=False
        )
        ax.scatter(TARGET_LONG, TARGET_LAT, color="red", marker="o", s=20, label="Target Point")
        
        # Add date to title
        time_val = intensity_data_cube.time.values[i]
        ax.set_title(f"Irregular Grid - {pd.to_datetime(time_val).strftime('%Y-%m-%d')}")
        ax.legend()
    
    # Hide unused subplots
    for j in range(n_times, len(axes)):
        axes[j].axis('off')
    
    plt.suptitle("‚ö†Ô∏è Coordinate Aligned Time Series (Irregular Grid) - Pixels may not align geographically", fontsize=14)
    plt.tight_layout()
    plt.show()
else:
    print("No irregular grid data available for visualization")

### Flood Detection Using Water Threshold

Water detection in SAR imagery is based on the principle that water surfaces appear very dark due to low backscatter. According to [literature](https://www.researchgate.net/figure/VV-and-VH-threshold-statistics-1-obtained-via-graphical-interpretation-and-2_tbl4_360412209) and [other sources](https://mbonnema.github.io/GoogleEarthEngine/07-SAR-Water-Classification/?utm_source=chatgpt.com), water typically has backscatter values lower than **-15 dB**.

We'll demonstrate flood detection using this threshold on both data cubes to show the advantages of the geocoded approach.

In [None]:
# Define water threshold
WATER_THRESHOLD_DB = -15

# Create flood maps using regular grid (geocoded) data
if intensity_datacube_regular is not None:
    print("Creating flood maps from regular grid (geocoded) data...")
    
    n_times = len(intensity_datacube_regular.time)
    cols = 4
    rows = int(np.ceil(n_times / cols))
    
    fig, axes = plt.subplots(rows, cols, figsize=(4*cols, 4*rows))
    if rows == 1 and cols == 1:
        axes = [axes]
    else:
        axes = axes.flatten()
    
    for i in range(n_times):
        ax = axes[i]
        
        # Create water mask (True where intensity <= threshold)
        water_mask = (intensity_datacube_regular.intensity.isel(time=i) <= WATER_THRESHOLD_DB)
        
        # Plot the water mask
        water_mask.plot(
            x="x", y="y",
            ax=ax,
            add_colorbar=False,
            cmap='RdYlBu_r'  # Water in blue, land in red/yellow
        )
        
        ax.scatter(TARGET_LONG, TARGET_LAT, color="red", marker="o", s=50, label="Target Point")
        ax.set_xlabel("Longitude")
        ax.set_ylabel("Latitude")
        
        # Add date to title
        time_val = intensity_datacube_regular.time.values[i]
        ax.set_title(f"Regular Grid Flood Map - {pd.to_datetime(time_val).strftime('%Y-%m-%d')}")
        ax.legend()
    
    # Hide unused subplots
    for j in range(n_times, len(axes)):
        axes[j].axis('off')
    
    plt.suptitle(f"Geocoded Flood Maps (Threshold: {WATER_THRESHOLD_DB} dB) - Blue = Water, Yellow/Red = Land", fontsize=14)
    plt.tight_layout()
    plt.show()
    
    # Calculate flood statistics
    print("\nFlood extent statistics (Regular Grid):")
    for i in range(n_times):
        water_pixels = (intensity_datacube_regular.intensity.isel(time=i) <= WATER_THRESHOLD_DB).sum().values
        total_pixels = intensity_datacube_regular.intensity.isel(time=i).count().values
        flood_percentage = (water_pixels / total_pixels) * 100
        time_val = pd.to_datetime(intensity_datacube_regular.time.values[i])
        print(f"  {time_val.strftime('%Y-%m-%d')}: {flood_percentage:.1f}% water coverage ({water_pixels} of {total_pixels} pixels)")
        
else:
    print("No regular grid data available for flood mapping")

#### Irregular Grid Flood Maps (For Comparison)

In [None]:
# Perform change detection using regular grid (geocoded) data
if intensity_datacube_regular is not None and len(intensity_datacube_regular.time) >= 2:
    print("Performing change detection with geocoded data...")
    
    # Select before and after images (adjust indices as needed based on your data)
    before_idx = 0  # First image (before flood)
    after_idx = 1   # Second image (after flood)
    
    if len(intensity_datacube_regular.time) > 2:
        after_idx = len(intensity_datacube_regular.time) // 2  # Middle image
    
    before_img = intensity_datacube_regular.intensity.isel(time=before_idx)
    after_img = intensity_datacube_regular.intensity.isel(time=after_idx)
    
    # Calculate difference (after - before)
    change_map = after_img - before_img
    
    # Create comparison plot
    fig, axes = plt.subplots(2, 2, figsize=(12, 10))
    
    # Before image
    before_img.plot(x="x", y="y", ax=axes[0,0], vmin=-25, vmax=5, cmap='viridis')
    axes[0,0].set_title(f"Before Flood - {pd.to_datetime(intensity_datacube_regular.time.values[before_idx]).strftime('%Y-%m-%d')}")
    axes[0,0].scatter(TARGET_LONG, TARGET_LAT, color="red", marker="o", s=50, label="Target")
    axes[0,0].legend()
    
    # After image
    after_img.plot(x="x", y="y", ax=axes[0,1], vmin=-25, vmax=5, cmap='viridis')
    axes[0,1].set_title(f"After Flood - {pd.to_datetime(intensity_datacube_regular.time.values[after_idx]).strftime('%Y-%m-%d')}")
    axes[0,1].scatter(TARGET_LONG, TARGET_LAT, color="red", marker="o", s=50, label="Target")
    axes[0,1].legend()
    
    # Change detection map
    change_map.plot(x="x", y="y", ax=axes[1,0], cmap='RdBu_r', vmin=-15, vmax=15)
    axes[1,0].set_title("Change Detection (After - Before)")
    axes[1,0].scatter(TARGET_LONG, TARGET_LAT, color="yellow", marker="o", s=50, label="Target")
    axes[1,0].legend()
    
    # Flood areas (areas with significant decrease in backscatter)
    flood_areas = change_map < -10  # Areas that became much darker (flooded)
    flood_areas.plot(x="x", y="y", ax=axes[1,1], cmap='Blues')
    axes[1,1].set_title("Detected Flood Areas (Change < -10 dB)")
    axes[1,1].scatter(TARGET_LONG, TARGET_LAT, color="red", marker="o", s=50, label="Target")
    axes[1,1].legend()
    
    plt.suptitle("Flood Change Detection - Regular Grid (Geocoded) Data", fontsize=16)
    plt.tight_layout()
    plt.show()
    
    # Statistics
    total_pixels = change_map.count().values
    flooded_pixels = (change_map < -10).sum().values
    flood_percentage = (flooded_pixels / total_pixels) * 100
    
    print(f"\nFlood Change Detection Results:")
    print(f"Total pixels analyzed: {total_pixels}")
    print(f"Pixels showing flood signature (>10dB decrease): {flooded_pixels}")
    print(f"Estimated flood coverage: {flood_percentage:.1f}%")
    
else:
    print("Insufficient regular grid data for change detection")

#### Change Detection with Irregular Grid Data (For Comparison)

In [None]:
# Create flood maps using irregular grid (coordinate aligned) data for comparison
if intensity_data_cube is not None:
    print("\nCreating flood maps from irregular grid (coordinate aligned) data...")
    
    n_times = len(intensity_data_cube.time)
    cols = 4
    rows = int(np.ceil(n_times / cols))
    
    fig, axes = plt.subplots(rows, cols, figsize=(4*cols, 4*rows))
    if rows == 1 and cols == 1:
        axes = [axes]
    else:
        axes = axes.flatten()
    
    for i in range(n_times):
        ax = axes[i]
        
        # Create water mask
        water_mask = (intensity_data_cube[i] <= WATER_THRESHOLD_DB)
        
        water_mask.plot(
            x="longitude", y="latitude",
            ax=ax,
            add_colorbar=False,
            cmap='RdYlBu_r'
        )
        
        ax.scatter(TARGET_LONG, TARGET_LAT, color="red", marker="o", s=20, label="Target Point")
        
        # Add date to title
        time_val = intensity_data_cube.time.values[i]
        ax.set_title(f"Irregular Grid Flood Map - {pd.to_datetime(time_val).strftime('%Y-%m-%d')}")
        ax.legend()
    
    # Hide unused subplots
    for j in range(n_times, len(axes)):
        axes[j].axis('off')
    
    plt.suptitle(f"‚ö†Ô∏è Coordinate Aligned Flood Maps (Threshold: {WATER_THRESHOLD_DB} dB) - Pixels may not align geographically", fontsize=14)
    plt.tight_layout()
    plt.show()
else:
    print("No irregular grid data available for flood mapping")

### Flood Change Detection

Now we'll demonstrate change detection by comparing images from before and after the flood event. This analysis is much more accurate with geocoded data because we can be confident that we're comparing the exact same geographical locations across time.

#### Change Detection with Regular Grid (Geocoded) Data

With properly geocoded data, we can perform accurate change detection by comparing the same geographical areas before and after the flood event. This provides reliable flood mapping because each pixel represents the exact same ground location across different acquisition dates.

In [None]:
# Perform change detection using irregular grid data for comparison
if intensity_data_cube is not None and len(intensity_data_cube.time) >= 2:
    print("\nPerforming change detection with irregular grid data...")
    
    # Calculate difference between second and third datasets
    dif = (intensity_data_cube[1] - intensity_data_cube[2])
    
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))
    
    # Change detection map
    dif.plot(x="longitude", y="latitude", vmin=-10, vmax=20, ax=ax1, cmap='RdBu_r')
    ax1.set_title("‚ö†Ô∏è Irregular Grid Change Detection")
    ax1.scatter(TARGET_LONG, TARGET_LAT, color="yellow", marker="o", s=50, label="Target")
    ax1.legend()
    
    # Flood areas (significant decreases)
    flood_areas_irregular = dif < -10
    flood_areas_irregular.plot(x="longitude", y="latitude", ax=ax2, cmap='Blues')
    ax2.set_title("Detected Flood Areas (Change < -10 dB)")
    ax2.scatter(TARGET_LONG, TARGET_LAT, color="red", marker="o", s=50, label="Target")
    ax2.legend()
    
    plt.suptitle("‚ö†Ô∏è Change Detection - Irregular Grid (Coordinate Aligned)", fontsize=14)
    plt.tight_layout()
    plt.show()
else:
    print("Insufficient irregular grid data for change detection")

### Time Series Analysis at Target Location

Now we'll demonstrate time series analysis by extracting backscatter values at our target location. This comparison will highlight the advantages of using geocoded data for temporal analysis.

#### Time Series from Regular Grid (Geocoded) Data - Recommended Approach

With geocoded data, we can extract time series values at precise geographic coordinates. This provides accurate temporal analysis because each measurement corresponds to exactly the same ground location.

#### Comparison of Time Series Results

In [None]:
# Plot comparison of time series from both approaches
fig, axes = plt.subplots(2, 1, figsize=(12, 10))

# Regular grid (geocoded) time series
if target_point_regular is not None:
    ax1 = axes[0]
    target_point_regular.plot(ax=ax1, marker='o', label='Geocoded time series', color='blue')
    
    # Add flood threshold line
    x_vals = target_point_regular.time.values
    ax1.axhline(y=WATER_THRESHOLD_DB, color='red', linestyle='--', label=f'Flood threshold ({WATER_THRESHOLD_DB} dB)')
    
    # Add trend line
    if len(target_point_regular) > 3:
        x_num = np.arange(len(target_point_regular))
        y_vals = target_point_regular.values
        z = np.polyfit(x_num, y_vals, min(3, len(y_vals)-1))
        p = np.poly1d(z)
        ax1.plot(x_vals, p(x_num), 'g--', alpha=0.7, label='Trend line')
    
    ax1.set_title('‚úÖ Regular Grid (Geocoded) Time Series - Reliable Geographic Alignment')
    ax1.set_ylabel('Backscatter Intensity (dB)')
    ax1.legend()
    ax1.grid(True, alpha=0.3)
else:
    axes[0].text(0.5, 0.5, 'No regular grid data available', 
                ha='center', va='center', transform=axes[0].transAxes)
    axes[0].set_title('Regular Grid Time Series (Not Available)')

# Irregular grid time series
if target_point_irregular is not None:
    ax2 = axes[1]
    target_point_irregular.plot(ax=ax2, marker='s', label='Coordinate aligned time series', color='orange')
    
    # Add flood threshold line
    x_vals = target_point_irregular.time.values
    ax2.axhline(y=WATER_THRESHOLD_DB, color='red', linestyle='--', label=f'Flood threshold ({WATER_THRESHOLD_DB} dB)')
    
    # Add trend line
    if len(target_point_irregular) > 3:
        x_num = np.arange(len(target_point_irregular))
        y_vals = target_point_irregular.values
        z = np.polyfit(x_num, y_vals, min(3, len(y_vals)-1))
        p = np.poly1d(z)
        ax2.plot(x_vals, p(x_num), 'g--', alpha=0.7, label='Trend line')
    
    ax2.set_title('‚ö†Ô∏è Irregular Grid (Coordinate Aligned) Time Series - Potential Geographic Misalignment')
    ax2.set_ylabel('Backscatter Intensity (dB)')
    ax2.set_xlabel('Time')
    ax2.legend()
    ax2.grid(True, alpha=0.3)
else:
    axes[1].text(0.5, 0.5, 'No irregular grid data available', 
                ha='center', va='center', transform=axes[1].transAxes)
    axes[1].set_title('Irregular Grid Time Series (Not Available)')

plt.tight_layout()
plt.show()

# Print summary statistics
print("\n=== Time Series Analysis Summary ===")

if target_point_regular is not None:
    regular_below_threshold = (target_point_regular < WATER_THRESHOLD_DB).sum().values
    regular_total = len(target_point_regular)
    print(f"Regular Grid (Geocoded):")
    print(f"  - Total time points: {regular_total}")
    print(f"  - Points below flood threshold: {regular_below_threshold}")
    print(f"  - Flood detection rate: {(regular_below_threshold/regular_total)*100:.1f}%")
    print(f"  - Mean backscatter: {target_point_regular.mean().values:.2f} dB")
    print(f"  - Min backscatter: {target_point_regular.min().values:.2f} dB")
    print(f"  - Max backscatter: {target_point_regular.max().values:.2f} dB")

if target_point_irregular is not None:
    irregular_below_threshold = (target_point_irregular < WATER_THRESHOLD_DB).sum().values
    irregular_total = len(target_point_irregular)
    print(f"\nIrregular Grid (Coordinate Aligned):")
    print(f"  - Total time points: {irregular_total}")
    print(f"  - Points below flood threshold: {irregular_below_threshold}")
    print(f"  - Flood detection rate: {(irregular_below_threshold/irregular_total)*100:.1f}%")
    print(f"  - Mean backscatter: {target_point_irregular.mean().values:.2f} dB")
    print(f"  - Min backscatter: {target_point_irregular.min().values:.2f} dB")
    print(f"  - Max backscatter: {target_point_irregular.max().values:.2f} dB")

print("\n=== Key Advantages of Geocoded Approach ===")
print("‚úÖ Each pixel represents exactly the same ground location across all dates")
print("‚úÖ Time series analysis is geographically meaningful and accurate")
print("‚úÖ Suitable for operational flood monitoring and change detection")
print("‚úÖ Can be easily combined with other geocoded datasets")
print("‚úÖ Enables quantitative analysis of flood extent and duration")

## Conclusion

This tutorial has demonstrated a significant improvement to Sentinel-1 flood mapping by implementing proper GCP-based slicing and geocoding. The key improvements include:

### üéØ **Main Improvements Made:**

1. **GCP-Based Spatial Slicing**: Instead of using hardcoded index slicing, we now use Ground Control Points (GCPs) to find the appropriate geographical area, ensuring that all products cover the same ground locations.

2. **Proper Geocoding**: Data is geocoded to a regular grid where each pixel represents exactly the same geographical area across all time steps.

3. **Aligned Time Series**: The geocoded approach enables accurate time series analysis where temporal changes reflect real changes in surface conditions.

### üöÄ **Benefits for Multi-Product Analysis:**

- **Geographic Consistency**: Each pixel represents the same area on the ground across all dates
- **Accurate Change Detection**: Differences between images reflect real changes, not geometric misalignment
- **Reliable Time Series**: Temporal analysis is geographically meaningful
- **Operational Ready**: Suitable for automated flood monitoring systems
- **Interoperability**: Can be easily combined with other geocoded datasets

### üìä **Technical Implementation:**

The improved workflow follows these steps:
1. Define bounding box around area of interest
2. Use GCPs to find appropriate slicing indices
3. Crop and decimate data based on geographic bounds
4. Interpolate GCPs to match processed data
5. Geocode all products to a common regular grid
6. Create aligned time series for accurate analysis

This approach transforms the notebook from a single-product demonstration to a robust multi-product flood monitoring system suitable for operational use.

In [None]:
# Extract time series from regular grid (geocoded) data
if intensity_datacube_regular is not None:
    print("Extracting time series from geocoded data...")
    
    # Find nearest pixel to target coordinates using geocoded coordinates
    # Use the x,y coordinates (longitude, latitude) from the regular grid
    x_coords = intensity_datacube_regular.x.values
    y_coords = intensity_datacube_regular.y.values
    
    # Find closest x and y indices
    x_idx = np.argmin(np.abs(x_coords - TARGET_LONG))
    y_idx = np.argmin(np.abs(y_coords - TARGET_LAT))
    
    # Extract time series at the target location
    target_point_regular = intensity_datacube_regular.intensity.isel(x=x_idx, y=y_idx)
    
    print(f"Extracting data at coordinates: Lon={x_coords[x_idx]:.4f}, Lat={y_coords[y_idx]:.4f}")
    print(f"Target coordinates: Lon={TARGET_LONG:.4f}, Lat={TARGET_LAT:.4f}")
    print(f"Distance from target: {np.abs(x_coords[x_idx] - TARGET_LONG):.4f}¬∞ lon, {np.abs(y_coords[y_idx] - TARGET_LAT):.4f}¬∞ lat")
    
else:
    target_point_regular = None
    print("No regular grid data available for time series extraction")

#### Time Series from Irregular Grid Data (For Comparison)

In [None]:
# Extract time series from irregular grid data for comparison
if intensity_data_cube is not None:
    print("\nExtracting time series from irregular grid data...")
    
    # Find how far each pixel's latitude and longitude is from the target point
    abs_error = np.abs(intensity_data_cube.latitude - TARGET_LAT) + np.abs(intensity_data_cube.longitude - TARGET_LONG)  
    
    # Get the indexes of the closest point
    i, j = np.unravel_index(np.argmin(abs_error.values), abs_error.shape)
    
    # Slice the data cube to get only the pixel that corresponds to the target point
    target_point_irregular = intensity_data_cube.isel(ground_range=j, azimuth_time=i)
    
    print(f"Closest pixel found at azimuth_time index {i}, ground_range index {j}")
else:
    target_point_irregular = None
    print("No irregular grid data available for time series extraction")

<hr>

## Challenges

While using the optimised `.zarr` format saves a lot of time and makes creating workflows relatively simple and achievable, there are still a few challenges to handle and to keep in mind:

- Sentinel-1 GRD Data Availability: For **Sentinel-1 GRD**, most of the datasets are not yet available on the STAC catalogue. This makes searching and data handling harder because, in the end, only a few products are correctly converted.

- Backscatter Computation Libraries: There are only a few working Python libraries that handle backscatter computation. When considering the `.zarr` format, the list becomes even smaller. `xarray_sentinel` is a very good library that handles intensity backscatter computation with `.zarr`.

- Terrain Correction: With the available libraries, it is very difficult to perform geometric and radiometric terrain correction. The existing tools that support the .`zarr` format are not yet fully operational and do not accept the format as it is.

- Image Coregistration: As discussed previously, the .`zarr` format is perfect for handling multiple datasets simultaneously and, thus, for time series analysis. The problem is that there is no library or package that performs proper **coregistration** of Sentinel images, especially with the `.zarr` format. In this tutorial, we used a simplified coordinate assignment approach rather than true coregistration, which works for our specific use case but has limitations. Proper coregistration remains a significant challenge because it is an important step for most production SAR workflows.

<hr>

## Conclusion

The `.zarr` format is particularly well suited for hazard analysis because it enables multiple datasets to be combined into a single structure, either as a data cube or as a list of datatrees. This makes it ideal for rapid, multi-temporal, and multi-spatial monitoring. Unlike the `.SAFE` format, which required downloading entire products, `.zarr` only loads the specific groups needed, while the rest is accessed on the fly. As a result, both data handling and subsequent operations are much faster and more efficient.

Although the ecosystem for `.zarr` is still evolving, there are already promising developments. In the past, `.SAFE` products could be fully processed on applications like SNAP, but similar completeness has not yet been reached for `.zarr`. Nevertheless, libraries such as `xarray_sentinel` and are beginning to cover essential SAR operations. This potential is illustrated in the Valencia flood case study, where Sentinel-1 backscatter sensitivity to water enabled clear mapping of flood extent and duration. The same workflow can be adapted to other flood events by adjusting the relevant thresholds and parameters to match local conditions.

## What's next?

This online resource is under active development. So stay tuned for regular updates üõ∞Ô∏è.