change `the_notebook.ipynb` name with the corresponding file name in the `href` in the next cell:

<a href="https://jupyterhub.user.eopf.eodc.eu/hub/user-redirect/git-pull?repo=https://github.com/eopf-toolkit/eopf-101&branch=main&urlpath=lab/tree/eopf-101/the_notebook.ipynb" target="_blank">
  <button style="background-color:#0072ce; color:white; padding:0.6em 1.2em; font-size:1rem; border:none; border-radius:6px; margin-top:1em;">
    ðŸš€ Launch this notebook in JupyterLab
  </button>
</a>

https://jupyterhub.user.eopf.eodc.eu/hub/user-redirect/git-pull?repo=https://github.com/atsiokanos/eopf-101&branch=6_a_general_overview_of_the_topic_gww&urlpath=lab/tree/eopf-101/65_gww.ipynb

### Introduction

Water reservoirs are essential for water supply, energy production, and irrigation. However, population growth, economic expansion, and climate change are increasing pressure on these resources, affecting water availability and raising the risk of droughts and floods. Reliable monitoring of reservoirs is critical to ensure sustainable water management and water security.  

[Global Water Watch (GWW)](https://www.globalwaterwatch.earth/) is a platform developed by **Deltares** and supported by Google.org, the Water, Peace, and Security Partnership, and the European Space Agency (ESA). It provides near-real-time, globally accessible information on reservoirs using Earth Observation data, helping stakeholders monitor changes in water extent and manage resources more effectively. A detailed description of GWW methods is available in [this publication](https://www.nature.com/articles/s41598-022-17074-6).  

In this notebook, we **implement parts of the GWW algorithms** to estimate water extent for a single reservoir: **Mita Hills in Zambia**. Although the original GWW algorithm uses **Landsat 7 & 8** as well as **Sentinel-2**, we focus only on **Sentinel-2 imagery** here for simplicity.  



### What we will learn


- ðŸ’§ Compute the Modified Normalized Difference Water Index (MNDWI) and understand its role in highlighting water bodies.  
- ðŸ§© Apply parts of the GWW algorithm to generate water masks and extract the largest connected water body.  
- ðŸ“ˆ Estimate and analyze reservoir water extent over time using Earth observation imagery.


### Prerequisites

Describe the most relevant packages used in the tutorial.
Include [linked]() references or call-out notes.<br>
List any resources, references to previous tutorials. Something the learner should be aware before the introduction to this chapter.

<hr>

#### Import libraries

In [None]:
# Core & utilities
import math
import numpy as np
import pandas as pd
from datetime import datetime

# Parallel computing and large data handling
import dask                                  
from dask.distributed import Client          
import xarray as xr                        

# Data access
from pystac_client import Client as StacClient   # Query EO datasets via STAC API
from pystac import MediaType                     # Identify asset media types (e.g., ZARR)

# Coordinate reference systems and projection
from pyproj import CRS, Transformer              # Convert AOI from geographic to projected (UTM)

import rioxarray                                # Read/write rasters with CRS metadata
from rasterio.enums import Resampling           # Resampling method for reprojection

# Image processing for water detection
from skimage.filters import threshold_otsu      # Thresholding (for MNDWI classification)
from skimage.feature import canny               # Edge detection
from skimage.morphology import dilation         # Edge dilation
from skimage.measure import label               # Connected components labeling

# Visualization
import matplotlib.pyplot as plt                



#### Helper functions

##### `function_name`

In [None]:
def function_name_1(kwargs):
    ...

# when developing a longer function, try to explain what each line is doing

In the case you are using less than 3 functions, you can list them individually. When utilising an `utils.py`, give an overview of the functions you will be using, plus a [link]() to it for further inspection.

<hr>

## Section 1 - Data Preparation

### 1.1 Define AOI, Time Range and Collection

### 1.2 Retrieve Sentinel-2 Data from EOPF STAC Catalog

### 1.3 Load and Preprocess Data

## Section 2 - Reservoir Water Extent Estimation

### 2.1 - Compute MNDWI

We start by computing the Modified Normalized Difference Water Index (MNDWI) using the green and SWIR bands. The MNDWI enhances the presence of water by producing high positive values for water pixels and lower or negative values for land and vegetation. This gives us a first indication of where water is located, but simple thresholding at this stage can be unreliable, especially near shorelines or in areas with shallow water or vegetation.

### 2.2 - Integrate Water Occurrence Data (JRC)

### 2.3 - Generate water extents (GWW Algorithm)

To detect water more reliably, we combine two pieces of information: the MNDWI index and the Water Occurrence (WO) dataset.

First, we identify edges in the MNDWI image, which correspond to sharp transitions between water and land. Using only these edge pixels, we apply Otsu thresholding to determine a robust cutoff value that separates water from land. This avoids biases from large uniform land areas and ensures the threshold is focused on transition zones.

Next, we use the WO dataset to fill in water areas that MNDWI might miss, such as shallow or turbid zones. The median of the WO values is computed only on the edge pixels, making the filling threshold context-aware. This filling also helps recover water in pixels obscured by clouds or shadows in the current image. Water is added in areas that are classified as non-water by MNDWI but exceed the WO threshold.

Finally, the water mask from MNDWI and the filled water mask from WO are combined to produce the total water mask.

### 2.4 - Extract Largest Connected Component

### 2.5 - Compute Reservoir Area

<hr>

## ðŸ’ª Now it is your turn

--> *This section contains an engaging part with some exercises or tasks for the learner.*

*Some ideas:*
* *Ask the learner to repeat the workflow with a different dataset*
* *Ask the learner to modify the area of interest.*
* *Ask the learner to reflect and test their level of understanding / comprehension*

Example: 

The following exercises will help you master the STAC API and understand how to find the data you need.

### Task 1: Explore Your Own Area of Interest
* Go to http://bboxfinder.com/ and select an area of interest (AOI) (e.g. your hometown, a research site, etc.)
* Copy the bounding box coordinates of your area of interest
* Change the provided code above to search for data over your AOI

### Task 2: Temporal Analysis
* Compare data availability across different years for the Sentinel-2 L-2A Collection.
* Search for items in year 2022
* Repeat the search for year 2024

### Task 3: Explore the SAR Mission and combine multiple criteria
* Do the same for a different Collection for example the Sentinel-1 Level-1 GRD, e.g. you can use the ID sentinel-1-l1-grd
* How many assets are available for the year 2024?



## Conclusion

--> *This section summarises the objectives of the notebook, what a learner learned and what possible conclusions / results were obtained.*

## What's next?

--> *In one or two sentences, describe what awaits the learner in the next chapter and include a link to the next chapter.*