# Measuring the water level of Theewaterskloof Dam in South Africa

**A Jupyter notebook on how to detect the water level of water bodies around the world, using Sentinel-2 multi-spectral and multi-temporal imagery**

This notebook serves as an example on how to bring satellite data from space down to the hands of people living on Earth and analyze the data in order to make draw conclusions that affect all of Earth's citizens. Specifically, it demonstrates how to run a water detection algorithm and extract surface water level for a single reservoir in a given time interval.

Hopefully, this example notebook promotes an increase of awareness about environmental problems and at least a bit helps us to make the world a better place.

## Notebook outline

The outline of this notebook is the following:
1. Defining geometries of [Theewaterskloof Dam, South Africa](https://en.wikipedia.org/wiki/Theewaterskloof_Dam)
2. Preparing and executing the full workflow for water detection
   1. Downloading Sentinel-2 data using [SentinelHub](https://www.sentinel-hub.com/) services
   2. Cloud detection using the [s2cloudless](https://github.com/sentinel-hub/sentinel2-cloud-detector) cloud detector
   3. Water detection
3. Visualizing the waterbodies and the water level over a period of time

## Requirements

- `eo-learn` (https://github.com/sentinel-hub/eo-learn)

In order to run the example, you will also need a Sentinel Hub account. If you do not have one yet, you can create a free trial account at [Sentinel Hub webpage](https://services.sentinel-hub.com/oauth/subscription). If you are a researcher you can even apply for a free non-commercial account at [ESA OSEO page](https://earth.esa.int/aos/OSEO).

Details on how to set up your Sentinel Hub configuration can be found [here](introduction.ipynb).

## Notebook configuration

In [None]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline

## Imports

### eo-learn imports

In [None]:
from eolearn.core import EOTask, EOPatch, LinearWorkflow, FeatureType

from eolearn.io import SentinelHubInputTask

from eolearn.mask import AddValidDataMaskTask

# filtering of scenes
from eolearn.features import SimpleFilterTask, NormalizedDifferenceIndexTask

# burning the vectorised polygon to raster
from eolearn.geometry import VectorToRasterTask

### Other imports 

In [None]:
import os

# The golden standard: numpy and matplotlib
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1 import make_axes_locatable

# For manipulating geo-spatial vector dataset (polygons of nominal water extent)
import geopandas as gpd

# Image manipulations
# Our water detector is going to be based on a simple threshold 
# of Normalised Difference Water Index (NDWI) grayscale image
from skimage.filters import threshold_otsu

# Loading polygon of nominal water extent
import shapely.wkt
from shapely.geometry import Polygon

# sentinelhub-py package
from sentinelhub import BBox, CRS, DataCollection

## Water level extraction EOWorkflow

Our basic logic of the example workflow is:

1. Download all available Sentinel-2 sattelite imagery of Theewaterskloof Dam from beginning of 2016 and today
    * We want to calculate NDWI and also have a true color visualization of the area. We need the following bands: 
        * B02, B03, B04 for `TRUE_COLOR` for visualisations
        * B03, B08 for NDWI calculation 
        * CLM (provided by Sentinel Hub) for cloud masking
2. Clouds are very often obscuring the view of the ground. In order to correctly determine the water level of the dam all images with clouds need to be filtered out. We will use the cloud masks provided by Sentinel Hub to filter clouds without the need for time consuming processing locally.
3. Apply adaptive thresholding to `NDWI` grayscale images
4. Extract water level from a comparison of measured water extent with the nominal one

Each step in the above overview of the workflow is accomplished by adding an `EOTask` to the `EOWorkflow`

### Load the Polygon of nominal water extent and define a `BBox`

The `BBox` defines an area of interest and will be used to create an EOPatch.

In [None]:
# The polygon of the dam is written in wkt format and WGS84 coordinate reference system
DATA_PATH = os.path.join('..', 'data', 'theewaterskloof_dam_nominal.wkt')
with open(DATA_PATH, 'r') as f:
    dam_wkt = f.read()

dam_nominal = shapely.wkt.loads(dam_wkt)

# We add a bit of buffer to the BBox so it nicely contains all polygons 
dam_bbox = BBox(dam_nominal.bounds, crs=CRS.WGS84).buffer(0.2)

# Display
dam_bbox.geometry - dam_nominal

### Step 1: Intialize (and implement workflow specific) EOTasks

#### Create an EOPatch and add all EO features (satellite imagery data)

In [None]:
download_task = SentinelHubInputTask(
    data_collection=DataCollection.SENTINEL2_L1C, 
    bands_feature=(FeatureType.DATA, 'BANDS'),
    resolution=20, 
    maxcc=0.5, 
    bands=['B02', 'B03', 'B04', 'B08'], 
    additional_data=[(FeatureType.MASK, 'dataMask', 'IS_DATA'), (FeatureType.MASK, 'CLM')]
)

calculate_ndwi = NormalizedDifferenceIndexTask((FeatureType.DATA, 'BANDS'), (FeatureType.DATA, 'NDWI'), (1, 3))

#### Burn in the nominal water extent

The `VectorToRasterTask` expects the vectorised dataset in geopandas dataframe.

In [None]:
dam_gdf = gpd.GeoDataFrame(crs=CRS.WGS84.pyproj_crs(), geometry=[dam_nominal])

In [None]:
dam_gdf.plot();

In [None]:
add_nominal_water = VectorToRasterTask(
    dam_gdf, (FeatureType.MASK_TIMELESS, 'NOMINAL_WATER'), values=1, 
    raster_shape=(FeatureType.MASK, 'IS_DATA'), raster_dtype=np.uint8
)

#### The cloud mask is already provided by Sentinel Hub, we use it to calulcate the valid data mask. 

Define a `VALID_DATA` layer: pixel has to contain data and should be classified as clear sky by the cloud detector (`CLM` equals 0)

In [None]:
def calculate_valid_data_mask(eopatch):
    is_data = eopatch.mask['IS_DATA'].astype(np.bool)
    not_cloud = ~eopatch.mask['CLM'].astype(np.bool)
    return is_data & not_cloud

add_valid_mask = AddValidDataMaskTask(predicate=calculate_valid_data_mask)

Calculate fraction of valid pixels per frame and store it as `SCALAR` feature

In [None]:
def calculate_coverage(array):
    return 1.0 - np.count_nonzero(array) / np.size(array)

class AddValidDataCoverageTask(EOTask):
    
    def execute(self, eopatch):
        
        valid_data = eopatch[FeatureType.MASK, 'VALID_DATA']
        time, height, width, channels = valid_data.shape
        
        coverage = np.apply_along_axis(calculate_coverage, 1,
                                       valid_data.reshape((time, height * width * channels)))
        
        eopatch[FeatureType.SCALAR, 'COVERAGE'] = coverage[:, np.newaxis]
        return eopatch
    
add_coverage = AddValidDataCoverageTask()

Filter out too cloudy scenes. We filter out all observation which have cloud coverage of more than 5%. 


In [None]:
cloud_coverage_threshold = 0.05 

class ValidDataCoveragePredicate:
    
    def __init__(self, threshold):
        self.threshold = threshold
        
    def __call__(self, array):
        return calculate_coverage(array) < self.threshold
    
remove_cloudy_scenes = SimpleFilterTask((FeatureType.MASK, 'VALID_DATA'),
                                        ValidDataCoveragePredicate(cloud_coverage_threshold))

#### Apply Water Detection



In [None]:
class WaterDetectionTask(EOTask):
    
    @staticmethod
    def detect_water(ndwi):
        """ Very simple water detector based on Otsu thresholding method of NDWI.
        """
        otsu_thr = 1.0
        if len(np.unique(ndwi)) > 1:
            ndwi[np.isnan(ndwi)] = -1
            otsu_thr = threshold_otsu(ndwi)

        return ndwi > otsu_thr

    def execute(self, eopatch):
        water_masks = np.asarray([self.detect_water(ndwi[...,0]) for ndwi in eopatch.data['NDWI']])
        
        # we're only interested in the water within the dam borders
        water_masks = water_masks[...,np.newaxis] * eopatch.mask_timeless['NOMINAL_WATER']
        
        water_levels = np.asarray([np.count_nonzero(mask)/np.count_nonzero(eopatch.mask_timeless['NOMINAL_WATER']) 
                                   for mask in water_masks])
        
        eopatch[FeatureType.MASK, 'WATER_MASK'] = water_masks
        eopatch[FeatureType.SCALAR, 'WATER_LEVEL'] = water_levels[...,np.newaxis]
        
        return eopatch
    
water_detection = WaterDetectionTask()

### Step 2: Define the EOWorkflow

In [None]:
workflow = LinearWorkflow(
    download_task,
    calculate_ndwi,
    add_nominal_water,
    add_valid_mask,
    add_coverage,
    remove_cloudy_scenes,
    water_detection
)

### Step 3: Run the workflow

Process all Sentinel-2 acquisitions from beginning of 2016 and until beginning of June 2020.

In [None]:
time_interval = ['2017-01-01','2020-06-01']

In [None]:
result = workflow.execute({
    download_task: {
        'bbox': dam_bbox,
        'time_interval': time_interval
    },
})

In [None]:
patch = list(result.values())[-1]

Print content of eopatch at the end of the workflow execution

# Plot results

In [None]:
from skimage.filters import sobel
from skimage.morphology import disk
from skimage.morphology import erosion, dilation, opening, closing, white_tophat

In [None]:
def plot_rgb_w_water(eopatch, idx):
    ratio = np.abs(eopatch.bbox.max_x - eopatch.bbox.min_x) / np.abs(eopatch.bbox.max_y - eopatch.bbox.min_y)
    fig, ax = plt.subplots(figsize=(ratio * 10, 10))
    
    ax.imshow(2.5*eopatch.data['BANDS'][..., [2, 1, 0]][idx])
    
    observed = closing(eopatch.mask['WATER_MASK'][idx,...,0], disk(1))
    nominal = sobel(eopatch.mask_timeless['NOMINAL_WATER'][...,0])
    observed = sobel(observed)
    nominal = np.ma.masked_where(nominal == False, nominal)
    observed = np.ma.masked_where(observed == False, observed)
    
    ax.imshow(nominal, cmap=plt.cm.Reds)
    ax.imshow(observed, cmap=plt.cm.Blues)
    ax.axis('off')

In [None]:
plot_rgb_w_water(patch, 0)

In [None]:
plot_rgb_w_water(patch, -1)

In [None]:
def plot_water_levels(eopatch, max_coverage=1.0):
    fig, ax = plt.subplots(figsize=(20, 7))

    dates = np.asarray(eopatch.timestamp)
    ax.plot(dates[eopatch.scalar['COVERAGE'][...,0] < max_coverage],
            eopatch.scalar['WATER_LEVEL'][eopatch.scalar['COVERAGE'][...,0] < max_coverage],
            'bo-', alpha=0.7)
    ax.plot(dates[eopatch.scalar['COVERAGE'][...,0] < max_coverage],
            eopatch.scalar['COVERAGE'][eopatch.scalar['COVERAGE'][...,0] < max_coverage],
            '--', color='gray', alpha=0.7)
    ax.set_ylim(0.0, 1.1)
    ax.set_xlabel('Date')
    ax.set_ylabel('Water level')
    ax.set_title('Theewaterskloof Dam Water Levels')
    ax.grid(axis='y')
    return ax

In [None]:
ax = plot_water_levels(patch, 1.0)