# Water Observations using Sentinel-2

## Background
Geoscience Australia's Water Observations from Space (WOfS) classifier is a decision tree that transforms individual multispectral Landsat observations into a surface-water occurrence map. 

## Description
In this example, we apply the WOfS classifier to Sentinel-2 to map surface water at 10 m spatial resolution and generate monthly water extents.

Only the `classify` algorithm from the `wofs` package will be used to minimize dependencies.

### Packages and functions
Import Python packages that are used for the analysis.

In [None]:
# only importing required packages

from datacube import Datacube
from odc.geo.geom import point

from wofs.classifier import classify
from odc.algo import mask_cleanup

In [None]:
from collections import Counter

def mostcommon_crs(datasets):
    crs_counts = Counter(dataset.metadata_doc["crs"] for dataset in datasets)
    return crs_counts.most_common(1)[0][0]

## Select a study site

Keep the area small for long time series analysis

In [None]:
start_date = "2025-01"
end_date = "2025-12"

coords = -4.02, 120.02  # Lake Tempe

aoi_point = point(coords[1], coords[0], crs="EPSG:4326")
area = aoi_point.buffer(0.04).boundingbox

area.explore()

## Load satellite data

Load spectral bands and data quality measurements relevant for WOfS.

In [None]:
dc = Datacube()

In [None]:
# bands have to be supplied in order
wofs_bands = ["blue", "green", "red", "nir08", "swir16", "swir22"]

In [None]:
# define sensor and resolution
product = ["s2_l2a"]
resolution=10
qa_band = "scl"

In [None]:
# Find datasets
datasets = dc.find_datasets(
    product=product,
    time=(start_date, end_date),
    longitude=(area.left, area.right),
    latitude=(area.bottom, area.top),
    # only use high-quality imagery for testing
    cloud_cover = (0, 60),
)

crs = mostcommon_crs(datasets)

print(f"Found {len(datasets)} datasets")
print(f"Most common CRS is {crs}")

data = dc.load(
    datasets=datasets,
    longitude=(area.left, area.right),
    latitude=(area.bottom, area.top),
    resolution=resolution,
    output_crs=crs,
    measurements=wofs_bands + [qa_band],
    group_by="solar_day",
    dask_chunks={"time": 1, "x": 1000, "y": 1000},
    resampling={
        "*": "bilinear",
        qa_band: "nearest",
    },
    driver="rio",
)

In [None]:
#data=data.compute()
data

## Water classification

The wofs classifier takes input as 3D DataArray with ordered bands and values from 0 to 10,000. 
No scaling or offset is required for Sentinel-2 collection 0 data.

In [None]:
# no scaling, classifier every input pixel
# noting the wofs classifier returns uint8 with 0 for dry, 128 for water
wofs_all = (
    data[wofs_bands]
    .to_array()
    .groupby("time")
    .map(lambda a: classify(a.squeeze("time", drop=True)))
).compute()

In [None]:
# plot selected maps
wofs_all.isel(time=[0,1,2]).plot.imshow(col='time');

## Masking

The water classification maps need to be filtered to keep only valid and reliable observations.
We will consider cloud, cloud shadow and nodata as invalid observations.

In [None]:
# Mask Sentinel-2 data
# 3 is cloud shadow, 8 is medium probability cloud, 9 is high probability cloud
cloud_mask = data[qa_band].isin([3, 8, 9])
#comment out for faster operation
#cloud_mask = mask_cleanup(cloud_mask, (("dilation", 10), ("erosion", 5)))
invalid = (data[wofs_bands]==0).to_array(dim='band').any(dim='band')
mask = (cloud_mask | invalid).compute()

In [None]:
#mask.plot.imshow(col='time');

## Monthly extent

Since Sentinel-2 provides frequent coverage, we will use it to monitor monthly changes in the water extent.

In [None]:
# using max() to capture any water detection within the month
wofs_monthly_extent = wofs_all.where(~mask).resample(time="MS").max(skipna=True)

# turn output into binary map
wofs_monthly_extent = wofs_monthly_extent==128

In [None]:
wofs_monthly_extent.plot.imshow(col='time');

In [None]:
# plot monthly extent trend
wofs_monthly_extent.sum(['y','x']).plot();

> Since for some months, clouds have prevented clear viewing, the extents mapped above would represent the minimum water presenence in that month.

## Further exploration

* Are there other areas where this analysis could be applied?

* What additional types of analysis could help support your use case?

## Additional information

**License:** The code in this notebook is licensed under the [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0). 