# Introduction to DEA Water Observations <img align="right" src="../Supplementary_data/dea_logo.jpg">

## Background
It is important to know where water is normally present in a landscape, where water is rarely observed, and where inundation has occasionally occurred.

These observations tell us where flooding has occurred in the past, and allows us to understand wetlands, water connectivity and surface-groundwater relationships. This can lead to more effective emergency management and risk assessment.

This is the principal [Digital Earth Australia (DEA) Water Observations product](https://cmi.ga.gov.au/data-products/dea/613/dea-water-observations-landsat) (previously known as Water Observations from Space), providing the individual water observations per satellite image that are subsequently used in the following DEA Water Observations suite and related water bodies products. 

This product shows where surface water was observed by the Landsat satellites on any particular day since mid 1986. These daily data layers are termed Water Observations (WOs).

### What this product offers
DEA Water Observations provides surface water observations derived from Landsat satellite imagery for all of Australia from 1986 to present.

The WOs show the extent of water in a corresponding Landsat scene, along with the degree to which the scene was obscured by clouds, shadows or where sensor problems cause parts of a scene to not be observable. 

### Applications
The DEA Water Observations (WOs) are used to determine the area of surface water present in the corresponding satellite scene, and can be used for several water monitoring applications. Uses of the individual WOs include:

* flood extent
* amount of water in water bodies, major rivers and the coastal zone.

As the WOs are separated from the derived statistics of the associated DEA Water Observations statistical products, the WOs are most useful for performing analyses requiring the investigation of surface water extent for particular times rather than over long term time series.

### Publications
* Mueller, N., Lewis, A., Roberts, D., Ring, S., Melrose, R., Sixsmith, J., Lymburner, L., McIntyre, A., Tan, P., Curnow, S., & Ip, A. (2016). [Water observations from space: Mapping surface water from 25 years of Landsat imagery across Australia](https://doi.org/10.1016/j.rse.2015.11.003). Remote Sensing of Environment, 174, 341–352. 

> **Note:** For technical information about DEA Water Observations, visit the official [Geoscience Australia DEA Water Observations product description](https://cmi.ga.gov.au/data-products/dea/613/dea-water-observations-landsat).

### Load packages

In [None]:
import sys
import datacube
from datacube.utils import masking
import matplotlib.pyplot as plt

sys.path.append('./Scripts')
from dea_plotting import plot_wo
from dea_datahandling import wofs_fuser

## Available products and measurements

### List products available in Digital Earth Australia

In [None]:
dc = datacube.Datacube(app='DEA_Water_Observations')

# List DEA Water Observations products available in DEA
dc_products = dc.list_products()
display_columns = ['name', 'default_crs', 'default_resolution']

display(dc_products[dc_products.name == 'ga_ls_wo_3']  [display_columns].set_index('name'))

### List measurements

In [None]:
dc_measurements = dc.list_measurements()
dc_measurements.loc['ga_ls_wo_3']

## Loading data
Now that we know what products and measurements are available for the products, we can load WOs data from Digital Earth Australia for an example location.

As WOs are created scene-by-scene and some scenes overlap, it is important when loading data to `group_by` solar day, and ensure that the data between scenes is combined correctly by using the WOs `fuse_func`.
This will merge observations taken on the same day, and ensure that important data isn not lost when overlapping datasets are combined.

In [None]:
# Set up a region to load data
query = {
    'y': (-31.18, -31.12),
    'x': (116.84, 116.90),
    'time': ('1996-09-01', '1996-12-30'),
}

# Load DEA Water Observations data from the datacube
wo = dc.load(product='ga_ls_wo_3',
             output_crs='EPSG:3577',
             resolution=(-30, 30),
             group_by='solar_day',
             fuse_func=wofs_fuser,
             **query)

We can now view the data that we loaded.
The measurements listed under `Data variables` should match the measurements displayed in the previous [List measurements](#List-measurements) step.

In [None]:
wo

### Plotting data
We can plot DEA Water Observations using the `plot_wo` function. We can see that our study area includes one large and several small waterbodies which are changing in size over time. We can also see that some observations (e.g. first and fourth panels) contain clouds and cloud shadow.


In [None]:
plot_wo(wo.water, col='time', size=4, col_wrap=4)

## Understanding WOs bit flags
WOs data are stored as a bit field. This is a binary number, where each digit of the number is independantly set based on the presence (1) or absence (0) of a particular attribute (water, cloud, cloud shadow etc). In this way, a single decimal value for each pixel can provide information about a variety of features of that pixel. 

Below is a breakdown of which bits represent which features, along with the decimal value associated with that bit being set to true.

| Attribute | Bit / position   | Decimal value |
|------|------|----|
| No data | 0:   `0-------` or `1-------` | 1|
| Non contiguous | 1:   `-0------` or `-1------` | 2 |
| Sea | 2:   `--0-----` or `--1-----` | 4 |
| Terrain or low solar angle | 3:   `---0----` or `---1----` | 8 |
| High slope | 4:   `----0---` or `----1---` | 16 |
| Cloud shadow | 5:   `-----0--` or `-----1--` | 32 |
| Cloud | 6:   `------0-` or `------1-` | 64 |
| Water | 7:   `-------0` or `-------1` | 128 |

Any combinations of flags can be combined to create a unique decimal value. 
For example, a value of 136 indicates that water (128) AND terrain shadow / low solar angle (8) were observed for the pixel (i.e. 128 + 8 = 136).
A value of 144 would indicate water (128) AND high slope (16) were observed (128 + 16 = 144).

This flag information is available inside the loaded data and can be visualised as below:

In [None]:
# Display details of available flags
flags = masking.describe_variable_flags(wo)
flags['bits'] = flags['bits'].astype(str)
flags.sort_values(by='bits')

### Masking using WOs bit flags
We can convert the WOs bit field into a binary array containing True and False values. 
This allows us to use the WOFL data as a mask that can be applied to other datasets.

The `make_mask` function allows us to create a mask using the flag labels (e.g. "wet" or "dry") rather than the binary numbers we used above. For example, we can easily identify pixels that were wet in each image (i.e. yellow) by passing the flag `wet=True`:

In [None]:
# Keeping only dry, non-cloudy pixels
wo_wet = masking.make_mask(wo, wet=True)

# Plot output mask
wo_wet.water.plot(col='time', size=4, col_wrap=4)

#### Excercise W1.1: Can you compute another mask for pixels categorised as 'clouds' and plot the result of 'water' or 'cloud' pixels. Can you modify the last line so it plots the pixels that are categorised as 'water' and 'cloud'?

In [None]:
import xarray as xr

# Keeping only dry, non-cloudy pixels
wo_mask = masking.make_mask(wo, wet=True, ?)

# Combine and plot
wo_mask.water.plot(col='time', size=4, col_wrap=4)

> **Note:** For more technical information about the DEA Water Observations's bit flags, refer to the Details tab of the official [Geoscience Australia DEA Water Observations product description](https://cmi.ga.gov.au/data-products/dea/613/dea-water-observations-landsat#details).

## Example application: mapping inundation frequency and tracking changes in surface water over time
The following section will demonstrate a simple analysis workflow based on DEA Water Observations. 
In this example, we will process our loaded WOs data so that we can map inundation frequency across our study area, and consistently track changes in surface water area over time.


### Identifying clear pixels

In the previous example, we used the `wet=True` bit flag to identify pixels that contained water. However, using wet pixels on their own can lead to misleading results. For example, the fourth image above gives the false impression that our waterbody reduced significantly in size, when in reality part of the waterbody was obscured by cloud and cloud shadow. 

To deal with this, we first need to remove cloud, cloud shadow and other sources of invalid data from our datset. We can do this by identifying "clear" pixels, i.e. those that were observed as either wet *or* dry by the DEA Water Observations algorithm. The resulting images will show clear pixels as yellow, and unclear pixels as purple:

In [None]:
# Identify clear pixels that were either dry or clear
wo_dry = masking.make_mask(wo, dry=True)
wo_clear = wo_wet + wo_dry

# Plot clear pixels over time
wo_clear.water.plot(col='time', size=4, col_wrap=4)

We can also achieve a similar result by combining multiple bit flags.
When chaining flags together, they will be combined in a logical AND fashion:

In [None]:
# Identify clear pixels that were not cloud, shadow or nodata
wo_clear = masking.make_mask(wo, cloud_shadow=False, cloud=False, nodata=False)

# Plot clear pixels over time
wo_clear.water.plot(col='time', size=4, col_wrap=4)

We can then use this as a mask to remove unclear pixels from our data. When we plot this, we can see that these pixels have now been set to `NaN` (i.e. white areas below):

In [None]:
# Apply clear data as a mask to remove unclear pixels
wo_masked = wo_wet.where(wo_clear).water

# Plot the masked data
wo_masked.plot(col='time', size=5, col_wrap=4)

### Inundation frequency
Now that we have correctly masked clouds, shadow and other invalid data, we can calculate the frequency that each pixel in our study area was observed as wet.
We can do this by taking the mean wetness of each pixel through time. 

Dark colours indicate pixels that were wet almost 100% of our time period, while light blue pixels only occasionally contained water.

In [None]:
# Calculate mean wetness through time
wo_freq = wo_masked.mean(dim=['time'])

# Plot with dark blue = high frequency
wo_freq.plot(size=5, cmap='Blues')
plt.title('Inundation frequency (% wet observations)')

#### Excercise W1.2: The previous plot shows the percentage of observations that are wet for each pixel. The pixels are boolean values represented by the numbers `1` for `True` and `0` for `False`. Can you find an alternative way of plotting the total number of wet observations for every pixel instead?

### Surface water over time
Similarly, we can track the percentage of our study area that contained water over time to inspect trends and changes in surface water.

To do this, we can take the mean surface water percent across each observation in our dataset:

In [None]:
# Calculate percent surface water over time
wo_time = wo_masked.mean(dim=['x', 'y'])

# Plot as a line graph
wo_time.plot(size=5)
plt.title('Surface water over time (%)')

### Dropping poorly observed scenes
In the line chart above, we can see that surface water has smoothly reduced over time, except for a small deviation at our fourth observation. This occured because cloud and show obscured part of that image, slightly distorting our calculation of how full the waterbody was at that moment in time.

To compare surface water consistently, we can restrict our analysis to observations where less than (for example) 20% of pixels contained cloud, shadow or other invalid pixels. This allows us to obtain a more reliable view of how surface water has changed at our location:

In [None]:
# Calculate the percent of nodata pixels in each observation
percent_nodata = wo_masked.isnull().mean(dim=['x', 'y'])

# Use this to filter to observations with less than 20% nodata
wo_masked = wo_masked.sel(time=percent_nodata < 0.2)

# The data now contains only three observations
wo_masked.mean(dim=['x', 'y']).plot(size=5)

#### Disclaimer: The original notebook for this tutorial has been taken from the Sandbox Beginners_guide folder. Refer to the conditions specified in the original notebook and for updated versions of the code.