<link rel="stylesheet" href="https://use.typekit.net/dvn1law.css">
<style>        
@font-face {
font-family:"futura-pt-bold";
src:url("https://use.typekit.net/af/053fc9/00000000000000003b9af1e4/27/l?primer=7cdcb44be4a7db8877ffa5c0007b8dd865b3bbc383831fe2ea177f62257a9191&fvd=n7&v=3") format("woff2"),url("https://use.typekit.net/af/053fc9/00000000000000003b9af1e4/27/d?primer=7cdcb44be4a7db8877ffa5c0007b8dd865b3bbc383831fe2ea177f62257a9191&fvd=n7&v=3") format("woff"),url("https://use.typekit.net/af/053fc9/00000000000000003b9af1e4/27/a?primer=7cdcb44be4a7db8877ffa5c0007b8dd865b3bbc383831fe2ea177f62257a9191&fvd=n7&v=3") format("opentype");
font-display:auto;font-style:normal;font-weight:700;font-stretch:normal;
}
</style>
<div style="display: flex; margin: 0px; padding-top: 1.5rem; padding-bottom: 1.5rem; font-family: futura-pt, 'Tahoma', 'Segoe UI', Geneva, Verdana, sans-serif;">
    <span style="margin-right: 15px; padding-right: 2rem; background-color: #3b6d48;"></span>
    <div style="margin-bottom: auto; margin-top: auto; margin-right: auto; padding-right: 15px;">
        <div style="margin: 0; padding-top: 0.2rem; padding-bottom: 3.3rem; letter-spacing: 0.15rem; color: #a6ce37; font-weight: bold; font-size: 3rem; font: futura-pt-bold"> CEOS Analytics Lab</div>
        <div style="margin: 0; color: #469ab9; font-weight: bold; font-size: 1.5rem;">Welcome to the CEOS Analytics Lab!</div>
        <div style="margin: 0; padding-bottom: 0.2rem; color: #474c38; font-size: 1.25rem;"><span>TBD</span><span>| </span><span style="color: #3b6d48; font-weight: bold;">Water Observations from Space (WOFS)</span></div>
        <hr style="border: 1px solid #474c38;">
    </div>
    <div style="margin-top: auto; margin-bottom: auto; margin-left: auto; padding-left: 15px;">
        <div><img style="vertical-align: middle; padding: 0.5rem; width: 300px; height: auto;" src="https://ceos.org/document_management/Communications/CEOS-Logos/CEOS_logo_colour_no_text-small.png" /></div>
    </div>
</div>

This notebook demonstrates the Australian Water Observations from Space (WOFS) algorithm. This water detection algorithm is an improvement over the Landsat QA water flag or the NDWI index for water identification. For more information, visit this website:

http://www.ga.gov.au/scientific-topics/hazards/flood/wofs

## <span id="import">Import Dependencies and Connect to the Data Cube [&#9652;](#top)</span>

In [None]:
import datacube
dc = datacube.Datacube(app='Water_Observations_from_Space')

import sys, os
os.environ['USE_PYGEOS'] = '0'

import datetime
import matplotlib.pyplot as plt
import numpy as np  
import xarray as xr
import pandas as pd

from dea_tools.plotting import rgb, display_map
from dea_tools.bandindices import calculate_indices

### EASI tools
sys.path.append(os.path.expanduser('../scripts'))
from ceos_utils.data_cube_utilities.clean_mask import landsat_clean_mask_invalid, landsat_qa_clean_mask
from easi_tools import EasiDefaults
from easi_tools import notebook_utils
easi = EasiDefaults() # Get the default parameters for this system

In [3]:
from datacube.utils import masking
help(masking)

Help on module datacube.utils.masking in datacube.utils:

NAME
    datacube.utils.masking - Tools for masking data based on a bit-mask variable with attached definition.

DESCRIPTION
    The main functions are `make_mask(variable)` `describe_flags(variable)`

FUNCTIONS
    create_mask_value(bits_def, **flags)
    
    describe_flags_def(flags_def)
    
    describe_variable_flags(variable, with_pandas=True)
        Returns either a Pandas Dataframe (with_pandas=True - default) or a string
        (with_pandas=False) describing the available flags for a masking variable
        
        Interprets the `flags_definition` attribute on the provided variable and
        returns a Pandas Dataframe or string like::
        
            Bits are listed from the MSB (bit 13) to the LSB (bit 0)
            Bit     Value   Flag Name            Description
            13      0       cloud_shadow_fmask   Cloud Shadow (Fmask)
            12      0       cloud_shadow_acca    Cloud Shadow (ACCA)
    

In [None]:
cluster, client = notebook_utils.initialize_dask(use_gateway=False)
display(cluster if cluster else client)
print(notebook_utils.localcluster_dashboard(client, server=easi.hub))

In [None]:
from datacube.utils.aws import configure_s3_access
configure_s3_access(aws_unsigned=False, requester_pays=True, client=client)

## <span id="plat_prod">Choose Platforms and Products [&#9652;](#top)</span>

In [None]:
# Define the Product
product = "landsat8_c2l2_sr"

## <span id="define_extents">Define the Extents of the Analysis [&#9652;](#top)</span>

In [None]:
# Select an analysis region (Latitude-Longitude) 
# Select a time period within the extents of the dataset (Year-Month-Day)

# Mombasa, Kenya
# latitude = (-4.05, -3.95) 
# longitude = (39.60, 39.68) 

# latitude=easi.latitude
# longitude=easi.longitude
latitude = (36.28, 36.48)
longitude = (-114.325, -114.43)

# Define Time Range
# Landsat-8 time range: 07-Apr-2013 to current
time_extents = ('2021-01-01', '2021-12-31')

In [None]:
# The code below renders a map that can be used to view the analysis region.
display_map(longitude,latitude)

## <span id="load_data">Load and Clean Data from the Data Cube [&#9652;](#top)</span>
After loading, you will view the Xarray dataset. Notice the dimensions represent the number of pixels in your latitude and longitude dimension as well as the number of time slices (time) in your time series.

In [None]:
measurements = ['red', 'green', 'blue', 'nir', 'swir1', 'swir2', 'pixel_qa']
data_names = measurements.copy()
data_names.remove('pixel_qa')

In [None]:
landsat_dataset = dc.load(latitude = latitude,
                          longitude = longitude,
                          time = time_extents,
                          product = product,
                          output_crs = 'epsg:6933',
                          resolution = (-30,30),
                          measurements = measurements,
                          group_by = 'solar_day',
                          dask_chunks = {'time':1}) 

In [None]:
landsat_dataset

In [None]:
clear_mask = masking.make_mask(landsat_dataset['pixel_qa'], clear='clear')
water_mask = masking.make_mask(landsat_dataset['pixel_qa'], water='water')
cloud_mask = masking.make_mask(landsat_dataset['pixel_qa'], cloud='not_high_confidence', cloud_shadow='not_high_confidence')
clean_mask = (clear_mask | water_mask) & cloud_mask

## <span id="time_series_water">Time Series Water Detection Analysis [&#9652;](#top)</span>
Time series output of the Australian Water Observations from Space (WOFS) results. The results show the percent of time that a pixel is classified as water over the entire time series. BLUE = frequent water, RED = infrequent water.

In [None]:
# WOFS is written for Landsat Collection 1, so we need to scale the Collection 2 data to look like collection 1
# This is stolen from https://github.com/GeoscienceAustralia/wofs/blob/master/wofs/virtualproduct.py
def scale_usgs_collection2(data):
    """These are taken from the Fractional Cover scaling values"""
    attrs = data.attrs
    data =  data.apply(scale_and_clip_dataarray, keep_attrs=False,
                       scale_factor=0.275, add_offset=-2000,
                       clip_range=None, valid_range=(0, 10000))
    data.attrs = attrs
    return data

def scale_and_clip_dataarray(dataarray: xr.DataArray, *, scale_factor=1, add_offset=0, clip_range=None,
                             valid_range=None, new_nodata=-999, new_dtype='int16'):
    orig_attrs = dataarray.attrs
    nodata = dataarray.attrs['nodata']

    mask = dataarray.data == nodata

    # add another mask here for if data > 10000 then also make that nodata
    dataarray = dataarray * scale_factor + add_offset

    if clip_range is not None:
        dataarray = dataarray.clip(*clip_range)

    dataarray = dataarray.astype(new_dtype)

    dataarray.data[mask] = new_nodata
    if valid_range is not None:
        valid_min, valid_max = valid_range
        dataarray = dataarray.where(dataarray >= valid_min, new_nodata)
        dataarray = dataarray.where(dataarray <= valid_max, new_nodata)
    dataarray.attrs = orig_attrs
    dataarray.attrs['nodata'] = new_nodata

    return dataarray

landsat_dataset_scaled = scale_usgs_collection2(landsat_dataset)

In [None]:
from ceos_utils.data_cube_utilities.dc_water_classifier import wofs_classify
ts_water_classification = wofs_classify(landsat_dataset_scaled,clean_mask = clean_mask.values, no_data=0, x_coord='x', y_coord='y')

In [None]:
# Apply nan to no_data values
ts_water_classification = ts_water_classification.where(ts_water_classification != -9999).astype(np.float16)

# Time series aggregation that ignores nan values.    
water_classification_percentages = (ts_water_classification.mean(dim = ['time']) * 100).wofs.rename('water_classification_percentages')

In [None]:
# import color-scheme and set nans (no data) to black
from matplotlib.cm import jet_r
jet_r.set_bad('black',1)

In [None]:
img_scale = water_classification_percentages.shape[0]/water_classification_percentages.shape[1]

In [None]:
# This is where the WOFS time series product is generated. 
# Areas of RED have experienced little or no water over the time series
# Areas of BLUE have experience significant or constant water over the time series
figsize=6
water_classification_percentages.plot(cmap = jet_r, figsize=(figsize,figsize*img_scale), vmin=0, vmax=100)
plt.title("Percent of Samples Classified as Water")
plt.axis('off')
plt.show()