<a href="https://colab.research.google.com/github/SERVIR/flood_mapping_intercomparison/blob/main/hydrafloods/training_materials/oct_2021_hf_training/notebooks/remote_sensing_water_day1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction to remote sensing of surface water using HYDRAFloods

In this notebook we will look at basic functionality of HYDRAFloods and how to create surface water maps from different sensors. Lastly we will explore the  Full documentation and additional examples for the HYDRAFloods Python package can be found at: https://servir-mekong.github.io/hydra-floods/

## Setup
Before running the notebook, please mount your Google Drive to the notebook. We will use Google Drive to securely store Earth Engine credentials for use in other notebooks. This will allow us to bypass authenticating everytime saving time throughout the training.

In [None]:
# mount the google drive so that we can save credentials
from google.colab import drive
drive.mount('/content/drive')

Now we will install the `hydrafloods` package for surface water mapping and `geemap` for interactive viewing results from Earth Engine.

You will get and error stating "*You must restart the runtime in order to use newly installed versions.*" This can be ignored.

In [None]:
# install the packages needed
!pip install hydrafloods geemap

In [None]:
%pylab inline

In [None]:
import ee
import datetime
import hydrafloods as hf
import geemap.eefolium as geemap
import geemap.colormaps as cm

Check the HYDRAFloods package version, should be "2021.10.11"

In [None]:
hf.__version__

In [None]:
# initiate authentication workflow
# it will ask to authenticate if no credentials are available
# will also initialize ee session
_ = geemap.Map()

## Exploring HYDRAFloods Datasets
The start of any process is to acquire data. Here HYDRAFloods is used to connect to Earth Engine collections and apply spatio-temporal filters of our interest with minimal amount of coding.

In [None]:
region = hf.country_bbox("Guatemala")
start_time = "2019-01-01"
end_time = "2019-07-01"

# get a Landsat 8 collection
lc8 = hf.Landsat8(region,start_time,end_time)

In [None]:
print(lc8)

In [None]:
lc8.n_images

In [None]:
lc8.dates

In [None]:
lc8.collection

`hydrafloods` has specialized datasets classes that extend a hydrafloods Dataset class and are common image collections used in surface water mapping (See list [here](https://servir-mekong.github.io/hydra-floods/using-datasets/#specialized-datasets)). These specialized datasets include a custom `qa()` method based on quality assessment bands that gets called on initialization to mask poor quality pixels and custom methods that make harmonization easy.

To demonstrate, we will pull the imagery from the Lansat 8 dataset with and without the qa process and compare.

In [None]:
lc8_noqa = hf.Landsat8(region,start_time,end_time,use_qa=False)

In [None]:
first_qa = lc8.collection.first()
first_noqa = lc8_noqa.collection.first()

In [None]:
first_qa.bandNames().getInfo()

In [None]:
optical_vis = {
    "min":50,
    "max":5500,
    "bands":"swir2,nir,green",
    "gamma":1.5,
}


In [None]:
Map = geemap.Map(center=(15.5754, -89.8297), zoom=8)

Map.addLayer(first_qa,optical_vis, 'Landsat 8 (QA)')
Map.addLayer(first_noqa,optical_vis, 'Landsat 8 (No QA)')


Map.addLayerControl()
Map

Again, the quality masking is turned on by default for *all* data and helps with quickly accessing and processing data.

One last note, the optical sensor bands are automatically renamed to a common scheme so that they can be used together easily.

## Optical surface water mapping
Optical data has a long history of being used for surface water mapping and is used for long-term studies (i.e. [Pekel et al., 2016](https://www.nature.com/articles/nature20584), [Donchyts et al., 2016](https://www.nature.com/articles/nclimate3111)). With optical imagery, there are a few steps usually completed for accurate surface water maps:

1. Acquire data
2. Calibration/georegistration
3. Atmospheric correction
4. Cloud/shadow masking
5. Terrain Correction (optional)
6. Calculate water index
7. Map water

Earth Engine has taken care of steps 1-3 for some optical datasets so we can directly access analysis ready surface refelctance data. These subsequent steps are fairly general and there are multiple paths we can take to achieve the goal of surface water maps.

Here we are going to get a Landsat 8 dataset again for Guatemala and create a monthly water maps.

In [None]:
region = hf.country_bbox("Guatemala")
start_time = "2019-01-01"
end_time = "2019-06-01"

# get a Landsat 8 collection
lc8 = hf.Landsat8(region,start_time,end_time)

To classify water within optical imagery, it is common practice to calculate a water index. There are many water indices and selection of an index should be based on your use case; here is a paper describing and comparing common indices: https://doi.org/10.3390/w9040256

For this case we will use the modified normalized water index ([Xu, 2006](https://doi.org/10.1080/01431160600589179)). In `hydrafloods` the water index functions are named based on their common abbreviation so we can simply call it.

In [None]:
# calculate water index
# here we calculate the modified normalized water index
water_index = lc8.apply_func(hf.mndwi)

In [None]:
first_img = lc8.collection.first()
first_mndwi = water_index.collection.first()

In [None]:
wi_vis ={
    "bands":"mndwi",
    "min": -0.5,
    "max":0.5,
    "palette": cm.palettes.Blues
}

In [None]:
Map = geemap.Map(center=(15.5754, -89.8297), zoom=8)

Map.addLayer(first_qa,optical_vis, 'Landsat 8')
Map.addLayer(first_mndwi,wi_vis, 'Landsat 8 MNDWI')


Map.addLayerControl()
Map

In [None]:
water = water_index.apply_func(hf.edge_otsu, initial_threshold=0.0, edge_buffer=300, scale=150, invert=True,thresh_no_data=0.0)

In [None]:
first_water = water.collection.first()

In [None]:
Map = geemap.Map(center=(15.5754, -89.8297), zoom=8)

Map.addLayer(first_qa,optical_vis, 'Landsat 8')
Map.addLayer(first_mndwi,wi_vis, 'Landsat 8 MNDWI')
Map.addLayer(first_water.selfMask(),{"min":0,"max":1,"palette":cm.palettes.Blues}, 'Landsat 8 Water')

Map.addLayerControl()
Map

In [None]:
monthly_mosaics = lc8.aggregate_time(
    dates=[f"2019-{i:02d}-01" for i in range(1,7)], # define times for 1st of every month in collection
    period_unit="month", # specify that the aggregation should be 1 month
    reducer=ee.Reducer.median() # reduce to the median observation per pixel
)

In [None]:
monthly_water = water.aggregate_time(
    dates=[f"2019-{i:02d}-01" for i in range(1,7)], # define times for 1st of every month in collection
    period_unit="month", # specify that the aggregation should be 1 month
    reducer=ee.Reducer.mode() # reduce the mode observation per pixel
)

In [None]:
monthly_water.dates

In [None]:
Map = geemap.Map(center=(15.5754, -89.8297), zoom=8)

Map.addLayer(monthly_mosaics.collection.first(), optical_vis, 'Jan. Landsat 8 Mosaic')
Map.addLayer(monthly_water.collection.first().selfMask(),{"min":0,"max":1,"palette":cm.palettes.Blues}, 'Jan. Landsat 8 Water')

Map.addLayerControl()
Map

## SAR surface water mapping
Synthetic Aperture Radar (SAR) data is often used to map surface water due to the unique ability to sense information of the land even in the presence of clouds. As with optical imagery, there are a few steps for accurate surface water maps from SAR imagery including:

1. Acquire data
2. Calibration/georegistration
3. Terrain Correction
4. Speckle Filter
5. Map water

Earth Engine has taken care of steps 1-2 so we can directly access anlysis ready data. These subsequent steps are fairly general and there are multiple algorithms that can achieve each step to create high quality water maps from SAR. We will explore one workflow implemented with the hydrafloods package.

In [None]:
# define a location geometry
region = hf.country_bbox("Guatemala")

# define time period
start_time = "2020-11-04"
end_time = "2020-11-05"

# get the Sentinel 1 collection as a hydrafloods Dataset
s1 = hf.Sentinel1(region,start_time,end_time)

In [None]:
# inspect the dataset object
s1

In [None]:
# print how many images we have for our specified time and location
s1.n_images

In [None]:
# get the imagery acquisition times
s1.dates

In [None]:
merit = ee.Image("MERIT/Hydro/v1_0_1")

# extract out the DEM and HAND bands
dem = merit.select("elv").unmask(0)
hand = merit.select("hnd").unmask(0)

In [None]:
# apply a (psuedo-) terrain flattening algorithm to S1 data
s1_flat = s1.apply_func(hf.slope_correction, elevation = dem, buffer = 50)

In [None]:
# apply a speckle filter algorithm to S1 data
s1_filtered = s1_flat.apply_func(hf.gamma_map)

# aggregate SAR observations to 30x30 m pixels
s1_aggregated = s1_filtered.apply_func(lambda x: x.focal_mean(1.5).reproject(ee.Projection("EPSG:4326").atScale(30)))

In [None]:
# view the results of SAR water mapping
Map = geemap.Map(center=(16.0029, -90.5109), zoom=12)

Map.addLayer(s1.collection.median(),{"bands": "VV", "min":-25, "max": 0}, 'Sentinel 1')
Map.addLayer(s1_flat.collection.median(),{"bands": "VV", "min":-25, "max": 0}, 'Sentinel 1 (terrain flattened)')
Map.addLayer(s1_aggregated.collection.median(),{"bands": "VV", "min":-25, "max": 0}, 'Sentinel 1 (speckle filtered)')

Map.addLayerControl()
Map

In [None]:
# apply a water thresholding algorithm to the collection
# method from Markert et al., 2020 (https://doi.org/10.3390/rs12152469)
water = s1_filtered.apply_func(hf.edge_otsu,initial_threshold=-14,band="VV",edge_buffer=300,scale=180)

In [None]:
water_hand = water.collection.mode().And(hand.lt(20))

In [None]:
# view the results of SAR water mapping
Map = geemap.Map(center=(16.0029, -90.5109), zoom=12)

Map.addLayer(s1.collection.median(),{"bands": "VV", "min":-25, "max": 0}, 'Sentinel 1')
Map.addLayer(water.collection.mode(),{"min":0,"max":1,"palette":cm.palettes.Blues}, "Sentinel 1 (water)")
Map.addLayer(water_hand,{"min":0,"max":1,"palette":cm.palettes.Blues}, "Sentinel 1 (water hand masked)")


Map.addLayerControl()
Map

In [None]:
hf.export_image(water.collection.mode(), region, scale=30, crs='EPSG:4326', pyramiding={".default":"mode"}, export_type='toDrive')


## Advanced water mapping

What we have covered at this point is has been relatively straighforward water mapping using adaptive thresholding techniques. This are efficient and produce adequate results, however, are not always the most accurate or appropriate. Here we will cover more advanced water mapping techniques.


### ML based water mapping

Some studies have successfully employed machine learning workflows for surface water mapping such as [Huang et al., 2018](https://doi.org/10.3390/rs10050797). There are ML based workflows implemented in `hydrafloods` to make the processing easier for users. In this case we will implement methods from [Cordeiro et al., 2021](https://doi.org/10.1016/j.rse.2020.112209) but for Sentinel 1 imagery.

First we will create a stack of images with multiple bands that we will use as input features for the ML model.

In [None]:
# use add indices to calculate multiple indices at onces efficiently
s1_multiband = s1_filtered.apply_func(hf.add_indices,indices=["vv_vh_ratio","ndpi"])

In [None]:
# reduce the multiple images into one as an input
input_img = s1_multiband.collection.mosaic()

Here we apply the  algorithm. This will take the input bands and attempt to find *k* classes that describe a sample. The *k* classes will be ordered based on a ranking band (in this case VV). The class with the lowest centroid value of VV (i.e. minimum ranking) will be considered water, all other classes will be considered not-water. Then a final generalization model will be applied to find the probability that each pixel fits within the designated water class.

In [None]:
# apply an advanced water mapping algorithm to automatically calculate water probability
water_proba = hf.multidim_semisupervised(
    first_qa,
    bands = ["green","red","nir","swir2"],
    rank_band="swir2",
    ranking='min',
    # region=region,
    n_samples=2500,
    seed=7,
    scale=120,
)

In [None]:
# view the results of SAR water mapping
Map = geemap.Map(center=(16.0029, -90.5109), zoom=12)

Map.addLayer(s1.collection.median(),{"bands": "VV", "min":-25, "max": 0}, 'Sentinel 1')
Map.addLayer(water_proba,{"min":0,"max":1,"palette":cm.palettes.inferno}, "Sentinel 1 (water proba)")


Map.addLayerControl()
Map

### Deep learning for water mapping

---

🚨 WARNING! 🚨 This section requires you to be part of the "*servir-ee-tf*" group to use the following deep learning model.

---

Going one more step further, some studies have successfully employed deep learning for satellite image classification (i.e. [Hughes & Kennedy, 2019](https://doi.org/10.3390/rs11212591 )). In the referenced paper, they trained a fully convolutional neural network (FCNN) to predict quality classes in Landsat imagery. We have implemented a similar network to be used with Earth Engine where the model will predict Cloud, Shadow, Snow, Water, Clear, and No Data classes. We can then extract the water class and use this as a water map.

This model is specifically for Landsat imagery. Deep learning models are sensitive to data inputs so use with caution for other sensors. Methods for this particular model are described in the following presentation: https://docs.google.com/presentation/d/1LOVJGxa_7bXfq2QCfrXSAvmE_C6rdSGDi4peamKNjxU/edit?usp=sharing

Here we will connect to the hosted model from Earth Engine. We define the cloud project, model name, model version, and a few more parameters then call `ee.Model.fromAiPlatformPredictor()`.

In [None]:
# set some ee.Model parameters
PROJECT = 'ee-demos';
MODEL_NAME = 'kel_cloud_model';
VERSION_NAME = 'eeified_lsqa_vgg19unet_weighted';
INSHAPES = ee.Dictionary({"qa":[6]});

#Load the trained model and use it for prediction.
model = ee.Model.fromAiPlatformPredictor(
    projectName= PROJECT,
    modelName= MODEL_NAME,
    version= VERSION_NAME,
    inputTileSize= [144,144],
    inputOverlapSize= [8,8],
    inputShapes= INSHAPES,
    proj= ee.Projection('EPSG:4326').atScale(30),
    fixInputProj= True,
    outputBands= {'qa': {
        'type': ee.PixelType.float(),
        'dimensions': 1
      }
    }
);

Now that we have our model and imagery, we can apply the prediction. Note that we use arrays to transfer data. These arrays need to be formatted in the exact way that is required for the model predictions. In this case it is height, width, bands.

In [None]:
# apply prediction to image
# rescale data to 0-1 and convert to Float Array for prediction
predictions = model.predictImage(
  first_noqa.multiply(0.0001).toFloat().toArray()
)

The output, `predictions`, is also an image formatted as an array where we can do some fun stuff.

For starters, we can get the probability of each individual pixels belong to a class:

In [None]:
# flatten predictions to class probabilities
pred_probs = (
    predictions
    .arrayFlatten([['cloud','shadow','snow','water','land','nodata']])
    .toFloat()
)

We can also calculate which class has the highest probablity and convert to a classified image:

In [None]:
# flatten predictions to highest probability class
pred_classes = (
    predictions
    .arrayArgmax()
    .arrayFlatten([['qa']])
)

# mask land pixels so we are left with the interesting classes
pred_classes = pred_classes.updateMask(pred_classes.neq(4))

In [None]:
prob_vis = {"bands":"water","min":0,"max":1}

class_vis = {
    "min":0,
    "max":5,
    "palette":'#ecf0f1,#7f8c8d,#00FFFF,#0000FF,#27ae60,#000000'
}

In [None]:
Map = geemap.Map(center=(15.7962, -87.7849), zoom=12)

Map.addLayer(first_noqa,optical_vis, "Landsat 8 Image")
Map.addLayer(pred_probs,{"bands":"water","min":0,"max":1},"Water Probability")
Map.addLayer(pred_classes,class_vis,"QA classes")

Map.addLayerControl()
Map

Deep learning can at the very least be an entire week long training by itself...this only serves and an example of how we can use deep learning for surface water mapping.

An example notebooks for sampling, building/training, and deploying a your own deep learning model can be found here: https://github.com/gee-community/ee-tensorflow-notebooks/blob/master/landsat_qa_cnn/lc8_ee_qa_unet.ipynb