<img src="NotebookAddons/blackboard-banner.jpg" width="100%" />
<font face="Calibri">
<br>
<font size="5"> <b>Exploring SAR Time Series Data for Flood Monitoring</b></font>

<br>
<font size="4"> <b> Franz J Meyer; University of Alaska Fairbanks & Josef Kellndorfer, <a href="http://earthbigdata.com/" target="_blank">Earth Big Data, LLC</a> </b> <br>
<img style="padding:7px;" src="NotebookAddons/UAFLogo_A_647.png" width="170" align="right" /></font>

<font size="3">This notebook introduces you to the time series signatures associated with flooding. The data analysis is doen in the framework of *Jupyter Notebooks*. The Jupyter Notebook environment is easy to launch in any web browser for interactive data exploration with provided or new training data. Notebooks are comprised of text written in a combination of executable python code and markdown formatting including latex style mathematical equations. Another advantage of Jupyter Notebooks is that they can easily be expanded, changed, and shared with new data sets or newly available time series steps. Therefore, they provide an excellent basis for collaborative and repeatable data analysis. <br>

<b>This notebook covers the following data analysis concepts:</b>

- How to load time series stacks into Jupyter Notebooks and how to explore image content using basic functions such as mean value calculation and histogram analysis.
- How to extract time series information for individual pixels of an image.
- Typical time series signatures over forests and deforestation sites.
</font>


</font>

<hr>
<font face="Calibri" size="5" color='rgba(200,0,0,0.2)'> <b>Important Notes about JupyterHub</b> </font>
<br><br>
<font face="Calibri" size="3"> <b>Your JupyterHub server will automatically shutdown when left idle for more than 1 hour. Your notebooks will not be lost but you will have to restart their kernels and re-run them from the beginning. You will not be able to seamlessly continue running a partially run notebook.</b> </font>
<br><br>
</font>


<hr>
<font face="Calibri">

<font size="5"> <b> 0. Importing Relevant Python Packages </b> </font>

<font size="3">In this notebook we will use the following scientific libraries:

<ol type="1">
    <li> <b><a href="https://pandas.pydata.org/" target="_blank">Pandas</a></b> is a Python library that provides high-level data structures and a vast variety of tools for analysis. The great feature of this package is the ability to translate rather complex operations with data into one or two commands. Pandas contains many built-in methods for filtering and combining data, as well as the time-series functionality. </li>
    <li> <b><a href="https://www.gdal.org/" target="_blank">GDAL</a></b> is a software library for reading and writing raster and vector geospatial data formats. It includes a collection of programs tailored for geospatial data processing. Most modern GIS systems (such as ArcGIS or QGIS) use GDAL in the background.</li>
    <li> <b><a href="http://www.numpy.org/" target="_blank">NumPy</a></b> is one of the principal packages for scientific applications of Python. It is intended for processing large multidimensional arrays and matrices, and an extensive collection of high-level mathematical functions and implemented methods makes it possible to perform various operations with these objects. </li>
    <li> <b><a href="https://matplotlib.org/index.html" target="_blank">Matplotlib</a></b> is a low-level library for creating two-dimensional diagrams and graphs. With its help, you can build diverse charts, from histograms and scatterplots to non-Cartesian coordinates graphs. Moreover, many popular plotting libraries are designed to work in conjunction with matplotlib. </li>
</font>

In [None]:
# Check Python version:
import sys
pn = sys.version_info[0]

import os # for chdir, getcwd, path.basename, path.exists

import pandas as pd # for DatetimeIndex
import gdal # for GetRasterBand, Open, ReadAsArray
import numpy as np #for log10, mean, percentile, power
import matplotlib.pylab as plb # for add_patch, add_subplot, figure, hist, imshow, set_title, xaxis,_label, text 
import matplotlib.pyplot as plt # for add_subplot, axis, figure, imshow, legend, plot, set_axis_off, set_data,
                                # set_title, set_xlabel, set_ylabel, set_ylim, subplots, title, twinx
import matplotlib.patches as patches  # for Rectangle
import matplotlib.animation as an # for FuncAnimation
from matplotlib import rc 

from asf_notebook import path_exists
from asf_notebook import asf_unzip
from asf_notebook import new_directory

from IPython.display import HTML
plt.rcParams.update({'font.size': 12})

import rasterio #pip install rasterio (version >=1.0.8, requires GDAL >=2.3.1)
from rasterio.windows import Window
import pyproj
import copy
from datetime import datetime
from glob import glob
import subprocess, sys
import ipywidgets as widgets #pip install ipywidgets (included with Jupyter)
from ipywidgets import Layout, VBox, Label, Checkbox, GridBox
if pn == 2:
    import cStringIO #needed for the image checkboxes
elif pn == 3:
    import io
    import base64
import plotly #pip install plotly (version >= 3.0)
import plotly.graph_objs as go
from ipywidgets import interactive, HBox, VBox
from mpldatacursor import datacursor
# For exporting:
from PIL import Image

<font face="Calibri" size="3"><b>Setup matplotlib plotting inside the notebook:</b></font>

In [None]:
%matplotlib notebook

<hr>
<font face="Calibri">

<font size="5"> <b> 1. Load Data Stack</b> </font> <img src="https://www.elcomercio.com/files/article_main/uploads/2017/03/29/58dc6b8cd9665.jpeg" width="350" style="padding:5px;" align="right" /> 

<font size="3"> This notebook will be using a Sentinel-1 data stack (VV only) north of Guayaquil. 

Ecuador and other countries in western South America experienced widespread <b>flooding</b> during the 2016-2017 winter (see picture by Bolívar Velasco/EL COMERCIO). <a href='https://www.eluniverso.com/noticias/2017/03/01/nota/6068059/arboles-se-caen-medio-torrencial-lluvia-ayer'>Guayaquil in Guayas</a> was among the affected regions, as precipitation in March 2017 was well above average. The increased precipitation was associated with a Coastal Niño event.
<br><br>
We will study the extent and progression of the flooding in the year 2016-2017 just north of Guayaquil.
</font></font>



<font face="Calibri" size="3">Before we get started, let's first <b>create a working directory for this analysis and change into it:</b> </font>

In [None]:
name="flood"
path = "/home/jovyan/notebooks/SAR_Training/English/Hazards/" + name
new_directory(path)
os.chdir(path)
print(f"Current working directory: {os.getcwd()}")

<font face="Calibri" size="3">We will <b>retrieve the relevant data</b> from an <a href="https://aws.amazon.com/" target="_blank">Amazon Web Service (AWS)</a> cloud storage bucket <b>using the following command</b>:</font></font>

In [None]:
time_series_path = 's3://asf-jupyter-data/flood.tar.gz'
time_series = os.path.basename(time_series_path)
!aws s3 cp $time_series_path $time_series

<font face="Calibri" size="3"> Now, let's <b>unzip the file (overwriting previous extractions) and clean up after ourselves:</b> </font>

In [None]:
!tar -xvzf {name}.tar.gz

<br>
<font face="Calibri" size="5"> <b> 2. Define Data Directory and Path to VRT </b> </font> 
<br><br>
<font face="Calibri" size="3"><b>Create a variable containing the VRT filename and the image acquisition dates:</b></font>

In [None]:
polarization = 'VV'
date_file = f"dates{name}_{polarization}.csv"
image_file = f"stack{name}_{polarization}.vrt"

<font face="Calibri" size="3"><b>Create Pandas time index:</b> and print the dates</font>

In [None]:
time_index = pd.DatetimeIndex(open(date_file).read().split(','))

for jacqdate, acqdate in enumerate(time_index):
    print('{:4d} {}'.format(jacqdate, acqdate.date()),end=' ')
    if (jacqdate % 5 == 4): print()

<br>
<hr>
<font face="Calibri" size="5"> <b> 3. Data exploration with an animation </b> </font> 


<font face="Calibri" size="3"><b>Read the data</b></font>

In [None]:
img = gdal.Open(image_file)
band = img.GetRasterBand(1)
raster0 = band.ReadAsArray()
band_number = 0 # Needed for updates
rasterstack = img.ReadAsArray()


<font face="Calibri" size="3">Before analyzing the data, decide whether to use <b>linear or logarithmic scaling</b></font>

In [None]:
use_dB = True

def convert(raster, use_dB=use_dB):
    # some Python trickery: 
    # if you call the convert function later, you can set the keyword 
    # argument use_dB to True or False
    # if you do not provide a keyword argument, the value that you set
    # above (when defining the function) is used
    if use_dB:
        return 10 * np.log10(raster)
    else:
        return raster


<font face="Calibri" size="3"> Let's create an <b>animation</b> to get an idea of where and when flooding might have occurred.</font>

In [None]:
%%capture 
figani = plt.figure(figsize=(10, 5))
axani = figani.subplots()
axani.axis('off')

rasterstack_ = convert(rasterstack)

imani = axani.imshow(rasterstack_[0,...], cmap='gray', vmin=np.nanpercentile(rasterstack_, 1), 
               vmax=np.nanpercentile(rasterstack_, 99))
axani.set_title("{}".format(time_index[0].date()))

def animate(i):
    axani.set_title("{}".format(time_index[i].date()))
    imani.set_data(rasterstack_[i,...])

# Interval is given in milliseconds
ani = an.FuncAnimation(figani, animate, frames=rasterstack_.shape[0], interval=300)
rc('animation', embed_limit=40971520.0)  # We need to increase the limit maybe to show the entire animation

<font face="Calibri" size="3"><b>Render</b></font>

In [None]:
HTML(ani.to_jshtml())

<br>
<div class="alert alert-success">
<font face="Calibri" size="5"> <b> <font color='rgba(200,0,0,0.2)'> <u>EXERCISE</u>:  </font> Backscatter dynamics</b> </font>

<font face="Calibri" size="3"> What is the most striking change in February 2017? Can you explain why the backscatter changes the way it does during that period?
</font>
</div>


<hr>
<br>
<font face="Calibri" size="5"> <b> 4. Create Minimum Image to Identify Inundated Areas </b> </font>
<br><br>
<font face="Calibri" size="3"> As flooding is often associated with very low backscater, we first compute the minimum backscatter for each pixel to get a first impression of areas that could have been flooded during the entire period. </font>


<font face="Calibri" size="3"> The following line <b>calculates the minimum backscatter per pixel</b> across the time series: </font>

In [None]:
temporal_min = np.nanmin(convert(rasterstack), axis=0)

<br>
<font face="Calibri" size="4"> <b> 4.2 Visualize the Minimum Image with Curser Information Included </b> </font>

<font face="Calibri" size="3"> We will now visualize the minimum image in a way that we can move our mouse over the image and visualize the line/sample image coordinates. This will help us create time-series information for the most interesting image locations. 
    
To do so, we <b>first create some helper functions:</b>
</font> 

In [None]:
def gray_plot(image, vmin=None, vmax=None, fig=None, return_ax=False):
    '''Plots an image in grayscale.
    
    Parameters:
    - image: 2D array of raster values
    - vmin: Minimum value for colormap
    - vmax: Maximum value for colormap
    - return_ax: Option to return plot axis

    '''
    if vmin is None:
        vmin = np.nanpercentile(image, 1)
    if vmax is None:
        vmax = np.nanpercentile(image, 99)
    if fig is None:
       fig = plt.figure() 
    
    ax = fig.add_axes([0.1,0.1,0.8,0.8])
    ax.imshow(image, cmap=plt.cm.gist_gray, vmin=vmin, vmax=vmax)

    if return_ax:
        return(ax)
    
###############################################################################

def big_fig(x=20, y=10):
    '''Initializes a large figure.
    
    Parameters:
    - x, y: X and Y figure dimensions
    '''
    return(plt.figure(figsize=(x, y)))

<font face="Calibri" size="3"> Now we are ready to plot the minimum image. Please note the <b>cursor location information on the bottom right of the figure.</b> Use this plot to identify points of interest for which you want to analyze radar brightness over time: </font>

In [None]:
# Large plot of multi-temporal average of VV values to inspect pixel values
fig_xsize = 8
fig_ysize = 8
fig = big_fig(fig_xsize, fig_ysize)
gray_plot(temporal_min, fig=fig)

<br>
<font face="Calibri" size="5"> <b> 5. Plot SAR Brightness Time Series at Point Locations </b> </font>

<font face="Calibri" size="4"> <b> 5.1 SAR Brightness Time Series at Point Locations </b> </font>

<font face="Calibri" size="3"> We will pick a pixel location identified in the SAR image above and plot the time series for this identified point. By focusing on image locations undergoing deforestation, we should see the changes in the radar cross section related to the deforestation event.
    
First, for processing of the imagery in this notebook we generate a list of image handles and retrieve projection and georeferencing information. We also define a function for mapping image pixels to a geographic projection</font> 

In [None]:
img_handle = gdal.Open(image_file)
geotrans = img_handle.GetGeoTransform()
proj = img_handle.GetProjection()
xsize = img_handle.RasterXSize
ysize = img_handle.RasterYSize
bands = img_handle.RasterCount
projlatlon = pyproj.Proj('+init=EPSG:4326') # WGS84
projstring = proj.split('[')[-1][:-2].split(',')[-1][1:-1]
projimg = pyproj.Proj(f'+init=EPSG:{projstring}')

def geolocation(x, y=None, latlon=True):
    if len(x) == 2:
        y = x[1]
        x = x[0]
    ref_x=geotrans[0]+sarloc[0]*geotrans[1]
    ref_y=geotrans[3]+sarloc[1]*geotrans[5]
    if latlon:
        ref_y, ref_x = pyproj.transform(projimg, projlatlon, ref_x, ref_y)
    return (ref_x, ref_y)

<font face="Calibri" size="3"> Now, let's <b>pick a rectangle around a center pixel defined in variable <i>sarloc</i></b>...</font>

In [None]:
sarloc=(1120, 180)
sarloc=(670, 740) #Interesting Site

extent = (5, 5) # choose a 5 by 5 rectangle
latlon = True # if False: return utm coordinates

refsarloc = geolocation(sarloc, latlon=latlon)
projsymbol = '°' if latlon else 'm'

<font face="Calibri" size="3">... and <b>extract the time series</b> for this small area around the selected center pixel in a memory-efficient way (needed for larger stacks):</font> 

In [None]:
bs_aggregated = []
for band in range(bands):
    rs = img_handle.GetRasterBand(band+1).ReadAsArray(sarloc[0], sarloc[1], extent[0], extent[1])
    rs_mean = convert(np.nanmean(rs))
    bs_aggregated.append(rs_mean)

fig, ax = plt.subplots(1, 1, figsize=(8, 4))
labeldB = 'dB' if use_dB else 'linear'
ax.plot(time_index, bs_aggregated, color='k', marker='o', markersize=3)
ax.set_xlabel('Date')
ax.set_ylabel(f'Sentinel-1 $\gamma^0$ [{labeldB}]')

plt.grid()
_ = fig.suptitle(f'Location: {refsarloc[0]:.3f}{projsymbol} {refsarloc[1]:.3f}{projsymbol}')
# fig.tight_layout() 
figname = "RCSTimeSeries-" + f'{refsarloc[0]:.3f}{projsymbol} {refsarloc[1]:.3f}{projsymbol}' + '.png'
plt.savefig(figname, dpi=300, transparent='true')

<br>
<div class="alert alert-success">
<font face="Calibri" size="5"> <b> <font color='rgba(200,0,0,0.2)'> <u>EXERCISE</u>:  </font> Explore Time Series at Different Point Locations </b> </font>

<font face="Calibri" size="3"> Can you interpret and attribute the changes at various locations? Apart from the flooding, what other patterns do you observe?
</font>
</div>
<br>
<hr>

<font face="Calibri" size="2"> <i>SAR Training Materials - Version 1.1 - October 2019 </i>
</font>