<img src="NotebookAddons/blackboard-banner.jpg" width="100%" />
<font face="Calibri">
<br>
<font size="5"><b>Exploring SAR Data and SAR Time Series Analysis using Jupyter Notebooks</b></font>

<br>
<font size="4"><b> Franz J Meyer; University of Alaska Fairbanks & Josef Kellndorfer, <a href="http://earthbigdata.com/" target="_blank">Earth Big Data, LLC</a> </b> <br>
<img style="padding:7px;" src="NotebookAddons/UAFLogo_A_647.png" width="170" align="right" /></font>

<font size="3"> This notebook will introduce you to the analysis of deep multi-temporal SAR image data stacks in the framework of *Jupyter Notebooks*. The Jupyter Notebook environment is easy to launch in any web browser for interactive data exploration with provided or new training data. Notebooks are comprised of text written in a combination of executable python code and markdown formatting including latex style mathematical equations. Another advantage of Jupyter Notebooks is that they can easily be expanded, changed, and shared with new data sets or newly available time series steps. Therefore, they provide an excellent basis for collaborative and repeatable data analysis. <br>

<b>We introduce the following data analysis concepts:</b>

- How to load SAR data into Jupyter Notebooks and create a time series stack 
- How to create a time series of your subset data.
- How to explore the time-series information in SAR data stacks for environmental analysis.
</font>
</font>

<hr>
<font face="Calibri" size="5" color='rgba(200,0,0,0.2)'> <b>Important Note about JupyterHub</b> </font>
<br><br>
<font face="Calibri" size="3"> <b>Your JupyterHub server will automatically shutdown when left idle for more than 1 hour. Your notebooks will not be lost but you will have to restart their kernels and re-run them from the beginning. You will not be able to seamlessly continue running a partially run notebook.</b> </font>


<hr>
<font face="Calibri">

<font size="5"> <b> 0. Importing Relevant Python Packages </b> </font>

<font size="3">In this notebook we will use the following scientific libraries:
<ol type="1">
    <li> <b><a href="https://pandas.pydata.org/" target="_blank">Pandas</a></b> is a Python library that provides high-level data structures and a vast variety of tools for analysis. The great feature of this package is the ability to translate rather complex operations with data into one or two commands. Pandas contains many built-in methods for filtering and combining data, as well as the time-series functionality. </li>
    <li> <b><a href="https://www.gdal.org/" target="_blank">GDAL</a></b> is a software library for reading and writing raster and vector geospatial data formats. It includes a collection of programs tailored for geospatial data processing. Most modern GIS systems (such as ArcGIS or QGIS) use GDAL in the background.</li>
    <li> <b><a href="http://www.numpy.org/" target="_blank">NumPy</a></b> is one of the principal packages for scientific applications of Python. It is intended for processing large multidimensional arrays and matrices, and an extensive collection of high-level mathematical functions and implemented methods makes it possible to perform various operations with these objects. </li>
    <li> <b><a href="https://matplotlib.org/index.html" target="_blank">Matplotlib</a></b> is a low-level library for creating two-dimensional diagrams and graphs. With its help, you can build diverse charts, from histograms and scatterplots to non-Cartesian coordinates graphs. Moreover, many popular plotting libraries are designed to work in conjunction with matplotlib. </li>
<li><b><a href="https://www.scipy.org/about.html" target="_blank">SciPY</a></b> is a library that provides functions for numerical integration, interpolation, optimization, linear algebra and statistics. </li>

</font>

<font face="Calibri" size="3"> Our first step is to <b>import them:</b> </font>

In [None]:
import os # for chdir, getcwd, path.exists
from math import ceil

import pandas as pd # for DatetimeIndex
from osgeo import gdal # for Info
import numpy as np # for copy, isnan, log10, ma.masked_where, max, mean, min, percentile, power, unique, var, where 
import matplotlib.pyplot as plt
from matplotlib import animation
from matplotlib import rc
from scipy.signal import savgol_filter

from IPython.display import HTML

from asf_notebook import new_directory

<font face="Calibri" size="3"><b>Setup matplotlib plotting</b> inside the notebook:</font>

In [None]:
%matplotlib inline 

<hr>
<font face="Calibri">
<font size="5"> <b> 1. Analyzing a data stack: La Amazonía</b> </font>
<img src='NotebookAddons/map.png' align='right' width=230><br><br>

<font size="4"> <b> 1.1. Overview</b> </font><br><br>

<font face="Calibri" size="3"> We will study a small subset of a Sentinel-1 stack along the Napo River, 30 km east of Coca.
<br><br>
The stack consists of 76 VV and VH (cross-pol) images each from July 2017 to July 2019.
<br><br>
</font>
<img src='NotebookAddons/tree.png' align='center' width=400>
<font face="Calibri" size="1" align='left'><div style="text-align: left">Picture by Josselyn Encarnacion.</div></font>

</font>

<font face="Calibri">

<font size="4"> <b> 1.2. Background</b> </font><br><br>

<font face="Calibri" size="3"> The low-lying area is covered by a mosaic of tropical rainforest and cleared areas, including plantations.  
<img src='NotebookAddons/lagoagrio.png' align='center' width=400>
<br>
The region receives more than 3000mm of rainfall per year. Although there are clear seasonal variations, precipitation is elevated throughout the year (see the observations from Lago Agrio/Nueva Loja, taken from the WMO website). There are no strong seasonal variations in temperature. 
<br><br>
</font>


<hr>
<font face="Calibri">

<font size="5"> <b> 2. Load the data </b> </font>

<font face="Calibri" size="3">First, let us <b>create a separate directory and move into it</b> 

In [None]:
name = 'tropical'
analysis_dir = name
new_directory(analysis_dir)
os.chdir(analysis_dir)

<font face="Calibri" size="3"><b>Download the prepared stack</b> 

In [None]:
s3_path = 's3://asf-jupyter-data/tropical.tar.gz'
time_series_path = os.path.basename(s3_path)
!aws s3 cp $s3_path $time_series_path

<font face="Calibri" size="3"><b>Extract all files from the arcive</b> 

In [None]:
!tar -xvzf tropical.tar.gz

<br>
<font face="Calibri" size="4"> <b> 2.3 Define Data Directory and Path to VRT </b> </font> 
<br><br>
<font face="Calibri" size="3">A VRT file is a virtual image that groups together multiple images as bands. In our case, the bands correspond to acquisition times.<br><br>
    
<b>Create a variable containing the VRT filename for each polarization:</b></font>

In [None]:
polarizations = ['VV', 'VH']
imagefile = {pol: f'stacktropical_{pol}.vrt' for pol in polarizations}

<font face="Calibri" size="3"><b>Create an index of timedelta64 data with Pandas:</b></font>

In [None]:
# Get some indices for plotting
datefile = {pol: f'dates{name}_{pol}.csv' for pol in polarizations}
tindex = {pol: pd.DatetimeIndex(open(datefile[pol]).read().split(',')) for pol in polarizations}

<font face="Calibri" size="3"><b>Print the bands and dates for all images in the virtual raster table (VRT):</b></font>

In [None]:
j = 1
print(f"Bands and dates for {imagefile[polarizations[0]]}")
for i in tindex[polarizations[0]]:
    print("{:4d} {}".format(j, i.date()), end=' ')
    j += 1
    if j%5 == 1:
        print()

<hr>
<br>
<font face="Calibri" size="4"> <b> 2.2 Open Your Data Stack</b> </font> 

<font face="Calibri" size="3"> We will store the <b>opened VRT</b> of each polarization in a dictionary data structure</font>

In [None]:
img = {pol: gdal.Open(imagefile[pol]) for pol in polarizations}

<font face="Calibri" size="3"><b>Print the bands (time instances), pixels, and lines:</b></font>

In [None]:
polarization = 'VV' # we will focus on VV for now
print(f"{polarization}: Number of  bands: {img[polarization].RasterCount}")
print(f"{polarization}: Number of pixels: {img[polarization].RasterXSize}")
print(f"{polarization}: Number of  lines: {img[polarization].RasterYSize}")

<br>
<font face="Calibri" size="4"> <b>2.3 Reading Data from an Image Band </b> </font> 

<font face="Calibri" size="3"> <b>To access any band in the image</b>, use GDAL's *GetRasterBand(x)* function. Replace the band_num value with the number of the band you wish to access.</font>

In [None]:
band_num = 4 # starts at 1
print(f'Accessing band {tindex[polarization][band_num - 1]}') # index starts at zero
band = img[polarization].GetRasterBand(band_num)

<font face="Calibri" size="3"> Once a band is seleted, several functions associated with the band are available for further processing, e.g., <i>band.ReadAsArray(xoff=0,yoff=0,xsize=None,ysize=None)</i>

<b>Let's read the entire raster layer for the band:</b> </font>

In [None]:
raster = band.ReadAsArray()
print(f'This is a two-dimensional array of size {raster.shape}')

<font face="Calibri" size="4"> <b> 2.4 Extracting Subsets from a Larger Image Frame </b> </font>

<font face="Calibri" size="3"> Because of the potentially large data volume when dealing with time series data stacks, it may be prudent to read only a subset of data. 

Using GDAL's <i>ReadAsArray()</i> function, subsets can be requested by defining pixel offsets and subset size:

**img.ReadAsArray(xoff=0, yoff=0, xsize=None, ysize=None)**

- <i>xoff, yoff</i> are the offsets from the upper left corner in pixel/line coordinates. 
- <i>xsize, ysize</i> specify the size of the subset in x-direction (left to right) and y-direction (top to bottom).

For example, we can <b>read only a subset of 5x5 pixels with an offset of 5 pixels and 20 lines:</b> </font>

In [None]:
raster_sub = band.ReadAsArray(5, 20, 50, 50)

<font face="Calibri" size="3"> The result is a two dimensional numpy array in the datatpye the data were stored in. **We can inspect these data in python by typing the array name on the commandline**: </font>

In [None]:
raster_sub

<hr>
<font face="Calibri">

<font size="5"> <b> 3. Visualize single images </b> </font>
<br><br>

<font face="Calibri" size="4"> <b> 3.1. Write a Plotting Function</b> </font>

<font face="Calibri" size="3"> Matplotlib's plotting functions allow for powerful options to display imagery. We are following some standard approaches for setting up figures.
First we are looking at a **raster band** and it's associated **histogram**. </font>
<br><br>
<font face="Calibri" size="3"> Our function, *show_image()* takes several parameters:
    
- raster = a numpy two dimensional array 
- tindex = a panda index array for dates
- bandnbr = the band number the corresponds to the raster 
- vmin = minimim value to display 
- vmax = maximum value to display
- output_filename = name of output file, if saving the plot
<br><br>
It then calls a function called plot_image_histogram that does the actual plotting.
</font>

In [None]:
def plot_image_histogram(axs, raster, tindex, band_nbr, vmin=None, vmax=None, polarization='Band'):
    # plot image
    vmin = np.percentile(raster, 1) if vmin==None else vmin
    vmax = np.percentile(raster, 99) if vmax==None else vmax
    axs[0].imshow(raster, cmap='gray', vmin=vmin, vmax=vmax)
    axs[0].set_title('Image {} {} {}'.format(polarization, band_nbr, tindex[band_nbr-1].date()))
    
    #plot histogram
    h = axs[1].hist(raster.flatten(), bins=200, range=(vmin, vmax))
    axs[1].xaxis.set_label_text('Intensity ($\\gamma^0$)')
    axs[1].set_title('Histogram {} {} {}'.format(polarization, band_nbr, tindex[band_nbr-1].date()))

def show_image_histogram(raster, tindex, band_nbr, vmin=None, vmax=None, output_filename=None):  
    fig, axs = plt.subplots(nrows=1, ncols=2)
    fig.set_size_inches((14,7), forward=True)
    plt.subplots_adjust(wspace=0.3)
    plt.rcParams.update({'font.size': 14})
    plot_image_histogram(axs, raster, tindex, band_nbr, vmin=vmin, vmax=vmax)
    if output_filename:
        plt.savefig(output_filename, dpi=300, transparent='true')

<font face="Calibri" size="4"><b>3.2 Visualize VV image:</b>


In [None]:
polarization = 'VV' # you can change this to VH

band_num = 50 # feel free to change them

raster = img[polarization].GetRasterBand(band_num).ReadAsArray()  
print(f'Band {band_num} is from {tindex[polarization][band_num - 1]}')
show_image_histogram(raster, tindex[polarization], band_num, vmin=0.1, vmax=0.4)

<font face="Calibri" size="4"><b>3.3 Compare VV and VH:</b></font>
<font face="Calibri" size="3">
<br><br>
One channel is much smaller than the other. Which one? Do you have to change vmin and vmax to display it properly?
</font>

In [None]:
fig, axs = plt.subplots(nrows=len(polarizations), ncols=2)
fig.set_size_inches((12,12), forward=True)
plt.subplots_adjust(wspace=0.3, hspace=0.3)
vmin = 0.0
vmax = 0.4
for jpolariztion, polarization in enumerate(polarizations):
    raster = img[polarization].GetRasterBand(band_num).ReadAsArray()  
    plot_image_histogram(
        axs[jpolariztion,:], raster, tindex[polarization], band_num, 
        vmin=vmin, vmax=vmax, polarization=polarization)  

<font face="Calibri" size="4"><b>3.4 Compare scaling normalizations:</b><br><br></font>
<font face="Calibri" size="3">
The data at hand are radiometrically terrain corrected images, which are often expressed as terrain flattened $\gamma^0$ backscattering coefficients. For forest and land cover monitoring applications $\gamma^o$ is the preferred metric.
    
There are two common ways of scaling the $\gamma^0$ data. <br><br>

So far, we have looked at the **power scale**, the natural scale in which the intensity is measured. For most mathematical operations such as speckle filtering, this is the appropriate scale. However, its large dynamic range is sometimes an issue for statistical analyses and visualization purposes.<br>

The **dB scale** is a logarithmic scale:<br>
     $\gamma^0_{dB} = 10 \log_{10} (\gamma^0)$<br>
The dynamic range is greatly reduced: A doubling of $\gamma^0$ corresponds to an additive increase of 3 in $\gamma^0_{dB}$. The distribution tends to become less skewed. 
</font>

In [None]:
raster = img['VV'].GetRasterBand(band_num).ReadAsArray() # gamma 0, power scale
rasterdB = 10*np.log10(raster)

<font face="Calibri" size="3"><b>Let us look at images in the power and dB scale</b>
</font> 

In [None]:
fig, axs = plt.subplots(nrows=1, ncols=2)
fig.set_size_inches((14,7), forward=True)
plt.subplots_adjust(wspace=0.3)
vmin = np.percentile(raster, 2)
vmax = np.percentile(raster, 98)
axs[0].imshow(raster, cmap='gray', vmin=vmin, vmax=vmax)
axs[0].set_title('Power scale')
axs[1].imshow(rasterdB, cmap='gray', vmin=10*np.log10(vmin), vmax=10*np.log10(vmax))
_ = axs[1].set_title('dB scale')

<font face="Calibri" size="3"><b>Let us look at histograms.</b><br><br>
How does the shape of the data distribution compare?
</font> 

In [None]:
fig, axs = plt.subplots(nrows=1, ncols=2)
fig.set_size_inches((14,5), forward=True)
plt.subplots_adjust(wspace=0.3)
labels = ['power', 'dB']
for jr, r in enumerate([raster, rasterdB]):
    rvalid = r[np.isfinite(r)]
    axs[jr].hist(rvalid.flatten(), range=np.percentile(rvalid, (0.5, 99.9)), bins=100)
    axs[jr].axvline(np.mean(rvalid),color='k',label='Mean')
    axs[jr].axvline(np.mean(rvalid)-np.std(rvalid),color='gray',label='1 $\sigma$')
    axs[jr].axvline(np.mean(rvalid)+np.std(rvalid),color='gray')
    axs[jr].set_title(labels[jr])

<font face="Calibri" size="5"><b>4. Time series</b></font>

<font face="Calibri" size="4"><b>4.1. Animation</b></font>

<font face="Calibri" size="3"><b>Let us choose the polarization</b></font>

In [None]:
polarization = 'VV'
band = img[polarization].GetRasterBand(1)
raster0 = band.ReadAsArray() # Needed for initialization
band_number = 0 # Needed for initialization
rasterstack = img[polarization].ReadAsArray()

<font face="Calibri" size="3"><b>Create and move into a directory in which to store our plots and animations:</b></font> 

In [None]:
product_path = f"plots_and_animations"
new_directory(product_path)
print(f"Current working directory: {os.getcwd()}")

In [None]:
%%capture 
fig = plt.figure(figsize=(10, 5))
ax = fig.subplots()
ax.axis('off')
rasterstackdB = 10 * np.log10(rasterstack)

im = ax.imshow(rasterstackdB[0,...], cmap='gray', vmin=np.nanpercentile(rasterstackdB, 3), 
               vmax=np.nanpercentile(rasterstackdB, 97))
ax.set_title("{}".format(tindex[polarization][0].date()))

def animate(i):
    ax.set_title("{}".format(tindex[polarization][i].date()))
    im.set_data(rasterstackdB[i,...])

# Interval is given in milliseconds
ani = animation.FuncAnimation(fig, animate, frames=rasterstackdB.shape[0], interval=200)

<font face="Calibri" size="3"><b>Configure matplotlib's RC settings for the animation:</b></font> 

In [None]:
rc('animation', embed_limit=40971520.0)  # We need to increase the limit maybe to show the entire animation

<font face="Calibri" size="3"><b>Create a javascript animation of the time-series running inline in the notebook:</b></font> 

In [None]:
HTML(ani.to_jshtml())

<font face="Calibri" size="3"><b>Delete the dummy png</b> that was saved to the current working directory while generating the javascript animation in the last code cell.</font> 

In [None]:
try:
    os.remove('None0000000.png')
except FileNotFoundError:
    pass

<font face="Calibri" size="3"><b>Save the animation (animation.gif):</b> </font> 

In [None]:
ani.save(f"{product_path}/animation.gif", writer='pillow', fps=2)

<br>
<hr>
<font face="Calibri" size="4"> <b> 4.2 Plot the Time Series of Means Calculated Across the Image </b> </font>

<font face="Calibri" size="3"> To create the time series of means, we will go through the following steps:
1. Ensure that you use the data in **power scale** ($\gamma^o_{pwr}$) for your mean calculations.
2. Compute means.
3. Convert the resulting mean values into dB scale for visualization.
4. Plot time series of means. </font> 
<br><br>
<font face="Calibri" size="3"> <b>Compute the means:</b> </font>

In [None]:
rs_means_pwr = np.nanmean(rasterstack, axis=(1, 2))

<font face="Calibri" size="3"><b>Convert resulting mean value time-series to dB scale for visualization:</b></font>

In [None]:
rs_means_dB = 10.*np.log10(rs_means_pwr)

<font face="Calibri" size="3"><b>Plot and save the time series of means (RCSoverTime.png):</b>
<br><br>
How does the temporal variability relate to the meteorlogical variability?
</font>

In [None]:
plt.rcParams.update({'font.size': 14})
fig = plt.figure(figsize=(16, 4))
ax1 = fig.subplots()
window_length = len(rs_means_dB)-1
if window_length % 2 == 0:
    window_length -= 1
polyorder = ceil(window_length*0.1)
yhat = savgol_filter(rs_means_dB, window_length, polyorder) 
ax1.plot(tindex[polarization], yhat, color='red', marker='o', markerfacecolor='white', linewidth=3, markersize=6)
ax1.plot(tindex[polarization], rs_means_dB, color='gray', linewidth=0.5)
plt.grid()
ax1.set_xlabel('Date')
ax1.set_ylabel('$\overline{\gamma^o}$ [dB]')
plt.savefig(f"{product_path}/RCSoverTime.png", dpi=72, transparent='true')

<br>
<font face="Calibri" size="4"> <b> 4.3 Create Two-Panel Animation with Global Mean </b> </font>

<font face="Calibri" size="3"> We use a few Matplotlib functions to <b>create a side-by-side animation of the dB-scaled imagery and the respective global means.</b> </font> 

In [None]:
%%capture 
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 4), gridspec_kw={'width_ratios':[1, 3]})

vmin = np.percentile(rasterstackdB, 1)
vmax = np.percentile(rasterstackdB, 99)
im = ax1.imshow(rasterstackdB[0, ...], cmap='gray', vmin=vmin, vmax=vmax)
ax1.set_title("{}".format(tindex[polarization][0].date()))
ax1.set_axis_off()

ax2.axis([tindex[polarization][0].date(), tindex[polarization][-1].date(), rs_means_dB.min(), rs_means_dB.max()])
ax2.set_ylabel('$\overline{\gamma^0}$ [dB]')
ax2.set_xlabel('Date')
l, = ax2.plot([], [])

def animate(i):
    ax1.set_title("{}".format(tindex[polarization][i].date()))
    im.set_data(rasterstackdB[i,...])
    ax2.set_title("{}".format(tindex[polarization][i].date()))
    l.set_data(tindex[polarization][0:(i+1)], rs_means_dB[0:(i+1)])

# Interval is given in milliseconds
ani = animation.FuncAnimation(fig, animate, frames=rasterstackdB.shape[0], interval=100)

In [None]:
HTML(ani.to_jshtml())

<font face="Calibri" size="3"><b>Save the animated time-series and histogram (animation_histogram.gif):</b></font>

In [None]:
ani.save(f"{product_path}/animation_histogram.gif", writer='pillow', fps=2)

<font face="Calibri" size="2"> <i>SAR Training Materials - Version 1.1 - Sep 2019 </i>
</font>