<img src="NotebookAddons/blackboard-banner.jpg" width="100%" />
<font face="Calibri">
<br>
<font size="5"> <b>Exploring SAR Data and SAR Time Series Analysis with Supplied Data</b></font>

<br>
<font size="4"> <b> Franz J Meyer; University of Alaska Fairbanks & Josef Kellndorfer, <a href="http://earthbigdata.com/" target="_blank">Earth Big Data, LLC</a> </b> <br>
<img src="NotebookAddons/UAFLogo_A_647.png" width="170" align="right" /></font>

<font size="3">This notebook introduces you to the analysis of deep multi-temporal SAR image data stacks in the framework of *Jupyter Notebooks*. The Jupyter Notebook environment is easy to launch in any web browser for interactive data exploration with provided or new training data. Notebooks are comprised of text written in a combination of executable python code and markdown formatting including latex style mathematical equations. Another advantage of Jupyter Notebooks is that they can easily be expanded, changed, and shared with new data sets or newly available time series steps. Therefore, they provide an excellent basis for collaborative and repeatable data analysis. <br>

<b>This notebook covers the following data analysis concepts:</b>

- How to load time series stacks into Jupyter Notebooks and how to explore image content using basic functions such as mean value calculation and histogram analysis.
- How to apply calibration constants to covert initial digital number (DN) data into calibrated radar cross section information.
- How to subset images create time series information of calibrated SAR amplitude values.
- How to explore the time-series information in SAR data stacks for environmental analysis.
</font>


</font>

<hr>
<font face="Calibri" size="5" color="red"> <b>Important Notes about JupyterHub</b> </font>
<br><br>
<font face="Calibri" size="3"> <b>Your JupyterHub server will automatically shutdown when left idle for more than 1 hour. Your notebooks will not be lost but you will have to restart their kernels and re-run them from the beginning. You will not be able to seamlessly continue running a partially run notebook.</b> </font>
<br><br>
<ol type="1">
  <li><font color='rgba(200,0,0,0.2)'> <b> Save your notebook with all of its content</b></font> by selecting <i> File / Save and Checkpoint </i> </li>
  <li><font color='rgba(200,0,0,0.2)'> <b>To export in Notebook format</b></font>, click on <i>File / Download as / Notebook (.ipynb)</i>  <font color='gray'>--- Downloading your file may take a bit as the Notebook will be about 100MB in size</font></li>
  <li><font color='rgba(200,0,0,0.2)'> <b>To export in PDF format</b></font>, click on <i>File / Download as / PDF vs LaTeX (.pdf) </i></li>
</ol>

</font>


<hr>
<font face="Calibri">

<font size="5"> <b> 0. Importing Relevant Python Packages </b> </font>

<font size="3">In this notebook we will use the following scientific libraries:

<ol type="1">
    <li> <b><a href="https://pandas.pydata.org/" target="_blank">Pandas</a></b> is a Python library that provides high-level data structures and a vast variety of tools for analysis. The great feature of this package is the ability to translate rather complex operations with data into one or two commands. Pandas contains many built-in methods for filtering and combining data, as well as the time-series functionality. </li>
    <li> <b><a href="https://www.gdal.org/" target="_blank">GDAL</a></b> is a software library for reading and writing raster and vector geospatial data formats. It includes a collection of programs tailored for geospatial data processing. Most modern GIS systems (such as ArcGIS or QGIS) use GDAL in the background.</li>
    <li> <b><a href="http://www.numpy.org/" target="_blank">NumPy</a></b> is one of the principal packages for scientific applications of Python. It is intended for processing large multidimensional arrays and matrices, and an extensive collection of high-level mathematical functions and implemented methods makes it possible to perform various operations with these objects. </li>
    <li> <b><a href="https://matplotlib.org/index.html" target="_blank">Matplotlib</a></b> is a low-level library for creating two-dimensional diagrams and graphs. With its help, you can build diverse charts, from histograms and scatterplots to non-Cartesian coordinates graphs. Moreover, many popular plotting libraries are designed to work in conjunction with matplotlib. </li>
</font>

In [None]:
import os # for chdir, getcwd, path.basename, path.exists

import pandas as pd # for DatetimeIndex
import gdal # for GetRasterBand, Open, ReadAsArray
import numpy as np #for log10, mean, percentile, power
import matplotlib.pylab as plb # for add_patch, add_subplot, figure, hist, imshow, set_title, xaxis,_label, text 
import matplotlib.pyplot as plt # for add_subplot, axis, figure, imshow, legend, plot, set_axis_off, set_data,
                                # set_title, set_xlabel, set_ylabel, set_ylim, subplots, title, twinx
import matplotlib.patches as patches  # for Rectangle
import matplotlib.animation as an # for FuncAnimation
from matplotlib import rc 

from asf_notebook import path_exists
from asf_notebook import asf_unzip
from asf_notebook import new_directory

from IPython.display import HTML

<font face="Calibri" size="3"><b>Setup matplotlib plotting inside the notebook:</b></font>

In [None]:
%matplotlib inline

<hr>
<font face="Calibri">

<font size="5"> <b> 1. Load Data Stack</b> </font> <img src="NotebookAddons/Nepalclimate.jpeg" width="400" align="right" /> 

<font size="3"> This notebook will be using a 70-image deep L-band SAR data stack over Nepal for a first experience with time series processing. The L-band data were acquired by the ALOS PALSAR sensor and are available to us through the services of the <a href="https://www.asf.alaska.edu/" target="_blank">Alaska Satellite Facility</a>. 

Nepal is an interesting site for this analysis due to the significant seasonality of precipitation that is characteristic for this region. Nepal is said to have five seasons: spring, summer, monsoon, autumn and winter. Precipitation is low in the winter (November - March) and peaks dramatically in the summer, with top rain rates in July, August, and September (see figure to the right). As SAR is sensitive to changes in soil moisture, these weather patterns have a noticeable impact on the Radar Cross Section ($\sigma$) time series information. 

We will analyze the variation of $\sigma$ values over time and will interpret them in the context of rainfall rates in the imaged area. 
</font></font>
<br><br>
<font face="Calibri" size="3">Before we get started, let's first <b>create a working directory for this analysis and change into it:</b> </font>

In [None]:
path = "/home/jovyan/notebooks/SAR_Training/English/data_time_series_example"
new_directory(path)
os.chdir(path)
print(f"Current working directory: {os.getcwd()}")

<font face="Calibri" size="3">We will <b>retrieve the relevant data</b> from an <a href="https://aws.amazon.com/" target="_blank">Amazon Web Service (AWS)</a> cloud storage bucket <b>using the following command</b>:</font></font>

In [None]:
s3_path = 's3://asf-jupyter-data/time_series.zip'
time_series_path = os.path.basename(s3_path)
!aws s3 cp $s3_path $time_series_path

<font face="Calibri" size="3"> Now, let's <b>unzip the file (overwriting previous extractions) and clean up after ourselves:</b> </font>

In [None]:
if path_exists(time_series_path):
    asf_unzip(os.getcwd(), time_series_path)
    os.remove(time_series_path)

<font face="Calibri" size="3"> The following lines set path variables needed for data processing. This step is not necessary but it saves a lot of extra typing later.<b> Define variables for the main data directory as well as for the files containing data and image information:</b></font>

In [None]:
datadirectory = f"{path}/time_series/S32644X696260Y3052060sS1-EBD"
datefile = 'S32644X696260Y3052060sS1_D_vv_0092_mtfil.dates'
imagefile = 'S32644X696260Y3052060sS1_D_vv_0092_mtfil.vrt'
imagefile_cross = 'S32644X696260Y3052060sS1_D_vh_0092_mtfil.vrt'

<br>
<hr>
<font face="Calibri" size="5"> <b> 2. Switch to the Data Directory: </b></font>


<font size="3"> We now <b>move to the data directory:</b></font>

In [None]:
if path_exists(datadirectory):
    os.chdir(datadirectory)
print(f"current directory: {os.getcwd()}")

In [None]:
#!ls *.vrt #Uncomment this line to see a List of the files 

<br>
<hr>
<font face="Calibri" size="5"> <b> 3. Assess Image Acquisition Dates </b> </font> 

<font face="Calibri" size="3"> Before we start analyzing the available image data, we want to examine the content of our data stack. <b>First, we read the image acquisition dates for all files in the time series and create a *pandas* date index.</b> </font>

In [None]:
if path_exists(datefile):
    with open(datefile, 'r') as f:
        dates = f.readlines()
        tindex = pd.DatetimeIndex(dates)

<font face="Calibri" size="3"> From the date index, we <b>make and print a lookup table for band numbers and dates:</b> </font>

In [None]:
if path_exists(imagefile):
    j = 1
    print('Bands and dates for', imagefile)
    for i in tindex:
        print("{:4d} {}".format(j, i.date()),end=' ')
        j += 1
        if j%5 == 1: print()

<br>
<hr>
<font face="Calibri" size="5"> <b> 4. Explore the Available Image Data </b> </font> 

<font face="Calibri" size="3"> To <b>open an image file using the gdal.Open() function.</b> This returns a variable (img) that can be used for further interactions with the file: </font>

In [None]:
if path_exists(imagefile):
    img = gdal.Open(imagefile)

<font face="Calibri" size="3"> To <b>explore the image (number of bands, pixels, lines),</b> you can use several functions associated with the image object (img) created in the last code cell: </font>

In [None]:
print(img.RasterCount) # Number of Bands
print(img.RasterXSize) # Number of Pixels
print(img.RasterYSize) # Number of Lines

<br>
<font face="Calibri" size="4"> <b> 4.1 Reading Data from an Image Band </b> </font> 

<font face="Calibri" size="3"> <b>To access any band in the image</b>, use GDAL's *GetRasterBand(x)* function. Replace the band_num value with the number of the band you wish to access.</font>

In [None]:
band_num = 70 
band = img.GetRasterBand(band_num)

<font face="Calibri" size="3"> Once a band is seleted, several functions associated with the band are available for further processing, e.g., <i>band.ReadAsArray(xoff=0,yoff=0,xsize=None,ysize=None)</i>

<b>Let's read the entire raster layer for the band:</b> </font>

In [None]:
raster = band.ReadAsArray()

<br>
<font face="Calibri" size="4"> <b> 4.2 Extracting Subsets from a Larger Image Frame </b> </font>

<font face="Calibri" size="3"> Because of the potentially large data volume when dealing with time series data stacks, it may be prudent to read only a subset of data. 

Using GDAL's <i>ReadAsArray()</i> function, subsets can be requested by defining pixel offsets and subset size:

**img.ReadAsArray(xoff=0, yoff=0, xsize=None, ysize=None)**

- <i>xoff, yoff</i> are the offsets from the upper left corner in pixel/line coordinates. 
- <i>xsize, ysize</i> specify the size of the subset in x-direction (left to right) and y-direction (top to bottom).

For example, we can <b>read only a subset of 5x5 pixels with an offset of 5 pixels and 20 lines:</b> </font>

In [None]:
raster_sub = band.ReadAsArray(5, 20, 50, 50)

<font face="Calibri" size="3"> The result is a two dimensional numpy array in the datatpye the data were stored in. **We can inspect these data in python by typing the array name on the commandline**: </font>

In [None]:
raster_sub

<br>
<font face="Calibri" size="4"> <b> 4.3 Displaying Bands in the Time Series of SAR Data </b> </font>

<font face="Calibri" size="3"> From the lookup table we know that bands 20 and 27 in the Nepal data stack are from mid February and late August. **Let's take look at these images**. </font>

In [None]:
raster_1 = img.GetRasterBand(20).ReadAsArray()
raster_2 = img.GetRasterBand(27).ReadAsArray()

<font face="Calibri" size="3"> <b> <i>4.3.1 Write a Plotting Function</i></b> </font>

<font face="Calibri" size="3"> Matplotlib's plotting functions allow for powerful options to display imagery. We are following some standard approaches for setting up figures.
First we are looking at a **raster band** and it's associated **histogram**. </font>
<br><br>
<font face="Calibri" size="3"> Our function, *show_image()* takes several parameters:
    
- raster = a numpy two dimensional array 
- tindex = a panda index array for dates
- bandnbr = the band number the corresponds to the raster 
- vmin = minimim value to display 
- vmax = maximum value to display
- output_filename = name of output file, if saving the plot

Preconditions: matplotlib.pyplot must be imported as plb and matplotlib.pyplot must be imported as plt. 
<br><br>
Note: By default, data will be linearly stretched between vmin and vmax.
<br><br>
<b>We won't use this function in this notebook but it is a useful utility method, which can be copied and pasted for use in other analyses</b>
</font>

In [None]:
def show_image_histogram(raster, tindex, band_nbr, vmin=None, vmax=None, output_filename=None):
    assert 'plb' in globals(), 'Error: matplotlib.pylab must be imported as "plb"'
    assert 'plt' in globals(), 'Error: matplotlib.pyplot must be imported as "plt"'  
    
    fig = plb.figure(figsize=(16, 8))
    ax1 = fig.add_subplot(121)
    ax2 = fig.add_subplot(122)
    
    # plot image
    ax1.imshow(raster, cmap='gray', vmin=vmin, vmax=vmax)
    ax1.set_title('Image Band {} {}'.format(band_nbr, tindex[band_nbr-1].date()))
    vmin = np.percentile(raster, 2) if vmin==None else vmin
    vmax = np.percentile(raster, 98) if vmax==None else vmax
    ax1.xaxis.set_label_text('Linear stretch Min={} Max={}'.format(vmin, vmax))
    
    #plot histogram
    h = ax2.hist(raster.flatten(), bins=200, range=(0, 10000))
    ax2.xaxis.set_label_text('Amplitude (Uncalibrated DN Values)')
    ax2.set_title('Histogram Band {} {}'.format(band_nbr, tindex[band_nbr-1].date()))
    
    if output_filename:
        plt.savefig(output_filename, dpi=300, transparent='true')

<font face="Calibri" size="3">We won't be calling our new function elsewhere in this notebook,<b> so test it now:</b></b></font>

In [None]:
show_image_histogram(raster_1, tindex, 20, vmin=2000, vmax=10000)

<br>
<hr>
<font face="Calibri" size="5"> <b> 5. SAR Time Series Visualization, Animation, and Analysis </b> </font> 

<font face="Calibri" size="3"> This section introduces you to the handling and analysis of SAR time series stacks. A focus will be put on time series visualization, which allow us to inspect time series in more depth. Note that html animations are not exported into the pdf file, but will display interactively. </font>

<br>
<font face="Calibri" size="4"> <b> 5.1 Reading the SAR Time Series Subset </b> </font>

<font face="Calibri" size="3"> Let's read an image subset (offset 400, 400 /  size 600, 600) of the entire time series data stack. The data are linearly scaled amplitudes represented as unsigned 16 bit integer.

We use the GDAL *ReadAsArray(xoff,yoff,xsize,ysize)* function where *xoff* is the offset in pixels from upper left; *yoff* is the offset in lines from upper left; *xsize* is the number of pixels and *ysize* is the number of lines of the subset.

If *ReadAsArray()* is called without any parameters, the entire image data stack is read. 

Let's first <b>define a subset and make sure it is in the right geographic location</b>. </font> 

In [None]:
# Open the image and read the first raster band
band = img.GetRasterBand(1)

# Define the subset
subset = (400, 400, 600, 600)

<font face="Calibri" size="3"> Now we are ready to <b>extract this subset from all slices of the data stack</b>. </font>

In [None]:
# Plot one band together with the outline of the selected subset to verify its geographic location.
raster = band.ReadAsArray()
vmin = np.percentile(raster.flatten(), 5)
vmax = np.percentile(raster.flatten(), 95)
fig = plb.figure(figsize=(10, 10))
ax = fig.add_subplot(111)
ax.imshow(raster, cmap='gray', vmin=vmin, vmax=vmax)
# plot the subset as rectangle
_ = ax.add_patch(patches.Rectangle((subset[0], subset[1]), subset[2], subset[3], fill=False, edgecolor='red'))

In [None]:
raster0 = band.ReadAsArray(*subset)
bandnbr = 0 # Needed for updates
rasterstack = img.ReadAsArray(*subset)

<font face="Calibri" size="3"><b>Close img, as it is no longer needed in the notebook:</b></font> 

In [None]:
img = None

<br>
<font face="Calibri" size="4"> <b> 5.2 Calibration and Data Conversion between dB and Power Scales </b> </font>

<font face="Calibri" size="3"> Focused SAR image data natively come in uncalibrated digital numbers (DN) and need to be calibrated to correspond to proper radar cross section information. 

Calibration coefficients for SAR data are often defined in the decibel (dB) scale due to the high dynamic range of the imaging system. For the L-band ALOS PALSAR data at hand, the conversion from uncalibrated DN values to calibrated radar cross section values in dB scale is performed by applying a standard **calibration factor of -83 dB**. 
<br> <br>
$\gamma^0_{dB} = 20 \cdot log10(DN) -83$

The data at hand are radiometrically terrain corrected images, which are often expressed as terrain flattened $\gamma^0$ backscattering coefficients. For forest and land cover monitoring applications $\gamma^o$ is the preferred metric.

Let's <b>apply the calibration constant for our data and export it in *dB* scale</b>: </font> 

In [None]:
caldB = 20*np.log10(rasterstack) - 83

<font face="Calibri" size="3"> While **dB**-scaled images are often "visually pleasing", they are often not a good basis for mathematical operations on data. For instance, when we compute the mean of observations, it makes a difference whether we do that in power or dB scale. Since dB scale is a logarithmic scale, we cannot simply average data in that scale. 
    
Please note that the **correct scale** in which operations need to be performed **is the power scale.** This is critical, e.g. when speckle filters are applied, spatial operations like block averaging are performed, or time series are analyzed.

To **convert from dB to power**, apply: $\gamma^o_{pwr} = 10^{\frac{\gamma^o_{dB}}{10}}$ </font>

In [None]:
calPwr = np.power(10., caldB/10.)

<br>
<font face="Calibri" size="4"> <b> 5.3 Create a Time Series Animation </b> </font>
<br><br>
<font face="Calibri" size="3">First, <b>Create a directory in which to store our plots and move into it:</b></font>

In [None]:
os.chdir(path)
product_path = 'plots_and_animations'
new_directory(product_path)
if path_exists(product_path) and os.getcwd() != f"{path}/{product_path}":
    os.chdir(product_path)
print(f"Current working directory: {os.getcwd()}")

<font face="Calibri" size="3"> Now we are ready to <b>create a time series animation</b> from the calibrated SAR data. </font> 

In [None]:
%%capture 
fig = plt.figure(figsize=(10, 10))
ax = fig.add_subplot(111)
ax.axis('off')
vmin = np.percentile(caldB.flatten(), 5)
vmax = np.percentile(caldB.flatten(), 95)
r0dB = 20*np.log10(raster0) - 83
im = ax.imshow(r0dB,cmap='gray', vmin=vmin, vmax=vmax)
ax.set_title("{}".format(tindex[0].date()))

def animate(i):
    ax.set_title("{}".format(tindex[i].date()))
    im.set_data(caldB[i])

# Interval is given in milliseconds
ani = an.FuncAnimation(fig, animate, frames=rasterstack.shape[0], interval=400)

<font face="Calibri" size="3"><b>Configure matplotlib's RC settings for the animation:</b></font> 

In [None]:
rc('animation', embed_limit=40971520.0)  # We need to increase the limit maybe to show the entire animation

<font face="Calibri" size="3"><b>Create a javascript animation of the time-series running inline in the notebook:</b></font> 

In [None]:
HTML(ani.to_jshtml())

<font face="Calibri" size="3"><b>Save the animation (animation.gif):</b></font> 

In [None]:
ani.save('animation.gif', writer='pillow', fps=2)

<br>
<font face="Calibri" size="4"> <b> 5.3 Plot the Time Series of Means Calculated Across the Subset </b> </font>

<font face="Calibri" size="3"> To create the time series of means, we will go through the following steps:
1. Compute means using the data in **power scale** ($\gamma^o_{pwr}$) .
3. Convert the resulting mean values into dB scale for visualization.
4. Plot time series of means. </font> 

<font face="Calibri" size="3"><b>Compute the means:</b></font> 

In [None]:
rs_means_pwr = np.mean(calPwr,axis=(1, 2))

<font face="Calibri" size="3"><b>Convert the resulting mean value time-series to dB scale for visualization and check that we got the means over time:</b></font> 

In [None]:
rs_means_dB = 10.*np.log10(rs_means_pwr)
rs_means_pwr.shape

<font face="Calibri" size="3"><b>Plot and save the time series of means (time_series_means.png):</b></font> 

In [None]:
# 3. Now let's plot the time series of means
fig = plt.figure(figsize=(16, 4))
ax1 = fig.add_subplot(111)
ax1.plot(tindex, rs_means_pwr)
ax1.set_xlabel('Date')
ax1.set_ylabel('$\overline{\gamma^o}$ [power]')


ax2 = ax1.twinx()
ax2.plot(tindex, rs_means_dB, color='red')
ax2.set_ylabel('$\overline{\gamma^o}$ [dB]')
fig.legend(['power', 'dB'], loc=1)
plt.title('Time series profile of average band backscatter $\gamma^o$ ')
plt.savefig('time_series_means', dpi=72, transparent='true')

<br>
<font face="Calibri" size="4"> <b> 5.4 Create Two-Panel Figure with Animated Global Mean $\mu_{\gamma^0_{dB}}$ </b> </font>

<font face="Calibri" size="3"> We use a few Matplotlib functions to <b>create a side-by-side animation of the dB-scaled imagery and the respective global means $\mu_{\gamma^0_{dB}}$.</b> </font> 

In [None]:
%%capture 
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 4), gridspec_kw={'width_ratios':[1, 3]})

vmin = np.percentile(rasterstack.flatten(), 5)
vmax = np.percentile(rasterstack.flatten(), 95)
im = ax1.imshow(raster0, cmap='gray', vmin=vmin, vmax=vmax)
ax1.set_title("{}".format(tindex[0].date()))
ax1.set_axis_off()

ax2.axis([tindex[0], tindex[-1], rs_means_dB.min(), rs_means_dB.max()])
ax2.set_ylabel('$\overline{\gamma^o}$ [dB]')
ax2.set_xlabel('Date')
ax2.set_ylim((-10, -5))
l, = ax2.plot([], [])

def animate(i):
    ax1.set_title("{}".format(tindex[i].date()))
    im.set_data(rasterstack[i])
    ax2.set_title("{}".format(tindex[i].date()))
    l.set_data(tindex[:(i+1)], rs_means_dB[:(i+1)])

# Interval is given in milliseconds
ani = an.FuncAnimation(fig, animate, frames=rasterstack.shape[0], interval=400)

<font face="Calibri" size="3"><b>Create a javascript animation of the time-series running inline in the notebook:</b></font> 

In [None]:
HTML(ani.to_jshtml())

<font face="Calibri" size="3"><b>Save the animated time-series and histogram (animation_histogram.gif):</b></font>

In [None]:
ani.save('animation_histogram.gif', writer='pillow', fps=2)