<img src="NotebookAddons/blackboard-banner.jpg" width="100%" />
<font face="Calibri">
<br>
<font size="5"> <b>Exploring SAR Time Series Data over Ecosystems and Deforestation Sites</b></font>

<br>
<font size="4"> <b> Franz J Meyer; University of Alaska Fairbanks & Josef Kellndorfer, <a href="http://earthbigdata.com/" target="_blank">Earth Big Data, LLC</a> </b> <br>
<img style="padding:7px;" src="NotebookAddons/UAFLogo_A_647.png" width="170" align="right" /></font>

<font size="3">This notebook introduces you to the time series signatures over forested sites and sites affected by deforestation. The data analysis is done in the framework of *Jupyter Notebooks*. The Jupyter Notebook environment is easy to launch in any web browser for interactive data exploration with provided or new training data. Notebooks are comprised of text written in a combination of executable python code and markdown formatting including latex style mathematical equations. Another advantage of Jupyter Notebooks is that they can easily be expanded, changed, and shared with new data sets or newly available time series steps. Therefore, they provide an excellent basis for collaborative and repeatable data analysis. <br>

<b>This notebook covers the following data analysis concepts:</b>

- How to load time series stacks into Jupyter Notebooks and how to explore image content using basic functions such as mean value calculation and histogram analysis.
- How to extract time series information for individual pixels of an image.
- Typical time series signatures over forests and deforestation sites.
</font>


</font>

<hr>
<font face="Calibri" size="5" color='rgba(200,0,0,0.2)'> <b>Important Notes about JupyterHub</b> </font>
<br><br>
<font face="Calibri" size="3"> <b>Your JupyterHub server will automatically shutdown when left idle for more than 1 hour. Your notebooks will not be lost but you will have to restart their kernels and re-run them from the beginning. You will not be able to seamlessly continue running a partially run notebook.</b> </font>
<br><br>
</font>


<hr>
<font face="Calibri">

<font size="5"> <b> 0. Importing Relevant Python Packages </b> </font>

<font size="3">In this notebook we will use the following scientific libraries:

<ol type="1">
    <li> <b><a href="https://pandas.pydata.org/" target="_blank">Pandas</a></b> is a Python library that provides high-level data structures and a vast variety of tools for analysis. The great feature of this package is the ability to translate rather complex operations with data into one or two commands. Pandas contains many built-in methods for filtering and combining data, as well as the time-series functionality. </li>
    <li> <b><a href="https://www.gdal.org/" target="_blank">GDAL</a></b> is a software library for reading and writing raster and vector geospatial data formats. It includes a collection of programs tailored for geospatial data processing. Most modern GIS systems (such as ArcGIS or QGIS) use GDAL in the background.</li>
    <li> <b><a href="http://www.numpy.org/" target="_blank">NumPy</a></b> is one of the principal packages for scientific applications of Python. It is intended for processing large multidimensional arrays and matrices, and an extensive collection of high-level mathematical functions and implemented methods makes it possible to perform various operations with these objects. </li>
    <li> <b><a href="https://matplotlib.org/index.html" target="_blank">Matplotlib</a></b> is a low-level library for creating two-dimensional diagrams and graphs. With its help, you can build diverse charts, from histograms and scatterplots to non-Cartesian coordinates graphs. Moreover, many popular plotting libraries are designed to work in conjunction with matplotlib. </li>
</font>

In [None]:
import os # for chdir, getcwd, path.basename, path.exists
from math import ceil
import zipfile

import pandas as pd # for DatetimeIndex
import numpy as np #for log10, mean, percentile, power
import gdal # for GetRasterBand, Open, ReadAsArray

import matplotlib.patches as patches  # for Rectangle
import matplotlib.pyplot as plt # for add_subplot, axis, figure, imshow, legend, plot, set_axis_off, set_data,
                                # set_title, set_xlabel, set_ylabel, set_ylim, subplots, title, twinx
plt.rcParams.update({'font.size': 12})

<font face="Calibri" size="3"><b>Setup matplotlib plotting inside the notebook:</b></font>

In [None]:
%matplotlib notebook

<hr>
<font face="Calibri">

<font size="5"> <b> 1. Load Data Stack</b> </font> <img src="NotebookAddons/Deforest-MadreDeDios.jpg" width="350" style="padding:5px;" align="right" /> 

<font size="3"> This notebook will be using a 78-image deep dual-polarization C-band SAR data stack over Madre de Dios in Peru to analyze time series signatures of vegetation covers, water bodies, and areas affected by deforestation. The C-band data were acquired by ESA's Sentinel-1 SAR sensor constellation and are available to you through the services of the <a href="https://www.asf.alaska.edu/" target="_blank">Alaska Satellite Facility</a>. 

The site in question is interesting as it has experienced extensive logging over the last 10 years (see image to the right; <a href="https://blog.globalforestwatch.org/" target="_blank">Monitoring of the Andean Amazon Project</a>). Since the 1980s, people have been clearing forests in this area for farming, cattle ranching, logging, and (recently) gold mining. Creating RGB color composites is an easy way to visualize ongoing changes in the landscape.
</font></font>
<br><br>
<font face="Calibri" size="3">Before we get started, let's first <b>create a working directory for this analysis and change into it:</b> </font>

In [None]:
path = "f"/home/jovyan/data_Ex2-4_S1-MadreDeDios"
if not os.path.exists(path):
    os.mkdir(path)
os.chdir(path)
print(f"Current working directory: {os.getcwd()}")

<font face="Calibri" size="3">We will <b>retrieve the relevant data</b> from an <a href="https://aws.amazon.com/" target="_blank">Amazon Web Service (AWS)</a> cloud storage bucket <b>using the following command</b>:</font></font>

In [None]:
time_series_path = 's3://asf-jupyter-data/MadreDeDios.zip'
time_series = os.path.basename(time_series_path)
!aws --no-sign-request --region us-east-1 s3 cp $time_series_path $time_series

<font face="Calibri" size="3"> Now, let's <b>unzip the file (overwriting previous extractions) and clean up after ourselves:</b> </font>

In [None]:
if os.path.exists(time_series):
    try:
        zipfile.ZipFile(time_series).extractall(os.getcwd())
    except zipfile.BadZipFile:
        print(f"Zipfile Error.")
    os.remove(time_series)

<br>
<font face="Calibri" size="5"> <b> 2. Define Data Directory and Path to VRT </b> </font> 
<br><br>
<font face="Calibri" size="3"><b>Create a variable containing the VRT filename and the image acquisition dates:</b></font>

In [None]:
!gdalbuildvrt -separate raster_stack.vrt tiffs/*_VV.tiff
image_file_VV = "raster_stack.vrt"
!gdalbuildvrt -separate raster_stack_VH.vrt tiffs/*_VH.tiff
image_file_VH = "raster_stack_VH.vrt"

<font face="Calibri" size="3"><b>Create an index of timedelta64 data with Pandas:</b></font>

In [None]:
!ls tiffs/*_VV.tiff | sort | cut -c 7-21 > raster_stack_VV.dates
datefile_VV = 'raster_stack_VV.dates'
dates_VV = open(datefile_VV).readlines()
tindex_VV = pd.DatetimeIndex(dates_VV)

!ls tiffs/*_VH.tiff | sort | cut -c 7-21 > raster_stack_VH.dates
datefile_VH = 'raster_stack_VH.dates'
dates_VH = open(datefile_VH).readlines()
tindex_VH = pd.DatetimeIndex(dates_VH)

<br>
<hr>
<font face="Calibri" size="5"> <b> 3. Assess Image Acquisition Dates </b> </font> 

<font face="Calibri" size="3"> Before we start analyzing the available image data, we want to examine the content of our data stack. From the date index, we <b>make and print a lookup table for band numbers and dates:</b></font>

In [None]:
stindex=[]
for i in [datefile_VV,datefile_VH]:
    sdates=open(i).readlines()
    stindex.append(pd.DatetimeIndex(sdates))
    j=1
    print('\nBands and dates for',i.strip('.dates'))
    for k in stindex[-1]:
        print("{:4d} {}".format(j, k.date()),end=' ')
        j+=1
        if j%5==1: print()

<hr>
<br>
<font face="Calibri" size="5"> <b> 4. Create Minimum Image to Identify Likely Areas of Deforestation </b> </font>

<font face="Calibri" size="4"> <b> 4.1 Load Time Series Stack </b> </font>

<b>First, we load the raster stack into memory and calculate the minimum backscatter in the time series:</b>
</font> 

In [None]:
img = gdal.Open(image_file_VV)
band = img.GetRasterBand(1)
raster0 = band.ReadAsArray()
band_number = 0 # Needed for updates
rasterstack_VV = img.ReadAsArray()

<font face="Calibri" size="3"> To <b>explore the image (number of bands, pixels, lines),</b> you can use several functions associated with the image object (img) created in the last code cell: </font>

In [None]:
print(img.RasterCount) # Number of Bands
print(img.RasterXSize) # Number of Pixels
print(img.RasterYSize) # Number of Lines

<font face="Calibri" size="3"> The following line <b>calculates the minimum backscatter per pixel</b> across the time series: </font>

In [None]:
db_mean = np.min(rasterstack_VV, axis=0)

<br>
<font face="Calibri" size="4"> <b> 4.2 Visualize the Minimum Image and Select a Coordinate for a Time Series</b> </font>

<font face="Calibri" size="3"> <b>Write a class to create an interactive plot from which we can select interesting image locations for a time series.</b></font>

In [None]:
class pixelPicker:
    def __init__(self, image, width, height):
        self.x = None
        self.y = None
        self.fig = plt.figure(figsize=(width, height))
        self.ax = self.fig.add_subplot(111, visible=False)
        self.rect = patches.Rectangle(
            (0.0, 0.0), width, height, 
            fill=False, clip_on=False, visible=False
        )
       
        self.rect_patch = self.ax.add_patch(self.rect)
        self.cid = self.rect_patch.figure.canvas.mpl_connect('button_press_event', 
                                                             self)
        self.image = image
        self.plot = self.gray_plot(self.image, fig=self.fig, return_ax=True)
        self.plot.set_title('Select a Point of Interest')
        
        
    def gray_plot(self, image, vmin=None, vmax=None, fig=None, return_ax=False):
        '''
        Plots an image in grayscale.
        Parameters:
        - image: 2D array of raster values
        - vmin: Minimum value for colormap
        - vmax: Maximum value for colormap
        - return_ax: Option to return plot axis
        '''
        if vmin is None:
            vmin = np.nanpercentile(self.image, 1)
        if vmax is None:
            vmax = np.nanpercentile(self.image, 99)
        #if fig is None:
        #   my_fig = plt.figure() 
        ax = fig.add_axes([0.1,0.1,0.8,0.8])
        ax.imshow(image, cmap=plt.cm.gist_gray, vmin=vmin, vmax=vmax)
        if return_ax:
            return(ax)
        
    
    def __call__(self, event):
        print('click', event)
        self.x = event.xdata
        self.y = event.ydata
        for pnt in self.plot.get_lines():
            pnt.remove()
        plt.plot(self.x, self.y, 'ro')

<font face="Calibri" size="3"> Now we are ready to plot the minimum image. <b>Click a point interest for which you want to analyze radar brightness over time</b>: </font>

In [None]:
# Large plot of multi-temporal average of VV values to inspect pixel values
fig_xsize = 7.5
fig_ysize = 7.5
my_plot = pixelPicker(db_mean, fig_xsize, fig_ysize)

<font face="Calibri" size="3"><b>Save the selected coordinates</b>: </font>

In [None]:
sarloc = (ceil(my_plot.x), ceil(my_plot.y))
print(sarloc)

<br>
<font face="Calibri" size="5"> <b> 5. Plot SAR Brightness Time Series at Point Locations </b> </font>

<font face="Calibri" size="4"> <b> 5.1 SAR Brightness Time Series at Point Locations </b> </font>

<font face="Calibri" size="3"> We will pick a pixel location identified in the SAR image above and plot the time series for this identified point. By focusing on image locations undergoing deforestation, we should see the changes in the radar cross section related to the deforestation event.
    
First, for processing of the imagery in this notebook we <b>generate a list of image handles and retrieve projection and georeferencing information.</b></font> 

In [None]:
imagelist=[image_file_VV, image_file_VH]
geotrans=[]
proj=[]
img_handle=[]
xsize=[]
ysize=[]
bands=[]
for i in imagelist:
    img_handle.append(gdal.Open(i))
    geotrans.append(img_handle[-1].GetGeoTransform())
    proj.append(img_handle[-1].GetProjection())
    xsize.append(img_handle[-1].RasterXSize)
    ysize.append(img_handle[-1].RasterYSize)
    bands.append(img_handle[-1].RasterCount)
# for i in proj:
#     print(i)
# for i in geotrans:
#     print(i)
# for i in zip(['C-VV','C-VH','NDVI','B3','B4','B5'],bands,ysize,xsize):
#     print(i)

<font face="Calibri" size="3"> Now, let's <b>pick a 5x5 image area around a center pixel defined in variable <i>sarloc</i></b>...</font>

In [None]:
ref_x=geotrans[0][0]+sarloc[0]*geotrans[0][1]
ref_y=geotrans[0][3]+sarloc[1]*geotrans[0][5]
print('UTM Coordinates      ',ref_x, ref_y)
print('SAR pixel/line       ',sarloc[0], sarloc[1])
subset_sentinel=(sarloc[0], sarloc[1], 5, 5)

<font face="Calibri" size="3">... and <b>extract the time series</b> for this small area around the selected center pixel:</font> 

In [None]:
s_ts=[]
for idx in (0, 1):
    means=[]
    for i in range(bands[idx]):
        rs=img_handle[idx].GetRasterBand(i+1).ReadAsArray(*subset_sentinel)
        rs_means_pwr = np.mean(rs)
        rs_means_dB = 10.*np.log10(rs_means_pwr)
        means.append(rs_means_dB)
    s_ts.append(pd.Series(means,index=stindex[idx]))
        
means = []

<font face="Calibri" size="3"><b>Plot the extracted time series</b> for VV and VH polarizations:</font> 

In [None]:
fig, ax = plt.subplots(1, 1, figsize=(8, 4))

s_ts[0].plot(ax=ax, color='red', label='C-VV')#,xlim=(min(min(stindex),min(stindex[0])),
                                             #     max(max(stindex),max(stindex[0]))))
s_ts[1].plot(ax=ax, color='blue', label='C-VH')
ax.set_xlabel('Date')
ax.set_ylabel('Sentinel-1 $\gamma^o$ [dB]')

ax.set_title('Sentinel-1 Backscatter')
plt.grid()
_ = ax.legend(loc='best')
_ = fig.suptitle('Time Series Profiles of Sentinel-1 SAR Backscatter')
figname = f"RCSTimeSeries-{ref_x:.0f}_{ref_y:.0f}.png"
plt.savefig(figname, dpi=300, transparent='true')

<br>
<div class="alert alert-success">
<font face="Calibri" size="5"> <b> <font color='rgba(200,0,0,0.2)'> <u>EXERCISE</u>:  </font> Explore Time Series at Different Point Locations </b> </font>

<font face="Calibri" size="3"> Explore this data set some more by picking different point coordinates to explore. Use the time series animation together with the minimum plot to identify interesting areas and explore the radar brightness history. Discuss with your colleagues what you find.
</font>
</div>
<br>
<hr>

<font face="Calibri" size="2"> <i>Exercise3B-ExploreSARTimeSeriesDeforestation-Copy1.ipynb - Version 1.2.1 - Sep 2020 </i>
    <br>
        <i>Version Changes:</i>
    <br>
    <i>- remove unecessary reliances on asf_notebook, use os functions</i>
        <br>
    <i>- add "--no-sign-request --region us-east-1" to aws s3 cp from public bucket</i>
    
</font>
</font>