<img src="NotebookAddons/blackboard-banner.jpg" width="100%" />
<font face="Calibri">
<br>
<font size="7"> <b> GEOS 657: Microwave Remote Sensing<b> </font>

<font size="5"> <b>Lab 8: Change Detection in <font color='rgba(200,0,0,0.2)'>Your Own</font> SAR Amplitude Time Series Stack </b> </font>

<br>
<font size="4"> <b> Franz J Meyer; University of Alaska Fairbanks & Josef Kellndorfer, <a href="http://earthbigdata.com/" target="_blank">Earth Big Data, LLC</a> </b> <br>
<img style="padding: 7px" src="NotebookAddons/UAFLogo_A_647.png" width="170" align="right"/>
</font>

<font size="3"> This Lab is part of the UAF course <a href="https://radar.community.uaf.edu/" target="_blank">GEOS 657: Microwave Remote Sensing</a>. It is introducing you to the methods of change detection in deep multi-temporal SAR image data stacks. 
    
<font color='rgba(200,0,0,0.2)'> <b>Note:</b> This version of Lab 8 is modified to allow for change detection analysis on your own data stack created within ASF HyP3</font> 
<br><br>

<b>In this chapter we introduce the following data analysis concepts:</b>

- How to use your own HyP3-generated data stack in a change detection effort
- The concepts of time series slicing by month, year, and date.
- The concepts and workflow of Cumulative Sum-based change point detection.
- The identification of change dates for each identified change point.
</font>

<font size="4"> <font color='rgba(200,0,0,0.2)'> <b>THIS NOTEBOOK INCLUDES NO HOMEWORK ASSIGNMENTS.</b></font> 

<font size="3">Contact me at fjmeyer@alaska.edu should you run into any problems.
</font>

</font>

<hr>
<font face="Calibri" size="5" color="red"> <b>Important Note about JupyterHub</b> </font>
<br><br>
<font face="Calibri" size="3"> <b>Your JupyterHub server will automatically shutdown when left idle for more than 1 hour. Your notebooks will not be lost but you will have to restart their kernels and re-run them from the beginning. You will not be able to seamlessly continue running a partially run notebook.</b> </font>


<hr>
<font face="Calibri">

<font size="5"> <b> 0. Importing Relevant Python Packages </b> </font>

<font size="3">In this notebook we will use the following scientific libraries:
<ol type="1">
    <li> <b><a href="https://pandas.pydata.org/" target="_blank">Pandas</a></b> is a Python library that provides high-level data structures and a vast variety of tools for analysis. The great feature of this package is the ability to translate rather complex operations with data into one or two commands. Pandas contains many built-in methods for filtering and combining data, as well as the time-series functionality. </li>
    <li> <b><a href="https://www.gdal.org/" target="_blank">GDAL</a></b> is a software library for reading and writing raster and vector geospatial data formats. It includes a collection of programs tailored for geospatial data processing. Most modern GIS systems (such as ArcGIS or QGIS) use GDAL in the background.</li>
    <li> <b><a href="http://www.numpy.org/" target="_blank">NumPy</a></b> is one of the principal packages for scientific applications of Python. It is intended for processing large multidimensional arrays and matrices, and an extensive collection of high-level mathematical functions and implemented methods makes it possible to perform various operations with these objects. </li>
    <li> <b><a href="https://matplotlib.org/index.html" target="_blank">Matplotlib</a></b> is a low-level library for creating two-dimensional diagrams and graphs. With its help, you can build diverse charts, from histograms and scatterplots to non-Cartesian coordinates graphs. Moreover, many popular plotting libraries are designed to work in conjunction with matplotlib. </li>
    <li> The <b><a href="https://www.pydoc.io/pypi/asf-hyp3-1.1.1/index.html" target="_blank">asf-hyp3 API</a></b> provides useful functions and scripts for accessing and processing SAR data via the Alaska Satellite Facility's Hybrid Pluggable Processing Pipeline, or HyP3 (pronounced "hype"). </li>
<li><b><a href="https://www.scipy.org/about.html" target="_blank">SciPY</a></b> is a library that provides functions for numerical integration, interpolation, optimization, linear algebra and statistics. </li>

</font>
<br>
<font face="Calibri" size="3"><b>Our first step is to import them:</b> </font>

In [None]:
!pip install pyproj # this can be removed when pyproj is installed on jupyterHub

In [None]:
import time
import os
import glob
import json # for loads
import datetime

import pandas as pd
import gdal
import numpy as np
import matplotlib.pylab as plt
import matplotlib.pyplot as plb
import matplotlib.patches as patches
from matplotlib import animation
from matplotlib import rc
from pyproj import Proj, transform

from IPython.display import HTML

from asf_notebook import earthdata_hyp3_login
from asf_notebook import new_directory
from asf_notebook import download_hyp3_products
from asf_notebook import path_exists
from asf_notebook import remove_nan_filled_tifs
from asf_notebook import select_RTC_polarization

<font face="Calibri" size="3"><b>set up matplotlib plotting</b> inside the notebook:</font>

In [None]:
%matplotlib inline 

<hr>
<font face="Calibri">

<font size="5"> <b> 1. Load Your Own Data Stack Into the Notebook </b> </font> 

<font size="3"> This lab assumes that you've created your own data stack over your personal area of interest using the <a href="https://www.asf.alaska.edu/" target="_blank">Alaska Satellite Facility's</a> value-added product system <a href="http://hyp3.asf.alaska.edu/" target="_blank">HyP3</a>. HyP3 is an environment that is used by ASF to prototype value added products and provide them to users to collect feedback. 

This lab expects <a href="https://media.asf.alaska.edu/uploads/RTC/rtc_atbd_v1.2_final.pdf" target="_blank">Radiometric Terrain Corrected</a> (RTC) image products as input, so be sure to select an RTC process when creating the subscription for your input data within HyP. Prefer a unique orbit geometry **(choose ascending or descending, not both)** to keep geometric differences between images low. 

We will retrieve HyP3 data via the HyP3 API. As both HyP3 and the Notebook environment sit in the <a href="https://aws.amazon.com/" target="_blank">Amazon Web Services (AWS)</a> cloud, data transfer is quick and cost effective.</font> 
</font>

<hr>
<font face="Calibri" size="3"> To download data from ASF, you need to provide your <a href="https://www.asf.alaska.edu/get-data/get-started/free-earthdata-account/" target="_blank">NASA Earth Data</a> username to the system. Setup an EarthData account if you do not yet have one. <font color='rgba(200,0,0,0.2)'><b>Note that EarthData's End User License Agreement (EULA) applies when accessing the Hyp3 API from this notebook. If you have not acknowleged the EULA in EarthData, you will need to navigate to <a href="https://earthdata.nasa.gov/" target="_blank">EarthData's home page</a> and complete that process.</b></font>
<br><br>
<b>Login to Earthdata:</b> </font> 

In [None]:
api = earthdata_hyp3_login()

<hr>
<font face="Calibri" size="3"> Before we download anything, <b>create a working directory for this analysis and change into it:</b> </font>

In [None]:
base_path = "/home/jovyan/notebooks/ASF/GEOS_657_Labs/lab_8_own_data"
new_directory(base_path)
os.chdir(base_path)
print(f"Current working directory: {os.getcwd()}")

<font face="Calibri" size="3"><b>Create a folder in which to download your RTC products.</b> </font>

In [None]:
new_directory("rtc_products")
products_path = f"{base_path}/rtc_products"

<font face="Calibri" size="3"><b>Set a date range of products to download:</b> </font>

In [None]:
date_range = [datetime.date(2017, 1, 1), datetime.date(2017, 6, 30)] #enter your date range here
direction = 'A' # enter a flight direction here (A or D)
flight_path = 114 # enter a flight path

########## NOTE: Currently filtering by path and flight_direction does not work for InSAR products #########

# uncomment code below to download all products
#date_range = [None, None]
#direction = None
#flight_path = None

<font face="Calibri" size="3"><b>Download the products associated with an existing RTC subscription.</b> </font>

In [None]:
subscription_id = download_hyp3_products(
    api, products_path, start_date=date_range[0], end_date=date_range[1], flight_direction=direction, path=path)

<hr>
<font face="Calibri" size="3"><b>Determine the subscription's process type</b>, which we need in order to determine the file paths to the tiffs.</font>

In [None]:
subscription_info = api.get_subscription(subscription_id)
process_type = subscription_info['process_id']

<font face="Calibri" size="3"><b>Create a variable called <i>paths</i>, that holds the paths to the tiffs</b>, which varies based on process type:</font>

In [None]:
rtc_path = "rtc_products"
paths = select_RTC_polarization(process_type, rtc_path)
print(f"paths: {paths}")
polarization = paths.split('.')[0][-2:]
print(polarization)

<font face="Calibri" size="3"><b>Write a function to collect the product acquisition dates:</b></font>

In [None]:
def get_dates(paths):
    dates = []
    pths = glob.glob(paths)
    for p in pths:
        date = p.split("_")[5][0:8]
        dates.append(date)
    dates.sort()
    return dates

<font face="Calibri" size="3"><b>Call get_dates() to collect the product acquisition dates:</b></font>

In [None]:
dates = get_dates(paths)
print(dates)

<hr>
<font face="Calibri" size="3"> You may notice duplicates in your acquisition dates. As HyP3 processes SAR data on a frame-by-frame basis, duplicates may occur if your area of interest is covered by two consecutive  image frames. In this case, two separate images are generated that need to be merged together before time series processing can commence.
<br><br>
<b>Write functions to collect and print the paths of the tiffs:</b></font>

In [None]:
def get_tiff_paths(paths):
    tiff_paths = !ls $paths | sort -t_ -k5,5
    return tiff_paths

def print_tiff_paths(tiff_paths):
    print("Tiff paths:")
    for p in tiff_paths:
        print(f"{p}\n")

<font face="Calibri" size="3"><b>Collect and print the paths of the tiffs:</b></font>

In [None]:
tiff_paths = get_tiff_paths(paths)
#print_tiff_paths(tiff_paths)

<hr>
<font face="Calibri" size="4"> <b>1.2 Fix multiple UTM Zone-related issues</b> <br>
<br>
<font face="Calibri" size="3">Fix multiple UTM Zone-related issues should they exist in your data set. If multiple UTM zones are found, the following code cells will identify the predominant UTM zone and reproject the rest into that zone. This step must be completed prior to merging frames or performing any analysis.</font>
<br><br>
<font face="Calibri" size="3"><b>Use gdal.Info to determine the UTM definition types and zones in each product:</b></font>

In [None]:
utm_zones = []
utm_types = []
print('Checking UTM Zones in the data stack ...\n')
for k in range(0, len(tiff_paths)):
    info = (gdal.Info(tiff_paths[k], options = ['-json']))
    info = (json.loads(info))['coordinateSystem']['wkt']
    zone = info[(len(info)-8):(len(info)-3)]
    utm_zones.append(zone)
    typ = info[(len(info)-15):(len(info)-11)]
    utm_types.append(typ)
print(f"UTM Zones:\n {utm_zones}\n")
print(f"UTM Types:\n {utm_types}")

<font face="Calibri" size="3"><b>Identify the most commonly used UTM Zone in the data:</b></font>

In [None]:
utm_unique, counts = np.unique(utm_zones, return_counts=True)
a = np.where(counts == np.max(counts))
predominate_utm = utm_unique[a][0]
print(f"Predominate UTM Zone: {predominate_utm}")

<font face="Calibri" size="3"><b>Reproject tiffs with errant UTMs to the predominate UTM:</b></font>

In [None]:
reproject_indicies = [i for i, j in enumerate(utm_zones) if j != predominate_utm] #makes list of indicies in utm_zones that need to be reprojected
print('--------------------------------------------')
print('Reprojecting %4.1f files' %(len(reproject_indicies)))
print('--------------------------------------------')
for k in reproject_indicies:
    temppath = tiff_paths[k].strip()
    _, product_name, tiff_name = temppath.split('/')
    cmd = f"gdalwarp -overwrite rtc_products/{product_name}/{tiff_name} rtc_products/{product_name}/r{tiff_name} -s_srs {utm_types[k]}:{utm_zones[k]} -t_srs EPSG:{predominate_utm}"
    #print(f"Calling the command: {cmd}")
    !{cmd}
    rm_command = f"rm {tiff_paths[k].strip()}"
    #print(f"Calling the command: {rm_command}")
    !{rm_command}

In [None]:
tiff_paths = get_tiff_paths(paths)

<hr>
<font face="Calibri" size="4"> <b>1.3 Merge multiple frames from the same date.</b></font>
<br><br>
<font face="Calibri" size="3"><b>Create a set containing each represented date:</b></font>

In [None]:
unique_dates = set(dates)
print(unique_dates)

<font face="Calibri" size="3"><b>Determine which dates have multiple frames. Create a dictionary with each date as a key linked to a value set as an empty string:</b></font>

In [None]:
dup_dates = {}
for date in unique_dates:
    count = 0
    for d in dates:
        if date == d:
            count +=1
    if count > 1:
        dup_dates.update({date : ""})
print(dup_dates)

<font face="Calibri" size="3"><b>Update the key values in dup_paths with the string paths to all the tiffs for each date:</b></font>

In [None]:
for pth in tiff_paths:
    date = pth.split('/')[2].split('_')[3][:8]
    if date in dup_dates:
        dup_dates[date] = f"{dup_dates[date]} {pth}"
print(dup_dates)

<font face="Calibri" size="3"><b>Merge all the frames for each date.</b></font>

In [None]:
for dup_date in dup_dates:
    output = f"{dup_dates[dup_date].split('/')[0]}/{dup_dates[dup_date].split('/')[1]}/new{dup_dates[dup_date].split('/')[2].split(' ')[0]}"
    gdal_command = f"gdal_merge.py -o {output} {dup_dates[dup_date]}"
    print(f"\n\nCalling the command: {gdal_command}\n")
    !{gdal_command}
    for pth in dup_dates[dup_date].split(' '):
        if pth and path_exists(pth):
            os.remove(pth)
            print(f"Deleting: {pth}")

<hr>
<font face="Calibri" size="3"> <b>Verify that all duplicate dates were resolved:</b> </font>

In [None]:
dates = get_dates(paths)
if len(dates) == len(set(dates)):
    print(f"Duplicate dates resolved.")
else:
    print(f"Duplicate dates still present!")
print(dates)

<font face="Calibri" size="3"><b>Update the paths of the tiffs:</b></font>

In [None]:
tiff_paths = get_tiff_paths(paths)
#print_tiff_paths(tiff_paths)

<hr>
<font face="Calibri">

<font size="5"> <b> 2. Create Subset and Stack Up Your Data </b> </font> 

<font size="3"> Now you are ready to work with your data. The next cells allow you to select an area of interest (AOI; via bounding-box corner coordinates) for your data analysis. Once selected, the AOI is being extracted and a data stack is formed.

<b>As a first step, we extract your AOI from the full frames:</b>
</font> 
</font>

In [None]:
# Using Google Maps, get the rough bounding box for the subset
# Enter your corner coordinates below
upper_left_x = 89.668263
lower_right_x = 90.015444
upper_left_y = 25.169304
lower_right_y = 24.910358
print(f"upper left x coord: {upper_left_x}\nupper left y coord: {upper_left_y}\nlower right x coord: {lower_right_x}\nlower right y coord: {lower_right_y}")

<font face="Calibri" size="3"> <b>Convert the EPSG:4326 coords from Google Maps to the predominate EPSG in the data stack:</b> </font> 

In [None]:
in_proj = Proj(init="epsg:4326")
out_proj = Proj(init=f"epsg:{predominate_utm}")
coords = [[None, None], [None, None]]
coords[0][0], coords[0][1] = transform(in_proj, out_proj, upper_left_x, upper_left_y)
coords[1][0], coords[1][1] = transform(in_proj, out_proj, lower_right_x, lower_right_y)
print(f"{coords[0][0]}, {coords[0][1]}")
print(f"{coords[1][0]}, {coords[1][1]}")

<font face="Calibri" size="3"><b>Adjust the bounding box so as to avoid intersecting any pixels</b>, which then requires resampling and may slightly skew the data:</font> 

In [None]:
for corner in range (0, len(coords)):
    for c in range (0, len(coords[corner])):
        coords[corner][c] = round(coords[corner][c]/30) * 30
print(f"{coords[0][0]}, {coords[0][1]}")
print(f"{coords[1][0]}, {coords[1][1]}")

<font size="3"> <b>Update the list of all the absolute paths of the tiffs:</b> </font> 

In [None]:
tiff_paths = get_tiff_paths(paths)
#print_tiff_paths(tiff_paths)

<font size="3"><b>Subset the tiffs and move them from the individual product directories into their own directory, /tiffs:</b></font> 

In [None]:
!mkdir -p tiffs
if path_exists('tiffs'):
    product_number = 1
    for tiff_path in tiff_paths:
        _, granule_name, tiff_name = tiff_path.split('/')
        g1, g2, g3, date, g4, g5, g6 = tiff_name.split('_')
        print(f"\nProduct #{product_number}:")
        product_number += 1
        gdal_command = f"gdal_translate -projwin {coords[0][0]} {coords[0][1]} {coords[1][0]} {coords[1][1]} -projwin_srs 'EPSG:{predominate_utm}' -co \"COMPRESS=DEFLATE\" -a_nodata 0 {tiff_path} tiffs/{date}_{polarization}.tiff"
        print(f"Calling the command: {gdal_command}")
        !{gdal_command}

<font size="3"><b>Grab the updated paths of the images:</b></font>

In [None]:
tp = f"{base_path}/tiffs"
tiff_paths = get_tiff_paths(tp)
print_tiff_paths(tiff_paths)

<font size="3"><b>Delete any subset tifs that are filled with NaNs and contain no data.</b></font>

In [None]:
t_path = f"{os.getcwd()}/tiffs/"
tiff_paths = get_tiff_paths(t_path)
remove_nan_filled_tifs(t_path, tiff_paths)

<font size="3"><b>Update the list of dates and tiff_paths after removing NaN filled images:</b></font>

In [None]:
dates = []
pth = glob.glob(f"{t_path}/*.tiff")
pth.sort()
for p in pth:
    date = p[len(p)-23:len(p)-15]
    dates.append(date)
    print(date)

tiff_paths = get_tiff_paths(tp)
# print_tiff_paths(tiff_paths) # uncomment to print tiff paths

<hr>
<font face="Calibri" size="3"> Now we stack up the data by creating a virtual raster table with links to all subset data files: </font>
<br><br>
<font size="3"><b>Create the virtual raster table for the subset GeoTiffs:</b></font>

In [None]:
!gdalbuildvrt -separate raster_stack.vrt tiffs/*.tiff

<hr>
<font face="Calibri">

<font size="5"> <b> 3. Now You Can Work With Your Data </b> </font> 

<font size="3"> Now you are ready to perform time series change detection on your data stack.
</font> 
</font>

<br>
<font face="Calibri" size="4"> <b> 3.1 Define Data Directory and Path to VRT </b> </font> 
<br><br>
<font face="Calibri" size="3"><b>Create a variable containing the VRT filename:</b></font>

In [None]:
image_file = "raster_stack.vrt"

<font face="Calibri" size="3"><b>Create an index of timedelta64 data with Pandas:</b></font>

In [None]:
# Get some indices for plotting
time_index = pd.DatetimeIndex(dates)

<font face="Calibri" size="3"><b>Print the bands and dates for all images in the virtual raster table (VRT):</b></font>

In [None]:
j = 1
print(f"Bands and dates for {image_file}")
for i in time_index:
    print("{:4d} {}".format(j, i.date()), end=' ')
    j += 1
    if j%5 == 1: print()

<hr>
<br>
<font face="Calibri" size="4"> <b> 3.2 Open Your Data Stack and Visualize Some Layers </b> </font> 

<font face="Calibri" size="3"> We will <b>open your VRT</b> and visualize some layers using Matplotlib. </font>

In [None]:
img = gdal.Open(image_file)

<font face="Calibri" size="3"><b>Print the bands, pixels, and lines:</b></font>

In [None]:
print(f"Number of  bands: {img.RasterCount}")
print(f"Number of pixels: {img.RasterXSize}")
print(f"Number of  lines: {img.RasterYSize}")

<font face="Calibri" size="3"><b>Read in raster data for the first two bands:</b></font>

In [None]:
raster_1 = img.GetRasterBand(1).ReadAsArray() # change the number passed to GetRasterBand() to 
where_are_NaNs = np.isnan(raster_1)           # read rasters from different bands
raster_1[where_are_NaNs] = 0

raster_2 = img.GetRasterBand(2).ReadAsArray() #must pass a valid band number to GetRasterBand()
where_are_NaNs = np.isnan(raster_2)
raster_2[where_are_NaNs] = 0

<font face="Calibri" size="3"><b>Plot images and histograms for bands 1 and 2:</b></font>

In [None]:
# Setup the pyplot plots
fig = plb.figure(figsize=(18,10)) # Initialize figure with a size
ax1 = fig.add_subplot(221)  # 221 determines: 2 rows, 2 plots, first plot
ax2 = fig.add_subplot(222)  # 222 determines: 2 rows, 2 plots, second plot
ax3 = fig.add_subplot(223)  # 223 determines: 2 rows, 2 plots, third plot
ax4 = fig.add_subplot(224)  # 224 determines: 2 rows, 2 plots, fourth plot

# Plot the band 1 image
band_number = 1
ax1.imshow(raster_1,cmap='gray', vmin=0, vmax=0.2) #,vmin=2000,vmax=10000)
ax1.set_title('Image Band {} {}'.format(band_number, time_index[band_number-1].date()))

# Flatten the band 1 image into a 1 dimensional vector and plot the histogram:
h = ax2.hist(raster_1.flatten(), bins=200, range=(0, 0.3))
ax2.xaxis.set_label_text('Amplitude? (Uncalibrated DN Values)')
ax2.set_title('Histogram Band {} {}'.format(band_number, time_index[band_number-1].date()))

# Plot the band 2 image
band_number = 2
ax3.imshow(raster_2,cmap='gray', vmin=0, vmax=0.2) #,vmin=2000,vmax=10000)
ax3.set_title('Image Band {} {}'.format(band_number, time_index[band_number-1].date()))

# Flatten the band 2 image into a 1 dimensional vector and plot the histogram:
h = ax4.hist(raster_2.flatten(),bins=200,range=(0,0.3))
ax4.xaxis.set_label_text('Amplitude? (Uncalibrated DN Values)')
ax4.set_title('Histogram Band {} {}'.format(band_number, time_index[band_number-1].date()))

<hr>
<br>
<font face="Calibri" size="4"> <b> 3.3 Create a Time Series Animation </b> </font>

<font face="Calibri" size="3"><b>Create a directory in which to store our plots and animations:</b></font> 

In [None]:
output_path = 'plots_animations'
new_directory(output_path)

<font face="Calibri" size="3"> Now we are ready to <b>create a time series animation</b> from the calibrated SAR data. </font> 

In [None]:
band = img.GetRasterBand(1)
raster0 = band.ReadAsArray()
band_number = 0 # Needed for updates
raster_stack = img.ReadAsArray()

<font face="Calibri" size="3"><b>Close img, as it is no longer needed in the notebook:</b></font> 

In [None]:
img = None

In [None]:
raster_stack_masked = np.ma.masked_where(raster_stack==0, raster_stack)

In [None]:
%%capture
fig = plt.figure(figsize=(14, 8))
ax = fig.add_subplot(111)
ax.axis('off')
vmin = np.percentile(raster_stack.flatten(), 5)
vmax = np.percentile(raster_stack.flatten(), 95)

r0dB = 20 * np.log10(raster0) - 83

im = ax.imshow(raster0, cmap='gray', vmin=vmin, vmax=vmax)
ax.set_title("{}".format(time_index[0].date()))

def animate(i):
    ax.set_title("{}".format(time_index[i].date()))
    im.set_data(raster_stack[i])

# Interval is given in milliseconds
ani = animation.FuncAnimation(fig, animate, frames=raster_stack.shape[0], interval=400)

<font face="Calibri" size="3"><b>Configure matplotlib's RC settings for the animation:</b></font> 

In [None]:
rc('animation', embed_limit=40971520.0)  # We need to increase the limit maybe to show the entire animation

<font face="Calibri" size="3"><b>Create a javascript animation of the time-series running inline in the notebook:</b></font> 

In [None]:
HTML(ani.to_jshtml())

<font face="Calibri" size="3"><b>Delete the dummy png</b> that was saved to the current working directory while generating the javascript animation in the last code cell.</font> 

In [None]:
try:
    os.remove('None0000000.png')
except FileNotFoundError:
    pass

<font face="Calibri" size="3"><b>Save the animation as a gif:</b> </font> 

In [None]:
ani.save(f"{output_path}/animation.gif", writer='pillow', fps=2)

<br>
<hr>
<font face="Calibri" size="5"> <b> 4. Cumulative Sum-based Change Detection Across an Entire Image</b> </font> 

<font face="Calibri" size="3"> Using numpy arrays we can apply the concept of **cumulative sum change detection** analysis effectively on the entire image stack. We take advantage of array slicing and axis-based computing in numpy. **Axis 0 is the time domain** in our raster stacks.
    
<hr>
<font size="4"><b>4.1 Create our time series stack</b></font> 
<br><br>
<font size="3"><b>Calculate the dB scale:</b></font> 

In [None]:
db = 10.*np.log10(raster_stack_masked)

<font face="Calibri" size="3">Sometimes it makes sense to <b>extract a reduced time span</b> from the full time series to reduce the number of different change objects in a scene. In the following, we extract a shorter time span:
</font>

In [None]:
# Change these dates to fit your time span
start_date = '2017-01-01'
end_date = '2017-12-31'

date_index_subset = np.where((time_index>start_date) & (time_index<end_date))
db_subset = np.squeeze(db[date_index_subset, :, :])
time_index_subset = time_index[date_index_subset]

In [None]:
db_subset.shape

In [None]:
plt.figure(figsize=(12, 8))
band_number = 0
vmin = np.percentile(db_subset[band_number], 5)
vmax = np.percentile(db_subset[band_number], 95)
plt.title('Band  {} {}'.format(band_number+1, time_index_subset[band_number].date()))
plt.imshow(db_subset[0], cmap='gray', vmin=vmin, vmax=vmax)
_ = plt.colorbar()

<br>
<hr>
<font face="Calibri" size="4"> <b> 4.2 Calculate Mean Across Time Series to Prepare for Calculation of Cummulative Sum $S$:</b> </font> 
<br><br>
<font face="Calibri" size="3"><b>Write a function to convert our plots into GeoTiffs:</b></font> 

In [None]:
def geotiff_from_plot(source_image, out_filename, extent, cmap=None, vmin=None, vmax=None, interpolation=None, dpi=300):
    assert "." not in out_filename, 'Error: Do not include the file extension in out_filename'
    assert type(extent) == list and len(extent) == 2 and len(extent[0]) == 2 and len(
        extent[1]) == 2, 'Error: extent must be a list in the form [[upper_left_x, upper_left_y], [lower_right_x, lower_right_y]]'
    
    plt.figure()
    plt.axis('off')
    plt.imshow(source_image, cmap=cmap, vmin=vmin, vmax=vmax, interpolation=interpolation)
    temp = f"{out_filename}_temp.png"
    plt.savefig(temp, dpi=dpi, transparent='true', bbox_inches='tight', pad_inches=0)

    cmd = f"gdal_translate -of Gtiff -a_ullr {extent[0][0]} {extent[0][1]} {extent[1][0]} {extent[1][1]} -a_srs EPSG:{predominate_utm} {temp} {out_filename}.tiff"
    !{cmd}
    try:
        os.remove(temp)
    except FileNotFoundError:
        pass

<font face="Calibri" size="3"><b>Plot the time-series mean and save as a png (time_series_mean.png):</b></font> 

In [None]:
db_mean = np.mean(db_subset, axis=0)
plt.figure(figsize=(12, 8))
plt.imshow(db_mean, cmap='gray')
_ = plt.colorbar()
plt.savefig(f"{output_path}/time_series_mean.png", dpi=300, transparent='true')

<font face="Calibri" size="3"><b>Save the time-series mean as a GeoTiff (time_series_mean.tiff):</b></font> 

In [None]:
%%capture
geotiff_from_plot(db_mean, f"{output_path}/time_series_mean", coords, cmap='gray')

<font face="Calibri" size="3"><b>Calculate the residuals and plot residuals[0]. Save it as a png (residuals.png):</b></font> 

In [None]:
residuals = db_subset - db_mean

plt.figure(figsize=(12, 8))
plt.imshow(residuals[0])
plt.title('Residuals for Band  {} {}'.format(band_number+1, time_index_subset[band_number].date()))
_ = plt.colorbar()
plt.savefig(f"{output_path}/residuals.png", dpi=300, transparent='true')

<font face="Calibri" size="3"><b>Save the residuals[0] as a GeoTiff (residuals.tiff):</b></font> 

In [None]:
%%capture
geotiff_from_plot(residuals[0], f"{output_path}/residuals", coords)

<br>
<hr>
<font face="Calibri" size="4"><b> 4.3 Calculate Cummulative Sum $S$ as well as Change Magnitude $S_{diff}$:</b></font> 
<br><br>
<font face="Calibri" size="3"><b>Plot Smin, Smax, and the change magnitude and save a png of the plots (Smin_Smax_Sdiff.png):</b></font> 

In [None]:
summation = np.cumsum(residuals, axis=0)
summation_max = np.max(summation, axis=0)
summation_min = np.min(summation, axis=0)
change_mag = summation_max - summation_min
fig, ax = plt.subplots(1, 3, figsize=(16, 4))
vmin = np.percentile(summation_min.flatten(), 3)
vmax = np.percentile(summation_max.flatten(), 97)
max_plot = ax[0].imshow(summation_max, vmin=vmin, vmax=vmax)
ax[0].set_title('$S_{max}$')
ax[1].imshow(summation_min, vmin=vmin, vmax=vmax)
ax[1].set_title('$S_{min}$')
ax[2].imshow(change_mag, vmin=vmin, vmax=vmax)
ax[2].set_title('Change Magnitude')
fig.subplots_adjust(right=0.8)
cbar_ax = fig.add_axes([0.85, 0.15, 0.02, 0.7])
_ = fig.colorbar(max_plot, cax=cbar_ax)
plt.savefig(f"{output_path}/Smin_Smax_Sdiff.png", dpi=300, transparent='true')

<font face="Calibri" size="3"><b>Save Smax as a GeoTiff (Smax.tiff):</b></font> 

In [None]:
%%capture
geotiff_from_plot(summation_max, f"{output_path}/Smax", coords, vmin=vmin, vmax=vmax)

<font face="Calibri" size="3"><b>Save Smin as a GeoTiff (Smin.tiff):</b></font> 

In [None]:
%%capture
geotiff_from_plot(summation_min, f"{output_path}/Smin", coords, vmin=vmin, vmax=vmax)

<font face="Calibri" size="3"><b>Save the change magnitude as a GeoTiff (Sdiff.tiff):</b></font> 

In [None]:
%%capture
geotiff_from_plot(change_mag, f"{output_path}/Sdiff", coords, vmin=vmin, vmax=vmax)

<br>
<hr>
<font face="Calibri" size="4"> <b> 4.4 Mask $S_{diff}$ With a-priori Threshold To Idenfity Change Candidates:</b> </font>

<font face="Calibri" size="3">To identified change candidate pixels, we can threshold $S_{diff}$ to reduce computation of the bootstrapping. For land cover change, we would not expect more than 5-10% change pixels in a landscape. So, if the test region is reasonably large, setting a threshold for expected change to 10% is appropriate. In our example, we'll start out with a very conservative threshold of 50%.
<br><br>
<b>Plot and tsave the histogram and CDF for the change magnitude (change_mag_histogram_CDF.png):</b></font>

In [None]:
plt.rcParams.update({'font.size': 14})
fig = plt.figure(figsize=(14, 6)) # Initialize figure with a size
ax1 = fig.add_subplot(121)  # 121 determines: 2 rows, 2 plots, first plot
ax2 = fig.add_subplot(122)
# Second plot: Histogram
# IMPORTANT: To get a histogram, we first need to *flatten* 
# the two-dimensional image into a one-dimensional vector.
histogram = ax1.hist(change_mag.flatten(), bins=200, range=(0, np.max(change_mag)))
ax1.xaxis.set_label_text('Change Magnitude')
ax1.set_title('Change Magnitude Histogram')
plt.grid()
n, bins, patches = ax2.hist(change_mag.flatten(), bins=200, range=(0, np.max(change_mag)), cumulative='True', density='True', histtype='step', label='Empirical')
ax2.xaxis.set_label_text('Change Magnitude')
ax2.set_title('Change Magnitude CDF')
plt.grid()
plt.savefig(f"{output_path}/change_mag_histogram_CDF", dpi=72)

In [None]:
precentile = 0.5
out_indicies = np.where(n>precentile)
threshold_index = np.min(out_indicies)
threshold = bins[threshold_index]
print('At the {}% percentile, the threshold value is {:2.2f}'.format(precentile*100, threshold))

<font face="Calibri" size="3">Using this threshold, we can <b>visualize our change candidate areas and save them as a png (change_candidate.png):</b></font>

In [None]:
change_mag_mask = change_mag < threshold
plt.figure(figsize=(12, 8))
plt.title('Change Candidate Areas (black)')
_ = plt.imshow(change_mag_mask, cmap='gray')
plt.savefig(f"{output_path}/change_candidate.png", dpi=300, transparent='true')

<font face="Calibri" size="3"><b>Save the change candidate areas as a GeoTiff (change_canididate.tiff):</b>
</font>

In [None]:
%%capture
geotiff_from_plot(change_mag_mask, f"{output_path}/change_canididate", coords, cmap='gray')

<br>
<hr>
<font face="Calibri" size="4"> <b> 4.5 Bootstrapping to Prepare for Change Point Selection:</b> </font>

<font face="Calibri" size="3">We can now perform bootstrapping over the candidate pixels. The workflow is as follows:
<ul>
    <li>Filter our residuals to the change candidate pixels</li>
    <li>Perform bootstrapping over candidate pixels</li>
</ul>
For efficient computing we permutate the index of the time axis.
</font>

In [None]:
residuals_mask = np.broadcast_to(change_mag_mask , residuals.shape)
residuals_masked = np.ma.array(residuals, mask=residuals_mask)

<font face="Calibri" size="3">On the masked time series stack of residuals, we can re-compute the cumulative sums:
</font>

In [None]:
summation_masked = np.ma.cumsum(residuals_masked, axis=0)

<font face="Calibri" size="3"><b>Plot the masked Smax, Smin, and change magnitude. Save them as a png (masked_Smax_Smin_Sdiff.png):</b>
</font>

In [None]:
summation_masked_max = np.ma.max(summation_masked, axis=0)
summation_masked_min = np.ma.min(summation_masked, axis=0)
change_mag_masked = summation_masked_max - summation_masked_min
fig, ax = plt.subplots(1, 3, figsize=(16, 4))
vmin = summation_masked_min.min()
vmax = summation_masked_max.max()
masked_sum_max_plot = ax[0].imshow(summation_masked_max, vmin=vmin, vmax=vmax)
ax[0].set_title('Masked $S_{max}$')
ax[1].imshow(summation_masked_min, vmin=vmin, vmax=vmax)
ax[1].set_title('Masked $S_{min}$')
ax[2].imshow(change_mag_masked, vmin=vmin, vmax=vmax)
ax[2].set_title('Masked Change Magnitude')
fig.subplots_adjust(right=0.8)
cbar_ax = fig.add_axes([0.85, 0.15, 0.02, 0.7])
_ = fig.colorbar(masked_sum_max_plot, cax=cbar_ax)
plt.savefig(f"{output_path}/masked_Smax_Smin_Sdiff.png", dpi=300, transparent='true')

<font face="Calibri" size="3"><b>Save the masked Smax as a GeoTiff (masked_Smax.tiff):</b>
</font>

In [None]:
%%capture
geotiff_from_plot(summation_masked_max, f"{output_path}/masked_Smax", coords, vmin=vmin, vmax=vmax)

<font face="Calibri" size="3"><b>Save the masked Smin as a GeoTiff (masked_Smin.tiff):</b>
</font>

In [None]:
%%capture
geotiff_from_plot(summation_masked_min, f"{output_path}/masked_Smin", coords, vmin=vmin, vmax=vmax)

<font face="Calibri" size="3"><b>Save the masked change magnitude as a GeoTiff (masked_Sdiff.tiff):</b>
</font>

In [None]:
%%capture
geotiff_from_plot(change_mag_masked, f"{output_path}/masked_Sdiff", coords, vmin=vmin, vmax=vmax)

<font face="Calibri" size="3">Now let's perform <b>bootstrapping</b>:
</font>

In [None]:
random_index = np.random.permutation(residuals_masked.shape[0])
residuals_random = residuals_masked[random_index,:,:]

In [None]:
n_bootstraps = 100  # bootstrap sample size

# to keep track of the maxium Sdiff of the bootstrapped sample:
change_mag_random_max = np.ma.copy(change_mag_masked) 
change_mag_random_max[~change_mag_random_max.mask]=0
# to compute the Sdiff sums of the bootstrapped sample:
change_mag_random_sum = np.ma.copy(change_mag_masked) 
change_mag_random_sum[~change_mag_random_max.mask]=0
# to keep track of the count of the bootstrapped sample
n_change_mag_gt_change_mag_random = np.ma.copy(change_mag_masked) 
n_change_mag_gt_change_mag_random[~n_change_mag_gt_change_mag_random.mask]=0
print("Running Bootstrapping for %4.1f iterations ..." % (n_bootstraps))
for i in range(n_bootstraps):
    # For efficiency, we shuffle the time axis index and use that 
    #to randomize the masked array
    random_index = np.random.permutation(residuals_masked.shape[0])
    # Randomize the time step of the residuals
    residuals_random = residuals_masked[random_index,:,:]  
    summation_random = np.ma.cumsum(residuals_random, axis=0)
    summation_random_max = np.ma.max(summation_random, axis=0)
    summation_random_min = np.ma.min(summation_random, axis=0)
    change_mag_random = summation_random_max - summation_random_min
    change_mag_random_sum += change_mag_random
    change_mag_random_max[np.ma.greater(change_mag_random, change_mag_random_max)] = \
    change_mag_random[np.ma.greater(change_mag_random, change_mag_random_max)]
    n_change_mag_gt_change_mag_random[np.ma.greater(change_mag_masked, change_mag_random)] += 1
    if ((i+1)/n_bootstraps*100)%10 == 0:
        print("\r%4.1f%% completed" % ((i+1)/n_bootstraps*100), end='\r', flush=True)
print(f"Bootstrapping Complete")

<br>
<hr>
<font face="Calibri" size="4"> <b> 4.6 Extract Confidence Metrics and Select Final Change Points:</b> </font>

<font face="Calibri" size="3">We first <b>compute for all pixels the confidence level $CL$, the change point significance metric $CP_{significance}$ and the product of the two as our confidence metric for identified change points. Plot the results and save them as a png (confidenceLevel_CPSignificance.png):</b></font>

In [None]:
confidence_level = n_change_mag_gt_change_mag_random / n_bootstraps
change_point_significance = 1.- (change_mag_random_sum / n_bootstraps)/change_mag 
#Plot
fig, ax = plt.subplots(1, 3, figsize=(16, 4))
a = ax[0].imshow(confidence_level*100)
fig.colorbar(a, ax=ax[0])
ax[0].set_title('Confidence Level %')
a = ax[1].imshow(change_point_significance)
fig.colorbar(a, ax=ax[1])
ax[1].set_title('Significance')
a = ax[2].imshow(confidence_level*change_point_significance)
fig.colorbar(a, ax=ax[2])
_ = ax[2].set_title('CL x S')
plt.savefig(f"{output_path}/confidenceLevel_CPSignificance.png", dpi=300, transparent='true')

<font face="Calibri" size="3"><b>Save the confidence level as a GeoTiff (confidence_level.tiff):</b>
</font>

In [None]:
%%capture
geotiff_from_plot(confidence_level*100, f"{output_path}/confidence_level", coords)

<font face="Calibri" size="3"><b>Save the change point significance as a GeoTiff (cp_significance.tiff):</b>
</font>

In [None]:
%%capture
geotiff_from_plot(change_point_significance, f"{output_path}/cp_significance", coords)

<font face="Calibri" size="3"><b>Save the change point significance as a GeoTiff (cp_significance.tiff):</b>
</font>

In [None]:
%%capture
geotiff_from_plot(confidence_level*change_point_significance, f"{output_path}/confidenceLevel_x_CPSignificance", coords)

<font face="Calibri" size="3">Now we can <b>set a change point threshold</b> to identify most likely change pixels in our map of change candidates:
</font>

In [None]:
change_point_threshold = 0.01

<font face="Calibri" size="3"><b>Plot the detected change pixels based on the change_point_threshold and save it as a png (detected_change_pixels.png):</b></font>

In [None]:
fig = plt.figure(figsize=(12, 8))
ax = fig.add_subplot(1, 1, 1)
plt.title('Detected Change Pixels based on Threshold %2.2f' % (change_point_threshold))
a = ax.imshow(confidence_level*change_point_significance < change_point_threshold, cmap='cool')
plt.savefig(f"{output_path}/detected_change_pixels.png", dpi=300, transparent='true')

<font face="Calibri" size="3"><b>Save the detected_change_pixels as a GeoTiff (detected_change_pixels.tiff):</b>
</font>

In [None]:
%%capture
geotiff_from_plot(confidence_level*change_point_significance < change_point_threshold, f"{output_path}/detected_change_pixels", coords, cmap='cool')

<br>
<hr>
<font face="Calibri" size="4"> <b> 4.7 Derive Timing of Change for Each Change Pixel:</b> </font>

<font face="Calibri" size="3">Our last step in the identification of the change points is to extract the timing of the change. We will produce a raster layer that shows the band number of this first date after a change was detected. We will make use of the numpy indexing scheme. First, we create a combined mask of the first threshold and the identified change points after the bootstrapping. For this we use the numpy "mask_or" operation.
</font>

In [None]:
# make a mask of our change points from the new threhold and the previous mask
change_point_mask = np.ma.mask_or(confidence_level*change_point_significance < change_point_threshold, confidence_level.mask)
# Broadcast the mask to the shape of the masked S curves
change_point_mask2 = np.broadcast_to(change_point_mask, summation_masked.shape)
# Make a numpy masked array with this mask
change_point_raster = np.ma.array(summation_masked.data, mask=change_point_mask2)

<font face="Calibri" size="3">To retrieve the dates of the change points we find the band indices in the time series along the time axis where the maximum of the cumulative sums was located. Numpy offers the "argmax" function for this purpose.
</font>

In [None]:
change_point_index = np.ma.argmax(change_point_raster, axis=0)
change_indices = list(np.unique(change_point_index))
change_indices.remove(0)
print(change_indices)
# Look up the dates from the indices to get the change dates
all_dates = time_index_subset
change_dates = [str(all_dates[x].date()) for x in change_indices]

<font face="Calibri" size="3">Lastly, we <b>plot the change dates by showing the $CP_{index}$ raster and label the change dates. Save the plot as a png (change_dates.png):</b></font>

In [None]:
ticks = change_indices
ticklabels = change_dates

cmap = plt.cm.get_cmap('tab20', ticks[-1])
fig, ax = plt.subplots(figsize=(12, 12))
cax = ax.imshow(change_point_index, interpolation='nearest', cmap=cmap)
# fig.subplots_adjust(right=0.8)
# cbar_ax = fig.add_axes([0.85, 0.15, 0.05, 0.7])
# fig.colorbar(p,cax=cbar_ax)

ax.set_title('Dates of Change')
# cbar = fig.colorbar(cax,ticks=ticks)
cbar = fig.colorbar(cax, ticks=ticks, orientation='horizontal')
_ = cbar.ax.set_xticklabels(ticklabels, size=10, rotation=45, ha='right')
plt.savefig(f"{output_path}/change_dates.png", dpi=300, transparent='true')

<font face="Calibri" size="3"><b>Save the change dates as a GeoTiff (change_dates.tiff):</b>
</font>

In [None]:
%%capture
geotiff_from_plot(change_point_index, f"{output_path}/change_dates", coords, cmap=cmap, interpolation='nearest', dpi=600)

<font face="Calibri" size="2"> <i>GEOS 657 Microwave Remote Sensing - Version 1.0 - April 2019 </i>
</font>