![OpenSARlab notebook banner](NotebookAddons/blackboard-banner.png)

# Exploring SAR Data and SAR Time Series Analysis using Jupyter Notebooks

### Franz J Meyer; University of Alaska Fairbanks, Josef Kellndorfer; [Earth Big Data, LLC](http://earthbigdata.com/), Alex Lewandowski; Alaska Satellite Facility

<img src="NotebookAddons/UAFLogo_A_647.png" width="170" align="right" />

This notebook will introduce you to the analysis of deep multi-temporal SAR image data stacks in the framework of *Jupyter Notebooks*. The Jupyter Notebook environment is easy to launch in any web browser for interactive data exploration with provided or new training data. Notebooks are comprised of text written in a combination of executable python code and markdown formatting including latex style mathematical equations. Another advantage of Jupyter Notebooks is that they can easily be expanded, changed, and shared with new data sets or newly available time series steps. Therefore, they provide an excellent basis for collaborative and repeatable data analysis.

**We introduce the following data analysis concepts:**

- How to load your **Zarr-store** SAR data into Jupyter Notebooks and create a time series stack 
- How to apply calibration constants to covert initial digital number (DN) data into calibrated radar cross section information.
- How to subset images and create a time series of your subset data.
- How to explore the time-series information in SAR data stacks for environmental analysis.

---
**Important Note about JupyterHub**

Your JupyterHub server will automatically shutdown when left idle for more than 1 hour. Your notebooks will not be lost but you will have to restart their kernels and re-run them from the beginning. You will not be able to seamlessly continue running a partially run notebook.

In [1]:
import url_widget as url_w
notebookUrl = url_w.URLWidget()
display(notebookUrl)

URLWidget()

In [2]:
from IPython.display import Markdown
from IPython.display import display

notebookUrl = notebookUrl.value
user = !echo $JUPYTERHUB_USER
env = !echo $CONDA_PREFIX
if env[0] == '':
    env[0] = 'Python 3 (base)'
if env[0] != '/home/jovyan/.local/envs/rtc_analysis':
    display(Markdown(f'<text style=color:red><strong>WARNING:</strong></text>'))
    display(Markdown(f'<text style=color:red>This notebook should be run using the "rtc_analysis" conda environment.</text>'))
    display(Markdown(f'<text style=color:red>It is currently using the "{env[0].split("/")[-1]}" environment.</text>'))
    display(Markdown(f'<text style=color:red>Select the "rtc_analysis" from the "Change Kernel" submenu of the "Kernel" menu.</text>'))
    display(Markdown(f'<text style=color:red>If the "rtc_analysis" environment is not present, use <a href="{notebookUrl.split("/user")[0]}/user/{user[0]}/notebooks/conda_environments/Create_OSL_Conda_Environments.ipynb"> Create_OSL_Conda_Environments.ipynb </a> to create it.</text>'))
    display(Markdown(f'<text style=color:red>Note that you must restart your server after creating a new environment before it is usable by notebooks.</text>'))

---
## 0. Importing Relevant Python Packages

In this notebook we will use the following scientific libraries:

- [Pandas](https://pandas.pydata.org/) is a Python library that provides high-level data structures and a vast variety of tools for analysis. The great feature of this package is the ability to translate rather complex operations with data into one or two commands. Pandas contains many built-in methods for filtering and combining data, as well as the time-series functionality.
- [GDAL](https://www.gdal.org/) is a software library for reading and writing raster and vector geospatial data formats. It includes a collection of programs tailored for geospatial data processing. Most modern GIS systems (such as ArcGIS or QGIS) use GDAL in the background.
- [NumPy](http://www.numpy.org/) is one of the principal packages for scientific applications of Python. It is intended for processing large multidimensional arrays and matrices, and an extensive collection of high-level mathematical functions and implemented methods makes it possible to perform various operations with these objects. 
- [Matplotlib](https://matplotlib.org/index.html) is a low-level library for creating two-dimensional diagrams and graphs. With its help, you can build diverse charts, from histograms and scatterplots to non-Cartesian coordinates graphs. Moreover, many popular plotting libraries are designed to work in conjunction with matplotlib.
- [SciPy](https://www.scipy.org/about.html) is a library that provides functions for numerical integration, interpolation, optimization, linear algebra and statistics.

**Our first step is to import them:**

In [3]:
# %%capture

# TODO Add s3fs, xarray, and zarr to rtc_analysis_env.yml
try:
    import s3fs
except:
    !mamba install -c conda-forge -q s3fs --yes
    import s3fs
try:
    import xarray as xr
except:
    !mamba install -c conda-forge -q xarray --yes
    import xarray as xr
try:
    import zarr
except:
    # After installing zarr, you have to refresh your browser window 
    # and restart the notebook kernel before this notebook will run
    !mamba install -c conda-forge -q zarr --yes
    import zarr

from pathlib import Path
import json # for loads
import math # for ceil

import dask
import pandas as pd # for DatetimeIndex
import pyproj
from osgeo import gdal # for Info
import numpy as np 
import scipy.signal

%matplotlib inline
import matplotlib.pylab as plb # for figure, grid, rcParams, savefig
import matplotlib.pyplot as plt
from matplotlib import animation
from matplotlib import rc

from ipyfilechooser import FileChooser

from IPython.display import HTML

import opensarlab_lib as asfn
# asfn.jupytertheme_matplotlib_format()

---
## 1. Load Your Prepared Data Stack Into the Notebook

This notebook assumes that you've prepared your own data stack of **RTC image products** over your personal area of interest. This can be done using the **Prepare_Data_Stack_Hyp3** and **Subset_Data_Stack notebooks**.
    
This notebook expects [Radiometric Terrain Corrected](https://media.asf.alaska.edu/uploads/RTC/rtc_atbd_v1.2_final.pdf) (RTC) image products as input, so be sure to select an RTC process when creating the subscription for your input data within HyP3. Prefer a **unique orbit geometry** (ascending or descending) to keep geometric differences between images low. 

**Select the Polarization of the data to use for the time series:**

In [4]:
print('Select a polarity for the time-series data:')
pol_select = asfn.select_parameter(['VV', 'VH'], '')
display(pol_select)

Select a polarity for the time-series data:


RadioButtons(layout=Layout(min_width='800px'), options=('VV', 'VH'), value='VV')

**Open a Zarr Store containing `VV` and/or `VH` RTC Data**

TODO: Replace hard-coded S3 path with an input field for S3 path to any public Zarr Store

In [5]:
# Create S3FileSystem object and map it to the S3 bucket and directory holding the zarr-stores
s3_path = "s3://asf-jupyter-data-west/zarr_test/bangladesh"
s3 = s3fs.S3FileSystem(anon=True)
store = s3fs.S3Map(root=s3_path, s3=s3, check=False)

pol = pol_select.value

# Create a list of groups (zarr-speak for the directories holding the zarr-stores)
groups = [g['name'].split('/')[-1] for g in s3.listdir(s3_path) if g['StorageClass'] == 'DIRECTORY']
if len(groups) < 1:
    print("Found no zarr-stores at provided S3 path")
groups = [g for g in groups if pol in g or pol.lower() in g]
if len(groups) < 1:
    print(f"Found no zarr stores in the {pol} polarity")
    
# Open each zarr-store and assign it to a variable named for its group
for g in groups:
    exec(f"{g} = xr.open_zarr(store=store, consolidated=True, group='{g}')")

print("Zarr Store Groups:\n")
for g in groups:
    print(g)
    
datasets = [globals()[g] for g in groups]

Zarr Store Groups:

S1A_IW_20170402T235525_DVP_RTC30_G_gpuned_9C40_VV
S1A_IW_20170414T235525_DVP_RTC30_G_gpuned_C9FE_VV
S1A_IW_20170426T235525_DVP_RTC30_G_gpuned_6648_VV
S1A_IW_20170508T235526_DVP_RTC30_G_gpuned_38D4_VV
S1A_IW_20170520T235527_DVP_RTC30_G_gpuned_C174_VV
S1A_IW_20170601T235528_DVP_RTC30_G_gpuned_9A6A_VV
S1A_IW_20170731T235531_DVP_RTC30_G_gpuned_8285_VV
S1A_IW_20170812T235532_DVP_RTC30_G_gpuned_E6B5_VV
S1A_IW_20170824T235532_DVP_RTC30_G_gpuned_60E1_VV
S1A_IW_20170905T235533_DVP_RTC30_G_gpuned_7E34_VV
S1A_IW_20170917T235533_DVP_RTC30_G_gpuned_E674_VV
S1A_IW_20170929T235533_DVP_RTC30_G_gpuned_9E2E_VV


In [6]:
S1A_IW_20170601T235528_DVP_RTC30_G_gpuned_9A6A_VV

Unnamed: 0,Array,Chunk
Bytes,287.22 MiB,1.12 MiB
Shape,"(7783, 9674)","(487, 605)"
Count,257 Tasks,256 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 287.22 MiB 1.12 MiB Shape (7783, 9674) (487, 605) Count 257 Tasks 256 Chunks Type float32 numpy.ndarray",9674  7783,

Unnamed: 0,Array,Chunk
Bytes,287.22 MiB,1.12 MiB
Shape,"(7783, 9674)","(487, 605)"
Count,257 Tasks,256 Chunks
Type,float32,numpy.ndarray


**Select or create a directory in which to store your output files:**

In [None]:
while True:
    print(f"Current working directory: {Path.cwd()}")
    data_dir = Path(input(f"\nPlease enter the name of a directory in which to store output from this analysis."))
    if data_dir == Path('.'):
        continue
    if data_dir.is_dir():
        contents = data_dir.glob('*')
        if len(list(contents)) > 0:
            choice = asfn.handle_old_data(data_dir)
            if choice == 1:
                if data_dir.exists():
                    shutil.rmtree(data_dir)
                data_dir.mkdir()
                break
            elif choice == 2:
                break
            else:
                clear_output()
                continue
        else:
            break
    else:
        data_dir.mkdir()
        break

In [None]:
product_path = Path.cwd()/data_dir
print(product_path)

**create a dictionary mapping data acquisition dates to group names (dataset directories in Zarr store):**

In [None]:
dates = [asfn.date_from_product_name(g).split('T')[0] for g in groups]
print(dates)

**Gather the upper-left and lower-right corner coordinates of the data stack:**

In [None]:
# TODO add to opensarlab-lib

def get_corners_xr_ds(ds: xr.core.dataset.Dataset):
    x = ds.coords['x']
    y = ds.coords['y']
    return {
        'ul': [int(x[0].compute()), int(y[0].compute())],
        'lr': [int(x[len(x)-1].compute()), int(y[len(y)-1].compute())]
    }

In [None]:
coords = get_corners_xr_ds(datasets[0])
print(coords)

**Grab the stack's UTM zone.** Note that any UTM zone conflicts should already have been handled in the Prepare_Data_Stack_Hyp3 notebook.

In [None]:
utm = datasets[0].epsg
print(f"UTM Zone: {utm}")

**Select an Area of Interest**

Note, the amount of data you can analyze in this notebook is limited by the amount of available memory, which is why we must subset full-granule sized data sets.

In [None]:
# TODO update AOI_Selector in opensarlab-lib

from typing import Optional, Union

import matplotlib
import matplotlib.pyplot as plt
from matplotlib.widgets import RectangleSelector
import matplotlib.patches as patches

class AOI_Selector:
    """
    Creates an interactive matplotlib plot allowing users
    to select an area-of-interest with a bounding box
    """

    def __init__(self, image: np.ndarray,
                 fig_xsize: Optional[Union[float, int]] = None, fig_ysize: Optional[Union[float, int]] = None,
                 cmap: Optional[matplotlib.colors.LinearSegmentedColormap] = plt.cm.gist_gray,
                 vmin: Optional[Union[float, int]] = None, vmax: Optional[Union[float, int]] = None
                 ):
        self.image = image
        self.x1 = None
        self.y1 = None
        self.x2 = None
        self.y2 = None
        if not vmin:
            self.vmin = np.percentile(self.image.flatten(), 5)
        else:
            self.vmin = vmin
        if not vmax:
            self.vmax = np.percentile(self.image.flatten(), 95)
        else:
            self.vmax = vmax
        if fig_xsize and fig_ysize:
            self.fig, self.current_ax = plt.subplots(figsize=(fig_xsize, fig_ysize))
        else:
            self.fig, self.current_ax = plt.subplots()
        self.fig.suptitle('Area-Of-Interest Selector', fontsize=16)
        self.current_ax.imshow(self.image, cmap=cmap, vmin=self.vmin, vmax=self.vmax)

        def toggle_selector(event: matplotlib.backend_bases.Event):
            """
            Takes: a key press event
            Toggles the selector off if the Pan or Zoom tools are selected.
            Toggles the selector on if the Pan and Zoom tools are deselected.
            """
            if event.key in ['Q', 'q'] and toggle_selector.RS.active:
                toggle_selector.RS.set_active(False)
            if event.key in ['A', 'a'] and not toggle_selector.RS.active:
                toggle_selector.RS.set_active(True)

        toggle_selector.RS = RectangleSelector(self.current_ax, self.line_select_callback,
                                               useblit=True,
                                               button=[1, 3],  # don't use middle button
                                               minspanx=0, minspany=0,
                                               spancoords='pixels',
                                               props=dict(facecolor='red', edgecolor='yellow',
                                                              alpha=0.3, fill=True),
                                               interactive=True)
        plt.connect('key_press_event', toggle_selector)

    def line_select_callback(self,
                             eclick: matplotlib.backend_bases.Event,
                             erelease: matplotlib.backend_bases.Event):
        """
        Takes: An eclick and erelease event
        Sets self.x1, self.x2, self.y1, and self.y2 from selection corner coordinates
        """

        self.x1, self.y1 = eclick.xdata, eclick.ydata
        self.x2, self.y2 = erelease.xdata, erelease.ydata

In [None]:
%matplotlib widget
aoi = AOI_Selector(datasets[0].backscatter.data, fig_xsize=10, fig_ysize=10)

In [None]:
geotrans = (coords['ul'][0], datasets[0].x_spacing, 0.0, coords['ul'][1], 0.0, datasets[0].y_spacing)
projlatlon = pyproj.Proj('EPSG:4326') # WGS84
projimg = pyproj.Proj(f'EPSG:{utm}')

In [None]:
# TODO: rename geolocation_from_plot_coord() and add to opensarlab-lib

def geolocation(x, y, geotrans,latlon=True):
    ref_x = geotrans[0]+x*geotrans[1]
    ref_y = geotrans[3]+y*geotrans[5]
    if latlon:
        ref_y, ref_x = pyproj.transform(projimg, projlatlon, ref_x, ref_y)
    return [ref_x, ref_y]

In [None]:
try:
    aoi_coords = {
        'ul': geolocation(aoi.x1, aoi.y1, geotrans, latlon=False), 
        'lr': geolocation(aoi.x2, aoi.y2, geotrans, latlon=False)
    }
    print(f"aoi_coords in EPSG {utm}: {aoi_coords}")
except TypeError:
    print('TypeError')
    display(Markdown(f'<text style=color:red>This error occurs if an AOI was not selected.</text>'))
    display(Markdown(f'<text style=color:red>Note that the square tool icon in the AOI selector menu is <b>NOT</b> the selection tool. It is the zoom tool.</text>'))
    display(Markdown(f'<text style=color:red>Read the tips above the AOI selector carefully.</text>'))

---
**Create an xarray.DataSet.stack of your AOI containing all the DataSets:**

In [None]:
stack = datasets[0]
stack = stack.drop_vars('backscatter') # remove 2d array, to be replaced with 3d stack
del stack.attrs['product_name'] # remove since dataset will hold data from multiple products 
stack = stack.assign_coords(date=dates)
stack.date.attrs['axis'] = "D" 
stack.date.attrs['units'] = f"date in format YYMMDD"
stack.date.attrs['calendar'] = "proleptic_gregorian"
stack.date.attrs['long_name'] = "Date"

# 1st stab at subsetting by coords
stack = stack.sel(x=slice(aoi_coords['ul'][0], aoi_coords['lr'][0]), 
                  y=slice(aoi_coords['ul'][1], aoi_coords['lr'][1]))
subset = [d.sel(x=slice(aoi_coords['ul'][0], aoi_coords['lr'][0]), y=slice(aoi_coords['ul'][1], aoi_coords['lr'][1])) for d in datasets]

# there may be a cleaner way to do this with xarray.Dataset.to_stacked_array
xarr3d = xr.concat([d.backscatter for d in subset], dim=stack.date)
stack['backscatter'] = xarr3d

**Take a look at the stacked `backscatter` DataArray**

---
## 3. Now You Can Work With Your Data

Now you are ready to perform time series analysis on your data stack

---
### 3.1 Open Your Data Stack and Visualize Some Layers

**Print the bands, pixels, and lines:**

In [None]:
print(f"Number of  bands: {len(stack.backscatter)}")
print(f"Number of pixels: {len(stack.backscatter[0].coords['x'])}")
print(f"Number of  lines: {len(stack.backscatter[0].coords['y'])}")

**View the stacked Dataset**

Notice the:
- `backscatter` 3D DataArray containing x, y, and time dimensions
- The original attributes are still present (except for product_name)

In [None]:
stack

In [None]:
stack.backscatter

**Plot images and histograms for bands 1 and 2:**

Note: Depending the histograms plotted by this cell, you may wish to adjust vmax when calling imshow() on ax1 and ax3. Increase the vmax value if the histogram cuts off much of the end of the peak, making your image too bright to see features well. Decrease vmax if the histogram extends much beyond the end of the peak, which will make your image appear dark.

In [None]:
# TODO: Add to opensarlab-lib

import datetime
def datetime_from_hyp3_dt_str(date_str):
    return datetime.datetime(int(date_str[0:4]),int(date_str[4:6]),int(date_str[6:8]),
                    int(date_str[9:11]),int(date_str[11:13]),int(date_str[13:15]),0)

def date_from_hyp3_dt_str(date_str):
    return datetime.date(int(date_str[0:4]),int(date_str[4:6]),int(date_str[6:8]))

In [None]:
# Setup the pyplot plots
%matplotlib inline
fig = plb.figure(figsize=(18,10)) # Initialize figure with a size
ax1 = fig.add_subplot(221)  # 221 determines: 2 rows, 2 plots, first plot
ax2 = fig.add_subplot(222)  # 222 determines: 2 rows, 2 plots, second plot
ax3 = fig.add_subplot(223)  # 223 determines: 2 rows, 2 plots, third plot
ax4 = fig.add_subplot(224)  # 224 determines: 2 rows, 2 plots, fourth plot

# Plot the band 1 image
band_number = 1
raster = stack.backscatter[band_number-1].data
date = date_from_hyp3_dt_str(str(int(stack.date[band_number-1])))
vmin = np.percentile(raster.flatten().compute(), 5)
vmax = np.percentile(raster.flatten().compute(), 95)
ax1.imshow(raster, cmap='gray', vmin=vmin, vmax=vmax) # see note above regarding vmax adjustments
ax1.set_title(f'Image Band {band_number} {date}')

# Flatten the band 1 image into a 1 dimensional vector and plot the histogram:
h = ax2.hist(raster.flatten().compute(), bins=200, range=(vmin,vmax)) ###############
ax2.xaxis.set_label_text('Amplitude? (Uncalibrated DN Values)')
ax2.set_title(f'Histogram Band {band_number} {date}')

# # Plot the band 2 image
band_number = 2
raster = stack.backscatter[band_number-1].data
date = date_from_hyp3_dt_str(str(int(stack.date[band_number-1])))
vmin = np.percentile(raster.flatten().compute(), 5)
vmax = np.percentile(raster.flatten().compute(), 95)
ax3.imshow(raster, cmap='gray', vmin=vmin, vmax=vmax) # see note above regarding vmax adjustments
ax3.set_title(f'Image Band {band_number} {date}')

# Flatten the band 2 image into a 1 dimensional vector and plot the histogram:
h = ax4.hist(raster.flatten().compute(), bins=200, range=(vmin,vmax))
ax4.xaxis.set_label_text('Amplitude? (Uncalibrated DN Values)')
ax4.set_title(f'Histogram Band {band_number} {date}')

---
### 3.3 Calibration and Data Conversion between dB and Power Scales

**Note, that if your data were generated by HyP3, this step is not necessary!** HyP3 performs the full data calibration and provides you with calibrated data in power scale.
    
If, your data is from a different source, however, calibration may be necessary to ensure that image gray values correspond to proper radar cross section information. 

Calibration coefficients for SAR data are often defined in the decibel (dB) scale due to the high dynamic range of the imaging system. For the L-band ALOS PALSAR data at hand, the conversion from uncalibrated DN values to calibrated radar cross section values in dB scale is performed by applying a standard **calibration factor of -83 dB**. 

$\gamma^0_{dB} = 20 \cdot log10(DN) -83$

The data at hand are radiometrically terrain corrected images, which are often expressed as terrain flattened $\gamma^0$ backscattering coefficients. For forest and land cover monitoring applications $\gamma^o$ is the preferred metric.

**To apply the calibration constant for your data and export in *dB* scale, uncomment the following code cell:**

In [None]:
 # caldB=20*np.log10(stack.backscatter)-83

While **dB**-scaled images are often "visually pleasing", they are often not a good basis for mathematical operations on data. For instance, when we compute the mean of observations, it makes a difference whether we do that in power or dB scale. Since dB scale is a logarithmic scale, we cannot simply average data in that scale. 
    
Please note that the **correct scale** in which operations need to be performed **is the power scale.** This is critical, e.g. when speckle filters are applied, spatial operations like block averaging are performed, or time series are analyzed.

To **convert from dB to power**, apply: $\gamma^o_{pwr} = 10^{\frac{\gamma^o_{dB}}{10}}$

In [None]:
# calPwr=np.power(10.,caldB/10.)

---
### 3.4 Create a Time Series Animation

Now we are ready to create a time series animation from the calibrated SAR data.

**First, create a raster from band 0 and a raster stack from all the images:**

**Create a masked raster stack:**

In [None]:
rs2 = stack.backscatter.where(stack.backscatter != 0).to_masked_array()

**Generate a matplotlib time-series animation:**

In [None]:
%%capture 
import dask
fig = plt.figure(figsize=(14, 8))
ax = fig.subplots()
ax.axis('off')
vmin = np.percentile(stack.backscatter[0].data.flatten().compute(), 5)
vmax = np.percentile(stack.backscatter[0].data.flatten().compute(), 95)
r0dB = 20 * np.log10(stack.backscatter[0]) - 83
date = date_from_hyp3_dt_str(str(int(stack.date[0])))
im = ax.imshow(stack.backscatter[0], cmap='gray', vmin=vmin, vmax=vmax)
ax.set_title(f"{date}")

def animate(i):
    date = date_from_hyp3_dt_str(str(int(stack.date[i])))
    ax.set_title(f"{date}")
    im.set_data(stack.backscatter[i])

# Interval is given in milliseconds
ani = animation.FuncAnimation(fig, animate, frames=stack.backscatter.shape[0], interval=400)

**Configure matplotlib's RC settings for the animation:**

In [None]:
rc('animation', embed_limit=40971520.0)  # We need to increase the limit maybe to show the entire animation

**Create a javascript animation of the time-series running inline in the notebook:**

In [None]:
HTML(ani.to_jshtml())

**Delete the dummy png** that was saved to the current working directory while generating the javascript animation in the last code cell.

In [None]:
try:
    product_path/Path('None0000000.png').unlink()
except FileNotFoundError:
    pass

In [None]:

# nc_path = product_path/"stack.nc4"
# s3_path = "s3://asf-jupyter-data-west/zarr_test/bangladesh_vh_stack"

# s3 = s3fs.S3FileSystem(anon=True)


# ds = xr.open_dataset(nc_path)
# store = s3fs.S3Map(root=s3_path, s3=s3, check=False)
# compressor = zarr.Blosc(cname='zstd', clevel=3)
# encoding = {vname: {'compressor': compressor} for vname in ds.data_vars}
# try:
#     ds.to_zarr(store=store, encoding=encoding, consolidated=True, group=nc_path.stem)
# except:
#     # TODO do something to handle existing S3 objects with same key
#     raise

**Save the animation (animation.gif):**

In [None]:
ani.save(f"{product_path}/animation.gif", writer='pillow', fps=2)

---
### 3.5 Plot the Time Series of Means Calculated Across the Subset

To create the time series of means, we will go through the following steps:
1. Ensure that you use the data in **power scale** ($\gamma^o_{pwr}$) for your mean calculations.
1. compute means.
1. convert the resulting mean values into dB scale for visualization.
1. plot time series of means.

**Compute the means:**

In [None]:
rs_means_pwr = np.mean(rs2, axis=(1, 2))

**Convert resulting mean value time-series to dB scale for visualization:**

In [None]:
rs_means_dB = 10.*np.log10(rs_means_pwr)

**Plot and save the time series of means (RCSoverTime.png):**

In [None]:
try:
    plt.rcParams.update({'font.size': 14})
    fig = plt.figure(figsize=(16, 4))
    ax1 = fig.subplots()
    window_length = len(rs_means_pwr)-1
    if window_length % 2 == 0:
        window_length -= 1
    polyorder = math.ceil(window_length*0.1)
    yhat = scipy.signal.savgol_filter(rs_means_pwr, window_length, polyorder) 
    ax1.plot([date_from_hyp3_dt_str(d) for d in dates], yhat, color='red', marker='o', markerfacecolor='white', linewidth=3, markersize=6)
    ax1.plot([date_from_hyp3_dt_str(d) for d in dates], rs_means_pwr, color='gray', linewidth=0.5)
    plt.grid()
    ax1.set_xlabel('Date')
    ax1.set_ylabel('$\overline{\gamma^o}$ [power]')
    plt.savefig(f"{product_path}/RCSoverTime.png", dpi=72, transparent='true')
except ValueError as e:
    print(f"Error: polyorder: {polyorder} >= window_length: {window_length}")
    raise

---
### 3.6 Calculate Coefficient of Variance

The coefficient of variance describes how much the $\sigma_{0}$ or $\gamma_{0}$ measurements in a pixel vary over time. Hence, the coefficient of variance can indicate different vegetation cover and soil moisture regimes in your area.

**Write a function to convert our plots into GeoTiffs:**

In [None]:
def geotiff_from_plot(source_image, out_filename, extent, utm, cmap=None, vmin=None, vmax=None, interpolation=None, dpi=300):
    assert "." not in out_filename, 'Error: Do not include the file extension in out_filename'
    assert type(extent) == list and len(extent) == 2 and len(extent[0]) == 2 and len(
        extent[1]) == 2, 'Error: extent must be a list in the form [[upper_left_x, upper_left_y], [lower_right_x, lower_right_y]]'
    
    plt.figure()
    plt.axis('off')
    plt.imshow(source_image, cmap=cmap, vmin=vmin, vmax=vmax, interpolation=interpolation)
    temp = f"{out_filename}_temp.png"
    plt.savefig(temp, dpi=dpi, transparent='true', bbox_inches='tight', pad_inches=0)

    cmd = f"gdal_translate -of Gtiff -a_ullr {extent[0][0]} {extent[0][1]} {extent[1][0]} {extent[1][1]} -a_srs EPSG:{utm} {temp} {out_filename}.tiff"
    !{cmd}
    try:
        Path(temp).unlink()
    except FileNotFoundError:
        pass

**Plot the Coefficient of Variance Map and save it as a png (Coeffvar.png):**

In [None]:
test = np.var(stack.backscatter, 0)
mtest = stack.backscatter.where(stack.backscatter != 0).mean()
coeffvar = test/(mtest+0.001)

plt.rcParams.update({'font.size': 14})
fig = plt.figure(figsize=(13, 10))
ax = fig.subplots()
ax.axis('off')
vmin = np.percentile(coeffvar.data.flatten(), 5)
vmax = np.percentile(coeffvar.data.flatten(), 95)
ax.set_title('Coefficient of Variance Map')
im = ax.imshow(coeffvar, cmap='jet', vmin=vmin, vmax=vmax)
fig.colorbar(im, ax=ax)
plt.savefig(f"{product_path}/Coeffvar.png", dpi=300, transparent='true')                                                                                                                                                               

**Save the coefficient of variance map as a GeoTiff (Coeffvar.tiff):**

In [None]:
%%capture
geotiff_from_plot(coeffvar, f"{product_path}/Coeffvar", [coords['ul'], coords['lr']], utm, cmap='jet', vmin=vmin, vmax=vmax)

---
### 3.7 Threshold Coefficient of Variance Map

This is an example how to threshold the derived coefficient of variance map. This can be useful, e.g., to detect areas of active agriculture.

**Plot and save the coefficient of variance histogram and CDF (thresh_coeff_var_histogram.png):**

In [None]:
plt.rcParams.update({'font.size': 14})
fig = plt.figure(figsize=(14, 6)) # Initialize figure with a size
ax1 = fig.add_subplot(121)  # 121 determines: 2 rows, 2 plots, first plot
ax2 = fig.add_subplot(122)
# Second plot: Histogram
# IMPORTANT: To get a histogram, we first need to *flatten* 
# the two-dimensional image into a one-dimensional vector.
flat_coeffvar = xr.DataArray(coeffvar.data.flatten())
h = ax1.hist(flat_coeffvar, bins=200, range=(0, 0.03))
ax1.xaxis.set_label_text('Coefficient of Variation')
ax1.set_title('Coeffvar Histogram')
plt.grid()
n, bins, patches = ax2.hist(flat_coeffvar, bins=200, range=(0, 0.03), cumulative='True', density='True', histtype='step', label='Empirical')
ax2.xaxis.set_label_text('Coefficient of Variation')
ax2.set_title('Coeffvar CDF')
plt.grid()
plt.savefig(f"{product_path}/thresh_coeff_var_histogram.png", dpi=72, transparent='true')

**Plot the Threshold Coefficient of Variance Map and save it as a png (Coeffvarthresh.png):**

In [None]:
plt.rcParams.update({'font.size': 14})
outind = np.where(n > 0.80)
threshind = np.min(outind)
thresh = bins[threshind]
coeffvarthresh = np.copy(coeffvar)
coeffvarthresh[coeffvarthresh < thresh] = 0
coeffvarthresh[coeffvarthresh > 0.1] = 0
fig = plt.figure(figsize=(13, 10))
ax = fig.subplots()
ax.axis('off')
vmin = np.percentile(flat_coeffvar, 5)
vmax = np.percentile(flat_coeffvar, 95)
ax.set_title(r'Thresholded Coeffvar Map [$\alpha=95%$]')
im = ax.imshow(coeffvarthresh, cmap='jet', vmin=vmin, vmax=vmax)
bar = fig.colorbar(im, ax=ax)
plt.savefig(f"{product_path}/Coeffvarthresh.png", dpi=300, transparent='true')

**Save the Threshold Coefficient of Variance Map as a GeoTiff (Coeffvarthresh.tiff):**

In [None]:
%%capture
geotiff_from_plot(coeffvarthresh, f"{product_path}/Coeffvarthresh", [coords['ul'], coords['lr']], utm, cmap='jet', vmin=vmin, vmax=vmax)

*Time_Series_From_Zarr_Stack.ipynb - Version 0.1.0 - February 2022*