# The Process

### loading data from the datacube
- Pick Area
- Browse inventory of available products for sentinel1
- Load sentinel 1 data over area

### pre-process imagery
- Visualize VV and VH bands
- Apply Filter to remove noise
- Visualize filtered VV and VH bands

### classify  
- Design classifier based on interpretation of histograms  
- Apply classifier 
- Visualize:
       - yearly classification frequencies
       - yearly classification variance  

### change detection  
- difference between two classifications, `t1` and `t2`   
- exploring autocorrelation to separate noise from change

# The Area

In [None]:
latitude= (-11.287611, -11.085876)
longitude = (130.324262, 130.452652)

In [None]:
from utils.display import display_map
display_map(latitude = latitude, longitude = longitude)

# Provisioning Data

### Connect to the Datacube System

In [None]:
app_name = "Sentinel 1 Water Classifier"

In [None]:
import datacube
dc = datacube.Datacube(app = app_name)

### List Available Products

In [None]:
dc.list_products()

### Specify Product Preferences

In [None]:
product_preferences = dict(product = "s1_gamma0_geotif_scene",
                     output_crs = "EPSG:4326",
                     resolution = (0.00013557119,0.00013557119))

### Specify GeoGraphic Extent

In [None]:
area_preferences = dict(latitude = latitude,
                  longitude = longitude) 

### Load Data

In [None]:
dataset = dc.load(**product_preferences, **area_preferences)

In [None]:
dataset

### Visualize VH bands

In [None]:
import matplotlib.pyplot as plt

In [None]:
%matplotlib inline
import numpy as np
np.log(dataset.vh).plot(cmap = "Blues", col='time',col_wrap=5)

> **Yearly Composite VH**

In [None]:
%matplotlib inline
import numpy as np
plt.figure(figsize = (15,12))
np.log(dataset.vh.mean(dim = "time")).plot(cmap = "Blues")

### Visualize VV bands  

In [None]:
%matplotlib inline
import numpy as np
np.log(dataset.vv).plot(cmap = "Blues", col='time',col_wrap=5)

In [None]:
%matplotlib inline
import numpy as np
plt.figure(figsize = (15,12))
np.log(dataset.vv.mean(dim = "time")).plot(cmap = "Blues")

# Image Filtering

### Speckle Filtering using Lee Filter

In [None]:
# Adapted from https://stackoverflow.com/questions/39785970/speckle-lee-filter-in-python

from scipy.ndimage.filters import uniform_filter
from scipy.ndimage.measurements import variance

def lee_filter(da, size):
    img = da.values
    img_mean = uniform_filter(img, (size, size))
    img_sqr_mean = uniform_filter(img**2, (size, size))
    img_variance = img_sqr_mean - img_mean**2

    overall_variance = variance(img)

    img_weights = img_variance / (img_variance + overall_variance)
    img_output = img_mean + img_weights * (img - img_mean)
    return img_output

In [None]:
dataset_zero_filled = dataset.where(~dataset.isnull(), 0)

In [None]:
dataset["filtered_vv"] = dataset_zero_filled.vv.groupby('time').apply(lee_filter, size=7)
dataset["filtered_vh"] = dataset_zero_filled.vh.groupby('time').apply(lee_filter, size=7)

### VH Filtered

In [None]:
np.log10(dataset.filtered_vh).plot(cmap = "Blues", col='time',col_wrap=5)

In [None]:
plt.figure(figsize = (15,12))
np.log10(dataset.filtered_vh.mean(dim = "time")).plot(cmap = "Blues")

### VV Filtered

In [None]:
np.log10(dataset.filtered_vv).plot(cmap = "Blues", col='time',col_wrap=5)

In [None]:
plt.figure(figsize = (15,12))
np.log10(dataset.filtered_vv.mean(dim = "time")).plot(cmap = "Blues")

### Observing VV and VH histograms

In [None]:
fig = plt.figure(figsize = (15,3))
_ = np.log10(dataset.filtered_vv).plot.hist(bins = 1000, label = "VV filtered")
_ = np.log10(dataset.vv).plot.hist(bins = 1000, label = "VV", alpha = .5)
plt.legend()
plt.title("Comparison of filtered VV bands to original") 

In [None]:
fig = plt.figure(figsize = (15,3))
_ = np.log10(dataset.filtered_vh).plot.hist(bins = 1000, label = "VH filtered")
_ = np.log10(dataset.vh).plot.hist(bins = 1000, label = "VH", alpha = .5)
plt.legend()
plt.title("Comparison of filtered VH bands to original") 

** 

# Designing a threshold based water classifier

A 2d visualization of imagery alone, suggests a stark contrast in `land` and `water` values.    
The visualization of the fitlered S1 data highlights a clear bimodal distribution on the `filtered VH` domain.   

In this section, a classifier is built based on a static threshold on `filtered_vh` values.  

$$ threshold = -2.0 $$

In [None]:
threshold = -2.0

The classifier separates data into two classes, data above, and data below the threshold. An assumption is made that values of both segments correspond to the same `water` and `not water` distinctions we make visually.  


<br>  

$$  water(Dataset) = \left\{
     \begin{array}{lr}
       True & :   Dataset_{VH} \le threshold\\
       False & :   Dataset_{VH} > threshold
     \end{array}
   \right.\\ $$  

<br>


### Visualize threshold

In [None]:
fig = plt.figure(figsize = (15,3))
plt.axvline(x=-2, label='Threshold at {}'.format(threshold), color = "red")
_ = np.log10(dataset.filtered_vh).plot.hist(bins = 1000, label = "VH filtered")
_ = np.log10(dataset.vh).plot.hist(bins = 1000, label = "VH", alpha = .5)
plt.legend()
plt.title("Histogram Comparison of filtered VH bands to original") 

In [None]:
fig, ax = plt.subplots(figsize = (15,3))
_ = np.log10(dataset.filtered_vh).plot.hist(bins = 1000, label = "VH filtered")
ax.axvspan(xmin=-2,xmax = -.5, alpha=0.25, color='red', label = "Not Water")
ax.axvspan(xmin=-3.5,xmax = -2, alpha=0.25, color='green', label = "Water")
plt.legend()
plt.title("Comparison of filtered VH bands to original") 

# Coding the classifier

In [None]:
import numpy as np
import xarray as xr 

def s1_water_classifier(ds:xr.Dataset, threshold = -2) -> xr.Dataset:
    assert "vh" in ds.data_vars, "This classifier is expecting a variable named `vh` expressed in DN, not DB values"
    filtered = ds.vh.groupby('time').apply(lee_filter, size=7)
    water_data_array = np.log10(filtered) < threshold
    return water_data_array.to_dataset(name = "s1_wofs")

# Running the classifier

In [None]:
dataset["s1_wofs"] = s1_water_classifier(dataset).s1_wofs

# Validation

### Water Classification Frequency

In [None]:
plt.figure(figsize = (15,12))
dataset.s1_wofs.mean(dim = "time").plot(cmap = "jet_r")

> #### Interpretation and Ideas: 

- There exists fairly consistent classifications inland and off the coasts.  
- The coastline in not consitently water.
- Check Variance

### Water Classification Standard Deviation

In [None]:
plt.figure(figsize = (15,12))
dataset.s1_wofs.std(dim = "time").plot(cmap = "jet")

> #### Interpretation and Ideas: 

- variance can capture long term trends like coastal erosion or degredation, but may also capture noise.  
  take, for example an alternating sequence of classifications $ts_1 = [0,1,0,1,...,0,1, 0, 1]$ and the sequence $ts_2 = [0,0,0,0,...,1,1,1,1]$    
  It's safe to assume that $var(ts_1) == var(ts_2)$ despite the fact that one might be frequent alternating changes in state of water, while the later might be lasting transition. 
  
- The coastline is not always consitently water

# Detecting Coastal Change 

### Simple Differencing Approach

In [None]:
t1 = 0
t2 = 26

In [None]:
change = dataset.s1_wofs.isel(time = t1) - dataset.s1_wofs.isel(time = t2)
change = change.where(change != 0) 
dataset["change"] = change

In [None]:
plt.figure(figsize = (15,12))
dataset.filtered_vh.mean(dim = "time").plot(cmap = "Blues")
dataset.change.plot(cmap = "jet", levels = 2) 

# Auto Correlation

In [None]:
def rtk(ts:np.array, k = 1):
    a = np.append(np.array(ts).copy(),
                  np.zeros(k))
    
    b = np.append(np.zeros(k),
                  np.array(ts).copy())
    
    auto = (a * b)[k:-k]
    return np.mean(auto)

In [None]:
auto_correlation_ds = xr.DataArray(auto_correlation, dims = dict((k, dataset[k].values) for k in ('latitude', 'longitude')))

In [None]:
auto_correlation = np.apply_along_axis(rtk,0,dataset.s1_wofs)

In [None]:
auto_correlation_ds = xr.DataArray(auto_correlation, dims = dict((k, dataset[k].values) for k in ('latitude', 'longitude')))

In [None]:
freq = dataset.s1_wofs.mean(dim = "time")
varying_pixels = np.logical_and(freq != 0, freq != 1) 

In [None]:
fig = plt.figure(figsize = (15,3))
_ = auto_correlation_ds.where(varying_pixels).plot.hist(bins = 256)
plt.title("Histogram of autocorrelation") 


In [None]:
plt.figure(figsize = (15,12))
dataset.filtered_vh.mean(dim = "time").plot(cmap = "Blues")
dataset.change.where(auto_correlation_ds > 0.8).plot(cmap = "jet", levels = 2)