# <u>Pixel drill using interactive widgets</u> <img align="right" src="../resources/csiro_easi_logo.png">

**Contents**

  - [Overview](#Overview)
  - [Notebook setup](#Notebook-setup)
  - [Loading up the Sentinel-2 data](#Loading-up-the-Sentinel-2-data)
    - [DataCube query](#DataCube-query)
    - [Load and pre-process the data](#Load-and-pre-process-the-data)
  - [Interactive widget plots](#Interactive-widget-plots)
    - [Selecting the pixel drill location](#Selecting-the-pixel-drill-location)
    - [Pixel drill plot](#Pixel-drill-plot)
    - [Plot the selected image](#Plot-the-selected-image)

# Overview

This notebook demonstrates the use of interactive widgets in a Jupyter notebook. We will load up a few time slices of Sentinel-2 data over the Australian Capital Territory (ACT) region in Australia, and display a widget allowing the user to interactively select a pixel of interest over the selected area. The time series of remote sensing values for that pixel is then plotted, allowing the user to select a given time index to use for the final display of the Sentinel-2 image acquired at that specific date.

This notebook is adapted from a [Digital Earth Australia](https://github.com/GeoscienceAustralia/dea-notebooks) example by Claire Krause.

**A note regarding compatibility**

As stated above, this notebook makes use of a specific dataset of remote sensing data, and is applied over a given region of interest. The code below will thus not run properly if the EASI deployment used to run this notebook does not also contain a similar dataset.

Users can nonetheless investigate the outputs provided in this demonstration notebook, and also modify certain variables in the code below to allow the notebook to run with a different EASI deployment (e.g. different remote sensing data, region of interest, etc.)

# Notebook setup

In a typical notebook, we would use the built-in "magic" command `%matplotlib inline` to set up the inline `matplotlib` backend in the notebook. Because we will here make use of interactive widgets, we need to invoke a different command to use the `widget` GUI backend, as done below:

In [None]:
%matplotlib widget

Then in the cell below, we import the usual Python modules and functions needed in the rest of this notebook. Subsequently, we also open a connection to the EASI datacube.

In [None]:
### System
import os, sys

### Datacube 
from datacube import Datacube
from datacube.utils import masking
from odc.algo import enum_to_bool

### Data tools
import matplotlib.dates
import numpy as np

### Plotting
import ipywidgets as widgets
import matplotlib.pyplot as plt

### EASI tools
sys.path.append(os.path.expanduser('../scripts'))
os.environ['USE_PYGEOS'] = '0'
import notebook_utils
from app_utils import display_map
from easi_tools import EasiDefaults
easi = EasiDefaults()

In [None]:
dc = Datacube(app='pixel_drill')

In [None]:
cluster, client = notebook_utils.initialize_dask(use_gateway=False)
display(client)
display(cluster)

In [None]:
# Show the Dask dashboard - click on this to watch how the calculations are progressing
notebook_utils.localcluster_dashboard(client, server=easi.hub)


# Loading up the Sentinel-2 data

## DataCube query

In this notebook, we're interested in a dataset of Sentinel-2 imagery (product labelled `s2a_ard_granule` on the current EASI deployment) centred over the Australian Capital Territory (ACT). We set up the corresponding spatial extent in the query dicitionary below, together with a (small) temporal window.

We will use the Sentinel-2 data for false-colour plots of the SWIR, NIR and green bands (in the corresponding RGB channels), so we add these bands to the list of measurements to load up. For the purpose of this demonstration, we will not carry out the typical data clean-up process to mask pixels affected by data quality issues (saturation, clouds, etc.) &ndash; thus, we do not include the `fmask` band here which would otherwise be needed for this QA filtering process.

In addition, we don't really care about the projection of the data in the plots further below. So to avoid unnecessary (and time consuming) re-projection of the data, we will here load up the Sentinel-2 dataset in its native projection. In the code below, we use the function `mostcommon_crs()` to work out the dataset's native coordinate reference system (CRS) and set the query parameter `output_crs` accordingly.

In [None]:
# This configuration is read from the defaults for this system. 
# Examples are provided in a commented line to show how to set these manually.

study_area_lat = easi.latitude
# study_area_lat = (39.2, 39.3)

study_area_lon = easi.longitude
# study_area_lon = (-76.7, -76.6)

product = easi.product('sentinel-2')
# product = 'landsat8_c2l2_sr'

set_time = easi.time
# set_time = ('2020-08-01', '2020-12-01')

# set_crs = easi.crs('sentinel-2')
# set_crs = 'EPSG:32618'

set_resolution = easi.resolution('sentinel-2')
# set_resolution = (-30, 30)

In [None]:
query = { 'product': product,
          'lat': study_area_lat,    # 
          'lon': study_area_lon,
          'measurements': ['green', 'nir_1', 'swir_2', 'SCL'],
          'time': set_time,
          'group_by': 'solar_day' # scene ordering
        }  

### Dataset's native projection
native_crs = notebook_utils.mostcommon_crs(dc, query)
print(f"The dataset's native CRS is: {native_crs}")

query.update({ 'output_crs': native_crs,    # EPSG code
               'resolution': set_resolution, # target resolution
               'dask_chunks': {'time': 1}
             })  

## Load and pre-process the data

In [None]:
data = dc.load(**query)
data

Because our dataset is here relatively small, we did not make use of `Dask` (distributed `Xarray` processing on compute clusters) &ndash; the data is thus loaded into the system memory. As usual, we can check the size of the data object we just loaded up: without `Dask`, using a very large dataset could lead to technical and/or computational issues in the code further below in this notebook.

In [None]:
print(f"The size of 'data' in memory is {data.nbytes/(1024**2):.2f} MB.")

Next, we can do some basic data cleaning / re-formatting to allow for a useful display of the plots further below:

In [None]:
# "dark area pixels" have been included in the mask below just to avoid extra pixels being removed
good_pixel_flags = {'water', 'vegetation', 'bare soils', 'unclassified','dark area pixels'} # pixels to retain 
good_pixel_mask = enum_to_bool(data['SCL'], good_pixel_flags)

In [None]:
# Find the most cloud-free scene to use in the first image
percentage_clear = good_pixel_mask.mean(dim=['x','y'])
best_scene = percentage_clear.argmax(dim='time').values

In [None]:
### Set all 'nodata' pixels to 'NaN'
data = masking.mask_invalid_data(data)

### Filter out bad pixels (clouds, etc)
data = data[['green','nir_1','swir_2']].where(good_pixel_mask)

### Set pixels to 'NaN' outside valid range for Sentinel-2 data (GA S2 products)
data = data.where((data>=0) & (data<=10000))

# Interactive widget plots

## Selecting the pixel drill location

In the next cell, we (manually) select a time slice from the Sentinel-2 time series, and plot it with a widget allowing the user to click on the plot and select a desired location for the pixel drill.

First, we define the callback function `onclick` that specifies what happens when the plot is clicked &ndash; this function here updates the text of the widget `wid` and sets the (global) parameters `pixelx` and `pixely` for the pixel drill location.

Subsequently, the plot is displayed and `mpl_connect` is used to bind the `onclick` callback function to the mouse click event.

In [None]:
### Callback function
def onclick(event):
    global pixelx, pixely
    pixelx, pixely = int(event.xdata), int(event.ydata)
    wid.value = wid_heading + f'<p>Pixel selected: x = {pixelx}, y = {pixely}</p>'
    for collection in fig.axes[0].collections:
        collection.remove()  # Remove any existing points
    for text in fig.axes[0].texts:
        text.remove() # Remove any existing text
    plt.scatter(pixelx,pixely,color='yellow',edgecolors='black')  # Plot new point
    plt.text(x=pixelx+200, y=pixely+200, s=f"{pixelx}, {pixely}",color='black',bbox=dict(facecolor='yellow',alpha=0.7)) # Add text

### Display one of the time slices (manually selected)
image_array = data[['swir_2', 'nir_1', 'green']].isel(time=best_scene).to_array()
image_array.plot.imshow(robust=True, figsize=(7,7))

### Widget handling
fig = plt.gcf()
wid_heading = "<h3>Click to select a pixel of interest</h3>"
wid_result = "<p>Pixel selected:</p>"
wid = widgets.HTML(wid_heading+wid_result)
cid = fig.canvas.mpl_connect('button_press_event', onclick)
display(wid)

## Pixel drill plot

Once the pixel drill location is selected, the next cell plots the complete timeseries of data for that pixel. 

For the sake of this demonstration, we here simply plot the `green` spectral band values &ndash; this workflow could easily be updated to display some band arithmetic index, such as NDVI for instance.

And similar to the previous example, we make use of the `mpl_connect` function to link the (new) callback function `onclick2` to the mouse-click event on the new plot. Here, the callback function defines the global variable `selected_date` whenever the user clicks on a point on the plot.

<div class="alert alert-danger">
    <strong>IMPORTANT:</strong> The next diagram does not currently show which point you have selected. This is a work in progress and will be added. When you click on the plot, the Date of interest will be shown in text above the plot.
</div>

In [None]:
### Extract the pixel drill data
pix_drill_data = data.green.sel(y=pixely, x=pixelx, method='nearest')

### Plot the pixel drill time series
fig = plt.figure(figsize=(6,6))
plt.plot(data.time, pix_drill_data, 'ro-')
plt.xlabel('date'); plt.title('pixel drill data')
plt.ylabel('surface reflectance (green band)')

### Callback function
def onclick2(event):
    global selected_date
    tmp = event.xdata
    selected_date = str( matplotlib.dates.num2date(tmp).date() )   # convert the returned (clicked) integer to datetime string
    wid2.value = f'Date of interest : {selected_date}'

### Widget handling
wid2 = widgets.HTML("Click on a point to select the date of interest")
cid = fig.canvas.mpl_connect('button_press_event', onclick2)
display(wid2)

Depending on the selected pixel drill location, and as we have not filtered out clouds, you might see in the above plot a mix of high `green` values (around ~10,000) as well as lower values (perhaps around ~1,000). If you click and select various dates in this time series, you will notice that the higher values correspond to images with clouds (high reflectance) covering the pixel drill location. The collection of lower values likely represent the scenes where the sensor provides a clear (cloud-free) image of the selected pixel, and so represent the `green` values of the actual ground.

## Plot the selected image

Finally, we can simply extract the selected time slice from the Sentinel-2 time series and plot it.

<div class="alert alert-info">
    <strong>NOTE:</strong> In the following plot, areas of cloud or missing data will appear as areas in black.
</div>

In [None]:
### Convert selected date to datetime format
time_slice = np.datetime64(selected_date, 'ns')

### Use time stamp to extract and display the selected image
image_array = data[['swir_2', 'nir_1', 'green']].sel(time=time_slice, method='nearest').to_array()
image_array.plot.imshow(robust=True, figsize=(8,8));
ax = plt.gca()
ax.set_facecolor("black")

In [None]:
### End notebook