# SWAMpy Datacube - Great Barrier Reef Example

This example demonstrates how to use SWAMpy with the Datacube through the use of an iPython Notebook.  Key steps to the workflow are:

1. Query Datacube
2. Visualize search results to identify scenes to process
3. Load onfiguration files and model paramaters to be used in SWAMpy calculation
4. Run SWAMpy algorithm on selected datasets
5. Visulize results results through time and across parameters
6. Time series analysis

SWAMpy can be a **CPU intensive operation** (e.g. 350x350 takes ~20 minutes with 8 CPUS) therefore keep this in mind when defining and area to process

## Step A - Once off setup of SWAMpy code base and example configuration files
If you have previously run these cells on your host then there is no need to perform these steps again.<br>
**Note: Requires a kernel restart (from the menu Kernel>Restart) for the newly installed modules to be recognized**

In [None]:
!export PIP_IGNORE_INSTALLED=0; pip3 install --user swampy-spatial-datacube

In [None]:
!git clone https://bitbucket.csiro.au/scm/datacube/swampy-data-cfg-tiles.git

__Now restart kernel (from the menu Kernel>Restart) for newly load modules to be recodgnized__ 

## Step 0 - Display Options

In [None]:
import warnings
warnings.filterwarnings('ignore')

In [None]:
%%javascript
IPython.OutputArea.auto_scroll_threshold = 9999

In [None]:
%%javascript
IPython.OutputArea.prototype._should_scroll = function(lines) {
    return false;
}

In [None]:
from IPython.display import HTML

HTML('''<script>
code_show=true; 
function code_toggle() {
 if (code_show){
 $('div.input').hide();
 } else {
 $('div.input').show();
 }
 code_show = !code_show
} 
$( document ).ready(code_toggle);
</script>
<form action="javascript:code_toggle()"><input type="submit" value="Click here to toggle on/off the raw code."></form>''')

## Step 1 - Setup input/output directories

If swampy-data-cfg-tile installed to directory other than /home/jovyan/sample-notebooks then datadir will need to be updated.
Additionally the pointer in the config files for the sensor and siops databases will need to be updated.

In [None]:
import os
datadir = '/home/jovyan/sample-notebooks/swampy-data-cfg-tiles'
idir = datadir + '/cfg/gbr/'
odir = datadir + '/results/gbr/'
os.makedirs(odir, exist_ok=True)

## Step 2 - List products in Datacube

In [None]:
import datacube

dc = datacube.Datacube()
print(dc.list_products())

In [None]:
df = dc.list_measurements(with_pandas=True)
print(df)

## Step 3 - Query Datacube

In [None]:
# first region is smaller about 220x200 pixels, second roi is around ~350x300 pixels
# first region takes ~10 minutes single date using the NCI VDI (8 cores)
lat = [(-18.940833,-18.986388), (-18.8333,-18.96472)]
lon = [(147.6825,147.72472), (147.68916, 147.81167)]

roi_idx = 0

query = {
    'product': 'ls8_usgs_sr_albers',
    'time': ('2000-01-01', '2016-11-02'),
    'lat': lat[roi_idx],  
    'lon': lon[roi_idx],
    'group_by': 'solar_day'
}

refl_xr = dc.load(measurements=['blue','green','red','nir'], **query, use_threads=True )
pqa = dc.load(measurements=['pixel_qa'], **query,  use_threads=True)
print(refl_xr.dims)

## Step 4 - List the available dates

In [None]:
print ('%i dates of imagery found over the search area' % len(refl_xr.coords['time']))
print ('Dates of imagery are: ', refl_xr.coords['time'] )

## Step 5 - Visualize the datasets returned in search results

In [None]:
%matplotlib inline
from swampy_spatial_datacube.swampy_datacube_plotting import do_plot

nimage = 3   # change to the number of images returned
img_idx_list = list(range(len(refl_xr.time)))
img_groups = [ img_idx_list[i:i+nimage] for i in range(0, len(refl_xr.time), nimage) ]
for igp in img_groups:
    do_plot(refl_xr.isel(time=igp),ncols=nimage, fsize=(10,5), dpi=80)

## Step 6 - Load configuration files and model parameters used for SWAMpy calculation
**Remember to modify configuration files so that they point to local installation 

In [None]:
import multiprocessing as mp
from swampy_spatial_datacube.swampy_spatial_datacube import load_swampy_config_files

input_file=idir+'inputs_options.yaml'
output_file=idir+'output_options.yaml'
siops_datasets=idir+'siops_datasets_sam_par.yaml'
sensor_data=idir+'sensor.yaml'
model_params=idir+'model_params.yaml'
optz_params=idir+'optimization_parameters.yaml'

print('Loading config files...')
cfg = load_swampy_config_files(input_options=input_file,
                                output_options=output_file,
                                siops_datasets=siops_datasets,
                                sensor_data=sensor_data,
                                model_params=model_params,
                                optz_params=optz_params)
                               
ncores = mp.cpu_count()
cfg['input_options']['mask_module_file'] = None
cfg['output_options']['odir'] = odir
cfg['input_options']['mask_pq_func'] = None 
cfg['input_options']['mask_refl_func'] = None
cfg['optz_params']['pool'] = ncores

print('Loading of files complete')

### Review model parameters for run

In [None]:
cfg['model_params']['envmeta']['q_factor'] = 4.0

In [None]:
import pprint
pprint.pprint(cfg)

## Step 7 Do SWAMpy Calculation...on selected scenes
** Select scenes by setting the images variable.  Use index number left to right then top to bottom starting from zero.  Note SWAMpy can be 
CPU intensive select a small area to process initially (350 x 350).  Processing status is displayed in standard output.

In [None]:
from swampy_spatial_datacube.swampy_spatial_datacube import swampy_spatial_datacube_jpnb
# Select image to process
print('Doing calculation')
# select scenes here
image_select = [0,1,2]                                            
swampy_spatial_datacube_jpnb(refl_xr.isel(time=image_select),
                             cfg['input_options'], 
                             cfg['output_options'],
                             cfg['siops_datasets'], 
                             cfg['sensor_data'], 
                             cfg['model_params'],
                             cfg['optz_params'],
                             pq_xr=None)
print('Finished calculation')

## Step 8 - Review SWAMpy outputs for a single date

In [None]:
import xarray as xr
from swampy_spatial_datacube.swampy_datacube_plotting import do_plot_swampy_single_result
# Load results file
# modify to reflect location of local output directory
filename=odir+'Landsat8_GBR_2013-08-05_SWAMPYV2.nc' 
#filename=odir+'Landsat8_GBR_2013-06-18_SWAMPYV2.nc'
dt = xr.open_dataset(filename)
do_plot_swampy_single_result(dt.isel(time=[0]))

## Step 9 - Load multiple output datasets

In [None]:
dt = xr.open_mfdataset(odir+'*V2.nc')
print(dt.dims)

## Step 10 - Plot key parameters through time
### True colour

In [None]:
%matplotlib inline
from  swampy_spatial_datacube.swampy_datacube_plotting import do_plot
image_select = [0,1,2]

do_plot(refl_xr.isel(time=image_select),ncols=3, fsize=(10,10), dpi=80)

### Coloured Dissolved Organic Particulates

In [None]:
dt.cdom.plot(x='x',y='y',col='time', col_wrap=3, cmap='rainbow')

### Non Algal Particulates

In [None]:
dt.nap.plot(x='x',y='y',col='time', col_wrap=3, cmap='rainbow')

## Depth

In [None]:
dt.depth.plot(x='x',y='y',col='time', col_wrap=3, cmap='rainbow')

## Chlorophyll

In [None]:
dt.chl.plot(x='x',y='y',col='time', col_wrap=3, cmap='rainbow')

## Substrate Fraction - Sand 

In [None]:
dt.sub1_frac_norm.plot(x='x',y='y',col='time', col_wrap=3, cmap='rainbow')

## Substrate Fraction - Acropora

In [None]:
dt.sub2_frac_norm.plot(x='x',y='y',col='time', col_wrap=3, cmap='rainbow')

## Substrate Fraction - Turf Algae 

In [None]:
dt.sub3_frac_norm.plot(x='x',y='y',col='time', col_wrap=3, cmap='rainbow')

# Step 11 - Time series analysis

## Setup output directory

In [None]:
import pprint
#setup output directory
odirB = odir + 'time-series/'
#os.mkdir(odir)
os.makedirs(odirB, exist_ok=True)
files = list(os.listdir(path=odirB))
files.sort()
pprint.pprint(files)

## Get location of interest

In [None]:
xmid = refl_xr.x[len(refl_xr.x)//2].values
ymid = refl_xr.y[len(refl_xr.y)//2].values
print(xmid, ymid)

In [None]:
ncores = 8

## Get profile

In [None]:
import numpy as np
def get_profile(refl_xr, xgeo, ygeo, buf):
    xidx=np.argmin(abs(refl_xr.x.values - xgeo))
    yidx=np.argmin(abs(refl_xr.y.values - ygeo))
    ia = max(xidx-buf//2, 0)
    xbuf = list(range(ia,ia+buf))
    ia = max(yidx-buf//2, 0)
    ybuf = list(range(ia,ia+buf))
    
    return refl_xr.isel(x=xbuf,y=ybuf).copy(), xbuf, ybuf

In [None]:
prefl_xr, xbuf, ybuf = get_profile(refl_xr, xmid, ymid, ncores) 
ppqa_xr = pqa.isel(x=xbuf,y=ybuf).copy()


## Setup Cloud Masking

In [None]:
from datacube.storage import masking
import pandas
#pandas.DataFrame.from_dict(masking.get_flags_def(pq))#, orient='index')
import pprint

pprint.pprint(masking.get_flags_def(pqa))

In [None]:
def mask_cloud(pq):
    # b0 - fill, b3 - cloud shadow, b5 - cloud
    good_data = (pq & 41) == 0    
    return good_data.pixel_qa.values

In [None]:
del(refl_xr)
del(pqa)

## Run SWAMpy on profile data

In [None]:
from swampy_spatial_datacube.swampy_spatial_datacube import swampy_spatial_datacube_jpnb

# change run options
cfg['output_options']['odir'] = odirB
cfg['model_params']['free_params']['p_min']['chl'] = 0.0099

# setup cloud masking
cfg['input_options']['mask_pq_func'] = mask_cloud
#cfg['input_options']['mask_refl_func'] = func2

print('Doing calculation')
# select scenes here                                           
swampy_spatial_datacube_jpnb(prefl_xr,
                             cfg['input_options'], 
                             cfg['output_options'],
                             cfg['siops_datasets'], 
                             cfg['sensor_data'], 
                             cfg['model_params'],
                             cfg['optz_params'],
                             pq_xr=ppqa_xr)
print('Finished calculation')


## Load output datasets

In [None]:
import xarray as xr

dt = xr.open_mfdataset(odirB+'*V2.nc')
print(dt.dims)

## Visualize results

In [None]:
dt.cdom.plot(x='x',y='y',col='time', col_wrap=3, cmap='rainbow')

In [None]:
dt.cdom.isel(x=0,y=0).plot(aspect=2, size=5)

In [None]:
dt.depth.isel(x=0,y=0).plot(aspect=2, size=5)

In [None]:
dt.nap.isel(x=0,y=0).plot(aspect=2, size=5)

In [None]:
dt.chl.isel(x=0,y=0).plot(aspect=2, size=5)