### General advice (delete this cell before submitting for review)

> * When choosing a location for your analysis, **select an area that has data on both the NCI and DEA Sandbox** to allow your code to be run on both environments. 
For example, you can check this for Landsat using the [DEA Explorer](https://explorer.sandbox.dea.ga.gov.au/ga_ls5t_ard_3/1990) (use the drop-down menu to view all products).
As of September 2019, the `DEA Sandbox` has a single year of continental Landsat data for 2015-16, and the full 1987-onward time-series for three locations (Perth WA, Brisbane QLD, and western NSW).
> * When adding **Products used**, embed the hyperlink to that specific product on the DEA Explorer using the `[product_name](product url)` syntax.
> * When writing in Markdown cells, start each sentence on a **new line**.
This makes it easy to see changes through git commits.
> * Use Australian English in markdown cells and code comments.
> * Check the [known issues](https://github.com/GeoscienceAustralia/dea-docs/wiki/Known-issues) for formatting regarding the conversion of notebooks to DEA docs using Sphinx.
Things to be aware of:
    * Sphinx is highly sensitive to bulleted lists:
        * Ensure that there is an empty line between any preceding text and the list
        * Only use the `*` bullet (`-` is not recognised)
        * Sublists must be indented by 4 spaces
    * Two kinds of formatting cannot be used simultaneously:
        * Hyperlinked code: \[\`code_format\`](hyperlink) fails
        * Bolded code: \*\*\`code_format\`\*\* fails
    * Headers must appear in heirachical order (`#`, `##`, `###`, `####`) and there can only be one title (`#`).
> * Use the [PEP8 standard](https://www.python.org/dev/peps/pep-0008/) for code. To make sure all code in the notebook is consistent, you can use the `jupyterlab_code_formatter` tool: select each code cell, then click `Edit` and then one of the `Apply X Formatter` options (`YAPF` or `Black` are recommended). This will reformat the code in the cell to a consistent style.
> * For additional guidance, refer to the style conventions and layouts in approved `develop` branch notebooks. 
Examples include
    * [Frequently_used_code/Using_load_ard.ipynb](./Frequently_used_code/Using_load_ard.ipynb)
    * [Real_world_examples/Coastal_erosion.ipynb](./Real_world_examples/Coastal_erosion.ipynb)
    * [Scripts/dea_datahandling.py](./Scripts/dea_datahandling.py)
> * The DEA Image placed in the title cell will display as long as the notebook is contained in one of the standard directories.
It does not work in the highest level directory (hence why it doesn't display in the original template notebook).
> * In the final notebook cell, include a set of relevant tags which are used to build the DEA User Guide's [Tag Index](https://docs.dea.ga.gov.au/genindex.html). 
Use all lower-case (unless the tag is an acronym), separate words with spaces (unless it is the name of an imported module), and [re-use existing tags](https://github.com/GeoscienceAustralia/dea-notebooks/wiki/List-of-tags).
Ensure the tags cell below is in `Raw` format, rather than `Markdown` or `Code`.


# Exporting Landsat Collection 3 vegetation-related indicies <img align="right" src="../Supplementary_data/dea_logo.jpg">

* **Compatability:** Notebook currently compatible with the `NCI`|`DEA Sandbox` environment only
* **Products used:** 
[ga_ls5t_ard_3](https://explorer.sandbox.dea.ga.gov.au/ga_ls5t_ard_3),
[ga_ls7e_ard_3](https://explorer.sandbox.dea.ga.gov.au/ga_ls7e_ard_3),
[ga_ls8c_ard_3](https://explorer.sandbox.dea.ga.gov.au/ga_ls8c_ard_3)
* **Special requirements:** An _optional_ description of any special requirements, e.g. If running on the [NCI](https://nci.org.au/), ensure that `module load otps` is run prior to launching this notebook
* **Prerequisites:** An _optional_ list of any notebooks that should be run or content that should be understood prior to launching this notebook


## Background
An *optional* overview of the scientific, economic or environmental management issue or challenge being addressed by Digital Earth Australia. 
For `Beginners_Guide` or `Frequently_Used_Code` notebooks, this may include information about why the particular technique or approach is useful or required. 
If you need to cite a scientific paper or link to a website, use a persistent DOI link if possible and link in-text (e.g. [Dhu et al. 2017](https://doi.org/10.1080/20964471.2017.1402490)).

## Description
A _compulsory_ description of the notebook, including a brief overview of how Digital Earth Australia helps to address the problem set out above.
It can be good to include a run-down of the tools/methods that will be demonstrated in the notebook:

1. First we do this
2. Then we do this
3. Finally we do this

***

## Getting started

Provide any particular instructions that the user might need, e.g. To run this analysis, run all the cells in the notebook, starting with the "Load packages" cell. 

### Load packages
Import Python packages that are used for the analysis.

Use standard import commands; some are shown below. 
Begin with any `iPython` magic commands, followed by standard Python packages, then any additional functionality you need from the `Scripts` directory.

In [2]:
%matplotlib inline

import sys
import warnings
import matplotlib.pyplot as plt
import calendar
import numpy as np
import xarray as xr

import dask
from dask.utils import parse_bytes
from dask.distributed import Client, LocalCluster

import datacube
from datacube.storage import masking
from datacube.helpers import write_geotiff
from datacube.utils.rio import configure_s3_access
from datacube.utils.dask import start_local_dask

from psutil import virtual_memory, cpu_count

# Load custom DEA notebook functions
sys.path.append('../dea-notebooks/Scripts')
import dea_datahandling
import dea_plotting
import DEADataHandling
from dea_bandindices import calculate_indices

### Connect to the datacube

Connect to the datacube so we can access DEA data.
The `app` parameter is a unique name for the analysis which is based on the notebook file name.

In [3]:
try:
    dc_landsat3 = datacube.Datacube(app='VegProductsExport_Coll3', env='c3-samples')
except:
    dc_landsat3 = datacube.Datacube(app='VegProductsExport_Coll3')

### Analysis parameters

* `dry_months`: Specific months of the year that correspond with dry season/low rainfall conditions. Values range from 0-11. 


In [4]:
dry_months = [5,6,7]

### Define spatial and temporal query

If running this notebook locally, use the smaller spatial extent and subset of the time series. 

If running on gadi, the the temporal and spatial extent can be increased.  

> **Note:** Landsat imagery is available from 1987 onwards. 

## Spatial loop to divide up the data processing and reduce memory usage

In [5]:
# create series of coordinates (with a fixed increment; 0.10 deg)
coords_lon = np.arange(132.07, 135.46, 0.05)
coords_lat = np.arange(-20.31, -22.11, -0.05)

In [None]:
for i in range(coords_lon.size-2):
#for i in range(2):
    for j in range(coords_lat.size-2):
#    for j in range(2):
        query_3 = {'lon': coords_lon[i:(i+2)],
          'lat': coords_lat[j:(j+2)],             # full study area
          'time':('2000-01', '2018-12'),       # subset of time-series
          #'time':('1987-01', '2018-12'),       # full time-series
          'output_crs': 'EPSG:28352',
          'resolution': (30, 30),
          'group_by': 'solar_day'
          }   
 
        # Load Landsat data from Collection 3 using .load_ard.
        # mask_dtype = np.float16 helps to keep the memory down, however, the data will need to be converted back to float34 later.
        ds = dea_datahandling.load_ard(dc=dc_landsat3,
          mask_dtype = np.float16,
          products=['ga_ls5t_ard_3', 'ga_ls7e_ard_3', 'ga_ls8c_ard_3'], 
          measurements=['nbart_red','nbart_nir','nbart_green',
                                      'nbart_blue','nbart_swir_1','nbart_swir_2'],
          mask_contiguity='nbart_contiguity', min_gooddata=0.90,
          **query_3)
    
        #set variable for path to save files
        savefilepath = '/g/data/zk34/ljg547/Outputs/'

        # Set project naming convention. Start and end dates are reformated to remove '-'.
        Proj = 'SSC_WD_'

        ds_startDate = str(ds.isel(time=0).time.values)[0:10]
        ds_startDate = str(ds_startDate[0:4] + f'{int(ds_startDate[6:7]):02d}' + 
                              f'{int(ds_startDate[9:10]):02d}')

        ds_endDate = str(ds.isel(time=-1).time.values)[0:10]
        ds_endDate = str(ds_endDate[0:4] + f'{int(ds_endDate[6:7]):02d}' + 
                              f'{int(ds_endDate[9:10]):02d}')
        
        # Calculate NDWI. NDWI is added to ds as a new band as shown in the below display
        calculate_indices(ds,index = 'NDWI', collection = 'ga_ls_3',
                normalise = True, deep_copy = False)
        
        # Create a new NDWI DataArray 
        ndwi = ds.NDWI

        # Group available NDWI time-steps into dry season months for later monthly averaging
        mean_ndwi_dry = ndwi[ndwi['time.month'].isin(dry_months)]
        
        # Calculate standard deviation in NDWI for each month in the dry period (through time and for each pixel)
        mean_ndwi_dry = mean_ndwi_dry.groupby('time.month').mean(dim = 'time')

        # Calculate mean standard deviation for the dry period based on the mean monthly values for each month in the dry period
        mean_ndwi_dry = mean_ndwi_dry.mean(dim = 'month')
        
        # Convert from float16 to float32
        arr = mean_ndwi_dry.astype(dtype='float32')

        # Convert from DataArray to Dataset
        arr = arr.to_dataset(name='mean_NDWI_dry')

        # Assign CRS from original DataArray
        arr.attrs = ds.attrs
        
        # Generating naming convention for dry season files based on Project area (Proj), specified dry season and time series start and end dates. 
        fname = str(savefilepath + Proj + 'meanNDWI_DrySeason_' +
                      ds_startDate + '_' + ds_endDate + '_' + 
                      "Lon" + str(i) + "Lat" + str(j) + '.tif')
        
        # Writing data to file
        write_geotiff(dataset=arr, filename=fname)
        
        # Creating an associated metadata file. w - writes, r - reads, a- appends
        f = open(savefilepath + Proj + 'meanNDWI_DrySeason' +
                      str(dry_months[0]+1) + 'to' + str(dry_months[-1]+1) +
                      '_' + ds_startDate + '_' + ds_endDate + '.txt','w')  


        f.write("NDWI of dry period (" + str(dry_months[0]+1) + "-" + str(dry_months[-1]+1) + " month)" +  
              " from " + ds_startDate + "-" + ds_endDate + "." + "\n" +
              "NDWI_dry_mean is the mean value of NDWI over the dry months." + "\n"
              "Coordinates are longitude: " +  str(round(coords_lon[i],2)) + ' to ' + 
               str(round(coords_lon[i+2],2)) + "; latitude: " + str(round(coords_lat[j],2)) + "." +
               str(round(coords_lat[j+2],2)) + "\n" "Data with >10% cloud was discarded." + "\n"
              "This product was derived from NDWI_Export.ipynb"
            )

#        f.close()
        
        print(i,j)
        
    i=i+1
    
print("--end--")

Loading ga_ls5t_ard_3 data
    Filtering to 118 out of 130 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Loading ga_ls7e_ard_3 data
    Filtering to 56 out of 379 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Loading ga_ls8c_ard_3 data
    Filtering to 91 out of 127 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Combining and sorting data
    Returning 265 observations 
0 0
Loading ga_ls5t_ard_3 data
    Filtering to 118 out of 130 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Loading ga_ls7e_ard_3 data
    Filtering to 55 out of 379 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Loading ga_ls8c_ard_3 data
    Filtering to 91 out of 127 observations
    Applying pixel 

  return np.nanmean(a, axis=axis, dtype=dtype)


6 30
Loading ga_ls5t_ard_3 data
    Filtering to 123 out of 261 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Loading ga_ls7e_ard_3 data
    Filtering to 55 out of 764 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Loading ga_ls8c_ard_3 data
    Filtering to 97 out of 252 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Combining and sorting data
    Returning 275 observations 
6 31
Loading ga_ls5t_ard_3 data
    Filtering to 123 out of 260 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Loading ga_ls7e_ard_3 data
    Filtering to 56 out of 758 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Loading ga_ls8c_ard_3 data
    Filtering to 96 out of 252 observations
    Applying 

  return np.nanmean(a, axis=axis, dtype=dtype)


7 31
Loading ga_ls5t_ard_3 data
    Filtering to 109 out of 240 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Loading ga_ls7e_ard_3 data
    Filtering to 55 out of 604 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Loading ga_ls8c_ard_3 data
    Filtering to 94 out of 167 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Combining and sorting data
    Returning 258 observations 
7 32
Loading ga_ls5t_ard_3 data
    Filtering to 107 out of 226 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Loading ga_ls7e_ard_3 data
    Filtering to 55 out of 576 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Loading ga_ls8c_ard_3 data
    Filtering to 97 out of 133 observations
    Applying 

  return np.nanmean(a, axis=axis, dtype=dtype)


8 29
Loading ga_ls5t_ard_3 data
    Filtering to 102 out of 196 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Loading ga_ls7e_ard_3 data
    Filtering to 53 out of 537 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Loading ga_ls8c_ard_3 data
    Filtering to 95 out of 128 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Combining and sorting data
    Returning 250 observations 
8 30
Loading ga_ls5t_ard_3 data
    Filtering to 104 out of 171 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Loading ga_ls7e_ard_3 data
    Filtering to 54 out of 490 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Loading ga_ls8c_ard_3 data
    Filtering to 97 out of 128 observations
    Applying 

  return np.nanmean(a, axis=axis, dtype=dtype)


9 29
Loading ga_ls5t_ard_3 data
    Filtering to 104 out of 143 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Loading ga_ls7e_ard_3 data
    Filtering to 51 out of 392 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Loading ga_ls8c_ard_3 data
    Filtering to 97 out of 128 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Combining and sorting data
    Returning 252 observations 
9 30
Loading ga_ls5t_ard_3 data
    Filtering to 107 out of 140 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Loading ga_ls7e_ard_3 data
    Filtering to 55 out of 388 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Loading ga_ls8c_ard_3 data
    Filtering to 95 out of 128 observations
    Applying 

  return np.nanmean(a, axis=axis, dtype=dtype)


10 29
Loading ga_ls5t_ard_3 data
    Filtering to 103 out of 134 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Loading ga_ls7e_ard_3 data
    Filtering to 52 out of 388 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Loading ga_ls8c_ard_3 data
    Filtering to 92 out of 128 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Combining and sorting data
    Returning 247 observations 


  return np.nanmean(a, axis=axis, dtype=dtype)


10 30
Loading ga_ls5t_ard_3 data
    Filtering to 104 out of 134 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Loading ga_ls7e_ard_3 data
    Filtering to 53 out of 388 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Loading ga_ls8c_ard_3 data
    Filtering to 94 out of 128 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Combining and sorting data
    Returning 251 observations 
10 31
Loading ga_ls5t_ard_3 data
    Filtering to 106 out of 134 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Loading ga_ls7e_ard_3 data
    Filtering to 52 out of 388 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Loading ga_ls8c_ard_3 data
    Filtering to 95 out of 128 observations
    Applyin

  return np.nanmean(a, axis=axis, dtype=dtype)


10 33
Loading ga_ls5t_ard_3 data
    Filtering to 107 out of 134 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Loading ga_ls7e_ard_3 data
    Filtering to 51 out of 388 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Loading ga_ls8c_ard_3 data
    Filtering to 97 out of 128 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Combining and sorting data
    Returning 255 observations 
10 34
Loading ga_ls5t_ard_3 data
    Filtering to 232 out of 262 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Loading ga_ls7e_ard_3 data
    Filtering to 114 out of 759 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Loading ga_ls8c_ard_3 data
    Filtering to 174 out of 248 observations
    Apply

  return np.nanmean(a, axis=axis, dtype=dtype)


11 30
Loading ga_ls5t_ard_3 data
    Filtering to 102 out of 134 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Loading ga_ls7e_ard_3 data
    Filtering to 53 out of 388 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Loading ga_ls8c_ard_3 data
    Filtering to 97 out of 128 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Combining and sorting data
    Returning 252 observations 
11 31
Loading ga_ls5t_ard_3 data
    Filtering to 106 out of 134 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Loading ga_ls7e_ard_3 data
    Filtering to 52 out of 388 observations
    Applying pixel quality/cloud mask
    Applying invalid data mask
    Applying contiguity mask
Loading ga_ls8c_ard_3 data
    Filtering to 97 out of 128 observations
    Applyin

### Setting up automated file naming and location

## Normalised Difference Wetness Index (NDWI)
Add text about NDWI

### NDWI statistics for the defined dry season

Here we use the previously defined dry_months parameter to look at mean and standard deviation of NDWI during dry conditions over multiple years.

Looking at the same period of time over multiple years reduces the noise and highlights longer-term trends in vegetation wetness under dry conditions.

Vegetation with access to groundwater resources are hypothesised to have higher mean and lower standard deviation compared to vegetation with access to less reliable water sources (i.e. rain-fed).

mean_ndwi_dry can be viewed after each step to understand how each step compresses the time series data

### Exporting data
In order to use the datacube.helpers write_geotiff function to export a simple single-band, single time-slice geotiff the above xarray DataArrays need to be converted to xarray Datasets. 

We do this be using the xarray function .to_dataset. If you don't do this, the write_geotiff fucntion will return an error. 

We also need to reassign the coordinate reference system before the write_geotiff function will work. This is done by the .attrs function. We take the crs from the original imported data (ds). 

Each file will be exported as a geotiff and saved in the same directory as this notebook. It can be downloaded from this location to the GA network using FileZilla.

***

## Additional information

**License:** The code in this notebook is licensed under the [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0). 
Digital Earth Australia data is licensed under the [Creative Commons by Attribution 4.0](https://creativecommons.org/licenses/by/4.0/) license.

**Contact:** If you need assistance, please post a question on the [Open Data Cube Slack channel](http://slack.opendatacube.org/) or on the [GIS Stack Exchange](https://gis.stackexchange.com/questions/ask?tags=open-data-cube) using the `open-data-cube` tag (you can view previously asked questions [here](https://gis.stackexchange.com/questions/tagged/open-data-cube)).
If you would like to report an issue with this notebook, you can file one on [Github](https://github.com/GeoscienceAustralia/dea-notebooks).

**Last modified:** October 2019

**Compatible datacube version:** 

In [None]:
print(datacube.__version__)

## Tags
Browse all available tags on the DEA User Guide's [Tags Index](https://docs.dea.ga.gov.au/genindex.html)