# Create an Australian coastline using ITEM

**What does this notebook do?** This notebook uses the [Intertidal Extents Model (ITEM) v2](http://pid.geoscience.gov.au/dataset/ga/113842) dataset to generate a high tide coastline for all of Australia. This coastline is derived from Landsat-based data, and is therefore useful for masking out the land/ocean in Landsat data.

**Requirements:** Specific running requirements have been documented within the code and markdown below. 

**Please note that for the all of Australia coastline analysis, this code was run on an interactive megamem node on raijin, as there is not enough memory to run here (or on a normal raijin node).**

` qsub -I -l walltime=24:00:00,mem=3TB,ncpus=32 -P r78 -q megamem -l wd`

**Date:** July 2019

**Author:** Claire Krause, Robbi Bishop-Taylor

In [None]:
%pylab notebook

from skimage import measure
import xarray as xr

sys.path.append('../10_Scripts')
import SpatialTools

## Preprocessing the ITEM data

The coastline layer was generated from the [Intertidal Extents Model (ITEM) v2](http://pid.geoscience.gov.au/dataset/ga/113842). The data was accessed via [THREDDS](http://dap.nci.org.au/thredds/remoteCatalogService?catalog=http://dapds00.nci.org.au/thredds/catalogs/fk4/item_2_0.xml), using the link provided in the data record. 

The data are stored in three formats: geotiff, netcdf and shapefile. The shapefile represents the coastal compartments used for the ITEM analysis, and so do not contain the actual ITEM data needed for this analysis. 

We used the geotiff files to generate a virtual raster, to allow us to visualise and interrogate the data. The commands below were run from a terminal window on the VDI, with the dea environment loaded.

```
cd /g/data/fk4/datacube/002/ITEM/ITEM_2_0/

gdalbuildvrt -srcnodata "-6666" /g/data/r78/cek156/ITEM.vrt geotiff/ITEM_REL_*.tif
```

## Selecting relevant data in QGIS

For the creation of a high tide coastline, we chose to select areas equal to 9 within the ITEM dataset, which represent pixels ['exposed at highest 80-100% of the observed tidal range (land)'](https://d28rz98at9flks.cloudfront.net/113842/ITEM_Product_Description.pdf).

The virtual raster was opened in QGIS, and the raster calculator was used to create a new geotiff file where; `ITEM == 9`. This boolean geotiff was written out as `ITEM9.tif`. 

## Label raster pixels by 'blobs'

The scikit-image python library was used to perform the raster analysis. The `skimage.measure.label` function labels connected regions of an array with a unique identifier. For example, see the image below. The image on the left shows a random collection of white blobs on a black background. The `skimage.measure.label` function labels each of those blobs separately, labeling connected blobs with the same identifier (see middle panel). This allows you to then select a single blob (here the largest), using the unique identifier. 

![skimage.measure.label function example](skimageMeasureLabelBlobs.PNG)

In [None]:
# Open the ITEM9 raster
ITEM9 = xr.open_rasterio('/g/data/r78/cek156/ShapeFiles/ITEMv2Coastline/ITEM9.tif')

In [None]:
# Compute a unique ID for each discrete region of connected pixels
blobs_labels = measure.label(ITEM9, background=1)

Once each 'blob' has a unique ID, you can count the number of pixels assigned to each blob to return the largest one. 

```
ids, counts = np.unique(blobs_labels[blobs_labels > 0], return_counts=True) #>0 to exclude background pixels
largest_region_id = ids[np.argmax(counts)]
largest_region = blobs_labels == largest_region_id
```

In our workflow, we have determined that the ocean was not the largest blob, but actually the third largest blob. (In ITEM, the no data area in the center of Australia made up the largest blob, with the land portion of each coastal compartment the second largest). This determination was made by exporting the `blobs_labels` variable to geotiff and identifying the ocean blob in QGIS. 

```
blobs_labels = blobs_labels.squeeze()
transform, projection = geotransform(ITEM9, (ITEM9.x, ITEM9.y), epsg=3577)
SpatialTools.array_to_geotiff('/g/data/r78/cek156/ShapeFiles/ITEMCoastlineAll.tif', Coastline, 
                              geo_transform = transform, 
                              projection = projection, 
                              nodata_val = -999)
```

## Select just the ocean 'blob' and write out to geotiff

In [None]:
# The ocean blob was given an ID of 2
Coastline = blobs_labels == 2

In [None]:
# Grab the transform and projection from the ITEM9 dataset we read in earlier
# This is a cheat to write out to geotiff. Since the spatial extent and resolution hasn't
# changed, we can just apply the spatial information from ITEM9 to the new boolean array
transform, projection = SpatialTools.geotransform(ITEM9, (ITEM9.x, ITEM9.y), epsg=3577)
SpatialTools.array_to_geotiff('/g/data/r78/cek156/ShapeFiles/ITEMCoastlineID2.tif', Coastline, 
                              geo_transform = transform, 
                              projection = projection, 
                              nodata_val = -999)

## Optimise the final geotiff

The Australia-wide coastal raster is ~92 GB in size, which is way too large to be easy to use. We use `gdal_translate` to optimise the geotiff and compress it to a much more reasonable ~100 MB. 

In [None]:
!gdal_translate \
   -co COMPRESS=DEFLATE \
   -co ZLEVEL=9 \
   -co PREDICTOR=1 \
   -co TILED=YES \
   -co BLOCKXSIZE=1024 \
   -co BLOCKYSIZE=1024 \
   /g/data/r78/cek156/ShapeFiles/ITEMCoastlineID2.tif /g/data/r78/cek156/ShapeFiles/ITEMCoastlineID2Optimised.tif

## Final product!

![Australian high tide coastline](HighTideCoastline.JPG)

Some of the known limitations of ITEM, related to data quality and quantity, have been transferred to this coastline dataset, the most notable of which is an error in the coastline to the south of WA.

![Poor data quality issues in WA coastline](WACoastlineNoise.JPG)