# Introduction to Digital Earth Australia

**Notebook currently compatible with the `NCI`|`DEA Sandbox` environment only**

## Background
Digital Earth Australia (DEA) is a digital platform that catalogues large amounts of Earth Observation data covering continental Australia.
It is implemented using the open source software collection of the Open Data Cube (ODC) which has an ever growing list of users, implementations and contributors.

The ODC and DEA platforms are designed to:
* Catalogue large amounts of Earth Observation data
* Provide a Python based API for high performance querying and data access
* Give scientists and other users easy ability to perform exploratory data analysis
* Allow scalable continent scale processing of the stored data
* Track the provenance of all the contained data to allow for quality control and updates

The DEA program catalogues data from a range of satellite sensors and has adopted processes and terminology that users should be aware of to enable efficient querying and use of the datasets stored within.
This notebook introduces these important concepts and forms the basis of understanding for the remainder of the notebooks in this Beginners Guide.
Resources to further explore these concepts are recommended at the end of the notebook.

## Prerequisites
Users of this notebook should have a basic understanding of the use and format of the Jupyter Notebook.

To review these basics, see [Introduction_to_Jupyter](Introduction_to_Jupyter.ipynb)

## Description
This introduction to the DEA will briefly introduce the OCD and review the dominant types of data catalogued in the DEA platform, as well as reviewing important terminology for referring to measurements within product datasets.

Topics include
* a brief introduction to the ODC
* a review of the satellite sensors whose data contributes to the DEA
* an introduction to Surface Reflectance measurements: NBAR, NBART and OA 
* important terminology:
  * band naming conventions
  * the coordinate reference scheme

## Open Data Cube
The ODC provides an integrated gridded data analysis environment for decades of analysis ready earth observation satellite and related data from multiple satellite and other acquisition systems.
It is a collection of software based around the [datacube-core](https://github.com/opendatacube/datacube-core) open source Python library that enables:
* Large-scale workflows on HPC
* Exploratory Data Analysis
* Cloud-based Services
* Standalone Applications

There are a number of existing implementations of the ODC, including DEA.

More information can be found in the [Open Data Cube Manual](https://datacube-core.readthedocs.io/en/latest/index.html)

## Digital Earth Australia
### Satellite datasets 
Digital Earth Australia catalogues data from a range of satellite sensors. 
The earliest datasets of optical satellite imagery in DEA date from 1986.
The DEA includes data from:
* Landsat 5 TM, operational between March 1984 and January 2013
* Landsat 7 ETM+, operational since April 1999
* Landsat 8 OLI, operational since February 2013
* Sentinel 2A MSI, operational since June 2015
* Sentinel 2B MSI, operational since March 2017

Landsat missions are jointly operated by the United States Geological Survey (USGS) and National Aeronautics and Space Administration (NASA).
Sentinel missions are operated by the European Space Agency (ESA).
One major difference between the two programs is the spatial resolution.
Landsat pixel sizes represent 30 $m^{2}$ of the land surface while Sentinel pixel sizes represent 10 $m^{2}$.
These pixels represent the smallest spatial area for which a spectra is detected at the satellite.

All of the listed datasets in DEA are captured by multispectral satellites.
This means that the satellites sample a small number of discrete sections of the electromagnetic spectrum.
Furthermore, the datasets generated by each of these sensors (satellites) are subtly different in other ways. 
Figure 1 shows the recent Landsat satellites and Sentinel sensors and compares the way they sample the electromagnetic spectrum (Wavelength axis).
Note that Landsat 5 TM contained 7 bands which measured the same regions as bands 1 to 7 on LS7 ETM+.
Each sampled spectral region is called a *band*.
It is important to note that the numbering of the bands relative to the detected wavelengths is inconsistent between sensors (Figure 1).
For example: in the green region of the electromagnetic spectrum (around 560 nm), LS5 TM and LS7 ETM+ detect a wide green region called band 2. 
LS8 OLI detects a slightly narrower region and calls it band 3.
Sentinel 2 MSI (A and B) detects a narrow green region but also call it band 3.
So not only is it important to understand that the band naming changes between sensors but so too does the spectral region that is sampled.

![Image](https://prd-wret.s3-us-west-2.amazonaws.com/assets/palladium/production/s3fs-public/styles/full_width/public/thumbnails/image/dmidS2LS7Comparison.png)
> Figure 1 [[source]](https://directory.eoportal.org/web/eoportal/satellite-missions/l/landsat-9) is overlaid upon the percent transmission of each wavelength returned to the atmosphere from the Earth relative to the amount of incoming solar radiation. 
The bands that are detected by each of the satellites are shown in the numbered boxes and the width of each box is relative to the spectral range that band detects.
The y-axis has no bearing on the comparison of the satellite sensors.


DEA band naming nomenclature takes sensor band naming and variability in the detected spectral region into account.
However, before we discuss the DEA band naming conventions in more detail, it is important to understand the DEA surface reflectance measurements (NBAR and NBART), as these form an important part of the DEA band naming scheme in collection 3 of the DEA Landsat data catalogue.

## Analysis ready data
Digital Earth Australia produces analysis-ready datasets (ARD) for each of the sensors listed [above](#Satellite-datasets).
Per sensor, the ARD dataset is offered as a product which contains a number of explorable [measurements](link to products and measurements notebook).
The ARD [standard](http://ceos.org/ard/) for satellite data requires that an ARD dataset comprise:
* **geometric correction** - Establishing ground position, taking into account terrain (orthorectification) and ground control points and assessing absolute position accuracy. Geometric calibration allows products to be used with other spatial data, and in particular to be ‘stacked as time-series’. Adjustments for ground variability typically use a Digital Elevation Model (DEM).
* **surface reflectance correction** - Adjustments for sensor/instrument gains, biases, offsets and adjustments for sensor viewing angle with respect to the pixel position on the surface
* **observation attributes** - Dataset and pixel descriptive information such as quality flags which allow users to make informed decisions about the suitability of the products for a particular use. For example clouds, cloud shadows, missing data, saturation and accuracy assessments
* **metadata** - Dataset and pixel descriptive information including: satellite, instrument, acquisition date and time, spatial boundaries, pixel locations, mode, processing details, spectral or frequency response and grid projection.

## Surface Reflectance
Surface reflectance provides standardised optical surface reflectance datasets using robust physical models to correct for variations in image radiance values due to atmospheric properties, and sun and sensor geometry. The resulting stack of surface reflectance grids are consistent over space and time which is instrumental in identifying and quantifying environmental change. Surface reflectance is based on radiance data from the Landsat and Sentinel sensors. Image radiance itself is a composite of surface reflectance, atmospheric condition, interaction between surface land cover, solar radiation and sensor view angle and land surface orientation relative to the imaging sensor.
  
There are many approaches to satellite surface reflectance correction and DEA opts to use two called NBAR and NBART.
**Users will choose which of these measurements to load when querying the DEA datacube and so it is important to understand their major similarities and differences.**

### NBAR
Radiance measurements from EO sensors do not directly quantify the surface reflectance of the Earth. Such measurements are modified by variations in atmospheric properties, sun position, sensor view angle, surface slope and surface aspect.
To obtain consistent and comparable measures of Earth surface reflectance from EO,these variations need to be reduced or removed from the radiance measurements (Li et al., 2010).
This is especially important when comparing imagery acquired in different seasons and geographic regions.

Surface reflectance measurements are created using a physics-based, coupled BRDF and atmospheric correction model that can be applied to both flat and inclined surfaces (Li et al., 2012).
The resulting surface reflectance values are comparable both within individual images and between images acquired at different times and/or with different sensors.

NBAR stands for *Nadir-corrected BRDF Adjusted Reflectance*, where BRDF stands for *Bidirectional reflectance distribution function*
The approach involves atmospheric correction to compute surface-leaving radiance, and bi-directional reflectance modelling to remove the effects of
topography and angular variation in reflectance.
### NBART
Terrain affects optical satellite images through both irradiance and bidirectional reflectance distribution function (BRDF) effects.
Slopes facing the sun receive enhanced solar irradiance and appear brighter compared to those facing away from the sun.
For anisotropic surfaces, the radiance received at the satellite sensor from a sloping surface is also affected by surface BRDF which varies with combinations of surface landcover types, sun, and satellite geometry (sun and sensor view, and their relative azimuth angle) as well as topographic geometry (primarily slope and aspect angles).
Consequently, to obtain comparable surface reflectance from satellite images covering mountainous areas, it is necessary to process the images to reduce or remove the topographic effect so that the images can be used for different purposes on the same spectral base.

A Digital Surface Model (DSM) resolution appropriate to the scale of the resolution of satellite image is needed for the best results. 1 second SRTM DSM is used for NBART processing.
  
NBAR-T has the same features of NBAR but includes the *terrain illumination* reflectance correction.

### Observation Attributes
The *Observation Attributes (OA)* are a suite of measurements that are included in the DEA ARD datasets.
They are an assessment of each image pixel to determine if it is an unobscured, unsaturated observation of the Earth surface and also whether the pixel is represented in each spectral band. 
The OA product allows users to produce masks which can be used to exclude pixels which do not meet their quality criteria from analysis.
The capacity to automatically exclude such pixels is essential for multi-temporal analysis techniques that make use of every quality assured pixel within a time series of observations.
The most common use of OA is for cloud masking where users can choose to threshold the amound of clouds that their analysis will tolerate. 
A demonstration of how to use cloud masking can be found in the [masking data](../Frequently_used_code/Masking_data.ipynb) notebook.

The OA suite of measurements include the following observation pixel-based attributes:
* Null pixels
* Clear pixels
* Cloud pixels
* Cloud shadow pixels
* Snow pixels
* Water pixels
* Spectrally contiguous pixels
* Terrain shaded pixels

Also included is a range of pixel-based attributes related to the satellite, solar and sensing geometries:
* Solar zenith
* Solar azimuth
* Satellite view
* Incident angle
* Exiting angle
* Azimuthal incident
* Azimuthal exiting
* Relative azimuth
* Timedelta


## Data format
### DEA band naming conventions
To account for the multiple sensors whose data are catalogued in the DEA datacube, and their unique [differences](#Satellite-datasets), DEA has developed its own band naming convention to maximise comparability when comparing data between satellite sensors.

DEA band names reference the surface reflectance correction applied (NBAR or NBART) followed by the spectral region detected by the satellite. 
This removes all reference to the sensor band numbering scheme and assumes that users understand that the spectral region described by the DEA band name is only approximately the same between sensors, not identical.

Table 1 summarises the DEA band naming terminology and compares the band sensors that are captured within the DEA band name.

|Description|DEA measurement name (NBAR)|DEA measurement name (NBART)|LS5 TM|LS7 ETM+|LS8 OLI|Sen2 MSI|
|----|----|----|----|----|----|----|
|Coastal aerosol|nbar_coastal_aerosol|nbart_coastal_aerosol|||1|1|
|Blue|nbar_blue|nbart_blue|1|1|2|2|
|Green|nbar_green|nbart_green|2|2|3|3|
|Red|nbar_red|nbart_red|3|3|4|4|
|Nir (Near infra-red)|nbar_nir|nbart_nir|4|4|5|8, 8a|
|Swir1 (Short wave infra-red 1)|nbar_swir1|nbart_swir1|5|5|6|11|
|Swir2 (Short wave infra-red 2)|nbar_swir2|nbart_swir2|7|7|7|12|

### DEA data projection and holdings
Keeping with the practices of the Landsat and Sentinel satellite programs, all DEA data holdings are projected using the Universal Transverse Mercator (UTM) system.
This aligns with the World Geodetic System 84 (WGS84) datum and all data queries default to this coordinate reference system unless specified otherwise.

Also by default, the spatial extent of the DEA data holdings is approximately the Australian coastal shelf. 
The actual extent varies based on the sensor. 
The current extents can be viewed using the interactive DEA data [explorer](http://explorer.sandbox.dea.ga.gov.au/ga_ls8c_ard_3).

## Recommended next steps
For more detailed information on the concepts introduced in this notebook, please see the [DEA User Guide](https://docs.dea.ga.gov.au/index.html#) and [Open Data Cube Manual](https://datacube-core.readthedocs.io/en/latest/).
For more information on the development of the DEA platform, please see [Dhu et al. 2017](https://doi.org/10.1080/20964471.2017.1402490).

To continue with the beginners guide, the following notebooks are designed to be worked through in the following order:
- [Introduction to Products and Measurements](link to notebook)
- [Introduction to Querying](link to notebook)
- [Introduction to Plotting](link to notebook)
- [Run a basic analysis](link to notebook)
- [Other training materials](link to notebook or folder)

Once you have worked through the beginners guide, you can join advanced users by exploring:
- A demonstration of how to use cloud masking can be found in the [masking data](../Frequently_used_code/Masking_data.ipynb) notebook.
- [DEA datasets](https://github.com/GeoscienceAustralia/dea-notebooks/tree/develop/DEA_datasets)
- [Frequently used code](https://github.com/GeoscienceAustralia/dea-notebooks/tree/develop/Frequently_used_code)
- [Real world examples](https://github.com/GeoscienceAustralia/dea-notebooks/tree/develop/Real_world_examples)

## Additional information

**License:** The code in this notebook is licensed under the [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0). 
Digital Earth Australia data is licensed under the [Creative Commons by Attribution 4.0](https://creativecommons.org/licenses/by/4.0/) license.

**Contact:** If you need assistance, please post a question on the [Open Data Cube Slack channel](http://slack.opendatacube.org/) or on the [GIS Stack Exchange](https://gis.stackexchange.com/questions/ask?tags=open-data-cube) using the `open-data-cube` tag (you can view previously asked questions [here](https://gis.stackexchange.com/questions/tagged/open-data-cube)).
If you would like to report an issue with this notebook, you can file one on [Github](https://github.com/GeoscienceAustralia/dea-notebooks).

**Last modified:** September 2019

**Compatible `datacube` version:** 

In [None]:
print(datacube.__version__)

## Tags
Browse all available tags on the DEA User Guide's [Tags Index](https://docs.dea.ga.gov.au/genindex.html)