# Introduction to Digital Earth Australia <img align="right" src="../Supplementary_data/dea_logo.jpg">

* **Compatability**: Notebook currently compatible with both the `NCI` and `DEA Sandbox` environments
* **Prerequisites**:  Users of this notebook should have a basic understanding of:
    * How to run a [Jupyter notebook](01_Jupyter_notebooks.ipynb)

## Background
[Digital Earth Australia](https://www.ga.gov.au/dea) (DEA) is a digital platform that catalogues large amounts of Earth observation data covering continental Australia.
It is underpinned by the [Open Data Cube](https://www.opendatacube.org/) (ODC), an open source software package that has an ever growing number of users, contributors and implementations.

The ODC and DEA platforms are designed to:

* Catalogue large amounts of Earth observation data
* Provide a Python based API for high performance querying and data access
* Give users easy ability to perform exploratory data analysis
* Allow scalable continent-scale processing of the stored data
* Track the provenance of data to allow for quality control and updates

The DEA program catalogues data from a range of satellite sensors and has adopted processes and terminology that users should be aware of to enable efficient querying and use of the datasets stored within.
This notebook introduces these important concepts and forms the basis of understanding for the remainder of the notebooks in this Beginners Guide.
Resources to further explore these concepts are recommended at the end of the notebook.

## Prerequisites
Users of this notebook should have a basic understanding of the use and format of a Jupyter Notebook.

To review these basics, see the [Introduction to Jupyter Notebooks](01_Jupyter_notebooks.ipynb).

## Description
This introduction to DEA will briefly introduce the OCD and review the types of data catalogued in the DEA platform.
It will also cover commonly-used terminology for measurements within product datasets.

Topics include

* a brief introduction to the ODC
* a review of the satellite sensors that provide data to DEA
* an introduction to analysis ready data and the processes to make it 
* important terminology:
    * band naming conventions
    * the coordinate reference scheme
    
***

## Open Data Cube
The ODC is an open-source software package for organising and analysing large quantities of Earth observation data.
At its core, the Open Data Cube consists of a database where data is stored, along with commands to load, view and analyse that data.
This functionality is delivered by the [datacube-core](https://github.com/opendatacube/datacube-core) open-source Python library.
The library is designed to enable and support:

* Large-scale workflows on high performance computing infrastructures
* Exploratory data analysis
* Cloud-based services
* Standalone applications

There are a number of existing implementations of the ODC, including DEA.
More information can be found in the [Open Data Cube Manual](https://datacube-core.readthedocs.io/en/latest/index.html).

## Satellite datasets in DEA
Digital Earth Australia catalogues data from a range of satellite sensors. 
The earliest datasets of optical satellite imagery in DEA date from 1986.
DEA includes data from

* [Landsat 5 TM](https://www.usgs.gov/land-resources/nli/landsat/landsat-5?qt-science_support_page_related_con=0#qt-science_support_page_related_con) (LS5 TM), operational between March 1984 and January 2013
* [Landsat 7 ETM+](https://www.usgs.gov/land-resources/nli/landsat/landsat-7?qt-science_support_page_related_con=0#qt-science_support_page_related_con) (LS7 ETM+), operational since April 1999
* [Landsat 8 OLI](https://www.usgs.gov/land-resources/nli/landsat/landsat-8?qt-science_support_page_related_con=0#qt-science_support_page_related_con) (LS8 OLI), operational since February 2013
* [Sentinel 2A MSI](https://sentinel.esa.int/web/sentinel/missions/sentinel-2) (S2A MSI), operational since June 2015
* [Sentinel 2B MSI](https://sentinel.esa.int/web/sentinel/missions/sentinel-2) (S2B MSI, operational since March 2017

Landsat missions are jointly operated by the United States Geological Survey (USGS) and National Aeronautics and Space Administration (NASA).
Sentinel missions are operated by the European Space Agency (ESA).
One major difference between the two programs is the spatial resolution: each Landsat pixel represents 30 $m^{2}$ on the ground while each Sentinel-2 pixel represents 10 $m^{2}$ (although some spectral bands have lower resolution).

### Spectral Bands
All of the listed datasets in DEA are captured by multispectral satellites.
This means that the satellites measure light from the Earth in discrete sections of the electromagnetic spectrum, known as *spectral bands*. 
Figure 1 shows the spectral bands for recent Landsat and Sentinel-2 sensors, allowing a direct comparison of how each sensor samples the overall electromagnetic spectrum.
LS5 TM is not displayed in this image; for reference, it measured light in seven bands that covered the same regions as bands 1 to 7 on LS7 ETM+.

![Image](https://prd-wret.s3-us-west-2.amazonaws.com/assets/palladium/production/s3fs-public/styles/full_width/public/thumbnails/image/dmidS2LS7Comparison.png)
> Figure 1 [[source]](https://directory.eoportal.org/web/eoportal/satellite-missions/l/landsat-9): The bands that are detected by each of the satellites are shown in the numbered boxes and the width of each box is relative to the spectral range that band detects.
The bands are overlaid on the percentage transmission of each wavelength returned to the atmosphere from the Earth relative to the amount of incoming solar radiation. 
The y-axis has no bearing on the comparison of the satellite sensors.

Figure 1 highlights that the numbering of the bands relative to the detected wavelengths is inconsistent between sensors.
As an example, in the green region of the electromagnetic spectrum (around 560 nm), LS5 TM and LS7 ETM+ detect a wide green region called band 2, where as LS8 OLI detects a slightly narrower region and calls it band 3.
Finally, Sentinel-2 MSI (A and B) detects a narrow green region but also call it band 3.
Consequently, when working with different sensors, it is important to understand the differences in their bands, and any impact this could have on an analysis.
To promote awareness of these differences, DEA band naming is based on both the spectral band name and sample region.
The naming convention will be covered in more detail toward the end of this notebook.

## Analysis ready data

Digital Earth Australia produces analysis ready data (ARD) for each of the sensors listed above.
The [ARD standard](http://ceos.org/ard/) for satellite data requires that data have undergone a number of processing steps, along with the creation of additional attributes for the data.
The requirements are:

* **Geometric correction:** This includes establishing ground position, accounting for terrain (orthorectification) and ground control points, and assessing absolute position accuracy. 
Geometric calibration means that datasets from different sensors can be used together, and that sequential observations can be used to track meaningful change over time.
Adjustments for ground variability typically use a Digital Elevation Model (DEM).
* **Surface reflectance correction:** This includes adjustments for sensor/instrument gains, biases, offsets and adjustments for sensor viewing angle with respect to the pixel position on the surface.
* **Observation attributes:** Dataset and pixel descriptive information such as quality flags, which allow users to make informed decisions about the suitability of the products for their use. For example, clouds, cloud shadows, missing data, saturation and accuracy assessments are common pixel attributes.
* **Metadata:** Dataset and pixel descriptive information including the satellite, instrument, acquisition date and time, spatial boundaries, pixel locations, mode, processing details, spectral or frequency response and grid projection.

### Surface Reflectance

Optical sensors, such as those on the Landsat and Sentinel-2 satellites, measure light that has come from the sun and been reflected by the Earth's surface.
The sensor measures the intensity of light in each of its spectral bands (known as radiance).
The intensity of the light is dependent on the angle of the sun relative to the ground, the angle of the sensor relative to the ground, and how the light interacts with the Earth's atmosphere on its way to the sensor. 
Frequently, the quantity of interest is not the radiance itself, but rather how much light was reflected at the ground level.
This is known as bottom-of-atmosphere surface reflectance and it can be calculated by using robust physical models to correct the observed radiance values for the atmosphere, angle of the sun, and sensor geometry.

There are many approaches to satellite surface reflectance correction and DEA opts to use two: NBAR and NBART.
**Users will choose which of these measurements to load when querying the DEA datacube and so it is important to understand their major similarities and differences.**

#### NBAR
NBAR stands for *Nadir-corrected BRDF Adjusted Reflectance*, where BRDF stands for *Bidirectional reflectance distribution function*
The approach involves atmospheric correction to compute bottom-of-atmosphere radiance, and bi-directional reflectance modelling to remove the effects of topography and angular variation in reflectance.

#### NBAR-T
NBAR-T has the same features of NBAR but includes the *terrain illumination* reflectance correction.
Terrain affects optical satellite images in a number of ways; for example, slopes facing the sun receive more sunlight and appear brighter compared to those facing away from the sun.
To obtain comparable surface reflectance from satellite images covering mountainous areas, it is therefore necessary to process the images to reduce or remove the topographic effect.
This correction is performed with a Digital Surface Model (DSM) that has the same resolution as the satellite data being corrected.

### Observation Attributes

The *Observation Attributes (OA)* are a suite of measurements included in DEA's analysis ready datasets.
They are an assessment of each image pixel to determine if it is an unobscured, unsaturated observation of the Earth's surface, along with whether the pixel is represented in each spectral band. 
The OA product allows users to exclude pixels that do not meet the quality criteria for their analysis.
The capacity to automatically exclude such pixels is essential for analysing any change over time, since poor-quality pixels can significantly alter the percieved change over time.
The most common use of OA is for cloud masking, where users can choose to remove images that have too much cloud, or ignore the clouds within each satellite image.
A demonstration of how to use cloud masking can be found in the [masking data](../Frequently_used_code/Masking_data.ipynb) notebook.

The OA suite of measurements include the following observation pixel-based attributes:

* Null pixels
* Clear pixels
* Cloud pixels
* Cloud shadow pixels
* Snow pixels
* Water pixels
* Spectrally contiguous pixels
* Terrain shaded pixels

Also included is a range of pixel-based attributes related to the satellite, solar and sensing geometries:

* Solar zenith
* Solar azimuth
* Satellite view
* Incident angle
* Exiting angle
* Azimuthal incident
* Azimuthal exiting
* Relative azimuth
* Timedelta


## Data format

### DEA band naming conventions

To account for the various available datasets, DEA has developed its own band naming convention to help distinguish datasets that come from the different sensors.

The band names are comprised of the applied surface reflectance correction (NBAR or NBAR-T) and the spectral region detected by the satellite. 
This removes all reference to the sensor band numbering scheme (Figure 1) and assumes that users understand that the spectral region described by the DEA band name is only approximately the same between sensors, not identical.

Table 1 summarises the DEA band naming terminology for NBAR and NBAR-T, coupled with the corresponding band names for the available sensors.

|Spectral Region|DEA measurement name (NBAR)|DEA measurement name (NBAR-T)|LS5 TM|LS7 ETM+|LS8 OLI|S2A,B MSI|
|----|----|----|----|----|----|----|
|Coastal aerosol|nbar_coastal_aerosol|nbart_coastal_aerosol|||1|1|
|Blue|nbar_blue|nbart_blue|1|1|2|2|
|Green|nbar_green|nbart_green|2|2|3|3|
|Red|nbar_red|nbart_red|3|3|4|4|
|Nir (Near infra-red)|nbar_nir|nbart_nir|4|4|5|8, 8a|
|Swir1 (Short wave infra-red 1)|nbar_swir1|nbart_swir1|5|5|6|11|
|Swir2 (Short wave infra-red 2)|nbar_swir2|nbart_swir2|7|7|7|12|

![](s2a_2016-04-26_-37.08_145.08_-37.22_145.27_20m_nbar.gif)

### DEA data projection and holdings
Keeping with the practices of the Landsat and Sentinel satellite programs, all DEA datasets are projected using the Universal Transverse Mercator (UTM) system.
This aligns with the World Geodetic System 84 (WGS84) datum and all data queries default to this coordinate reference system unless specified otherwise.

Also by default, the spatial extent of the DEA data holdings is approximately the Australian coastal shelf. 
The actual extent varies based on the sensor. 
The current extents can be viewed using the interactive DEA data [explorer](http://explorer.sandbox.dea.ga.gov.au/ga_ls8c_ard_3).

## Recommended next steps
For more detailed information on the concepts introduced in this notebook, please see the [DEA User Guide](https://docs.dea.ga.gov.au/index.html#) and [Open Data Cube Manual](https://datacube-core.readthedocs.io/en/latest/).
For more information on the development of the DEA platform, please see [Dhu et al. 2017](https://doi.org/10.1080/20964471.2017.1402490).

To continue with the beginners guide, the following notebooks are designed to be worked through in the following order:

1. [Jupyter Notebooks](01_Jupyter_notebooks.ipynb)
2. **Digital Earth Australia (this notebook)**
3. [Products and Measurements](03_Products_and_measurements.ipynb)
4. [Loading data](04_Loading_data.ipynb)
5. [Plotting](05_Plotting.ipynb)
6. [Basic analysis](06_Basic_analysis.ipynb)

Once you have worked through the beginners guide, you can join advanced users by exploring:

* A demonstration of how to use cloud masking can be found in the [masking data](../Frequently_used_code/Masking_data.ipynb) notebook.
* [DEA datasets](https://github.com/GeoscienceAustralia/dea-notebooks/tree/develop/DEA_datasets)
* [Frequently used code](https://github.com/GeoscienceAustralia/dea-notebooks/tree/develop/Frequently_used_code)
* [Real world examples](https://github.com/GeoscienceAustralia/dea-notebooks/tree/develop/Real_world_examples)

## Additional information

**License:** The code in this notebook is licensed under the [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0). 
Digital Earth Australia data is licensed under the [Creative Commons by Attribution 4.0](https://creativecommons.org/licenses/by/4.0/) license.

**Contact:** If you need assistance, please post a question on the [Open Data Cube Slack channel](http://slack.opendatacube.org/) or on the [GIS Stack Exchange](https://gis.stackexchange.com/questions/ask?tags=open-data-cube) using the `open-data-cube` tag (you can view previously asked questions [here](https://gis.stackexchange.com/questions/tagged/open-data-cube)).
If you would like to report an issue with this notebook, you can file one on [Github](https://github.com/GeoscienceAustralia/dea-notebooks).

**Last modified:** September 2019

**Compatible `datacube` version:** 

In [None]:
print(datacube.__version__)

## Tags
Browse all available tags on the DEA User Guide's [Tags Index](https://docs.dea.ga.gov.au/genindex.html)