## Introduction to querying

**Notebook currently compatible with the `NCI`|`DEA Sandbox` environment only**

### General advice (delete this cell before submitting for review)

- When choosing a location for your analysis, **select an area that has data on both the `NCI` and `DEA Sandbox`** to allow your code to be run on both environments. 
For example, you can check this for Landsat using the [DEA Explorer](https://explorer.sandbox.dea.ga.gov.au/ga_ls5t_ard_3/1990) (use the drop-down menu to view all products). 
As of September 2019, the `DEA Sandbox` has a single year of continental Landsat data for 2015-16, and the full 1987-onward time-series for three locations (Perth WA, Brisbane QLD, and western NSW).
- When writing in Markdown cells, start each sentence is on a **new line**.
This makes it easy to see changes through git commits.
- Use Australian English in markdown cells and code comments.
- Use the [PEP8 standard](https://www.python.org/dev/peps/pep-0008/) for code. To make sure all code in the notebook is consistent, you can use the `jupyterlab_code_formatter` tool: select each code cell, then click `Edit` and then one of the `Apply X Formatter` options (`YAPF` or `Black` are recommended). This will reformat the code in the cell to a consistent style.
- In the final notebook cell, include a set of relevant tags which are used to build the DEA User Guide's [Tag Index](https://docs.dea.ga.gov.au/genindex.html). 
Use all lower-case, seperate words with spaces, and where possible re-use existing tags.
Ensure the tags cell below is in `Raw` format, rather than `Markdown` or `Code`.


### Description
Every DEA analysis begins with a query to the datacube that specifies the what? where? and when? of the data request.
A query returns an xarray dataset containing the contents of your request.

This notebook will introduce how to load data from the datacube through the construction of a query and use of the *load* function

### Technical details
* **Products used:** `product_name`, `product_name`, `product_name`
* **Analyses used:** NDWI water index, geomedian compositing, pixel drill
* **Special requirements:** An _optional_ description of any special requirements, e.g. If running on the [NCI](https://nci.org.au/), ensure that `module load otps` is run prior to launching this notebook

### Prerequisites
Users of this notebook should have a basic understanding of how to run a [Jupyter notebook](future link to Intro_to_Jupyter) and understand the basic structure of the [satellite datasets](future link to Intro_to_DEA) that are held within the DEA.

## Getting started
To run this introduction to querying, run all the cells in the notebook, starting with the "Load packages" cell. 

### Load packages
Use standard import commands; some are shown below. 
Begin with any `iPython` magic commands, followed by standard Python packages, then any additional functionality you need from the `Scripts` directory.

In [1]:
# %matplotlib inline

# import datacube
# import matplotlib.pyplot as plt
# import numpy as np
# import pandas as pd
# import sys
# import xarray as xr

# sys.path.append("../Scripts")

### Connect to the datacube
Give your datacube app a unique name that is consistent with the purpose of the notebook.

In [22]:
import datacube
# Temporary solution to account for Collection 3 data being in a different
# database on the NCI
try:
    dc = datacube.Datacube(app='Introduction_to_querying', env='c3-samples')
except:
    dc = datacube.Datacube(app='Introduction_to_querying')


## Loading data

Loading data from the datacube uses the *load* function.

The function takes several arguments:

* *product*; A specifc product to load. 
* *x*; Defines the spatial region in the *x* dimension
* *y*; Defines the spatial region in the *y* dimension
* *time*; Defines the temporal extent.

**Note**: DEA products are discussed in more detail in Introduction_to_products(future link to Intro_to_products)

Lets run a query to load all datasets within the landsat 7 nbart annual geomedian product for Moreton Bay in QLD.
The *load* function requires the minimum following criteria:

* product: ls7_nbart_geomedian_annual
* location: x=(153.3, 153.4), y=(-27.5, -27.6)
* time period: 2015-01-01 to 2016-01-01

Run the following cell to load all matching datasets

In [37]:
data = dc.load(product='ls7_nbart_geomedian_annual', 
               x=(153.3, 153.4), y=(-27.5, -27.6),
               time=('2015-01-01', '2016-01-01'))

In [38]:
print (data)

<xarray.Dataset>
Dimensions:  (time: 2, x: 461, y: 508)
Coordinates:
  * time     (time) datetime64[ns] 2015-01-01 2016-01-01
  * y        (y) float64 -3.156e+06 -3.156e+06 ... -3.168e+06 -3.168e+06
  * x        (x) float64 2.067e+06 2.067e+06 2.067e+06 ... 2.079e+06 2.079e+06
Data variables:
    blue     (time, y, x) int16 519 496 480 499 503 506 ... 366 316 287 289 300
    green    (time, y, x) int16 563 555 545 558 552 553 ... 565 487 456 415 460
    red      (time, y, x) int16 308 306 312 314 307 308 ... 490 419 400 365 390
    nir      (time, y, x) int16 207 183 183 189 187 ... 2866 2650 2505 2440 2538
    swir1    (time, y, x) int16 89 88 88 99 112 117 ... 1752 1368 1127 1120 1229
    swir2    (time, y, x) int16 75 98 87 82 91 94 96 ... 894 898 657 573 495 553
Attributes:
    crs:      EPSG:3577


The variable *data* has returned an xarray Dataset containing all matching datasets.

*Dimensions* 
* identifies the number of temporal datasets returned in the search. 
In this case, there are 2 datasets that fit the criteria of our query.

*Coordinates* 
* *time* identifies the date attributed to each returned dataset
* *x* and *y* are the coordinates for the pixels within the spatial bounds of your query

*Data variables*
* For every date (time) returned by the query, the spectral response for each pixel (y, x) is returned as an array for each band.

*Attributes*
* *crs* identifies the coordinate reference system. By default, the *x* and *y* arguments accept queries in a geographical co-ordinate system WGS84, identified by the EPSG code *4326*, which is the same as within Google Earth.
* Users can also query via the native co-ordinate system that the product is stored in, and supply the *crs* argument, to get the same result.
* e.g. data = dc.load(product='ls7_nbart_geomedian_annual', 
               x=(2069309.60, 2077063.44), y=(-3155814.15, -3168499.20), crs='EPSG:3577',
               time=('2015-01-01', '2016-01-01'))

### Analysis parameters

An *optional* section to inform the user of any parameters they'll need to configure to run the notebook:
* `param_name_1`: Simple description (e.g. `example_value`). Advice about appropriate values to choose for this parameter.
* `param_name_2`: Simple description (e.g. `example_value`). Advice about appropriate values to choose for this parameter.


In [3]:
param_name_1 = "example_value"
param_name_2 = "example_value"

## Heading 1
Use headings to break up key steps/stages of the notebook.

Use markdown text for detailed, descriptive text explaining what the code below does and why it is needed.

In [4]:
# Use code comments for low-level documentation of code
a = 1

### Subheading 1
Use subheadings to break up steps within a single section.

In [5]:
# Use code comments for low-level documentation of code
b = 2

## Heading 2
Use markdown text for detailed, descriptive text explaining what the code below does and why it is needed.

In [6]:
# Use code comments for low-level documentation of code
c = 3

### To continue

## Additional information

**License:** The code in this notebook is licensed under the [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0). 
Digital Earth Australia data is licensed under the [Creative Commons by Attribution 4.0](https://creativecommons.org/licenses/by/4.0/) license.

**Contact:** If you need assistance, please post a question on the [Open Data Cube Slack channel](http://slack.opendatacube.org/) or on the [GIS Stack Exchange](https://gis.stackexchange.com/questions/ask?tags=open-data-cube) using the `open-data-cube` tag (you can view previously asked questions [here](https://gis.stackexchange.com/questions/tagged/open-data-cube)).
If you would like to report an issue with this notebook, you can file one on [Github](https://github.com/GeoscienceAustralia/dea-notebooks).

**Last modified:** September 2019

**Compatible `datacube` version:** 

In [7]:
print(datacube.__version__)

1.7+43.gc873f3ea


## Tags
Browse all available tags on the DEA User Guide's [Tags Index](https://docs.dea.ga.gov.au/genindex.html)