# Do it yourself <img align="right" src="images/dea_logo.jpg">

* **Products used:** 
[s2a_nrt_granule](https://explorer.sandbox.dea.ga.gov.au/s2a_ard_granule)
* **Prerequisites:** Users of this notebook should have a basic understanding of:
    * How to run a [Jupyter notebook](01_Jupyter_notebooks.ipynb)

## Background
In the following notebook, you'll be exploring how you can load and visualise data using a set of simple Open Data Cube commands.

Before you get started, save this notebook to somewhere outside the `Tutorials` folder, so it won't get overwritten when you next log in.
The [JupyterLab documentation](https://jupyterlab.readthedocs.io/en/stable/user/files.html) contains tips on how to work with files.

As you work through this notebook, you may want to know more about how to load different kinds of data.
For more advice and examples, view the [products and measurements](./Reference_products_and_measurements.ipynb) and the [loading data](./Reference_loading_data.ipynb) reference notebooks.

## Description
As you work through this notebook you will:
1. Pick a study area anywhere in Australia.
1. Explore available data products for your study area.
1. Set up a datacube load command to load data for your study area.
1. Plot data that has been loaded, exploring plotting of different timesteps.
1. Export data to an image format to view on your local computer.

Let's get started.

## Pick a study area

First, use Google Maps to pick a study site in Australia and click on the map.
You should be able to select the lat/lon coordiantes.
Visit [maps.google.com](https://www.google.com/maps/@-28.6035447,135.9291226,4.93z) to select a site.
The image below shows the coordinates that you can select at the bottom of the image.
Copy these coordinates and paste them in the next cell.

![google maps coordinates](images/google_maps.jpg "Google Maps Coordinates")

Alternatively, you can copy and paste one of the example coordinate pairs:

**Dead Dog Creek, Queensland**\
`coordinates = [-14.642744, 144.899747]`

**Giles Creek, Northern Territory**\
`coordinates = [-23.765165, 134.724024]`

**Lake Disappointment, Western Australia**\
`coordinates = [-23.481127, 122.817712]`

*Note that if you change the study site coordinates, you need to run or re-run each cell below to run the analysis on the new location.*

In [None]:
# Supply the latitude and longitude coordinates for your study site here
# Format them in the same manner as the examples above
coordinates = 

The `coordinates` variable gives the central latitude and longitude for the study.
The next cell allows you to specify how large you want your study area to be by creating a bounding box around your central coordinates.

Change the size of the bounding box by changing the `box_size` parameter.
For example, `box_size=0.05` will add 0.05 degrees either side of your central latitude and longitude, resulting in 0.1 degree$^2$ box. 

In [None]:
# Set the box size
# You can increase the box_size to load more data. More data means longer load, though!
box_size = 

# For simplicity, store the centre X and Y coords
centre_x = coordinates[1]  # longitude
centre_y = coordinates[0]  # latitude

# Convert them to a bounding box by subtracting and adding the box_size
bounding_box_x = (centre_x - box_size, centre_x + box_size)
bounding_box_y = (centre_y - box_size, centre_y + box_size)

## Loading a data cube

This next cell performs the load.
First, we set up a datacube object, `dc`, which has all the functions of the Open Data Cube library, and then we use the `dc.load()` function to load data.
You can see the parameters that we've set below, but you can change them, for example, adding or removing measurements based on [product metadata](ODC_and_DEA_Metadata.ipynb).

In [None]:
# Import necessary python packages
%matplotlib inline
import datacube
import warnings
warnings.filterwarnings('ignore')  # suppress warnings

# Set up the datacube object
dc = datacube.Datacube(app='do-it-yourself')

# This command here does the loading of data
# Please be patient, it can take some time to load, depending on the size of your study area
dataset = dc.load(
    product='s2a_nrt_granule',
    x=bounding_box_x,
    y=bounding_box_y,
    resolution = (-10, 10),
    output_crs='epsg:3577',
    measurements=(
        'nbar_red',
        'nbar_green',
        'nbar_blue',
        'nbar_nir_1'
    )
)

Following the load step, printing the `dataset` object will give you insight into all of the data that was loaded.
Do this by running the next cell.

There's a lot of information to unpack, which is represented by the following aspects of the data:
- `Dimensions`: the names of data dimensions, frequently `time`, `x` and `y`, and number of entries in each
- `Coordinates`: the coordinate values for each point in the data cube
- `Data variables`: the observations loaded, frequently different spectral bands from a satellite
- `Attributes`: any useful information for the data, such as the `crs` (coordinate reference system)

In [None]:
print(dataset)

## Plotting data

The next step uses Matplotlib to plot some data. We use a quick function to prepare the data, called `rgb`, which will prepare the measurements from the datacube we loaded into something Matplotlib expects. Later, we show another way to plot a simple single-band image.

There are several parameters you can experiment with:

- `time_step=n`\
This sets the time step you want to view. 
`n` can be any number from `0` to one fewer than the number of time steps you loaded. 
The number of time steps loaded is given in the print-out of the data, under the `Dimensions` heading. 
As an example, if under `Dimensions:` you see `time: 6`, then there are 6 time steps, and `time_step` can be any number between `0` and `5`.

- `bands=[red_channel, green_channel, blue_channel]`\
This sets the measurements that you want to use to make the image.
Any measurements can be mapped to the three channels, and different combinations highlight different features.
Two common combinations are
    - True colour: `bands = ['nbar_red', 'nbar_green', 'nbar_blue']`
    - False colour: `bands = ['nbar_nir_1', 'nbar_red', 'nbar_green']`

In [None]:
import matplotlib.pyplot as plt
from utils.dea_plotting import rgb

# Set the time step to view
time_step = 

# Set the band combination to plot
bands = 

# Generate the image by running the rgb function
rgb(dataset, bands=bands, index=time_step, size=10)

# Format the time stamp for use as the plot title
time_string = str(dataset.time.isel(time=time_step).values).split('.')[0]  

# Set the title and axis labels
ax = plt.gca()
ax.set_title(f"Timestep {time_string}", fontweight='bold', fontsize=16)
ax.set_xlabel('Easting', fontweight='bold')
ax.set_ylabel('Northing', fontweight='bold')

# Display the plot
plt.show()

##  Exporting data
The last task here is to export the data for your study site. You can change the name of the filename so that you know what the file is going to be called. After the file has been created, you can download it from the Jupyter directory it was exported into.

In [None]:
from datacube import helpers

# You can change this, if you like.
filename = "example.tiff"

helpers.write_geotiff(dataset=dataset.isel(time=time_step), filename=filename)

## Stretch goal: Calculate NDVI

If you've come this far and you'd like to do something a bit fancier, you can have a go at calculating the normalised difference vegetation index (NDVI) over your study site. There is a definition of what [NDVI is on Wikipedia](https://en.wikipedia.org/wiki/Normalized_difference_vegetation_index).

Basically, you need to use the following formula:

$$
\begin{aligned}
\text{NDVI} & = \frac{(\text{NIR} - \text{Red})}{(\text{NIR} + \text{Red})} \\
\end{aligned}
$$
    
Some hints:
 * You can access bands of an Xarray (the data format we're using) with their name, like this: `dataset.bandname`
 * You can do simple math with bands by simply referring to them, like this: `dataset.bandname_1 + dataset.bandname_2`
 * The two band names you're after are `nbar_nir_1`, which is near infra-red, and `nbar_red`, which is red.
 * You can pass many arguments to the `.plot()` command to configure the image.
 One example is `cmap=colormap`, where `colormap` is the name of a [Matplotlib Colour Map](https://matplotlib.org/3.1.0/tutorials/colors/colormaps.html).
 See if you can find a colour map that shows high values as green.

In [None]:
# Calculate NDVI here
# Fill in the calculation after the equals sign
ndvi = 

# This is the simple way to plot
# Note that high values are likely to be vegetation.
plt.figure(figsize=(10,10))
ndvi.isel(time=time_step).plot()
plt.show()