# Australian Gridded Climate Data <img align="right" src="../../resources/easi_logo.jpg">

#### Index
- [Overview](#Overview)
- [Setup (imports, dask, query)](#Setup)
- [Product definition (measurements, flags)](#Product-definition)
- [Create a mask (quality layer)](#Create-a-mask)
- [Define scaling (scale, offset)](#Define-scaling)
- [Select a data layer (process)](#Select-a-data-layer)
- [Visualise](#Visualise)
- [Appendix](#Appendix)

## Overview

The Australian Gridded Climate Data collection are gridded surface meteorology layers for Australia produced by the Bureau of Meteorology (BoM). The rainfall, temperature and vapour pressure data come from the BoM's network of rain gauges and weather stations. Observation station data are interpolated to a spatial resolution of 0.05 degrees latitude and longitude (~5 km x 5km). Data are unprojected, in geographic decimal degrees, referenced to GDA94 (equivalent to WGS84 for all practical purposes). The solar irradiance data are derived from geostationary meteorological satellite imagery, and similarly gridded.

Rainfall data are available for 1900 to present, temperatures from 1911 to present, vapour pressure from 1950 to present, and solar irradiance from 1990 to present. Most parameters have "day" and "month" layers and some have "RMSE" layers as well.

Key references for Bureau of Meteorology AWAP data are Jones et al. (2009) and Grant et al. (2008) and the information at http://www.bom.gov.au/climate/maps/.

#### Recalibrated rainfall

`agdc_rain_recal_day`: This product is generated by rescaling daily rainfall at the end of each month so that the sum of daily rainfalls matches the monthly reanalyses. The discrepancy between the sums of original daily rainfalls and the end-of-month reanalyses is due to differences in the interpolation methods applied to daily and monthly rainfall because of their different spatial structures. Interpolation failures in the (uncalibrated) daily data occur in sparsely gauged areas of the continent such as the Central and Western Deserts. These are removed in the recalibrated rainfall product and replaced with missing data values.

_This product is recommended for longer-term analyses that would benefit from consistency between daily and monthly values._

#### Data source and documentation

Individual data files can be manually downloaded, and verified, at http://www.bom.gov.au/climate/maps/.

CSIRO has a license agreement with BoM to receive the data and produce value-added products.

Key references:

- Jones DA, Wang W, Fawcett R (2009), High-quality spatial climate data-sets 
  for Australia. Australian Meteorological and Oceanographic Journal 58:233-248
- Grant I, D Jones, W Wang, R Fawcett, D Barratt (2008), Meteorological and
  remotely sensed datasets for hydrological modelling: A contribution to the
  Australian Water Availability Project. Proceedings of the Catchment-scale
  Hydrological Modelling & Data Assimilation (CAHMDA-3) International Workshop
  on Hydrological Prediction: Modelling, Observation and Data Assimilation,
  Melbourne, January 9-11 2008.

#### EASI pipeline

| Task | Summary |
|------|---------|
| Source | Collate sources and versions of the full set of variables for all available time (CSIRO internal) |
| Download | Daily downloads from BoM FTP (CSIRO internal) |
| Preprocess | |
| Format | COG |
| Location | s3://landscapes-easi-shared/agcd/ (not public) |

Contacts: Rob Bridgart, Matt Paget

## Setup

#### Imports

In [None]:
# Data tools
import numpy as np
import xarray as xr
import pandas as pd
from datetime import datetime

# Datacube
import datacube
from datacube.utils import masking  # https://github.com/opendatacube/datacube-core/blob/develop/datacube/utils/masking.py
from odc.algo import enum_to_bool   # https://github.com/opendatacube/odc-tools/blob/develop/libs/algo/odc/algo/_masking.py
from odc.algo import xr_reproject   # https://github.com/opendatacube/odc-tools/blob/develop/libs/algo/odc/algo/_warp.py
from datacube.utils.geometry import GeoBox, box  # https://github.com/opendatacube/datacube-core/blob/develop/datacube/utils/geometry/_base.py
from datacube.utils.rio import configure_s3_access

# Holoviews, Datashader and Bokeh
import hvplot.pandas
import hvplot.xarray
import holoviews as hv
import panel as pn
import colorcet as cc
import cartopy.crs as ccrs
from datashader import reductions
from holoviews import opts
# import geoviews as gv
# from holoviews.operation.datashader import rasterize
hv.extension('bokeh', logo=False)

# Python
import sys, os, re

# Optional EASI tools
sys.path.append(os.path.expanduser('../../scripts'))
import notebook_utils

# Hide ShapelyDeprecationWarning
import warnings
warnings.filterwarnings('ignore', message='.+Shapely 2.0')

#### Dask

In [None]:
cluster, client = notebook_utils.initialize_dask(workers=(1,2), use_gateway=False)
display(cluster)
display(client)

#### AWS configuration

In [None]:
# Optional
configure_s3_access(aws_unsigned=False, requester_pays=True, client=client)

#### ODC database

In [None]:
dc = datacube.Datacube()

#### Example query

Change any of the parameters in the query object below to adjust the location, time, projection, or spatial resolution of the returned datasets.

Use the Explorer interface to check the temporal and spatial coverage for each product:
- https://explorer.csiro.easi-eo.solutions  + /product (when available)

In [None]:
# Area name
min_longitude, max_longitude = (110,154)
min_latitude, max_latitude = (-45,-10)
min_date = '2021-01-01'
max_date = '2021-12-31'

query = {
    'x': (min_longitude, max_longitude),    # "x" axis bounds
    'y': (min_latitude, max_latitude),      # "y" axis bounds
    'time': (min_date, max_date),           # Any parsable date strings
    'group_by': 'solar_day',                # Scene ordering
    'dask_chunks': {'latitude': 2048, 'longitude': 2048},  # Dask chunks
}

In [None]:
p = re.compile('agcd_(\w+)_([a-z]+)$')

def load_awap_product(product, query):

    # Load data
    data = dc.load(product=product, **query)

    # notebook_utils.heading(notebook_utils.xarray_object_size(data))
    # display(data)

    # Calculate valid (not nodata) masks for each layer
    valid_mask = masking.valid_data_mask(data)
    
    # Apply valid mask and good pixel mask
    m = p.match(product)
    layer_name = m.group(1)
    layer = data[[layer_name]].where(valid_mask[layer_name])
    layer = layer.persist()
    
    return layer, layer_name

## Product definition

Display the measurement definitions for the selected product.

Use `list_measurements` to show the details for a product, and `masking.describe_variable_flags` to show the flag definitions.

In [None]:
# Measurements for the selected product
measurements = dc.list_measurements().loc[query['product']]
measurements
# # The AGDC products do not have flag_definition measurements; just missing data

## Visualise

In [None]:
clim = {
    'agcd_rain_recal_day': (0,400),
    'agcd_rain_rmse_day': (0,100),
    'agcd_rain_rmse_month': (0,200),
    'agcd_rain_total_day': (0,400),
    'agcd_rain_total_month': (0,800),
    'agcd_solar_exposure_day': (0,40),
    'agcd_solar_exposure_month': (0,40),
    'agcd_tmax_mean_day': (0,45),
    'agcd_tmax_mean_month': (0,45),
    'agcd_tmax_rmse_day': (0,10),
    'agcd_tmax_rmse_month': (0,10),
    'agcd_tmin_mean_day': (0,30),
    'agcd_tmin_mean_month': (0,30),
    'agcd_tmin_rmse_day': (0,10),
    'agcd_tmin_rmse_month': (0,10),
    'agcd_vp09_mean_day': (0,36),
    'agcd_vp09_mean_month': (0,36),
    'agcd_vp15_mean_day': (0,36),
    'agcd_vp15_mean_month': (0,36),
}

In [None]:
def awap_plot(layer, product, layer_name):

    # Generate a plot

    options = {
        'title': f'{product}: {layer_name}',
        'width': 800,
        'height': 450,
        'aspect': 'equal',
        'cmap': cc.rainbow,
        'clim': clim[product],
        'colorbar': True,
        'tools': ['hover'],
    }

    # Set the Dataset CRS
    plot_crs = ccrs.PlateCarree()

    # Native data and coastline overlay:
    # - Comment `crs`, `projection`, `coastline` to plot in native_crs coords
    # TODO: Update the axis labels to 'longitude', 'latitude' if `coastline` is used

    layer_plot = layer.hvplot.image(
        x = 'longitude', y = 'latitude',                        # Dataset x,y dimension names
        rasterize = True,                        # Use Datashader
        aggregator = reductions.mean(),          # Datashader selects mean value
        precompute = True,                       # Datashader precomputes what it can
        crs = plot_crs,                        # Dataset crs
        projection = ccrs.PlateCarree(),         # Output projection (use ccrs.PlateCarree() when coastline=True)
        coastline='10m',                         # Coastline = '10m'/'50m'/'110m'
    ).options(opts.Image(**options)).hist(bin_range = options['clim'])

    # display(layer_plot)
    # Optional: Change the default time slider to a dropdown list, https://stackoverflow.com/a/54912917
    fig = pn.panel(layer_plot, widgets={'time': pn.widgets.Select})  # widget_location='top_left'
    display(fig)

## Run all products

In [None]:
for product in awap_clim.keys():
    try:
        layer, layer_name = load_awap_product(product)
        display(layer)
        awap_plot(layer, product, layer_name)
    except KeyError:
        continue

## Appendix