# Visualising NetCDF data in Python

This notebook tutorial was produced by the [Knowledge Systems](eatlas.org.au) team at the [Australian Institute of Marine Science](www.aims.gov.au) to  guide the reader through downloading a dataset file, inspecting metadata of interest, and visualising the data on a map. This tutorial will use the [GBR photosynthetically active radiation (PAR) at 8 m depth](https://eatlas.org.au/data/uuid/eebd1438-2d4e-4f60-9055-27e6b9e58c3a) dataset from the [Benthic light as ecologically-validated GBR-wide indicator for water quality (NESP TWQ Project 5.3)](https://eatlas.org.au/nesp-twq-5/benthic-light-5-3) project, which was part of the [NESP Tropical Water Quality hub](https://nesptropical.edu.au).

This notebook can be run directly from Google Colab (if not already) by clicking the "Open in Colab" button below.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/eatlas/netcdf-python/blob/main/notebooks/visualising-netcdf.ipynb)

### Prepare the environment.
This next cell will install any libraries that are required but are not already installed. This only needs to be done once for the lifetime of the active notebook.

In [None]:
# Re-install 'shapely' because of a bug that crashes the notebook when plotting a variable on a map.
!pip uninstall shapely --yes
!pip install shapely --no-binary shapely

# Install 3rd party libraries required by the code.
!pip3 install netcdf4 cartopy

### Download sample data
Download sample data from the PAR8 dataset.

<b>NOTE:</b> This need only be done once, as the file will be cached. However, running this command multiple times will NOT cause any problems.

In [None]:
!curl https://maps.eatlas.org.au/thredds/fileServer/NESP-TWQ-5-3_Benthic-light/xr_par8/orig/xr_par8_daily/xr_par8_daily_2019.nc --output xr_par8_daily_2019.nc

### Connect to the dataset

In [None]:
# Import the NetCDF library for accessing and manipulating NetCDF libraries.
import netCDF4 as nc

# Open the Par8 NetCDF file downloaded previously.
dataset = nc.Dataset('xr_par8_daily_2019.nc')

# Print the high-level metadata.
print(dataset)

This shows the dataset-level metadata (lower-level objects, such as variables, can have their own metadata). Some things to notice here:

1. Dimensions contained in this dataset are: `lon` (longitude, 1344 steps), `lat` (latitude, 1536 steps), `time` (365 days).
2. Variables are:
    - lon - the actual values of the longitude dimension, referred to as a Dimension Variable.
    - lat - the actual values of the latitude dimension, referred to as a Dimension Variable.
    - par8 - the variable data we will visualise.

The metadata is also available as a Python Dict for easier processing:

In [None]:
print(dataset.__dict__)

# As an example.
print('\nmetadata_link: ' + dataset.__dict__['metadata_link'])

### Dimensions
As previously stated, NetCDF files are often used for multi-dimensional data. Each dimension is stored as a dimension class which contains pertinent information. Metadata for all dimensions can be access by looping through all available dimensions.

In [None]:
for dimension in dataset.dimensions.values():
    print(dimension)

An individual dimension can be access directly:

In [None]:
print(dataset.dimensions['lon'])

### Variables
Metadata for all variables can be accessed similarly.

In [None]:
for variable in dataset.variables.values():
    print(variable)

And for a specific variable:

In [None]:
print(dataset['par8'])

Notes about the `par8` data:

__`float32 par8(time, lat, lon)`__ - 3-dimensions, with the order being: `time`, `lat` and then `lon`. This is important later when accessing the data.

__`current shape = (365, 1536, 1344)`__ - the size of the data cube is 365 x 1536 x 1344, which matches the size of the dimensions from earlier.

### Visualise the data

Retrieve the latitude and longitude data. Note the syntax as the latitude and longitude dimension variables are 1-dimensional.

In [None]:
lons = dataset['lon'][:]
lats = dataset['lat'][:]

Retrieve the variable data. Note that the variable data is 3-dimensional, with the first dimension being the day, from 0 (zero) to 365. Here we choose a random day to visualise.

In [None]:
par8 = dataset['par8'][287,:,:]

Find the minimum and maximum values for the map legend.

In [None]:
import numpy as np
import math

# Scan the extracted data for the minimum and maximum.
min_value = math.floor(np.amin(par8))
max_value = math.ceil(np.amax(par8))
print(f'min value: {min_value}')
print(f'max_value: {max_value}')

Display the data on a map.

In [None]:
import matplotlib.pyplot as plt
import cartopy.crs as ccrs

fig = plt.figure(figsize=(16,8))
ax = plt.axes(projection=ccrs.Robinson())
ax.gridlines(linestyle='--',color='black')
ax.coastlines()
clevs = np.arange(min_value,max_value,1)
plt.contourf(lons, lats, par8, clevs, transform=ccrs.PlateCarree(),cmap=plt.cm.jet)
plt.title(dataset['par8'].__dict__['long_name'], size=14)
cb = plt.colorbar(ax=ax, orientation="vertical", pad=0.02, aspect=16, shrink=0.8)
cb.set_label(dataset['par8'].__dict__['units'],size=12,rotation=90,labelpad=15)
cb.ax.tick_params(labelsize=10)

## Links
- [Python Netcdf library](https://unidata.github.io/netcdf4-python/)
- [GBR photosynthetically active radiation (PAR) at 8 m depth](https://eatlas.org.au/data/uuid/eebd1438-2d4e-4f60-9055-27e6b9e58c3a)
- [Benthic light as ecologically-validated GBR-wide indicator for water quality (NESP TWQ Project 5.3)](https://eatlas.org.au/nesp-twq-5/benthic-light-5-3)
- [NESP Tropical Water Quality hub](https://nesptropical.edu.au)

## Acknowledgements
These tutorials build on ideas and content from the following sources:
- [Read NetCDF Data with Python](https://towardsdatascience.com/read-netcdf-data-with-python-901f7ff61648)
