# Discovering eReefs dataset dimensions

This example demonstrates how to extract information about the spatial and temporal dimensions of an eReefs dataset.  The results translate the (i,j) curvilinear grid cell indexes into longitude and latitude, and 

## Access the OPeNDAP endpoint URL for your dataset

We begin by discovering the OPeNDAP endpoint URL for the published eReefs dataset that we want to interrogate.

If you don't know this already, then you can follow [these instructions](./data-discovery.ipynb) to use the CSIRO eReefs data explorer to discover the URL you need.

The URL we are using in these examples is <https://dapds00.nci.org.au/thredds/dodsC/fx3/gbr4_v2/gbr4_simple_2023-10-20.nc>, which is the OPeNDAP endpoint for a single day (October 20, 2023) of the [GBR4 Hydrodynamics v2.0 near-real-time model results](https://marlin.csiro.au/geonetwork/srv/eng/catalog.search#/metadata/72020224-f086-434a-bbe9-a222c8e5cf0d).

The example would also work fine with the full aggregation of that dataset:  we've just chosen the smaller subset to limit the size of the results that get dumped to the output of this notebook.

We will use [emsarray](https://emsarray.readthedocs.io/en/stable/) to help us query this dataset:

In [None]:

import emsarray

# Access the dataset
dataset = emsarray.open_dataset("https://dapds00.nci.org.au/thredds/dodsC/fx3/gbr4_v2/gbr4_simple_2023-10-20.nc")


## Spatial Bounds

Discovering the 2D spatial bounds is very simple!

Under the hood, `emsarray` is combining the `i` curvilinear grid-cell coordinate with the `longitude` variable, and the `j` coordinate with the `latitude` variable.


In [None]:
bounds = dict(zip(['xmin_west', 'ymin_south', 'xmax_east', 'ymax_north'], dataset.ems.bounds))
bounds


## Elevations

The third spatial dimention is elevation (or depth). eReefs models have a discreete set of valid elevations, all measured in metres, and an elevation of `0` is sea level.

In the netCDF eReefs results, this is the `k` dimension-index combined with the `zc` variable.

In [None]:
elevation_variables = dataset.ems.get_all_depth_names()
elevation_result = {}

for elevation_variable in elevation_variables:
        if elevation_variable in dataset.variables:
            elevation_result[elevation_variable] = dataset[elevation_variable].values.tolist()

elevation_result


## Time Periods

Discover the available time periods for this dataset.

The netCDF model results store time-dimension information as the number of days since 1990-01-01 00:00:00 +10.  This example translates those indices to ISO-formatted date strings.

In [None]:
import numpy
import pandas

time_values =  dataset.ems.time_coordinate

# Check we have more than one timestep!
assert len(time_values) > 1

# Round to the nearest minute to take care of rounding issues.
# (None of the eReefs model results have a time resolution higher than this)
differences = numpy.ediff1d(time_values)
differences = differences.astype('timedelta64[m]')
previous_diff = None
period_start = time_values[0]

# identify the time periods
time_periods: list[tuple[numpy.datetime64, numpy.datetime64, numpy.timedelta64]] = []
for i, next_diff in enumerate(differences):
    if previous_diff is None:
        previous_diff = next_diff

    if previous_diff != next_diff:
        time_periods.append((period_start, time_values[i], previous_diff))
        period_start = time_values[i + 1]
        previous_diff = None

    # close off current period
    time_periods.append((period_start, time_values[-1], next_diff))

# Format the time periods in to 'start/stop/step' format
time_results = [
    '/'.join([
        numpy.datetime_as_string(start, unit='s'),
        numpy.datetime_as_string(end, unit='s'),
        pandas.Timedelta(diff).isoformat(),
    ])
    for start, end, diff in time_periods
]

time_results
