Skip to content

Commit

Permalink
Sgrid (#559)
Browse files Browse the repository at this point in the history
* Add sgrid file. Make effort to recognise valid sgrid datasets.

* Added method to assign coordinate axes to Sgrid data. Using 'X', 'Y', 'Z'.

* Basic identification of sgrid coordinates.

* Added basic logic to extract SGRID coords and assign padding.

* add short note on x_dimensions syntax

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* clarify SGRID grid dummy variable usage; tweaks

* Basic handling of SGRID data and generation of grid.

* Handle optional spaces after colon in grid data.

* Add a check for conventions attribute (Review of PR 559)

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Added example SGRID datasets to test folder.

* Removing print statements from grid.py

* Updates tests for SGRID (NB SOME FAIL - IN DEVELOPMENT)

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add in SGRID datasets.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove duplicate SGRID dataset.

* Remove duplicate SGRID dataset.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* refactored to have a metadata parsing module. WIP.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Added basic initial tests for metadata parsing.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Undo bug introduced in test_grid.py

* Update metadata parsing to return dict of grid kwargs.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Added cf_xarray approach for sgrid grid (pending upstream changes).

* Grid metadata parsing with checks for conflicts. Passing existing Grid tests.

* black, flake8, mypy (except grid.py).

* Added autoparsing tests to grid.

* add tests for sgrid and metadata parsing.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updated grids documentation.

* Updated grids documentation.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Added summary to what's new under features.

* Update xgcm/grid.py

Make grid dataset error more specific.

Co-authored-by: Julius Busecke <julius@ldeo.columbia.edu>

* Update test to match revised error message in Grid class.

* Fix docstring formatting in grid.py

* Fix errors and warnings in docs.

* Fix typo in grid error message and associated test. Fix docs typo.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Format docs with blackdoc.

* Add correct sgrid dataset to docs.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add test for 1D and 4D grids and update docs to run.

* Update docs to run.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updated documentation to pass builds.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* linting of test_metadata_parers.py

* add dummy txt file

* Update ufunc tests with autoparse_metadata=False when user supplies coords.

* remove dummy text file added previously for testing CI.

* Update other tests with autoparse_metadata=False where required.

* Update test_padding with autoparse_metadata=False where required.

---------

Co-authored-by: Ben Orchard <thefirstmuffinman@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Julius Busecke <julius@ldeo.columbia.edu>
  • Loading branch information
4 people committed Apr 13, 2023
1 parent 444e1c8 commit 8fcd0b0
Show file tree
Hide file tree
Showing 18 changed files with 1,221 additions and 125 deletions.
2 changes: 1 addition & 1 deletion doc/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Grid
:members:

Grid ufuncs
====
===========

.. autosummary::
:toctree: generated/
Expand Down
2 changes: 1 addition & 1 deletion doc/contributor_guide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ The conda-forge package will be triggered by the Pypi release and you will have
Thats it!

How to synchronize examples from xgcm-examples
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Most of the example notebooks in this documentation are located in the seperate repo `xgcm-examples <https://github.com/xgcm/xgcm-examples>`_, which is automatically linked to `pangeo gallery <https://gallery.pangeo.io>`_. These examples are synced into this documentation using git submodules.
Currently updates in the example repo need to be manually synced to this repo with the following steps:

Expand Down
6 changes: 4 additions & 2 deletions doc/grid_ufuncs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,9 @@ Our grid object has one Axis (``"X"``), which has two coordinates, on positions
}
)
grid = Grid(ds, coords={"X": {"center": "x_c", "left": "x_g"}})
grid = Grid(
ds, coords={"X": {"center": "x_c", "left": "x_g"}}, autoparse_metadata=False
)
grid
Our data starts on the cell centers.
Expand Down Expand Up @@ -295,7 +297,7 @@ In both cases the result has the same length as the original input array.
We can also see that the result depends on the choice of boundary conditions.

Automatically Applying Boundary Conditions
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Doing this manually is a chore, so xgcm allows you to apply boundary conditions automatically when using grid ufuncs.

Expand Down
105 changes: 97 additions & 8 deletions doc/grids.rst
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,9 @@ of the same axis. We do this using the ``coords`` keyword argument, as follows:
from xgcm import Grid
grid = Grid(ds, coords={"X": {"center": "x_c", "left": "x_g"}})
grid = Grid(
ds, coords={"X": {"center": "x_c", "left": "x_g"}}, autoparse_metadata=False
)
grid
The printed information about the grid indicates that xgcm has successfully
Expand All @@ -195,19 +197,105 @@ interpolation and difference operations: operating on the center coordinate
Detecting Axes from Dataset Attributes
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

It is possible to avoid manually specifying the axis information via the
``coords`` keyword if the dataset contains specific metadata that can
tell xgcm about the relationship between different coordinates.
If ``coords`` is not specified, xgcm looks for this metadata in the coordinate
attributes.
It is possible to avoid manually specifying the axis if the dataset contains
specific metadata that can tell xgcm about the relationship between different
coordinates.
If the ``autoparse_metadata`` kwarg is set to ``True`` (the default), xgcm looks
for this metadata in the coordinate attributes.
Wherever possible, we try to follow established metadata conventions, rather
than defining new metadata conventions. The two main relevant conventions
than defining new metadata conventions. The main relevant conventions
are the `CF Conventions`_, which apply broadly to Climate and Forecast datasets
that follow the netCDF data model, and the `COMODO conventions`_, which define
that follow the netCDF data model, and the `COMODO conventions`_ and
`SGRID conventions`_, both of which define
specific attributes relevant to Arakawa grids. While the COMODO conventions
were designed with C-grids in mind, we find they are general enough to support
all the different Arakawa grids.

Detection and extraction of grid information from datasets is performed by a series
of metadata parsing functions that take an xarray dataset and return a (potentially
modified) dataset and dictionary of extracted Grid kwargs.
When used as part of the autoparsing functionality of the ``Grid`` class there is a
default hierarchy imposed.
For more control a user can manually use a specific autoparsing function to extract
the ``Grid`` kwargs and then pass they to the ``Grid`` constructor (after any
changes/additions) with ``autoparse_metadata=False``.

For example:

.. ipython:: python
grid = xgcm.Grid(ds)
will return a ``Grid`` object constructed from xgcm's best attempts to autoparse any
metadata in the dataset according to internal hierarchies, whilst

.. ipython:: python
ds = xr.Dataset(
{
"grid": (
(),
np.array(1, dtype="int32"),
{
"cf_role": "grid_topology",
"topology_dimension": 1,
"node_dimensions": "x_g",
"face_dimensions": "x_c: x_g (padding: high)",
},
),
},
attrs={"Conventions": "SGRID-0.3"},
coords={
"x_c": (
["x_c"],
np.arange(1, 10),
),
"x_g": (
["x_g"],
np.arange(0.5, 9),
),
},
)
ds_sgrid, grid_kwargs_sgrid = xgcm.metadata_parsers.parse_sgrid(ds)
grid = xgcm.Grid(ds, coords=grid_kwargs_sgrid["coords"], autoparse_metadata=False)
explicitly extracts SGRID metadata which is then used to construct a ``Grid`` object
without autoparsing.

SGRID data
""""""""""
The identifier xgcm looks for is 'SGRID' in the ``conventions`` attribute.
Grid data is then contained within the ``variable`` with the ``cf_role`` of
'grid_topology'.
A set of grid axes in the order ``'X', 'Y', 'Z'`` are assigned based on the
dimensionality of the data.
Note that SGRID treatment of 3D grids and 2D grids with a vertical coordinate is
subtly different.
Both cases are handled by the autoparsing functionality to form a 3D ``Grid`` object.

SGRID 'node_dimensions' are extracted and correspond to xgcm's cell edges.
SGRID 'face' or 'volume' dimensions are then extracted with their associated 'padding'
identifier.
This corresponds to xgcm's cell centers.
Once the padding type has been extracted the correct xgcm 'position' can be assigned
to the associated cell edge coordinate as set out in the following table:

+---------------+----------+
| SGRID padding | position |
+===============+==========+
| low | right |
+---------------+----------+
| high | left |
+---------------+----------+
| both | inner |
+---------------+----------+
| none | outer |
+---------------+----------+

COMODO Data
"""""""""""

The key attribute xgcm looks for is ``axis``.
When creating a new grid, xgcm will search through the dataset dimensions
looking for dimensions with the ``axis`` attribute defined.
Expand Down Expand Up @@ -338,3 +426,4 @@ This is described in the :ref:`grid_topology` page.
.. _MITgcm notation: http://mitgcm.org/public/r2_manual/latest/online_documents/node31.html
.. _CF Conventions: http://cfconventions.org/
.. _COMODO Conventions: https://web.archive.org/web/20160417032300/http://pycomodo.forge.imag.fr/norm.html
.. _SGRID Conventions: https://sgrid.github.io/sgrid/
1 change: 1 addition & 0 deletions doc/ufunc_examples.rst
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ Firstly we need a two-dimensional grid, and we use similar coordinate names to t
"X": {"center": "x_c", "left": "x_g"},
"Y": {"center": "y_c", "left": "y_g"},
},
autoparse_metadata=False,
)
grid
Expand Down
12 changes: 8 additions & 4 deletions doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,10 @@ v0.9.0 (unreleased)

New Features
~~~~~~~~~~~~
- Methods for autoparsing of dataset metadata to construct a :py:class:`xgcm.Grid` class have been added.
Currently these include restructred functionality for the COMODO conventions (already in xgcm) and the
addition of SGRID conventions (:issue:`109`, :pull:`559`).
By `Jack Atkinson <https://github.com/jatkinson1000>`_.

Breaking Changes
~~~~~~~~~~~~~~~~
Expand All @@ -21,6 +25,7 @@ Breaking Changes

- All functionality for generating c-grid dimensions on incomplete datasets via :py:meth:`xgcm.Grid.autogenerate`, was removed (:pull:`557`).
By `Julius Busecke <https://github.com/jbusecke>`_.

Internal Changes
~~~~~~~~~~~~~~~~
- Switch CI environment setup to micromamba (:issue:`576`, :pull:`577`).
Expand Down Expand Up @@ -153,7 +158,7 @@ Documentation
Internal Changes
~~~~~~~~~~~~~~~~
- Switch CI to use mamba (:pull:`412`, :issue:`398`).
By `Julius Busecke <https://github.com/jbusecke>`_.
By `Julius Busecke <https://github.com/jbusecke>`_.

- Add deprecation warnings for future changes in the API (:issue:`409`,:pull:`411`).
By `Julius Busecke <https://github.com/jbusecke>`_.
Expand Down Expand Up @@ -195,7 +200,7 @@ Documentation
By `Dianne Deauna <https://github.com/jdldeauna>`_.[*]_

Internal Changes
~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~

- Fixed metrics tests so some tests that previously did not run now do run, and refactored the metrics tests.
By `Tom Nicholas <https://github.com/TomNicholas>`_.[*]_
Expand Down Expand Up @@ -258,7 +263,6 @@ Bug fixes
- More reliable handling of missing values in :py:meth:`Grid.average`. Missing values between data and metrics do not have to be aligned by the user anymore. (:pull:`259`). By `Julius Busecke <https://github.com/jbusecke>`_.

- Remove outdated `example_notebooks` folder (:pull:`244`, :issue:`243`). By `Nikolay Koldunov <https://github.com/koldunovn>`_ and `Julius Busecke <https://github.com/jbusecke>`_.
.. _whats-new.0.5.0:


v0.4.0 (2020/9/2)
Expand All @@ -272,7 +276,7 @@ New Features
Default values can be overridden in individual method calls (e.g. :py:meth:`Grid.interp`) as usual.
By `Deepak Cherian <https://github.com/dcherian>`_.

Bug fixes
Bug fixes
~~~~~~~~~
- Fix for parsing fill_values as dictionary (:issue:`218`).
By `Julius Busecke <https://github.com/jbusecke>`_.
Expand Down
83 changes: 60 additions & 23 deletions xgcm/grid.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
import xarray as xr
from dask.array import Array as Dask_Array

from . import comodo, gridops
from . import gridops, metadata_parsers
from .axis import Axis
from .grid_ufunc import (
GridUFunc,
Expand Down Expand Up @@ -113,6 +113,7 @@ def __init__(
value. (i.e. a limited form of Neumann boundary condition.)
* 'extrapolate': Set values by extrapolating linearly from the two
points nearest to the edge
Optionally a dict mapping axis name to seperate values for each axis
can be passed.
face_connections : dict
Expand All @@ -129,8 +130,60 @@ def __init__(
----------
.. [1] Comodo Conventions https://web.archive.org/web/20160417032300/http://pycomodo.forge.imag.fr/norm.html
"""
if not isinstance(ds, xr.Dataset):
raise TypeError(
f"ds argument to `xgcm.Grid` must be of type xarray.Dataset, but is of type {type(ds)}"
)

self._ds = ds

# Attempt to autoparse metadata from various conventions
# Default is to do this to preserve backwards compatability
if autoparse_metadata:
ds, parsed_kwargs = metadata_parsers.parse_metadata(ds)

# Loop over input kwargs. If None and parsed alternative available
# then replace local variable with autoparsed. If conflict raise error.
print(f"coords = {coords}")
duplicates = []
if "coords" in parsed_kwargs:
if coords is None:
coords = parsed_kwargs["coords"]
else:
duplicates.append("coords")
if "fill_value" in parsed_kwargs:
if fill_value is None:
fill_value = parsed_kwargs["fill_value"]
else:
duplicates.append("fill_value")
if "default_shifts" in parsed_kwargs:
if default_shifts is None:
default_shifts = parsed_kwargs["default_shifts"]
else:
duplicates.append("default_shifts")
if "boundary" in parsed_kwargs:
if boundary is None:
boundary = parsed_kwargs["boundary"]
else:
duplicates.append("boundary")
if "face_connections" in parsed_kwargs:
if face_connections is None:
face_connections = parsed_kwargs["face_connections"]
else:
duplicates.append("face_connections")
if "metrics" in parsed_kwargs:
if metrics is None:
metrics = parsed_kwargs["metrics"]
else:
duplicates.append("metrics")

if len(duplicates) > 0:
raise ValueError(
f"Autoparsed Grid kwargs: '{', '.join(duplicates)}' conflict with "
f"user-supplied kwargs. Run with 'autoparse_metadata=False', or "
f"autoparse and amend kwargs before calling Grid constructer."
)

if boundary:
warnings.warn(
"The `boundary` argument will be renamed "
Expand Down Expand Up @@ -168,27 +221,6 @@ def __init__(
)

if coords is None:
coords = {}
else:
# TODO this is only to retain backwards compatibility
# preferred would be to remove this line so autoparsing is combined with user input.
autoparse_metadata = False

if autoparse_metadata:
# TODO (Julius in #568) full hierarchy of conventions here
# but override with any user-given options

# try comodo parsing
comodo_ax_names = comodo.get_all_axes(ds)
parsed_coords = {}
for ax_name in comodo_ax_names:
parsed_coords[ax_name] = comodo.get_axis_positions_and_coords(
ds, ax_name
)

coords = parsed_coords | coords

if len(coords) == 0:
raise ValueError(
"Could not determine Axis names - please provide them in the coords kwarg "
"or provide a dataset from which they can be parsed"
Expand Down Expand Up @@ -523,6 +555,7 @@ def interp_like(self, array, like, boundary=None, fill_value=None):
DataArray with desired grid positions for source array
boundary : str or dict, optional,
boundary can either be one of {None, 'fill', 'extend', 'extrapolate'}
* None: Do not apply any boundary conditions. Raise an error if
boundary conditions are required for the operation.
* 'fill': Set values outside the array boundary to fill_value
Expand All @@ -532,6 +565,7 @@ def interp_like(self, array, like, boundary=None, fill_value=None):
the difference at the boundary will be zero.)
* 'extrapolate': Set values by extrapolating linearly from the two
points nearest to the edge
This sets the default value. It can be overriden by specifying the
boundary kwarg when calling specific methods.
fill_value : float, optional
Expand Down Expand Up @@ -761,12 +795,13 @@ def apply_as_grid_ufunc(
Grid universal function signature. Specifies the xgcm.Axis names and
positions for each input and output variable, e.g.,
``"(X:center)->(X:left)"`` for ``diff_center_to_left(a)`.
``"(X:center)->(X:left)"`` for ``diff_center_to_left(a)``.
boundary_width : Dict[str: Tuple[int, int]
The widths of the boundaries at the edge of each array.
Supplied in a mapping of the form {axis_name: (lower_width, upper_width)}.
boundary : {None, 'fill', 'extend', 'extrapolate', dict}, optional
A flag indicating how to handle boundaries:
* None: Do not apply any boundary conditions. Raise an error if
boundary conditions are required for the operation.
* 'fill': Set values outside the array boundary to fill_value
Expand All @@ -775,6 +810,7 @@ def apply_as_grid_ufunc(
value. (i.e. a limited form of Neumann boundary condition.)
* 'extrapolate': Set values by extrapolating linearly from the two
points nearest to the edge
Optionally a dict mapping axis name to separate values for each axis
can be passed.
fill_value : {float, dict}, optional
Expand Down Expand Up @@ -1269,6 +1305,7 @@ def interp_2d_vector(self, vector, **kwargs):
(i.e. a Dirichlet boundary condition.)
* 'extend': Set values outside the array to the nearest array
value. (i.e. a limited form of Neumann boundary condition.)
fill_value : float, optional
The value to use in the boundary condition with `boundary='fill'`.
vector_partner : dict, optional
Expand Down

0 comments on commit 8fcd0b0

Please sign in to comment.