Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coordinates object permits invalid state #8883

Closed
5 tasks done
TomNicholas opened this issue Mar 28, 2024 · 2 comments · Fixed by #8886
Closed
5 tasks done

Coordinates object permits invalid state #8883

TomNicholas opened this issue Mar 28, 2024 · 2 comments · Fixed by #8886

Comments

@TomNicholas
Copy link
Contributor

TomNicholas commented Mar 28, 2024

What happened?

It is currently possible to create a Coordinates object where a variable shares a name with a dimension, but the variable is not 1D. This is explicitly forbidden by the xarray data model.

What did you expect to happen?

If you try to pass the resulting object into the Dataset constructor you get the expected error telling you that this is forbidden, but that error should have been raised by Coordinates.__init__.

Minimal Complete Verifiable Example

In [1]: from xarray.core.coordinates import Coordinates

In [2]: from xarray.core.variable import Variable

In [4]: import numpy as np

In [5]: var = Variable(data=np.arange(6).reshape(2, 3), dims=['x', 'y'])

In [6]: var
Out[6]: 
<xarray.Variable (x: 2, y: 3)> Size: 48B
array([[0, 1, 2],
       [3, 4, 5]])

In [7]: coords = Coordinates(coords={'x': var}, indexes={})

In [8]: coords
Out[8]: 
Coordinates:
    x        (x, y) int64 48B 0 1 2 3 4 5

In [10]: import xarray as xr

In [11]: ds = xr.Dataset(coords=coords)
---------------------------------------------------------------------------
MergeError                                Traceback (most recent call last)
Cell In[11], line 1
----> 1 ds = xr.Dataset(coords=coords)

File ~/Documents/Work/Code/xarray/xarray/core/dataset.py:693, in Dataset.__init__(self, data_vars, coords, attrs)
    690 if isinstance(coords, Dataset):
    691     coords = coords._variables
--> 693 variables, coord_names, dims, indexes, _ = merge_data_and_coords(
    694     data_vars, coords
    695 )
    697 self._attrs = dict(attrs) if attrs else None
    698 self._close = None

File ~/Documents/Work/Code/xarray/xarray/core/dataset.py:422, in merge_data_and_coords(data_vars, coords)
    418     coords = create_coords_with_default_indexes(coords, data_vars)
    420 # exclude coords from alignment (all variables in a Coordinates object should
    421 # already be aligned together) and use coordinates' indexes to align data_vars
--> 422 return merge_core(
    423     [data_vars, coords],
    424     compat="broadcast_equals",
    425     join="outer",
    426     explicit_coords=tuple(coords),
    427     indexes=coords.xindexes,
    428     priority_arg=1,
    429     skip_align_args=[1],
    430 )

File ~/Documents/Work/Code/xarray/xarray/core/merge.py:731, in merge_core(objects, compat, join, combine_attrs, priority_arg, explicit_coords, indexes, fill_value, skip_align_args)
    729     coord_names.intersection_update(variables)
    730 if explicit_coords is not None:
--> 731     assert_valid_explicit_coords(variables, dims, explicit_coords)
    732     coord_names.update(explicit_coords)
    733 for dim, size in dims.items():

File ~/Documents/Work/Code/xarray/xarray/core/merge.py:577, in assert_valid_explicit_coords(variables, dims, explicit_coords)
    575 for coord_name in explicit_coords:
    576     if coord_name in dims and variables[coord_name].dims != (coord_name,):
--> 577         raise MergeError(
    578             f"coordinate {coord_name} shares a name with a dataset dimension, but is "
    579             "not a 1D variable along that dimension. This is disallowed "
    580             "by the xarray data model."
    581         )

MergeError: coordinate x shares a name with a dataset dimension, but is not a 1D variable along that dimension. This is disallowed by the xarray data model.

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.
  • Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

No response

Anything else we need to know?

I noticed this whilst working on #8872

Environment

main

@TomNicholas TomNicholas added this to To do in Explicit Indexes via automation Mar 28, 2024
@benbovy
Copy link
Member

benbovy commented Mar 28, 2024

It is currently possible to create a Coordinates object where a variable shares a name with a dimension, but the variable is not 1D. This is explicitly forbidden by the xarray data model.

Shouldn't we allow this instead of raising an error? It is now allowed by the xarray data model and supported when opening a dataset from a file (#7989).

@TomNicholas
Copy link
Contributor Author

Oh right! I've opened #8886 to allow it instead.

Explicit Indexes automation moved this from To do to Done Mar 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
2 participants