-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Index.validate_dataarray_coord #10137
base: main
Are you sure you want to change the base?
Conversation
... when check_default_indexes=False.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some quick thoughts
xarray/core/dataset.py
Outdated
k, self._variables[k], needed_dims | ||
) | ||
coords[k] = self._variables[k] | ||
except ValueError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ValueError
can be raised by many different things going wrong in Python. If we're going to check specifically for an error, we should only catch a specific Exception subclass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Functions with a leading underscore are marked by pyright as unused if they are not used from within the module in which they are defined. Also remove unneeded nested import.
Move check_dataarray_coords in xarray.core.coordinates module and rename it to validate_dataarray_coords (name consistent with Index.validate_dataarray_coord). Move CoordinateValidationError from xarray.core.indexes to xarray.core.coordinates module.
Thanks @shoyer for taking a look! This is now ready for (another round of) review. Compared to #10116 this PR still represents a major data model change for DataArray (in the sense that it allows overriding strict enforcement of the DataArray model in certain cases), but I think that the risk is mitigated since it is here done explicitly via @dcherian regarding your #10116 (comment), in the example notebook the DataArray reprs shown without any list of dimensions for the Coordinates section do not strike me much. That said, the |
whats-new.rst
api.rst
Same goal than #10116 using the alternative approach suggested in #10116 (comment) where the propagation (validation) of coordinates in a DataArray is delegated to their index (if any).
I find this approach cleaner than #10116. Here is the same notebook example adapted for this PR.
Index API alternatives
Index.validate_dataarray_coord(self, name: Hashable, var: Variable, dims: set[Hashable]) -> None
Index.validate_dataarray_coords(self, variables: dict[Hashable, Variable], dims: set[Hashable]) -> None
Indexes.group_by_index
while in option 1 we simply need to iterate over_variables
,_coords
and/or_indexes
).Index.validate_dataarray_coords(self, variables: dict[Hashable, Variable], dims: set[Hashable]) -> Coordinates
Option 1, the simplest and working well with IntervalIndex, is currently implemented in this PR.
cc @shoyer @dcherian