-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
apply_ufunc: don't modify attrs on input variables #10330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The additional tests in test_ufuncs.py and test_computation.py currently fail due to GH8047, because apply_ufunc() relies on merge_coordinates_without_align() which may overwrite .attrs on input coordinates. The additional tests in test_merge.py currently pass, because Dataset.merge() dosen't seem affected by the bug.
Calls to xarray.apply_ufunc() (which is used by for instance xarray.where()) have a call stack of core.apply_ufunc.apply_ufunc() core.apply_ufunc.apply_dataarray_vfunc() core.apply_ufunc.build_output_coords_and_indexes structure.merge.merge_coordinates_without_align() structure.merge.merge_collected() and in merge_collected() the .attrs of a coordinate in an original input array could be overwritten depending on combine_attrs, even if the intent was just to produce the desired attributes for the returned result. This very simple fix always makes a copy before assigning attributes.
Thank you for opening this pull request! It may take us a few days to respond here, so thank you for being patient. |
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
dcherian
reviewed
May 27, 2025
dcherian
approved these changes
May 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing, thanks!
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixing #8047 (and at least the #8480 duplicate) by copying coordinate-variables before overwriting their .attrs in
structure.merge.merge_collected()
.In a call to
apply_ufunc()
, the output coordinates and their attributes are produced inmerge_collected()
based on the combine_attrs argument (which is "override" or "drop" depending on the keep_attrs argument). However, a side effect of the linemerged_vars[name].attrs = merge_attrs(
... was that also the .attrs of the coordinate on the input arrays would be changed to the output arrays', i.e. dropped or changed to those of the first input array.The proposed solution of creating a shallow copy of the input coordinate before assigning attributes works, according to the new tests I added in in test_ufuncs.py and test_computation.py, with a slight performance penalty (3% on an unpublished test I did, didn't install enough things to run the asv performance test suite locally).
Although the last
else
-clause ofmerge_collected()
has similar code assigningmerged_vars[name].attrs
, my tests pass without doing a copy there. Feel free to propose test cases that will exercise that code (indexed_elements being empty?) and still cause a bug like #8047 or #8480.I also extended the tests in test_merge.py to ensure that input attributes aren't affected by
merge
. These tests passed from the beginning, so they are not strictly related to the bug, but meant as precautions to catch hypothetical regressions.