Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move .rolling_exp functions from reduce to apply_ufunc #8114

Merged
merged 6 commits into from Sep 19, 2023

Conversation

max-sixty
Copy link
Collaborator

@max-sixty max-sixty commented Aug 24, 2023

A similar change should solve #6528, but let's get one finished first...

Posting for discussion, will comment inline Ready for merge

xarray/core/rolling_exp.py Outdated Show resolved Hide resolved
xarray/core/rolling_exp.py Outdated Show resolved Hide resolved
xarray/core/rolling_exp.py Outdated Show resolved Hide resolved
xarray/core/rolling_exp.py Outdated Show resolved Hide resolved
@github-actions github-actions bot added topic-backends topic-plotting topic-cftime CI Continuous Integration tools dependencies Pull requests that update a dependency file topic-typing io and removed topic-backends topic-plotting topic-cftime CI Continuous Integration tools dependencies Pull requests that update a dependency file topic-typing io labels Sep 17, 2023
@max-sixty
Copy link
Collaborator Author

After #8183, that's pretty nice now!

I think we could start to move some of these over.

There is one failing test, but that seems like the test is testing for something apply_ufunc's keep_attrs functionality doesn't offer. Not sure whether that's an issue with the test of apply_func...

@max-sixty
Copy link
Collaborator Author

Could anyone who has context or views on attrs offer direction on this test failure?

TestDatasetRollingExp.test_rolling_exp_keep_attrs[1-numpy]

AssertionError: assert {'attr': 'z1'} == {}
  Left contains 1 more item:
  {'attr': 'z1'}
  Full diff:
  - {}
  + {'attr': 'z1'}

Otherwise I think we remove the test and fall back to the apply_ufunc behavior — I'm guessing that behavior is better tested than the current .rolling_exp...

@mathause
Copy link
Collaborator

apply_ufunc cannt not preserve `Dataset-level attributes. Interesting that this has not been reported - I'll open an issue.

I want to see this fixed but ok to drop the test for now.

@max-sixty
Copy link
Collaborator Author

Great, thanks @mathause — done

@kmuehlbauer
Copy link
Contributor

@max-sixty @mathause FYI, to implement keep_attrs for that it only need change 3 lines of code. At least to make the test work.

Index: xarray/core/computation.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/xarray/core/computation.py b/xarray/core/computation.py
--- a/xarray/core/computation.py	(revision d2940ac3db7665d4e1bf7ccfd1775eb758cfb02a)
+++ b/xarray/core/computation.py	(revision b833560b704fc5f1edc8f45de505a9b385e86549)
@@ -433,6 +433,7 @@
     join="inner",
     fill_value=None,
     on_missing_core_dim: MissingCoreDimOptions = "raise",
+    keep_attrs="override",
 ):
     """Apply a variable level function over dicts of DataArray, DataArray,
     Variable and ndarray objects.
@@ -445,7 +446,7 @@
     for name, variable_args in zip(names, grouped_by_name):
         core_dim_present = _check_core_dims(signature, variable_args, name)
         if core_dim_present is True:
-            result_vars[name] = func(*variable_args)
+            result_vars[name] = func(*variable_args, keep_attrs=keep_attrs)
         else:
             if on_missing_core_dim == "raise":
                 raise ValueError(core_dim_present)
@@ -522,6 +523,7 @@
         join=dataset_join,
         fill_value=fill_value,
         on_missing_core_dim=on_missing_core_dim,
+        keep_attrs=keep_attrs,
     )
 
     out: Dataset | tuple[Dataset, ...]

@max-sixty
Copy link
Collaborator Author

@max-sixty @mathause FYI, to implement keep_attrs for that it only need change 3 lines of code. At least to make the test work.

Nice! Added...

@max-sixty max-sixty changed the title Explore moving functions from reduce to apply_ufunc Move .rolling_exp functions from reduce to apply_ufunc Sep 18, 2023
@max-sixty max-sixty marked this pull request as ready for review September 18, 2023 07:23
@max-sixty
Copy link
Collaborator Author

Nice! Added...

But actually I'm not seeing it working — have I made a mistake? Otherwise I can revert that commit and we can split into a separate PR

@@ -445,7 +446,7 @@ def apply_dict_of_variables_vfunc(
for name, variable_args in zip(names, grouped_by_name):
core_dim_present = _check_core_dims(signature, variable_args, name)
if core_dim_present is True:
result_vars[name] = func(*variable_args)
result_vars[name] = func(*variable_args, keep_attrs=keep_attrs)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that can be any function and there is no guarantee that it will consume this argument (admittedly I am surprised why this does not error in our test suite - so I may be wrong). Also isn't this on the DataArray level while the missing attrs are on the Dataset level?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good chance that we are undertesting here. But the error is raised for the attribute on z1 (DataArray level). And this propagates keep_attrs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK this only works for core_dim_present @max-sixty, so the attributes would also have to be removed on on_missing_core_dim == "copy". But as @mathause already mentioned, it's debatable if it should loose it's attrs.

@kmuehlbauer
Copy link
Contributor

Nice! Added...

But actually I'm not seeing it working — have I made a mistake? Otherwise I can revert that commit and we can split into a separate PR

Hmm, yes probably good to revert and handle in another PR. I'll check again locally, good chance I've messed up my branch.

@mathause
Copy link
Collaborator

@kmuehlbauer & @max-sixty Sorry - I was wrong (one shouldn't try to figure these things out on the phone display...).

I think the issue is that the attrs are not dropped when a DataArray is copied (so pretty much the exact opposite of what I thought before 😂). Here is a small repro:

import xarray as xr
air = xr.tutorial.open_dataset("air_temperature")
air["a"] = 1
air.a.attrs = {"key": "value"}
xr.apply_ufunc(
    lambda x: x.mean(axis=-1),
    air,
    input_core_dims=[["time"]],
    keep_attrs=False,
    on_missing_core_dim="copy"
).a

a still has its attrs (one could debate if it should loose them or not...)

@max-sixty
Copy link
Collaborator Author

max-sixty commented Sep 18, 2023

OK, I reverted the suggested changes. Unless there's strong disagreement, let's make the more general changes to apply_ufunc in a separate PR?

@max-sixty max-sixty added the plan to merge Final call for comments label Sep 18, 2023
@max-sixty max-sixty merged commit 828ea08 into pydata:main Sep 19, 2023
30 checks passed
@max-sixty max-sixty deleted the apply_ufunc branch September 19, 2023 01:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
plan to merge Final call for comments topic-rolling
Projects
None yet
Development

Successfully merging this pull request may close these issues.

rolling_exp loses coords
4 participants