-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix numbagg aggregations #282
Conversation
OK awesome, thanks a lot; let me know if there's anything you need on the numbagg side! |
FYI I'm running in to a major blocker with how Xarray handles groups with all-NaN entries, and groups with no entries. I think you should just choose numbagg explicitly using the |
262bb4a
to
3454db9
Compare
Thanks a lot for doing all these — it looks like difficult and finicky work. Let me know what I can do on the numbagg side to make it less of a burden on your end — particularly things that are bad / wrong in numbagg behavior (am still happy to fix bad upstream conventions if it makes your job much easier though) |
I'm not sure you should, it's really an Xarray annoyance import pandas as pd
import numpy as np
from xarray import Dataset
times = pd.date_range("2000-01-01", freq="6H", periods=10)
ds = Dataset(
{
"bar": ("time", [1, 2, 3, np.nan, np.nan, np.nan, 4, 5, np.nan, np.nan], {"meta": "data"}),
"time": times,
}
)
expected_time = pd.date_range("2000-01-01", freq="3H", periods=19)
expected = ds.reindex(time=expected_time)
ds.resample(time="3H").sum().bar.data
# array([ 1., nan, 2., nan, 3., nan, 0., nan, 0., nan, 0., nan, 4., nan, 5., nan, 0., nan, 0.]) ^ It's NaN when there are no observations in the window, and 0 if there are only NaNs in the window. Both numpy_groupies and numbagg would just give you all 0s (i.e. the identity element) which is sensible to me. The Xarray behaviour is really an artifact of the fact that we accumulate |
Great, definitely agree. Possibly we could change that. Pandas even does the arguably more logical thing!
|
Sweet, looks like we are actually using numbagg by default now. I don't understand the first row for
|
Nice!! And I recently enabled |
Closes #281
TODO:
saveseen_groups
infactorize_