-
Notifications
You must be signed in to change notification settings - Fork 21
Description
I've been trying to use flox
for multi-dimensional binning and found the API a bit tricky to understand.
For some context, I have two variables (depth(time)
and temperature(time)
), which I'd like to bin into time_bounds(time, bounds)
and depth_bounds(time, bounds)
.
I can get this to work using
arr = ds.set_coords("depth")["temperature"]
coords = [reference[name] for name in ["depth", "time"]]
vertices = [
cf_xarray.bounds_to_vertices(reference[name], bounds_dim="bounds")
for name in ["depth_bounds", "time_bounds"]
]
flox.xarray.xarray_reduce(
arr,
*coords,
expected_groups=vertices,
isbin=[True] * len(coords),
func="mean",
)
but in the process of getting this right I frequently hit the Needs better message
error from
Line 219 in 51fb6e9
raise ValueError("Needs better message.") |
which certainly did not help too much. However, ignoring that it was pretty difficult to make sense of the combination of
*by
, expected_groups
, and isbin
, and I'm not confident I won't be going through the same cycle of trial and error if I were to retry in a few months.
Instead, I wonder if we could change the call to something like:
bins = [
flox.Bin(along=name, labels=reference[name], bounds=reference[f"{name}_bounds"])
for name in ["depth", "time"]
]
flox.xarray.xarray_reduce(arr, *bins, func="mean")
(leaving aside the question of which bounds convention(s) this Bin
object should support)
Another option might be to just use an interval index. Something like:
flox.xarray.xarray_reduce(arr, time=pd.IntervalIndex(...), depth=pd.IntervalIndex(...), func="mean")
That would be pretty close to the existing groupby
interface. And we could even combine both:
flox.xarray.xarray_reduce(
arr,
time=flox.Bin(labels=reference[name], bounds=reference[f"{name}_bounds"]),
depth=flox.Bin(labels=reference[name], bounds=reference[f"{name}_bounds"]),
func="mean",
)
xref pydata/xarray#6610, where we probably want to adopt whatever signature we figure out here. Also, do tell me if you'd prefer to have this discussion in that issue instead (but figuring this out here might allow for quicker iteration). And maybe I'm trying to get xarray_reduce
to do something too similar to groupby
?