-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compose weighted with groupby, coarsen, resample, rolling etc. #3937
Comments
I think We could probably make both |
I want to do something similar as xesmf's weighted regridding, but without the need to install esmpy which has a lot of dependencies. Are variations of the following possible?
|
That's currently not possible. What you can try is the following:
but this only works if you don't have any NaNs. |
@mathause Does the |
No that won't work. You need to mask the weights where the data is NaN. An untested and not very efficient way may be: def coarsen_weighted_mean(da, weights, dims, skipna=None, boundary="exact"):
weighted_sum = (da * weights).coarsen(dims, boundary=boundary).sum(skipna=skipna)
masked_weights = weights.where(da.notnull())
sum_of_weights = masked_weights.coarsen(dims, boundary=boundary).sum()
valid_weights = sum_of_weights != 0
sum_of_weights = sum_of_weights.where(valid_weights)
return weighted_sum / sum_of_weights An example (without NaNs though): import xarray as xr
import numpy as np
air = xr.tutorial.open_dataset("air_temperature").air
weights = np.cos(np.deg2rad(air.lat))
# we need to rename them from "lat"
weights.name = "weights"
c_w_m = coarsen_weighted_mean(air, weights, dict(lat=2), boundary="trim")
# to compare do it for one slice
alt = air.isel(lat=slice(0, 2)).weighted(weights).mean("lat")
# compare if it is the same
xr.testing.assert_allclose(c_w_m.isel(lat=0, drop=True), alt) |
Would da.weighted(weights).integrate(...) be in scope as well? |
I am interested in the coarsen with weights scenario that @dcherian and @mathause described here for a current project of ours. I solved the issue manually and its not that hard import xarray as xr
import numpy as np
# example data with weights
data = np.arange(16).reshape(4,4).astype(float)
# add some nans
data[2,2] = np.nan
data[1,1] = np.nan
# create some simple weights
weights = np.repeat(np.array([[1,2,1,3]]).T, 4, axis=1)
weights
da = xr.DataArray(data, dims=['x', 'y'], coords={'w':(['x','y'], weights)})
da masked_weights = da.w.where(~np.isnan(da)) # .weighted() already knows how to do this
da_weighted = da * masked_weights
da_coarse = da_weighted.coarsen(x=2, y=2).sum() / masked_weights.coarsen(x=2, y=2).sum()
da_coarse but I feel all of this is duplicating existing functionality (e.g. the masking of weights based on nans in the data) and might be sensibly streamlined into something like: da.weighted(da.w).coarsen(...).mean() at least from a user perspective (there might be unique challenges with the implementation that I am overlooking here). Happy to help but would definitely need some guidance on this one. I do believe that this would provide a very useful functionality for many folks who work with curvilinear grids and want to prototype things that depend on some sort of scale reduction (coarsening). Also cc'ing @TomNicholas who is involved in the same project 🤗 |
It would be nice to make
weighted
work withgroupby
- e.g. #3935 (comment)However, it is not entirely clear to me how that should be done. One way would be to do:
this would require that the
groupby
operation is applied over theweights
(how would this be done?)Or should it be
but this seems less intuitive to me.
Or
The text was updated successfully, but these errors were encountered: