New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quantity.prod() result unit is incorrect with axis or where argument #867
Comments
965: Raise ValueError on ambiguous boolean conversion and add pending NumPy functions/ufuncs from issue tracker r=hgrecco a=jthielen There were a few follow-up issues from #905 that were missing fairly simple implementations, so this PR takes care of them in preparation for the upcoming release. Part of this was resolving #866 by raising a `ValueError` due to ambiguity when casting Quantities with offset units to boolean, which is technically a breaking change (although I would argue that the previous behavior was incorrect as discussed in #866). Also, @keewis, this implements `np.any` and `np.all`, which are two of the functions you had mentioned were needed by xarray. I think now the only one still missing is `np.prod` (#867) which will likely have to wait since there hasn't been a good solution yet. - [x] Closes #419; Closes #470; Closes #807; Closes #866 - [x] Executed ``black -t py36 . && isort -rc . && flake8`` with no errors - [x] The change is fully covered by automated unit tests - [x] Documented in docs/ as appropriate - [x] Added an entry to the CHANGES file Co-authored-by: Jon Thielen <github@jont.cc>
couldn't we use the result of I'm not sure how to compute the units if both I'm thinking of something like def prod(quantity, axis=None, *args, **kwargs):
def compute_units(unit, in_size, out_size, axis, where):
if axis is not None and where is not None:
return ureg.dimensionless
if where is not None:
exponent = np.sum(where)
else:
exponent = in_size // out_size
return unit ** exponent
result = np.prod(quantity.magnitude, axis=axis, *args, **kwargs)
units = compute_units(
quantity.units,
quantity.size,
result.size,
axis,
kwargs.get("where"),
)
return ureg.Quantity(result, units) (but the default value of |
@keewis Good point that in the either/or case of Cases that I had in mind like the following are well-defined, but couldn't come up with a good way to handle in general: np.prod(np.arange(6).reshape((2, 3)) * ureg.meter, axis=1, where=[True, False, True])
|
I guess we could try to filter the cases where this works. I've been trying something like a = np.linspace(1, 2, 20).reshape(4, 5) * ureg.m
where = a < 1.3 for which we obviously cannot compute a common unit. I think with a reshaped axis = 1
exponent = np.sum(where, axis=axis) we could use In my example (threshold
while with a threshold of
So the former would be |
@keewis That almost all sounds great. I would disagree though that the former example should return units of |
I wasn't sure if returning |
Indeed, it tries to convert to |
what I wanted to suggest was that we replicate that behavior (sorry if that was ambiguous): if we can't determine a common unit and the input unit is incompatible to Edit: maybe we should do that before the (potentially expensive) computation of the result |
Ah, got it, sorry for the confusion. Doing that check ahead of time does seem like a good idea. |
I think this issue should be reopened, since #1087 didn't really resolve it directly. #1087 was great in that it took care of implementing |
1120: revise the unit computation for np.prod r=hgrecco a=keewis #1087 left out the case where `axis` and `where` are specified and also used the size of the result to compute the output unit if only `axis` was specified. This changes that to use `a.shape[axis]` instead and also implements the support for both `axis` and `where` by broadcasting `where` against the array, applying `np.sum` along `axis` and using the only one unique value (`0` doesn't count) as an exponent. In case there's more than that, it will try to cast to `dimensionless`. I'm not quite sure if using `np.broadcast_arrays` is the best way to get the exponents, though. Edit: **Todo**: make the error message easier to understand - [x] Closes #867 - [x] Executed ``black -t py36 . && isort -rc . && flake8`` with no errors - [x] The change is fully covered by automated unit tests - [ ] Documented in docs/ as appropriate - [ ] Added an entry to the CHANGES file Co-authored-by: Keewis <keewis@posteo.de>
Right now, the
.prod()
method assumes the full size of the input is collapsed in the result. This gives incorrect results when theaxis
orwhere
arguments are supplied, as seen below:The unit on the first result where the full array is collapsed to a scalar is correct, but the other two results should have
meter ** 2
as the unit.I don't have a good fix for this yet, since this is the same problem mentioned in #764 (comment). If anyone has any suggestions for a performant way to determine how many unit multiplications occur given both
axis
andwhere
arguments, please do let me know! Otherwise, I'll try to figure something out and get a PR in for this or include it alongside aprod
implementation for__array_function__
.The text was updated successfully, but these errors were encountered: