-
-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Provide a way to disable flattening of 0d arrays to scalars #13105
Comments
This is a pretty ugly workaround for an inconsistency in the current design. An alternative would be to simply make it a breaking change at some point. Possibly in combination with doing something about array scalars? |
I agree it's a workaround, but it has some nice properties:
|
At least the I agree it is very ugly, but unless we think we can pull off some incompatible break which fixes this sooner rather than later it may be a practical path. If you really want to be complete we would be adding kwargs all over the python API (possibly even adding inconsistencies with the C-Api). That said, we have to weigh of the gain and cost a bit I suppose. If numpy would be the only user, I am not sure it is worth it, since I have some doubt it will help us much with the breaking change if we add a kwarg to half of the numpy API. |
more like, it makes it quite a bit harder to make a change later, since we then have keywords that we'd like to remove again but they now have users
sounds like something to avoid ....
I think the gain for most end users is fairly minimal, however for libraries that implement array-like data structures or ndarray subclasses it would be very nice to fix this. A NEP to give more context here and discuss costs/benefits could be useful indeed. |
One possible alternative would be to instead spell it as |
that doesn't really work. in particular, warning if the argument is not passed in is not a good idea. you would be forcing every single user to specify this argument. that's way worse than a breaking change.
or maybe you meant, only warn if the argument is not passed and a 0-d array is passed in? that's better but also tricky, because very few people will be testing with 0-d arrays in the first place.
That would be preferred to adding new keywords everywhere probably. However, you then can't have both |
@rgommers if out is supplied, it is used. Thus, conversion to scalars does not happen anyway in that case (and things are well behaved in that sense). So, I do think that such an overloading could probably cover all cases, and is thus likely a better hack than a new kwarg everywhere (some functions might still need to have |
Might add this to the breaking changes doc: https://github.com/numpy/numpy/wiki/Backwards-incompatible-ideas-for-a-major-release |
I'm not sure at all I like the new keyword. Perhaps also good to think why we may want it:
Overall, I wonder if it is really such a breaking change to at least change the output to be an array scalar if all inputs were array scalars... |
@mhvk, I might agree with preserving scalars that were given in (though there are rough edges around non-numpy scalars maybe), and I think that reductions with I used to like that idea, but becoming less certain nowadays. On the other hand we now have logic allowing non-numpy scalars to preserve themselves (using |
Sorry, addendum: I think Eric's idea of using a singleton to signal "please return arrays" should be able to replace all occurrences where
|
@seberg - thanks for reminding of @eric-wieser's second suggestion, of using a special singleton for About the special value: currently, the documentation states: p.s. More for completeness than preference: Other arguments that could be extended/re-used are |
That's a really good argument for using the |
The further discussion in #13100 reminded me of this one: it seems we are converging here on the idea of a special value for |
< moved to top-matter > |
Yes, and I think it one could even make a decorator that does the start and end... (the |
@mhvk: I thought about that, but I wasn't sure how to handle multiple input and/or output arguments |
It might for object arrays, and it's definitely more efficient with it. |
I was thinking just about the case of a ufunc-like function that has a single out-argument. Indeed, perhaps we could have a |
@seberg - I don't think the proposal here, of allowing |
Woops, sorry, yeah, I went through the wiki and did not look careful enough that this issue discussed the workaround (in a sense) and not a change of behaviour. |
It is a bit horrifying (and there is a lot of duplicate work going on in ufuncs that should be cleaned up :(. But I can make it work for |
(The reason for looking at it was that the fix for gh-14326 seemed way too ugly once more) |
I'm not sure I see a particularly good argument for using ... instead of a bespoke singleton. |
I don't think there is, it was just lazyness. The only argument is that it fits to indexing in the sense that in indexing |
I remember there were some objectiosn to singletons regarding |
Yeah, it is somewhat (maybe too) convenient to write. For third parties overriding with |
Considering how simple this is (to a part because the arguments are parsed multiple times...), I would not even mind to use a private sentinel, just to use it internally. |
Great to see an actual PR! On the sentinel, just to be sure it has been considered and rejected consciously, what about the idea of using an explicit |
@mhvk |
@seberg - I realize that one can do I guess the main question is what the hoped-for end-state is. If it is to (eventually) return array scalars for 0-dim output, then perhaps my suggestion makes that easier. Obviously, though, this presumes that very little code will rely on the output being an actual scalar rather than an array scalar. I think this may well be OK, since making it an array scalar does not limit anything (but does allow new things, like using it as an Anyway, just a thought. Your PR that at least allows one to ask for array scalars to be returned is a big improvement already. |
Fair point, although I am slightly hesitant to change behaviour to the documentation, and am not sure if I like |
Agreed. I guess options are:
|
Long term, the last option would be to preserve scalars if passed in. That is in parts weird, but especially in a world where you could already tack on I hate the automatic unwrapping in general. I think it is just wrong and was tacked on in an attempt to speed things up, missing the point that scalars are only reasonably produced in very few places (such as reductions) where it is actually not unreasonable to return true scalars (not weird array scalars). |
Good point about scalars to scalars - preserving those seems logical but then one definitely needs a Also agreed that the automatic unwrapping is just wrong (for normal calls at least), which is partially why I'm trying to think how we can get to a state where that doesn't happen... |
It is a general issue, about what
And about actually changing anything... Well, right now I do not see how to get around doing a compatibility breaking change (at which point I suppose the new behaviour here may force array outputs, even if we decide against that for all scalar inputs). Maybe we should do a "lets see what happens" release at some point by making an env-variable which is only on during a prerelease at best. |
I would lean towards @mhvk's 4, since it gives us all the tools we need to remove internal hacks, without breaking "leaf" users (we will likely break |
Belatedly reviving the option to just not convert to scalars ever as a proposal for numpy 2.0 in #24897. |
If only given 0d inputs, even if they are of type
ndarray
, ufuncs will decay their output to a scalar via[()]
(as noted in #4563, #5819). While we can't change this behavior now without creating signficant pain downstream, we could add a way to opt out of it.#13100 raises a case where
np.fix
resorts to callingnp.asanyarray(np.ceil(x, out=out))
in order to ensure that the result is an array. Unfortunately, this has a number of draw-backs:dask
arraysProposed implementation:
np.leave_wrapped
sentinel object that can be passed as anout
argumentufunc.__call__
,np.take
, ... for passingout=np.leave_wrapped
meaning "do not call PyArray_Return", causing the result to never be a scalarPyArray_Return
to python asnp.core.unpack_scalar
np.fix
as:Original
unpack_scalars=True
proposalunpack_scalars=True
kwarg toufunc.__call__
,ufunc.reduce
. WhenFalse
, the current behavior of going throughPyArray_Return
is disabled. Alternative names:decay=True
unpack_0d=True
unpack_0d_ndarray=True
(PyArray_Return
already does not apply to subclasses)np.unpack_scalar(arr)
function to exposePyArray_Return
to python code. This would not be overloadable with__array_function__
, since existing uses ofPyArray_Return
are also not.With these changes, the current implementation of
np.fix
would change from:to
The text was updated successfully, but these errors were encountered: