API,ENH: Allow forcing an array result in ufuncs #28576

seberg · 2025-03-24T11:22:43Z

As of now a WIP, because it would be nice to see what _methods.py. Happy if someone just picks it up also, thought I would show how trivial it is now.

The API change is to allow out=..., for multi-argument outputs it is currently out=(..., ...), although I guess we should allow short-handing that (nobody needs to pick it per argument).

I.e. all of these return arrays (and unlike the last time around it works with subclasses, although right now it passes ... as context so that may confuse them or __array_ufunc__ implementations.):

np.add(1, 2, out=...)
a = np.add.reduce(np.arange(10), out=...)

There is some discussion in gh-14489 about this. I still think ... is the best, because out=np.ndarray doesn't generalize for __array_ufunc__, and yes, I like ... because it also indicates an array return in indexing. (arr[i, ...] always returns an array.)

The ufunc part works (not sure if 100% complete), but the _methods.py changes do not (and change behavior subtlely, if possibly correctly). The actual change is trivial (after I had refactored this a while ago).

I think we should do this, but the fact that the _methods.py changes are not trivial makes me think there may be some idea missing.

It may be that the solution is for these functions to use conversion helper once at the start and then convert to scalars based on that at the end though (rather than dragging around subclasses).

CC @mdhaber you may be interested in this. We could probably just put this through (with a few new tests) without the _methods.py changes hashed out. (There must be a few other low-hanging places where this would simplify code, that are not as maze-like as _methods.py.)

jorenham · 2025-03-24T13:49:08Z

I can type it, so I like it 👌🏻

ngoldbaum · 2025-03-24T15:51:46Z

Thanks for working on this! I think having a way to opt in to this behavior is the first and necessary step to eventually getting rid of scalars. Worth experimenting with for sure.

seberg · 2025-03-24T15:55:51Z

Thanks for working on this! I think having a way to opt in to this behavior is the first and necessary step to eventually getting rid of scalars.

Well, I have all the code to ensure that 0-D arrays in returns 0-D arrays (and reductions return scalars for axis=None).
The problem with that isn't making it work with NumPy. The main problem by now is seeing what it means for downstream, because they would start returning arrays (if they use np.asarray() even where a NumPy ufunc would not).

seberg · 2025-03-24T15:56:28Z

But yes, I agree... No matter how we evolve, having out=... for now won't stand in the way and help transitioning (as well as being immediately helpful).

mhvk

Nice to see this revived! Some transcription errors, though! (see inline). Also, I'd tend to use ... everywhere it is allowed instead of Ellipsis.

I guess a more general question is whether it makes sense to have a special name for this (lust like np.new_axis is None), maybe np.keep_as_array. Although, while None has no obvious connection to adding an axis, ... has precedent for indexing (as you note), so on balance it seems fine as is. Obviously, do need to add documentation as well.

numpy/_core/_methods.py

mhvk

Yes, fine to start with the simple case. For documentation, I guess the main place is _add_newdocs.py for the various ufunc cases (starting around l.4900). And then the tests of course...

mhvk · 2025-03-25T15:43:12Z

numpy/_core/src/umath/ufunc_object.c

-        if (_set_full_args_out(1, out_obj, &full_args) < 0) {
+        if (out_obj == Py_Ellipsis) {
+            out_obj = NULL;
+            return_scalar = 0;


Since you use NPY_FALSE below, maybe here too?

mhvk · 2025-03-25T15:43:30Z

numpy/_core/src/umath/ufunc_object.c

@@ -3487,6 +3487,7 @@ PyUFunc_GenericReduction(PyUFuncObject *ufunc,
     */
    PyObject *otype_obj = NULL, *out_obj = NULL, *indices_obj = NULL;
    PyObject *keepdims_obj = NULL, *wheremask_obj = NULL;
+    int return_scalar = 1;  /* scalar return is disabled for out=... */


Here also use NPY_TRUE, like below?

seberg · 2025-03-25T20:53:29Z

Going to mark this ready for review. It feels a bit light on documentation, so we can merge, but we may want to search if there is some other place to document this (or maybe broadcast it out through some other channels...).

mdhaber · 2025-03-26T03:55:27Z

Thanks @seberg. Happy to see movement in this direction!

As of now a WIP, because it would be nice to see what `_methods.py`. The API change is to allow `out=...`, for multi-argument outputs it is currently `out=(..., ...)`, although I guess we should allow short-handing that (nobody needs to pick it per argument). There is some discussion in numpygh-14489 about this. I still think `...` is the best, because `out=np.ndarray` doesn't generalize for `__array_ufunc__`, and yes, I like `...` because it also indicates an array return in indexing. (`arr[i, ...]` always returns an array.) --- The ufunc part works (not sure if 100% complete), but the `_methods.py` changes do not (and change behavior subtlely, if possibly correctly). The actual change is trivial (after I had refactored this a while ago). I think we should do this, but the fact that the `_methods.py` changes are not trivial makes me think there may be some idea missing. It may be that the solution is for these functions to use conversion helper once at the start and then convert to scalars based on that at the end though (rather than dragging around subclasses).

This also means we don't pass it out as a context, which is probably safer. (it is passed as part of `return_scalar=True` already)

Signed-off-by: Sebastian Berg <sebastianb@nvidia.com>

(in a sense, I care less about subclasses/arrays because I think we jus tshouldn't return scalars there, but..._

mhvk

@seberg - this looks all good, but I thought maybe the doc changes could be a bit better. I mention these inline, but after doing so, I felt that this was just as likely to lead to more delay. So, instead, I force-pushed the simpler change to your branch, and will open another one with a bit of docs follow-up. Hope that makes sense! With that, this one is all OK as far as I'm concerned.

mhvk · 2025-03-26T15:10:35Z

numpy/_core/_add_newdocs.py

        Alternate array object(s) in which to put the result; if provided, it
        must have a shape that the inputs broadcast to. A tuple of arrays
        (possible only as a keyword argument) must have length equal to the
        number of outputs; use None for uninitialized outputs to be
        allocated by the ufunc.
+        If ``out=...`` is passed, the output is guaranteed not to be an array


delete "not" after guaranteed!

Alternatively, maybe rewrite a little more so that the order is followed and that the text becomes more similar between the various methods:

Location(s) into which the result(s) are stored. If not provided or None, new array(s) are created by the ufunc. If passed as a keyword argument, can be Ellipses (``out=...``) to ensure an array is returned even if the result is 0-dimensional, or a tuple with length equal to the number of outputs (where None can be used for allocation by the ufunc).

mhvk · 2025-03-26T15:16:28Z

numpy/_core/_add_newdocs.py

        A location into which the result is stored. If not provided or None,
        a freshly-allocated array is returned. For consistency with
        ``ufunc.__call__``, if given as a keyword, this may be wrapped in a
        1-element tuple.
+        If ``out=...`` is passed, the output is guaranteed not to be an array


Either minimal change as above, or

Location into which the result is stored. If not provided or None, a freshly-allocated array is returned. If passed as a keyword argument, can be Ellipses (``out=...``) to ensure an array is returned even if the result is 0-dimensional (which is useful especially for object dtype), or a 1-element tuple (latter for consistency with ``ufunc.__call__``).

mhvk · 2025-03-26T15:26:33Z

numpy/_core/_add_newdocs.py

@@ -5276,6 +5290,7 @@
        a freshly-allocated array is returned. For consistency with
        ``ufunc.__call__``, if given as a keyword, this may be wrapped in a
        1-element tuple.
+        (``out=...`` is supported, but results are never scalar).


Following the above, maybe change to

Location into which the result is stored. If not provided or None, a freshly-allocated array is returned. For consistency with ``ufunc.__call__``, if passed as a keyword argument, can be Ellipses (``out=...``, which has the same effect as None as an array is always returned), or a 1-element tuple.

mhvk · 2025-03-26T15:27:02Z

numpy/_core/_add_newdocs.py

@@ -5357,6 +5372,7 @@
        a freshly-allocated array is returned. For consistency with
        ``ufunc.__call__``, if given as a keyword, this may be wrapped in a
        1-element tuple.
+        (``out=...`` is supported, but results are never scalar).


And perhaps the same as last one here.

seberg · 2025-03-26T16:15:46Z

So, instead, I force-pushed the simpler change to your branch

Thanks! Nothing nicer than seeing pushes to your branch (in my book) :). I'll wait for that and bring it up in the meeting tonight, but maybe we can put it in very soon then.

mhvk · 2025-03-26T16:16:47Z

@seberg - my follow-up PR is at seberg#51 - this is to your branch just so can look at it. If you think it is uncontroversial, do feel free to include it in your PR. But happy to do it as follow-up too.

ngoldbaum · 2025-03-26T16:43:19Z

I think this feature probably deserves at least a paragraph somewhere on this page: https://numpy.org/doc/stable/user/basics.ufuncs.html

seberg · 2025-03-26T16:46:38Z

I think this feature probably deserves at least a paragraph somewhere on this page:

Yes, Marten added that in his PR to mine, just need to manage to merge it (the github API doesn't let me merge it into my branch though, right now...)

EDIT: Went through now, must have just been a temporary/current slowness in github. (I might do a pass, but I think what Marten wrote is also good as is.)

Ufunc array out doc followup

ngoldbaum

Just some doc suggestions related to ... being visually (albeit not syntactically) ambiguous, particularly in a terminal without rst rendering.

Overall looks great, this is a neat new feature that I'm sure array API implementers might be interested in.

Did this get a mailing list ping? It probably deserves one.

doc/release/upcoming_changes/28576.new_feature.rst

doc/source/reference/ufuncs.rst

doc/source/user/basics.ufuncs.rst

Co-authored-by: Nathan Goldbaum <nathan.goldbaum@gmail.com>

seberg · 2025-03-26T19:49:33Z

I'm sure array API implementers might be interested in.

Maybe interesting, but not obviously/immediately useful, I think. For that you would have to make that namespace actually use it, which is possible but not as long as it is the identical namespace.
(There is also always the worry that e.g. SciPy might actually want to return scalars to look like NumPy in some places... Although, my opinion there is that SciPy should maybe lead and avoid scalars, not NumPy)

lucascolley · 2025-03-26T21:19:09Z

I'm sure array API implementers might be interested in.

👍!

Maybe interesting, but not obviously/immediately useful, I think. For that you would have to make that namespace actually use it, which is possible but not as long as it is the identical namespace.

Relatively immediate I think. In SciPy and scikit-learn we aren't using the numpy namespace proper, everything goes through array_api_compat.numpy for code that has been converted to support the standard. So I think it would just need a PR to array-api-compat to map array_api_compat.numpy.multiply to numpy.multiply(..., out=...) etc.

There is also always the worry that e.g. SciPy might actually want to return scalars to look like NumPy in some places...

Yes, but we already have the pattern return x[()] if x.ndim == 0 else x about 50 times for places where we want to return scalars for this reason but may not end up using a ufunc (or things otherwise end up coerced back to 0D arrays). So I don't think this PR will make things tangibly worse, apart from maybe needing to use that pattern in some extra places. Our test suite should be ready for this, we have assertions which forbid 0D NumPy arrays at https://github.com/scipy/scipy/blob/main/scipy/_lib/_array_api_no_0d.py.

my opinion there is that SciPy should maybe lead and avoid scalars, not NumPy

We already do in some parts of SciPy, like special.factorial and ndimage (see https://github.com/scipy/scipy/blob/0f1fd4a7268b813fa2b844ca6038e4dfdf90084a/scipy/ndimage/_support_alternative_backends.py#L39-L41). But the general vibe I get from scipy/scipy#18768 is that it seems very difficult to make a full transition in SciPy without NumPy shifting on things like

np.array(0) * 2     # scalar, not 0d array
- np.array(0)       # scalar, not 0d-array

seberg · 2025-03-27T06:38:08Z

Right, if you do your compat namespace it is useful. I am fine with you doing, but it does basically cement that you cannot plan to ever get away from that.
To be fair, I don't mind that, I just feel if that is so, then maybe NumPy should export the namespace like that and drop the pretense that it is identical.

Or... I suppose maybe just live with the fact that there are two slightly different things that tend to be called by the same name.

lucascolley · 2025-03-27T08:36:28Z

you cannot plan to ever get away from that.

But you said on the mailing list:

this doesn't preclude a general transition towards returning
scalars less aggressively (or never)

I understand that won't happen any time soon, but NumPy 3.0 perhaps??

lucascolley · 2025-03-27T10:23:10Z

Relatively immediate I think

(Modulo difficulties with supporting this for older np versions)

miccoli · 2025-03-27T18:03:50Z

doc/release/upcoming_changes/28576.new_feature.rst

+This spelling is borrowed from ``arr1d[0, ...]`` where the ``...``
+also ensures a non-scalar return.


Maybe we could also mention that arr0d[...] is a valid lvalue as in arr0d[...] = 3.14

TBH, I don't really understand why arr[...] assignment is relevant in this context? (Besides any excuse for telling folks about ... is a good excuse ;).)

You are right, my comment is not relevant.

From the observation that arr0d[()] is scalar and arr0d[...] is a 0D array I wrongly assumed that the first is not a valid lvalue...

But I just realized that arr0d.__setitem__((), 3.14) and arr0d.__setitem__(Ellipsis, 3.14) are prefectly equivalent, so please forget my comment.

seberg · 2025-04-01T10:15:50Z

Since there were no comments on the mailing list, and I don't think it's a huge change (with an approval), I'll plan to merge this in a few days unless there are comments.

lucascolley · 2025-04-01T10:36:57Z

do you have any further thoughts on #28576 (comment) ? Of course not blocking for this PR.

seberg · 2025-04-01T10:55:57Z

This summarizes most of it: #28576 (comment), I have repeated my opinions on this a few times recently...

I understand that won't happen any time soon, but NumPy 3.0 perhaps??

Maybe, but I am personally not willing to look into getting rid of scalars entirely (just explaining my opinion here over and over is enough ;)).
I am simply not fully convinced it is the right/feasible:

The pain we feel doesn't come from scalars (yes it causes implementation pain). It origiates from converting 0-D arrays to scalars so they sneaking up like ninjas.
Scalars do have some advantages:
- They are not containers (and shouldn't pretend to be). If they are, they may be very different containers (e.g. strings).
- Scalars are e.g. immutable.
- speed
I am not convinced that the transition wouldn't be utterly horrible and devastating until someone tries around a bit, and I won't be the one to do it. For example, I could imagine that pandas would require a lot of work if you were to do it.

So, the thing I have already is a branch that stop converting to 0-D arrays while returning scalars where it makes sense to do so. (See other issue.)
I can also see DTypes deciding they don't have a scalar (effectively), or even a context manager to enable this or enable my other branch optionally if things are hard...
The problem with that branch isn't actually so much NumPy, it is more figuring out what SciPy needs to do to align itself (in whatever way it wants), or what we need to provide to allow that.

Apparently, I haven't yet pointed to there from this particular PR, but this discussion should be in: #24897 or similar issues.

seberg · 2025-04-02T12:17:36Z

OK, thanks everyone, I'll put this in also because I suspect it is relevant to gh-28624.

* API,ENH: Allow forcing an array result in ufuncs This allows using `out=...` for ufuncs (including reductions) to enforce an array result. With it, the result will never be converted to a scalar, but subclasses are preserved normally. (A subclass might still misbehave and convert to a scalar anyway, but it would be a bug there.) There is some discussion in numpygh-14489 about this. I still think `...` is the best, because `out=np.ndarray` doesn't generalize for `__array_ufunc__`, and yes, I like `...` because it also indicates an array return in indexing. (`arr[i, ...]` always returns an array.) It uses this in a few places, although there are many more, e.g. `_methods.py` should probably use it, but it is very subtle there. (Also because some paths are probably subtly wrong for object dtype.) --------- Signed-off-by: Sebastian Berg <sebastianb@nvidia.com> Co-authored-by: Marten H. van Kerkwijk <mhvk@astro.utoronto.ca> Co-authored-by: Nathan Goldbaum <nathan.goldbaum@gmail.com>

github-actions bot added the 30 - API label Mar 24, 2025

mhvk reviewed Mar 24, 2025

View reviewed changes

seberg mentioned this pull request Mar 25, 2025

BUG: cross() fails with 1D arrays, int numbers, and dtype=object #27435

Open

seberg added the 59 - Needs tests label Mar 25, 2025

mhvk reviewed Mar 25, 2025

View reviewed changes

seberg marked this pull request as ready for review March 25, 2025 20:40

seberg removed the 59 - Needs tests label Mar 26, 2025

seberg added 7 commits March 26, 2025 12:06

Revert method.py changes for now (they are just trickier)

6d455b0

Tighten when out=... is allowed

e30bf54

This also means we don't pass it out as a context, which is probably safer. (it is passed as part of `return_scalar=True` already)

DOC: Add release note fragment for out=...

1384a4e

Signed-off-by: Sebastian Berg <sebastianb@nvidia.com>

Add basic tests for ufunc out=...

c248070

expand test a bit more to cover subclasses

e382fe2

(in a sense, I care less about subclasses/arrays because I think we jus tshouldn't return scalars there, but..._

DOC: Add docs about out=... to the ufunc docs

a717674

mhvk force-pushed the force-array-ufunc branch from 070b9dc to a717674 Compare March 26, 2025 16:06

mhvk approved these changes Mar 26, 2025

View reviewed changes

mhvk added 2 commits March 26, 2025 12:09

DOC: clarify ufunc docstrings for new out=... possibility

2db9992

DOC: update user/reference docs for new out=... possibility

bf82b63

Merge pull request #51 from mhvk/ufunc-array-out-doc-followup

ce92d9b

Ufunc array out doc followup

ngoldbaum reviewed Mar 26, 2025

View reviewed changes

doc/release/upcoming_changes/28576.new_feature.rst Outdated Show resolved Hide resolved

doc/source/reference/ufuncs.rst Outdated Show resolved Hide resolved

doc/source/user/basics.ufuncs.rst Outdated Show resolved Hide resolved

Apply suggestions from code review

4ba05ef

Co-authored-by: Nathan Goldbaum <nathan.goldbaum@gmail.com>

miccoli reviewed Mar 27, 2025

View reviewed changes

seberg merged commit 72e1458 into numpy:main Apr 2, 2025
72 checks passed

seberg deleted the force-array-ufunc branch April 2, 2025 12:20

seberg mentioned this pull request Apr 2, 2025

BUG: fix np.vectorize for object dtype #28624

Merged

		This spelling is borrowed from ``arr1d[0, ...]`` where the ``...``
		also ensures a non-scalar return.

Uh oh!

API,ENH: Allow forcing an array result in ufuncs #28576

API,ENH: Allow forcing an array result in ufuncs #28576

Uh oh!

Conversation

seberg commented Mar 24, 2025

Uh oh!

jorenham commented Mar 24, 2025

Uh oh!

ngoldbaum commented Mar 24, 2025

Uh oh!

seberg commented Mar 24, 2025

Uh oh!

seberg commented Mar 24, 2025

Uh oh!

mhvk left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mhvk left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

seberg commented Mar 25, 2025

Uh oh!

mdhaber commented Mar 26, 2025

Uh oh!

mhvk left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

seberg commented Mar 26, 2025

Uh oh!

mhvk commented Mar 26, 2025

Uh oh!

ngoldbaum commented Mar 26, 2025

Uh oh!

seberg commented Mar 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ngoldbaum left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

seberg commented Mar 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lucascolley commented Mar 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

seberg commented Mar 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lucascolley commented Mar 27, 2025

Uh oh!

lucascolley commented Mar 27, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

seberg commented Apr 1, 2025

seberg commented Mar 26, 2025 •

edited

Loading

seberg commented Mar 26, 2025 •

edited

Loading

lucascolley commented Mar 26, 2025 •

edited

Loading

seberg commented Mar 27, 2025 •

edited

Loading