Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: Add shape and copy arguments to numpy.reshape #26292

Merged
merged 6 commits into from Apr 24, 2024

Conversation

mtsokol
Copy link
Member

@mtsokol mtsokol commented Apr 16, 2024

Hi!

This is the last PR with Array API support. It adds shape and copy arguments to the numpy.reshape, as in the Array API standard, and numpy.ndarray.reshape.

I found that existing newshape keyword is used quite a lot (GitHub search) so I don't think we should deprecate the old name. Here both newshape and shape keywords are supported.

P.S. There's also copy keyword in the reshape ... [EDIT] Implemented as suggested in the comment.

@mtsokol mtsokol self-assigned this Apr 16, 2024
@mtsokol mtsokol added this to the 2.1.0 release milestone Apr 16, 2024
@ngoldbaum
Copy link
Member

numpy.reshape calls PyArray_Newshape, where copying happens. It's a part of NUMPY_API so it won't be easy to just add NPY_COPYMODE copy argument to the signature, right?

We can just leave the wart in the C API and add a new internal-only C function that has a different signature. There are a lot of wrapper functions around the public C API like that in the internal C API.

@charris charris changed the title API: Add shape argument to numpy.reshape API: Add shape argument to numpy.reshape Apr 16, 2024
@mtsokol mtsokol force-pushed the reshape-shape-keyword branch 2 times, most recently from e3bf8e5 to bdfb126 Compare April 17, 2024 12:43


NPY_NO_EXPORT PyObject *
_reshape_with_copy_arg(PyArrayObject *array, PyArray_Dims *newdims,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we can come up with a better name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is OK, I might have tended to use npy_reshape_with_copy_arg although there isn't much reason for the npy_ in such cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is fine

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@mtsokol
Copy link
Member Author

mtsokol commented Apr 17, 2024

@ngoldbaum @seberg I think it's ready for a review.

@mtsokol mtsokol changed the title API: Add shape argument to numpy.reshape API: Add shape and copy arguments to numpy.reshape Apr 17, 2024
Copy link
Member

@ngoldbaum ngoldbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall mostly good, see comments inline, mostly focusing on the python API changes

@array_function_dispatch(_reshape_dispatcher)
def reshape(a, newshape, order='C'):
def reshape(a, /, newshape=None, shape=None, *, order='C', copy=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make shape come first? Can we also make newshape a keyword-only argument?

That way you're not lying in the error message you added below when people pass newshape positionally by saying they passed shape.

Also that way we can issue a deprecation warning if anyone passes newshape as a keyword argument, encouraging them to migrate to the new name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! I changed the order and added a deprecation warning.



NPY_NO_EXPORT PyObject *
_reshape_with_copy_arg(PyArrayObject *array, PyArray_Dims *newdims,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is fine

@@ -1,6 +1,8 @@
#ifndef NUMPY_CORE_SRC_MULTIARRAY_SHAPE_H_
#define NUMPY_CORE_SRC_MULTIARRAY_SHAPE_H_

#include "conversion_utils.h"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usually we don't put includes in headers, can you move this to the implementation files where it's needed?

Copy link
Member Author

@mtsokol mtsokol Apr 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put this header here because _reshape_with_copy_arg declaration now contains NPY_COPYMODE that is defined in conversion_utils.h. Without it, it fails with:

error: unknown type name 'NPY_COPYMODE'

So I think it's needed. There are other header files with includes, such as dtypemeta.h.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(left a comment I quickly realized was dumb and deleted, sorry for the noise)

Maybe it would be cleaner to define the new function in conversion_utils.c?

"You cannot specify 'newshape' and 'shape' arguments "
"at the same time.")
if shape is not None:
newshape = shape
Copy link
Member

@ngoldbaum ngoldbaum Apr 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as alluded to above, instead of this let's make newshape keyword -only and also issue a deprecation warning if anyone passes it. Old uses with a positional argument continue to work with the new argument name. Old uses with a keyword argument continue to work, but with a deprecation warning so they to fix their code before we finally remove newhape.

Please also mark here in a comment that the deprecation happened in numpy 2.1.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, had some comments, but looks good to me overall. I think the result order is probably a bit niche to worry about.

Nathan's comments on the Python API make sense, and it is nice that you normalize the shape for _wrapfunc.
One comment on that, though: We should be dropping the new argument if it is None as part of the normalization (I don't think it happens later, but I may be wrong). Otherwise we break forwarding to downstream reshape.

PyArray_OrderConverter, &order)) {
if (!NpyArg_ParseKeywords(kwds, "|$O&O&", keywords,
PyArray_OrderConverter, &order,
PyArray_CopyConverter, &copy)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh well, this is an old parsing helper that is strange and slow nowadays, we should rewrite it. But I guess it doesn't have to be here...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The copy argument is expected to be either None, False, or True. We want to convert it to NPY_COPYMODE enum. I think PyArray_CopyConverter is the easiest way to do this (it also provides validation), right?
So IMO it has to be here. Or did you mean something else?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that is fine. I mean the whole NpyArg_ParseKeywords construct is awkward and unnecessarily complicated now (probably not when it was written).



NPY_NO_EXPORT PyObject *
_reshape_with_copy_arg(PyArrayObject *array, PyArray_Dims *newdims,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is OK, I might have tended to use npy_reshape_with_copy_arg although there isn't much reason for the npy_ in such cases.

if (copy == NPY_COPY_ALWAYS) {
PyObject *newcopy = PyArray_NewCopy(array, order);
if (newcopy == NULL) {
return NULL;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this is desired: When a copy is made, do we prefer to return the equivalent of copy("K") or copy("C") (effectively?).

I can see this go both ways:

  • Prefer keeping input order when possible (i.e. change it), or
  • Prefer ensuring the memory order doesn't depend on whether a no-copy reshape would have been possible in a specific case. (I.e. this)

So maybe just a note...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I prefer the second option, to just honor order argument in both paths - copy and no-copy, the same way. In that place I can add:

/*
 * Memory order doesn't depend on a copy/no-copy context.
 * 'order' argument is always honored. 
 */

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@mtsokol
Copy link
Member Author

mtsokol commented Apr 18, 2024

One comment on that, though: We should be dropping the new argument if it is None as part of the normalization (I don't think it happens later, but I may be wrong). Otherwise we break forwarding to downstream reshape.

By "the new argument" do you mean copy keyword? To not to pass it if it's None?

@mtsokol
Copy link
Member Author

mtsokol commented Apr 18, 2024

Ok, I addressed all comments.

@charris charris changed the title API: Add shape and copy arguments to numpy.reshape API: Add shape and copy arguments to numpy.reshape Apr 18, 2024
Comment on lines 229 to 230
Replaced by ``shape`` argument. Retained for backward
compatibility.
Copy link
Member

@ngoldbaum ngoldbaum Apr 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Replaced by ``shape`` argument. Retained for backward
compatibility.
.. deprecated:: 2.1
Replaced by ``shape`` argument. Retained for backward
compatibility.

(please double-check that renders correctly in the html docs)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

image

@mtsokol
Copy link
Member Author

mtsokol commented Apr 24, 2024

I added one docs change and rebased it. It's ready from my side.

Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nitpicks, but mostly looks good. I'll let you see what to follow up on, and I guess Nathan has looked at it so may want to have a quick look too.

return _wrapfunc(a, 'reshape', newshape, order=order)
if newshape is None and shape is None:
raise TypeError(
"reshape() missing 1 required positional argument: 'shape'")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly, shape=None was previously supported as "do nothing", I don't know how that would be sensible, so probably doesn't matter.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I don't think shape=None is useful really (I think some people could even expect it flattens rather than "do nothing"). I would stick to throwing an error on shape=None.

if newshape is not None and shape is not None:
raise ValueError(
"You cannot specify 'newshape' and 'shape' arguments "
"at the same time.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be a TypeError (bad parameters usually are), not that it matters much.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(might move it into the branch below, but doesn't really matter)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

kwargs = {"order": order}
if copy is not None:
kwargs["copy"] = copy
return _wrapfunc(a, 'reshape', shape, **kwargs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, I think it is easier to just do the two branches:

if copy is not None:
    return _wrapfunc(a, 'reshape', shape, order=order, copy=copy)
return _wrapfunc(a, 'reshape', shape, order=order)

Also avoids the overhead of creating and unpacking the dict again.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, done!

@ngoldbaum
Copy link
Member

Thanks @mtsokol!

@ngoldbaum ngoldbaum merged commit 2da02ea into numpy:main Apr 24, 2024
65 checks passed
@mtsokol mtsokol deleted the reshape-shape-keyword branch April 24, 2024 20:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants