Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: pass copy=True to __array__ #26208

Closed
ngoldbaum opened this issue Apr 3, 2024 · 3 comments · Fixed by #26215
Closed

ENH: pass copy=True to __array__ #26208

ngoldbaum opened this issue Apr 3, 2024 · 3 comments · Fixed by #26215
Assignees

Comments

@ngoldbaum
Copy link
Member

This is followup for #25941.

As of NumPy 2.0, __array__ implementations will never receive copy=True when NumPy calls the function. As a followup for adding the copy keyword, we need to first prevent numpy from making a copy before calling __array__ and pass copy=True to __array__ implementations. We also need to update the documentation to emphasize that the library is responsible for making a copy, or if making a copy is impossible, raising an error.

@mtsokol mtsokol self-assigned this Apr 4, 2024
@mtsokol
Copy link
Member

mtsokol commented Apr 4, 2024

Hi @ngoldbaum @seberg,

I followed np.array implementation and, if I understood it correctly, we need to prevent numpy from making a copy after __array__ is called rather than before.

def __array__() (with copy argument) is only called deep inside PyArray_DiscoverDTypeAndShape which is meant to discover ndims. I think the result of __array__ is stored in a cache passed to that function (first copy, 1. in the photo), but then this cache is passed in ctors.c::1622:

arr = (PyArrayObject *)(cache->arr_or_sequence);
/* we may need to cast or assert flags (e.g. copy) */
PyObject *res = PyArray_FromArray(arr, dtype, flags);

where flags still has "copy always" (second copy?, 2. in the photo).

My question: Would it be sufficient to avoid the second copy by changing "copy always" flag to "copy if needed" in flags, as the cache that was provisioned in Py_Array_Discover... already holds a copy?

@seberg
Copy link
Member

seberg commented Apr 4, 2024

Well, there are two things to note I guess:

  • When the discovery finds an array immediately, we may or may not already get a copy if we pass it on with copy=True. But that of course only works for __array__ and none of the others, so would need to add the copy logic also for others or indicate that a copy was made.
  • When the discovery finds a sequence (of arrays), a copy has to be made in any case and it doesn't make sense to keep passing on copy=True to __array__. (also we must fail for copy=False, but I think we do)

@mtsokol
Copy link
Member

mtsokol commented Apr 4, 2024

Then I can try with copy was made indicator that is set by Py_Array_Discover..., that would remove that flag for further processing.

@mtsokol mtsokol linked a pull request Apr 5, 2024 that will close this issue
@mtsokol mtsokol changed the title ENH: pass copy=True to __array__ ENH: pass copy=True to __array__ Apr 5, 2024
pitrou pushed a commit to apache/arrow that referenced this issue Apr 15, 2024
… compatibility (#41071)

### Rationale for this change

Adapting for changes in numpy 2.0 as decribed at https://numpy.org/devdocs/numpy_2_0_migration_guide.html#adapting-to-changes-in-the-copy-keyword and future changes to pass copy=True (numpy/numpy#26208)

### What changes are included in this PR?

Add a `copy=None` to the signatures of our `__array__` methods.

This does have impact on the user facing behaviour, though. Questioning that upstream at numpy/numpy#25941 (comment)

### Are these changes tested?

Yes

### Are there any user-facing changes?

No (compared to usage with numpy<2)
* GitHub Issue: #39532
* GitHub Issue: #41098

Authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
raulcd pushed a commit to apache/arrow that referenced this issue Apr 15, 2024
… compatibility (#41071)

### Rationale for this change

Adapting for changes in numpy 2.0 as decribed at https://numpy.org/devdocs/numpy_2_0_migration_guide.html#adapting-to-changes-in-the-copy-keyword and future changes to pass copy=True (numpy/numpy#26208)

### What changes are included in this PR?

Add a `copy=None` to the signatures of our `__array__` methods.

This does have impact on the user facing behaviour, though. Questioning that upstream at numpy/numpy#25941 (comment)

### Are these changes tested?

Yes

### Are there any user-facing changes?

No (compared to usage with numpy<2)
* GitHub Issue: #39532
* GitHub Issue: #41098

Authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
tolleybot pushed a commit to tmct/arrow that referenced this issue May 2, 2024
…y 2.0+ compatibility (apache#41071)

### Rationale for this change

Adapting for changes in numpy 2.0 as decribed at https://numpy.org/devdocs/numpy_2_0_migration_guide.html#adapting-to-changes-in-the-copy-keyword and future changes to pass copy=True (numpy/numpy#26208)

### What changes are included in this PR?

Add a `copy=None` to the signatures of our `__array__` methods.

This does have impact on the user facing behaviour, though. Questioning that upstream at numpy/numpy#25941 (comment)

### Are these changes tested?

Yes

### Are there any user-facing changes?

No (compared to usage with numpy<2)
* GitHub Issue: apache#39532
* GitHub Issue: apache#41098

Authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
vibhatha pushed a commit to vibhatha/arrow that referenced this issue May 25, 2024
…y 2.0+ compatibility (apache#41071)

### Rationale for this change

Adapting for changes in numpy 2.0 as decribed at https://numpy.org/devdocs/numpy_2_0_migration_guide.html#adapting-to-changes-in-the-copy-keyword and future changes to pass copy=True (numpy/numpy#26208)

### What changes are included in this PR?

Add a `copy=None` to the signatures of our `__array__` methods.

This does have impact on the user facing behaviour, though. Questioning that upstream at numpy/numpy#25941 (comment)

### Are these changes tested?

Yes

### Are there any user-facing changes?

No (compared to usage with numpy<2)
* GitHub Issue: apache#39532
* GitHub Issue: apache#41098

Authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants