Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QUESTION (BUG?): stride of 0 for dimension of size 1 after use of newaxis. Arbitrary strides for dimensions of 1 more generally. #22950

Closed
gnathand opened this issue Jan 5, 2023 · 4 comments
Labels

Comments

@gnathand
Copy link

gnathand commented Jan 5, 2023

Describe the issue:

When using newaxis to add a dimension to an array, the stride of that dimension is set to 0. This is inconsistent with the behaviour of resize(), and results in arr.stride != arr.data.strides.

It is unclear whether this is actually a bug. The following documentation suggests it is not:

"Even for contiguous arrays a stride for a given dimension arr.strides[dim] may be arbitrary if arr.shape[dim] == 1 or the array has no elements." https://numpy.org/doc/stable/reference/generated/numpy.ndarray.flags.html

However, it feels counter-intuitive, and it is not hard to imagine how somebody could naively rely upon a particular value of stride. If this is intentional, it may be worth additional clarification, and probably somewhere more conspicuous.

Reproduce the code example:

x1 = np.array([[1,2,3]])

x2 = np.array([1,2,3])
x2.resize((1,3))

x3 = np.array([1,2,3])[np.newaxis,:]

x1.shape, x2.shape, x3.shape                       # ((1, 3), (1, 3), (1, 3))
x1.strides, x2.strides, x3.strides                 # ((24, 8), (24, 8), (0, 8))
x1.data.strides, x2.data.strides, x3.data.strides  # ((24, 8), (24, 8), (24, 8))

Error message:

No response

Runtime information:

1.23.5
3.10.8 (main, Nov 4 2022, 09:21:25) [GCC 12.2.0]

Context for the issue:

No response

@eric-wieser
Copy link
Member

Here's another case comparing .strides with .data.strides:

In [10]: x4 = np.array([1,1,2,2,3,3])[::2][np.newaxis,:]

In [11]: x4.shape
Out[11]: (1, 3)

In [12]: x4.strides
Out[12]: (0, 16)

In [13]: x4.data.strides
Out[13]: (0, 16)

The reason that .strides and .data.strides sometimes disagree is that numpy and PEP 3118 have different opinions on what it means for something to be C contiguous, and numpy dutifully translates its more general definition to the PEP3118 one where possible (for x3). For x4 neither consider it C contiguous, so there's no point replacing the 0.

and it is not hard to imagine how somebody could naively rely upon a particular value of stride

Can you elaborate on how this might happen? Any such case is likely broken on arrays like np.array([1,1,2,2,3,3])[::2] anyway.

@eric-wieser
Copy link
Member

(sorry, closed accidentally due to https://bugs.chromium.org/p/chromium/issues/detail?id=1124575#c1 sending the click to somewhere different to where my cursor was)

@seberg
Copy link
Member

seberg commented Jan 6, 2023

Yeah, memoryview/buffer export will "fix" the strides. Note that if an array has only one dimension which is not 1, then the array may be both C and Fortran contiguous (arr.flags) and how the stride is filled in actually depends on that!
(You can't see that here easily, since I am not sure how to request a fortran order buffer export from Python.)

@seberg
Copy link
Member

seberg commented Jan 17, 2023

Going to close the issue since I can't think of any actionable item here. "fixing" the strides without context (whether C or F-contiguity is expected) seems unhelpful to me: Users who incorrectly do rely on "clean" strides will still fail, just in even more confusing code paths.

Happy to reopen if there is any thought on what could be improved, though. Thanks for opening an issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants