Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't change dtype for non-continuous array #9496

Open
aplavin opened this issue Jul 29, 2017 · 17 comments
Open

Can't change dtype for non-continuous array #9496

aplavin opened this issue Jul 29, 2017 · 17 comments

Comments

@aplavin
Copy link

aplavin commented Jul 29, 2017

When I have a plain float array like arr = np.ones((3, 4, 2)) and want to make a complex value from each two consequtive floats it's easily possible: arr.view(complex) gives the expected result. But when I want to perform the same with a non-continuous array, like arr[:, ::2, :].view(complex), it raises ValueError: new type not compatible with array.. Clearly this conversion is possible without copies, but either I misunderstand something here or it's a bug in numpy.

Anyway, for this moment, are there any workarounds for this?

@charris
Copy link
Member

charris commented Jul 29, 2017

Hmm, seems like there is no fundamental reason that could not work.

@charris
Copy link
Member

charris commented Jul 29, 2017

Anyway, for this moment, are there any workarounds for this?

Might take the view before slicing, if that is possible.

@shoyer
Copy link
Member

shoyer commented Jul 30, 2017

Indeed. I suppose that strictly speaking, we only need the last dimension to be continuous.

Anyway, for this moment, are there any workarounds for this?

The simplest option is to make a copy, e.g., arr[:, ::2, :].copy().view(complex).

If you need to reuse the same buffer, that should also be possible -- take a look at np.lib.stride_tricks.as_strided for inspiration.

@aplavin
Copy link
Author

aplavin commented Jul 30, 2017

If possible, I would prefer not taking a copy, as the array is very large. Also I can't "take the view before slicing", because in reality I have a structured array with one of the fields being pairs of floats, which I need to convert to complex values. So, will try something along the lines of as_strided for now, thanks.

@eric-wieser
Copy link
Member

I don't think we can fix this without removing a deprecation from 2015-11-27 (1.11.0), which gives fortran-ordered arrays different and possibly conflicting semantics:

In [11]: arr = np.arange(6.0).reshape((3, 2))

In [12]: arr
Out[12]:
array([[ 0.,  1.],
       [ 2.,  3.],
       [ 4.,  5.]])

In [13]: arr.view(complex)
Out[13]:
array([[ 0.+1.j],
       [ 2.+3.j],
       [ 4.+5.j]])

In [14]: arr.T.view(complex)
C:\Users\wiese\Repos\numeric-python\numpy\runtests.py:1: DeprecationWarning: Changing the shape of non-C contiguous array by
descriptor assignment is deprecated. To maintain
the Fortran contiguity of a multidimensional Fortran
array, use 'a.T.view(...).T' instead
  #!/usr/bin/env python
Out[14]: array([[ 0.+1.j,  2.+3.j,  4.+5.j]])

@eric-wieser
Copy link
Member

Actually, the semantics do not conflict, so I think this is solvable. Patch in the works

@eric-wieser
Copy link
Member

In my initial PR to fix this (#9497, now closed), @jaimefrio made a comment against allowing it when the last axis is contiguous - which I'm repeating here to keep the discussion in one place:

Think of a 2D sliding window type of view, where both axes have the same stride. The array is not contiguous, but its last dimension is. Say the array is of doubles and you take a complex dtype view of it. What do you do with the stride of the first axis? Leave it as is? Transform it like the last axis one? Whatever you do, it's going to be surprising to some users, hence probably better not to do it.

You could define something like "weak contiguousness", where the stride of each axis is required to be a multiple of the product of the size and stride of the next one, which would cover all arrays that are slices of contiguous arrays. And allow views like these to be taken on "weakly contiguous arrays with a strongly contiguous last dimension" only, probably with no issues like above. Not sure the benefit is worth the extra complexity.

@eric-wieser
Copy link
Member

eric-wieser commented Jul 31, 2017

My argument would be the semantics of view can become b = a.view(dtype)b[i,j,k,:] == a[i,j,k].view(dtype).

This is already the semantics for C-contiguous arrays. It's not the semantics for F-contiguous, which is presumably why those semantics were deprecated in #6747.

Relaxing the constraint to "last axis is contiguous" would also preserve this invariant.

There was a pr that would also fix this at #5508. That PR does not preserve this invariant, instead have stride-dependant behaviour. Stride-dependant behaviour (beyond deciding whether to error) sounds like a bad idea to me, because few users are aware of the strides of their arrays .

@shoyer
Copy link
Member

shoyer commented Jul 31, 2017

My argument would be the semantics of view can become b = a.view(dtype)b[i,j,k,:] == a[i,j,k].view(dtype).

Yes, I was thinking the same thing.

I agree that the behavior for view should not be stride-dependant.

@njsmith
Copy link
Member

njsmith commented Jul 31, 2017

We've already defined the semantics of a.view(dtype) for C contiguous arrays. I can't see any problem with extending the range of arrays that view succeeds on so long as (a) it always returns a view, and (b) whenever array_equal(a, b), and a.view(dtype) and b.view(dtype) are defined, then array_equal(a.view(dtype), b.view(dtype)).

I think this is a restatement of what you're saying.

@aplavin
Copy link
Author

aplavin commented Jul 31, 2017

Think of a 2D sliding window type of view, where both axes have the same stride. The array is not contiguous, but its last dimension is. Say the array is of doubles and you take a complex dtype view of it. What do you do with the stride of the first axis? Leave it as is? Transform it like the last axis one? Whatever you do, it's going to be surprising to some users, hence probably better not to do it.

Personally I don't see how supporting this case (leaving first axis strides as is) can be confusing at all - the same thing happens now with continuous arrays.

If you need to reuse the same buffer, that should also be possible -- take a look at np.lib.stride_tricks.as_strided for inspiration.

Could you please expand on this a bit? I'm not too familiar with these "inner" parts of numpy and can't get how to do this, even after looking at as_strided source.

@eric-wieser
Copy link
Member

eric-wieser commented Jul 31, 2017

I take this remark back:

Actually, the semantics do not conflict, so I think this is solvable.

We can't fix this properly until we remove the deprecation. We're forced to either:

  • Break some old previously deprecated code
  • Change the rule from "last-axis-contiguous" to "last-axis-contiguous and (C-contiguous or not F-contiguous)", which is confusing, especially with relaxed stride checking.

This code:

a = np.zeros((1, 2, 3), dtype=np.int32).T  # fortran-contiguous, but also contiguous in the last axis
b = a.view(np.int8)

Is required to have b.shape == (12, 2, 1) by the deprecated behaviour, but if we want to keep the invariant I mention above, it should give b.shape == (3, 2, 4).

Unfortunately, we gave a DeprecationWarning, not a FutureWarning back in 1.11.0. So can we actually change this, or do we need to restart the deprecation cycle?

@shoyer
Copy link
Member

shoyer commented Jul 31, 2017

Unfortunately, we gave a DeprecationWarning, not a FutureWarning back in 1.11.0. So can we actually change this, or do we need to restart the deprecation cycle?

The conservative thing to do is to switch to a FutureWarning now and revisit this in a couple of years.

@eric-wieser
Copy link
Member

eric-wieser commented Jul 31, 2017

@shoyer: How about doing the following before revisiting, listed by precedence:

  • C-contiguous: as before (match new semantics)
  • Fortran-contiguous + last-axis-contiguous: FutureWarning, keep deprecated behaviour
  • Fortran-contiguous: DeprecationWarning, keep deprecated behaviour
  • last-axis-contiguous: Use new semantics

This introduces as much new behavior as possible, without breaking the old code - and begins the next cycle for finally removing that code path, and completing the use of new behaviour

@charris
Copy link
Member

charris commented Jul 31, 2017

Might want to take a look at #6614. The problems came in with relaxed strides, which allow arrays to be both Fortran and C contiguous.

@eric-wieser
Copy link
Member

Wrong issue number?

@charris
Copy link
Member

charris commented Aug 1, 2017

Yep, wrong PR, see #6684 and also #6759. Note that these are not directly related, but they help explain why things are as they are.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants