Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

REV: 8daf144 and following. Strides must be clean to be contiguous #2735

Closed
wants to merge 2 commits into from

8 participants

@seberg
Owner

This reverts those parts of PR gh-2694 that were directly related to the change in flag logic, since it seems that too much code relies at least on mostly clean strides. See also the discussion there.

Some changes remain:

  1. 0-sized arrays are handled somewhat more stricter, forcing them to have matching strides including the dimension of size 0. (This means an array with itemsize=8, strides=(16,) and shape=(0,) is not contiguous). This is mostly so that its really always true that if C-Contiguous then strides[-1] == itemsize, and I can't see it hurting, and it "fixes" a corner case in scikits-learn.
  2. 0-sized arrays can be both C- and F-Contiguous.
  3. 1-sized arrays will be both C- and F-Contiguous (or neither).

Though I would consider 2 and 3 fixes.

On a side note, I think there should be code to clean up strides a little (better). This applies to:

  1. Newaxis in slicing operations
  2. Copy with Keeporder
  3. reduce operations with Keepdims=True

Oh, I think the ISONESEGMENT should at least a bit more accurately check for SIZE < 1, the NDIM check is superfluous anyway.

seberg added some commits
@seberg seberg REV: 8daf144 and following. Strides must be clean to be contiguous
This reverts mostly 8daf144 and c48156d. It leaves some small changes
from the original behavior. Size 1 arrays are both C- and F-contiguous,
and size 0 array can one, both or neither, however it is a bit stricter
enforcing strides are clean including the (first) 0 dimension.
8c99ef1
@seberg seberg BUG: Fix small issue with flags setting in ctors. fa09b2d
@seberg seberg closed this
@seberg seberg reopened this
@charris
Owner

@rgommers, @njsmith, it looks like this PR should go in.

@seberg
Owner

Maybe there should be some a better explanation when exactly the flags are set. Possibly noting that it is still better not to rely on clean strides and that relying on it may brake (and most of the code that this broke already has problems) espeacially for older numpy if the arrays size is 0 or 1. The C-Api docu is generated from source? I can write a bit in a few days.

@GaelVaroquaux

A vote from the scikit-learn: without this PR, scikit-learn is fairly broken under numpy master. This PR fixes it. :+1: for merge on our side!

@charris
Owner

@seberg The c-api isn't automatically generated, you need to write something in the appropriate c-api*.rst file in doc/source. I don't know which one, though :-(.

@seberg
Owner

Ok, thanks I think I will look at writing a little documentation later. However if that is the only reason to keep this open, I will just do a seperate PR for it.

@charris
Owner

That seems appropriate. I'm not sure what is holding this up.

@seberg
Owner

I am not sure, I guess it just needs a bit review. Or is there any discussion needed for the actual remaining changes? I mean it may sound weird that a size 1 array can be non contiguous and size 0 arrays are stricter, but that is consistent with allowing third party software to rely on what they already rely on (even if I don't like it). Also the old behaviour was a bit odd with allowing only one of the two flags being set (unless ndim < 2) which is removed.

@charris
Owner

I agree that it sounds a bit weird. To some extent, I think we are being driven by unwarranted assumptions in user code. @GaelVaroquaux Gael, what would it take to fix scikits-learn? I know that you have a lot of working sets out in the field and don't want to invalidate them, but is it possible to fix things up so that some time in the future we can finish rationalizing the behavior of these flags>

@stefanv
Owner

A size 1 array can and should be both C and F-contiguous.

@seberg
Owner

@stefanv OK, I can change it to that I really don't care. The deal is, people complained about scipy and sk-learn (and probably others out there) being broken. And in reality all of these are already broken for corner cases because of rules like "A size 1 array should be (always) both C and F-contiguous". From my point of view, the change here "fixes" the misuse in other code, because I thought it nonsense to support people relying on clean strides for most cases but not for all cases. Now of course I will gladly change it back, then I have more reason to say that these are all bugs in scipy and sk-learn, because without this change, they are, they just rarely surface.

I do think that this needs to be changed back (it just breaks too much suddenly), but if numpy should or should not make these abuses at least work consistently for the moment, you guys tell me.

@stefanv
Owner

@seberg I feel that we should try to be consistent with the mental model of what contiguous means; I'm not sure why scikit-learn breaks, unless they depend on broken behavior inside of numpy? Perhaps I should investigate more closely; perhaps you have already?

@seberg
Owner

@stefanv to be honest I am a bit sick of this... There are two models (the first is the case in current master):

  1. The memory is in a contiguous layout
  2. and also things like strides[-1] == itemsize are true (this is the only point I saw yet, I don't know if anyone might rely on more, but I doubt it)

All things break in current master rely on 2. Now numpy 1.7., etc. actually provide 2. in most cases.

If your mental model is 1. then yes, they all rely on broken behaviour in numpy. Now in numpy 1.7. this is not even consistently possible (for some rarer cases namely size 1 and 0 arrays), in current master you cannot rely on 2. in many more cases (so things break). This PR basically makes numpy consistently broken if your model is 1. but thus "fixes" the other packages.

The fixes are all very easy, all you need to do is if you use the fact that an array is contiguous do not use strides, because all you need is itemsize and shape. Actually it really boils down to replacing arr.strides[-1] (or [0] for F-order) with arr.itemsize.

@charris
Owner

@stefanv I believe current scipy also breaks without this code. I'm inclined to merge this fix, but I would also like sk-learn and scipy to fix things up so we can do things properly at some point in the future. Now, precisely how we get to that desired end point is a bit hazy. Suggestions on that point welcome.

@GaelVaroquaux
@erg

The python community doesn't pay enough attention to detail regarding array shapes, types, and corner cases imho. +1 for anything that would improve the situation.

@seberg
Owner

What might mitigate this problem a little would be if we clean the strides (also with the buffer protocol, I think that is done already like that) when a contiguous array is requested. I do not know if Cython, etc. use if not array.flags.c_contiguous: array = array.copy() or array = np.array(array, copy=False, order='C').

If cython, etc. use the latter method, that should actually solve all existing problems in scipy and sk-learn. It might be a little confusing though and will certainly not solve all occurrences of this usage out there. However I actually think that it would be consistent to do it like that, and maybe its "good enough" to avoid massive problems out there and still make the flags more consistent to what most expect.

@seberg
Owner

Actually it seems to me that cython uses the buffer protocol, which should probably return cleaned up strides in any case. I actually like the above idea, even if there is still a risk of breaking something out there. Does anyone have an idea how likely that should be?

@njsmith
Owner

I agree with @charris... this is a mess and we really want to get rid of it (especially this nonsense where we need heuristics to guess whether people want their row vectors to be C-contiguous or F-contiguous based on the contiguity of the arrays they came from). But the fact that it has broken all the major C-API-using packages we've looked at so far suggests that we're not going to get away with just changing it in one fell swoop :-(.

(I don't much like any of the partial fixes, since none of them allow a row vector to be flagged as both C- and F-contiguous simultaneously, and that's the most common situation that requires nasty hard-to-test-and-maintain heuristics.)

Here's one possible migration path, very conservative:

  • Add new flags to carry the new semantics, like NPY_ARRAY_C_CONTIGUOUS_FIXED = 0x2000. (Fortunately we have many flag bits still to spare.)
  • Deprecate the old flags. For GCC and MSVC we can do this directly with something like static const int NPY_ARRAY_C_CONTIGUOUS = 0x0001 __attribute__(deprecated); resp. __declspec(deprecated), and then everyone who uses these flags will get warnings.
  • Tell everyone that the way to fix these warnings is to audit their code to make sure it's not making unwarranted assumptions about strides, and then switch to using the new un-deprecated flags.
  • Do something similar with NPY_CORDER_FIXED, np.array(..., order="C_FIXED"), arr.flags.c_contiguous_fixed ... Ugh, this is looking uglier the more I think about it.
  • After a while, make using the old stuff an error.
  • Optional: After a while, re-add the original names, but now with the new semantics. (Optional because, if we can find a name that is less horrible than ..._FIXED in the first place, then we might be able to skip this step.)

Hmm. This wouldn't be so bad if it were just a matter of having to remember for a few releases that, say, NPY_ARRAY_C_CONTIG and NPY_ARRAY_C_CONTIGUOUS were subtly different things. But there's really no acceptable alternative to order="C", is there? And there is almost certainly code out there that does np.array(..., order="C") and then calls a C function which blithely assumes the strides are 'clean'. I don't see how to detect such code and issue a warning except by changing how order="C" works somehow.

@seberg: You found a bunch of these problems in sklearn with a "quick look" -- did it give you any ideas for how we can find more of these problems semi-automatically?

Anyone else have any ideas on heuristics we could use to issue warnings at appropriate points without drowning the user in warnings?

@seberg
Owner

I actually now think that Cython code (and with that sk-learn) is in a way OK, because Cython seems to use the buffer interface and that probably should (and easily can) clean up strides of the returned buffer instead of just setting it to the original strides.

Numpy could ensure clean strides for np.array(..., order='C') and equivalent C-API functions, though it wastes a few CPU cycles and is a bit tedious, it does not seem too bad to me even for the future. So maybe it may be unnecessary to deprecate order. If users out there only check flags (the code we currently saw break doesn't) and then rely on clean strides of course a deprecation process with the strides flags would be necessary I guess.

I have started a bit on hacking that the buffer interface returns cleaned up strides and also numpy tidies up strides when explicitly asked for it (note that ie. np.array(..., order='C', copy='False') would still have both flags as now, but the strides are clean when interpreted as a C-Contiguous array.

@njsmith I just grepped for .strides[ PyArray_STRIDES(, etc. ... its all not too common, and if its not used with something like -1, etc. I can assume its fine... You could also make things crash more often by changing numpy. But semi-automatic, I am not sure. Its probably not hard for someone who knows what is going on to find if a project should be OK... for example I also grepped through Theano code and I think it should be fine.

@stefanv
Owner

@njsmith I'm just trying to clarify this in my mind: a row-vector that comes from slicing a C-contiguous array must be F-contiguous False and C-contiguous False, correct?

@seberg
Owner

Its a work in progress, but https://github.com/seberg/numpy/compare/clean_requested_strides fixes those issues I know in scipy and sk-learn (by cleaning up strides when buffer/array is specifically requested as contiguous), though I did not test the buffer interface change yet. This doesn't clean anything in the python API yet though, etc. Just to note that change would leave the flags as they are in master.

@nouiz

@seberg on gh-2949 asked my opinion of the strides stuff. I agree that in an ideal world, program shouldn't use the strides without checking them for dimensions of size 1. But I think that having a simple deterministic behavior is important.

I also think that such change have far implication that are hard to see. Just the number of code broken by this is an example. So I think that if there is such change, it should be done for NumPy 2.0. People don't want to re-validate all code they did to check that. Also, I think we should make it easy for people to detect problem in there code with this. For example, by being able to compile NumPy that use MAX_SSIZE_T value for those strides will probably make the program crash.

What ever convention is decided, I think that those arrays MUST have the align/c contiguous/f contiguous flag don't check the strides in those dimensions. i.e. This can be done by having as strides some value that make those flags computed correctly as in the past or that we update the computation of those flags to check for the new convention.

So here the list of possible value that can be affected as strides to dimensions of 1:

1) Always use 0. This is useful for parallel algorithm that just multiple the index * strides for each dimension. There isn't any special cases for broadcasted dimensions. In Theano GPU code, we set the broadcasted dimensions to 0 before calling the GPU code.
2) Always some dummy value like MAX_SSIZE_T (probably wrong macro, this is just the idea). This help detect user that rely on the strides for broadcatable dimensions as this will probably make the program crash.
3) Use a strides that would make multiple for loop do the broadcasting automatically. I think an example is best to describe this. I want to add 2 ndarray of float64 with shape(10, 10) and (5,1) and strides (80, 8) and (-80, 8) respectively.
ptr_in1 = in1.data
ptr_in2 = in2.data
ptr_out = out.data
for (int i=0; i<out_shape[0]; i++, ptr_in1+=in1.strides[0], ptr_in2+=in2.strides[0], ptr_out+=out.strides[0])
for (int j=0; j<out_shape[0]; j++, ptr_in1+=in1.strides[0], ptr_in2+=in2.strides[0], ptr_out+=out.strides[0])
*ptr_out = *ptr_in + *ptr_in2

Here, the "ptr_in2+=in2.strides[0]" will automatically reset the ptr_in2 at the right place. So there isn't any special cases for broadcasting in this cases. This cause less special case codes, simplify the code and make it faster then having condition in the loop.

This convention is only good for serial code. As we go to parallel architecture, I don't think it is wise to do this. The code can pre arrange the strides it use to do this.

4) Keep the past value: the previous strides used. This allow to don't change how we compute the align/c/f flags. Also, this don't break much code.

Also, making 1d array c and f contiguous at the same time or neither (as for 1sized array) would simplifies implementation.

As conclusion, I think we shouldn't break other people code by change to strides until NumPy 2.0. Those are hard to find and debug. I consider the numpy strides for broadcating dimensions as part of the NumPy API as this is something needed when we do performance code. People that write c code with NumPy do it for only one reason: performance! So even if it wasn't writen formally somewhere, I think we should consider this part of the NumPy interface and deal with this as like that: warn people, allow them to identify problem and later do changes that break code.

This discussion seam splited. If I missed important part of it, can you point them to me for reading?

In all cases, thank all for your hard work. I know that improving a library without breaking stuff is not trivial.

@pv
Owner
pv commented

@seberg: as a way forward --- would it be simplest to first completely revert all the commits in gh-2694 and get that merged ASAP? After that, the "good" parts could go in a new PR (once we have a clearer idea what 3rd party code actually expects). I think it's not good to keep the master branch in a "known to be sort of wonky" state waiting for a decision on how to fix it or which parts to revert.

@seberg
Owner

Well... I honestly was not aware that this caused much trouble out there (I did not know about those cython memoryview issues). To be honest, this PR basically just reverts to the old behaviour.
Yes it keeps more, but those are bug fixes and making the old behavior sane. At this time your cython example probably already fails for corner cases (think 0-sized arrays), and this would "fix" that (though maybe not quite, because for 0-sized arrays numpy does sometimes multiply by 0 but for array creation multiplies the stride with 1).

So no, I personally think it would be better to just review this and then merge it if it should be done fast. That was once the intention anyway. (even if after that I thought that just fixing the strides for requested contiguous may be a better option to push forward, which fails to "fix" np.ndarray[...] cython misuses)

Thanks for the comment nouize I like the point about the stride value being 0 or such, definitely something to keep in mind.
At this time it seems to me like the only thing that may be done to try to push to ultimately changing the behaviour would be providing something like NPY_ARRAY_C_MEM_CONTIGUOUS and using that inside of numpy (it seems to me more elegant in some occasions) as well as saying that ultimately the definition may be adopted as only one.

@pv
Owner
pv commented

@seberg: indeed, my point was more of a organizational one (I didn't take a close look at the code changes, except that the full diff is not that big) --- a full revert is easier to understand and review, as it doesn't really involve making decisions, and it removes the unnecessary pressure to do something fast. But I'm just "cheering" from the sidelines here, I trust your judgment :)

@njsmith
Owner
@GaelVaroquaux
@njsmith
Owner
@GaelVaroquaux
@njsmith
Owner
@GaelVaroquaux
@njsmith
Owner
@GaelVaroquaux
@seberg
Owner

Well, this would be very simple... Though I do not know how to do such flagging (I guess a macro, but then the environment flag, etc...). In any case, it would be basically just this code with the flags setting changes staying in optionally (the other changes are just things that would get unnecessary but doesn't hurt either).

In any case, I somewhat think it is a decent option (hoping that with cython changing its logic there is not much out there that gets broken), but I really have no idea about how much headache it might cause out there, so I don't have a real opinion...

@nouiz

I like the idea of an env variable to help people debug there code. Don't forget, few people compile numpy and compiling it with optimized blas isn't trivial. So forcing people to do it is wrong I think.

But I agree that we probably need to swap the behavior in NumPy 2.0. @GaelVaroquaux, do you also object of having an env variable in the 1.* series to enable the new behavior to allow easy testing of libraries with the new behavior? Also, I don't think people will try to mess with that as you told. They don't have any insentive to enable the new behavior in release version, just when they debug/develop.

@seberg
Owner

Ok, just so that I know where we stand here. First there is an agreement that it is worth to make this change in the long term?
Second we should create an environment variable to help debugging this. So that would mean numpy will revert to old behaviour (unless the variable is set, at which point it would even actively give weird strides for many cases so that errors get triggered with a higher probability during testing).

Numpy may then start defaulting to it during beta test phase in a while to hopefully make devs aware if they may have problems. That would certainly be only after the cython issues get fixed though, as before that it would trigger too often without the actual developer having any influence.
Then, with 2.0. probably, it would be possible to do the actual change.

Both compile time or runtime variable seem fine to me, but while I can code the rest, I would need a hint on how to implement either one to get that variable or whatever.

@charris
Owner

@seberg Can this be closed now?

@seberg
Owner

I wish. No this is still a problem and must be fixed in this or some other form before releasing 1.8...

@njsmith
Owner

@seberg: no-one seems to have objected to that summary of things, so yes, I guess we have agreement!

@njsmith
Owner

@seberg: Now that we're gearing up for 1.8, what's the status on this? Do you think you'll have time to implement the switch logic discussed above soon?

@seberg
Owner

@njsmith, I think I can get it ready. Was it a compile time flag? Basically need to figure out where to add that flag/global variable, after that it is just mixing a bit of this PR with the other... shouldn't be all that huge amount of work.

@seberg
Owner

Closing this. For anyone who is interested, gh-3162 is the new pull request, which supersedes this. It is basically this pull request but introducing a flag to keep (plus additional debug help) the current master behaviour.

@seberg seberg closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Nov 13, 2012
  1. @seberg

    REV: 8daf144 and following. Strides must be clean to be contiguous

    seberg authored
    This reverts mostly 8daf144 and c48156d. It leaves some small changes
    from the original behavior. Size 1 arrays are both C- and F-contiguous,
    and size 0 array can one, both or neither, however it is a bit stricter
    enforcing strides are clean including the (first) 0 dimension.
Commits on Nov 14, 2012
  1. @seberg
This page is out of date. Refresh to see the latest.
View
6 numpy/core/include/numpy/ndarraytypes.h
@@ -756,9 +756,9 @@ typedef int (PyArray_FinalizeFunc)(PyArrayObject *, PyObject *);
#define NPY_ARRAY_F_CONTIGUOUS 0x0002
/*
- * Note: all 0-d arrays are C_CONTIGUOUS and F_CONTIGUOUS. An N-d
- * array that is C_CONTIGUOUS is also F_CONTIGUOUS if only
- * one axis has a dimension different from one (ie. a 1x3x1 array).
+ * Note: all 0-d arrays are C_CONTIGUOUS and F_CONTIGUOUS. If a
+ * 1-d array is C_CONTIGUOUS it is also F_CONTIGUOUS. Higher
+ * dimensional arrays can be both if they have a size of 0 or 1.
*/
/*
View
32 numpy/core/src/multiarray/ctors.c
@@ -3565,33 +3565,14 @@ _array_fill_strides(npy_intp *strides, npy_intp *dims, int nd, size_t itemsize,
int inflag, int *objflags)
{
int i;
- npy_bool not_cf_contig = 0;
- npy_bool nod = 0; /* A dim != 1 was found */
-
- /* Check if new array is both F- and C-contiguous */
- for (i = 0; i < nd; i++) {
- if (dims[i] != 1) {
- if (nod) {
- not_cf_contig = 1;
- break;
- }
- nod = 1;
- }
- }
-
/* Only make Fortran strides if not contiguous as well */
if ((inflag & (NPY_ARRAY_F_CONTIGUOUS|NPY_ARRAY_C_CONTIGUOUS)) ==
NPY_ARRAY_F_CONTIGUOUS) {
for (i = 0; i < nd; i++) {
strides[i] = itemsize;
- if (dims[i]) {
- itemsize *= dims[i];
- }
- else {
- not_cf_contig = 0;
- }
+ itemsize *= dims[i] ? dims[i] : 1;
}
- if (not_cf_contig) {
+ if ((nd > 1) && ((strides[0] != strides[nd-1]) || (dims[0] > 2))) {
*objflags = ((*objflags)|NPY_ARRAY_F_CONTIGUOUS) &
~NPY_ARRAY_C_CONTIGUOUS;
}
@@ -3602,14 +3583,9 @@ _array_fill_strides(npy_intp *strides, npy_intp *dims, int nd, size_t itemsize,
else {
for (i = nd - 1; i >= 0; i--) {
strides[i] = itemsize;
- if (dims[i]) {
- itemsize *= dims[i];
- }
- else {
- not_cf_contig = 0;
- }
+ itemsize *= dims[i] ? dims[i] : 1;
}
- if (not_cf_contig) {
+ if ((nd > 1) && ((strides[0] != strides[nd-1]) || (dims[nd-1] > 2))) {
*objflags = ((*objflags)|NPY_ARRAY_C_CONTIGUOUS) &
~NPY_ARRAY_F_CONTIGUOUS;
}
View
39 numpy/core/src/multiarray/flagsobject.c
@@ -90,8 +90,13 @@ PyArray_UpdateFlags(PyArrayObject *ret, int flagmask)
* Check whether the given array is stored contiguously
* in memory. And update the passed in ap flags apropriately.
*
- * A dimension == 1 stride is ignored for contiguous flags and a 0-sized array
- * is always both C- and F-Contiguous. 0-strided arrays are not contiguous.
+ * 0-strided arrays are not contiguous. 0-sized arrays are considered
+ * contiguous if their strides are ok up to and including the dimension
+ * that is 0. This means 0-sized arrays can be both C- and F-Contiguous.
+ * Arrays with ndim<2 are either both C- and F-Contiguous or neither.
+ * For contiguous flags to be set the strides must match even if they are
+ * unused because the corresponding dimension is 1. This is because of 3rd
+ * party code relying ie. on strides[-1] == itemsize (for C-Contiguous).
*/
static void
_UpdateContiguousFlags(PyArrayObject *ap)
@@ -104,18 +109,15 @@ _UpdateContiguousFlags(PyArrayObject *ap)
sd = PyArray_DESCR(ap)->elsize;
for (i = PyArray_NDIM(ap) - 1; i >= 0; --i) {
dim = PyArray_DIMS(ap)[i];
- /* contiguous by definition */
- if (dim == 0) {
- PyArray_ENABLEFLAGS(ap, NPY_ARRAY_C_CONTIGUOUS);
- PyArray_ENABLEFLAGS(ap, NPY_ARRAY_F_CONTIGUOUS);
- return;
+ if (PyArray_STRIDES(ap)[i] != sd) {
+ is_c_contig = 0;
+ break;
}
- if (dim != 1) {
- if (PyArray_STRIDES(ap)[i] != sd) {
- is_c_contig = 0;
- }
- sd *= dim;
+ /* contiguous, if it got this far */
+ if (dim == 0) {
+ break;
}
+ sd *= dim;
}
if (is_c_contig) {
PyArray_ENABLEFLAGS(ap, NPY_ARRAY_C_CONTIGUOUS);
@@ -128,13 +130,14 @@ _UpdateContiguousFlags(PyArrayObject *ap)
sd = PyArray_DESCR(ap)->elsize;
for (i = 0; i < PyArray_NDIM(ap); ++i) {
dim = PyArray_DIMS(ap)[i];
- if (dim != 1) {
- if (PyArray_STRIDES(ap)[i] != sd) {
- PyArray_CLEARFLAGS(ap, NPY_ARRAY_F_CONTIGUOUS);
- return;
- }
- sd *= dim;
+ if (PyArray_STRIDES(ap)[i] != sd) {
+ PyArray_CLEARFLAGS(ap, NPY_ARRAY_F_CONTIGUOUS);
+ return;
+ }
+ if (dim == 0) {
+ break;
}
+ sd *= dim;
}
PyArray_ENABLEFLAGS(ap, NPY_ARRAY_F_CONTIGUOUS);
return;
View
14 numpy/core/src/multiarray/multiarraymodule.c
@@ -1517,18 +1517,26 @@ PyArray_EquivTypenums(int typenum1, int typenum2)
/*** END C-API FUNCTIONS **/
static PyObject *
-_prepend_ones(PyArrayObject *arr, int nd, int ndmin)
+_prepend_ones(PyArrayObject *arr, int nd, int ndmin, NPY_ORDER order)
{
npy_intp newdims[NPY_MAXDIMS];
npy_intp newstrides[NPY_MAXDIMS];
+ npy_intp newstride;
int i, k, num;
PyArrayObject *ret;
PyArray_Descr *dtype;
+ if (order == NPY_FORTRANORDER || PyArray_ISFORTRAN(arr) || PyArray_NDIM(arr) == 0) {
+ newstride = PyArray_DESCR(arr)->elsize;
+ }
+ else {
+ newstride = PyArray_STRIDES(arr)[0] * PyArray_DIMS(arr)[0];
+ }
+
num = ndmin - nd;
for (i = 0; i < num; i++) {
newdims[i] = 1;
- newstrides[i] = PyArray_DESCR(arr)->elsize;
+ newstrides[i] = newstride;
}
for (i = num; i < ndmin; i++) {
k = i - num;
@@ -1669,7 +1677,7 @@ _array_fromobject(PyObject *NPY_UNUSED(ignored), PyObject *args, PyObject *kws)
* create a new array from the same data with ones in the shape
* steals a reference to ret
*/
- return _prepend_ones(ret, nd, ndmin);
+ return _prepend_ones(ret, nd, ndmin, order);
clean_type:
Py_XDECREF(type);
View
9 numpy/core/src/multiarray/shape.c
@@ -215,8 +215,9 @@ PyArray_Newshape(PyArrayObject *self, PyArray_Dims *newdims,
* because we can't just re-use the buffer with the
* data in the order it is in.
*/
- if ((order == NPY_CORDER && !PyArray_IS_C_CONTIGUOUS(self)) ||
- (order == NPY_FORTRANORDER && !PyArray_IS_F_CONTIGUOUS(self))) {
+ if ((PyArray_SIZE(self) > 1) &&
+ ((order == NPY_CORDER && !PyArray_IS_C_CONTIGUOUS(self)) ||
+ (order == NPY_FORTRANORDER && !PyArray_IS_F_CONTIGUOUS(self)))) {
int success = 0;
success = _attempt_nocopy_reshape(self, ndim, dimensions,
newstrides, order);
@@ -1076,8 +1077,6 @@ build_shape_string(npy_intp n, npy_intp *vals)
* WARNING: If an axis flagged for removal has a shape equal to zero,
* the array will point to invalid memory. The caller must
* validate this!
- * If an axis flagged for removal has a shape larger then one,
- * the arrays contiguous flags may require updating.
*
* For example, this can be used to remove the reduction axes
* from a reduction result once its computation is complete.
@@ -1100,4 +1099,6 @@ PyArray_RemoveAxesInPlace(PyArrayObject *arr, npy_bool *flags)
/* The final number of dimensions */
fa->nd = idim_out;
+
+ PyArray_UpdateFlags(arr, NPY_ARRAY_C_CONTIGUOUS | NPY_ARRAY_F_CONTIGUOUS);
}
View
15 numpy/core/tests/test_api.py
@@ -204,7 +204,6 @@ def check_copy_result(x, y, ccontig, fcontig, strides=False):
def test_contiguous_flags():
a = np.ones((4,4,1))[::2,:,:]
- a.strides = a.strides[:2] + (-123,)
b = np.ones((2,2,1,2,2)).swapaxes(3,4)
def check_contig(a, ccontig, fcontig):
@@ -214,8 +213,8 @@ def check_contig(a, ccontig, fcontig):
# Check if new arrays are correct:
check_contig(a, False, False)
check_contig(b, False, False)
- check_contig(np.empty((2,2,0,2,2)), True, True)
- check_contig(np.array([[[1],[2]]], order='F'), True, True)
+ check_contig(np.empty((2,2,0,2,2)), True, False)
+ check_contig(np.array([[[1],[2]]], order='F'), False, True)
check_contig(np.empty((2,2)), True, False)
check_contig(np.empty((2,2), order='F'), False, True)
@@ -224,11 +223,11 @@ def check_contig(a, ccontig, fcontig):
check_contig(np.array(a, copy=False, order='C'), True, False)
check_contig(np.array(a, ndmin=4, copy=False, order='F'), False, True)
- # Check slicing update of flags and :
- check_contig(a[0], True, True)
- check_contig(a[None,::4,...,None], True, True)
- check_contig(b[0,0,...], False, True)
- check_contig(b[:,:,0:0,:,:], True, True)
+ # Check slicing update of flags:
+ check_contig(a[0], True, False)
+ # Would be nice if this was C-Contiguous:
+ check_contig(a[None,0,...,None], False, False)
+ check_contig(b[0,0,0,...], False, True)
# Test ravel and squeeze.
check_contig(a.ravel(), True, True)
Something went wrong with that request. Please try again.