-
-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Add support for PEP 3118 buffers with offsets (non-contiguous) #7467
Conversation
The ndim > 2 case does not quite work yet and I need to figure out where to get the test cases for it. |
Added support for PEP 3118 PIL-style buffers in multiarray constructor.
I fixed the ndim > 2 case, added a test and squashed the commits. This is ready for review. Note that an untested case is with multiple nonnegative suboffsets. |
numpy/core/src/multiarray/ctors.c
Outdated
goto fail; | ||
if (PyArray_SetBaseObject((PyArrayObject *)r, memoryview) < 0) { | ||
Py_DECREF(r); | ||
goto fail; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PyArray_NewFromDescr steals a reference to descr
, so the goto's above should not got to
fail:
Py_XDECREF(descr); Py_XDECREF(descr);
Can you add some text summarizing what "PEP 3118 PIL-style buffers" are and why they need special handling? (Is this python imaging library PIL or something else?) I'm sure I could figure it out with some research but we reviewers are lazy folk ;-) |
"PEP 3118 PIL-style buffers" are an integral part of PEP 3118. Here is the description from the PEP: """ Where would you like me to add this text (other than here)? |
BTW, I cannot figure out what Travis CI is unhappy about. |
} | ||
/* compute blocksize */ | ||
blocksize = itemsize; | ||
for (ax = last_axis + 1; ax < ndim; ++ax) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If all suboffsets[ax] < 0
then last_axis
never gets initialized when we get here. Even if that is not really possible (is it not?) the compiler does not seem able to figure it out, and that's what's making Travis unhappy. I guess the right value to initialize it at the beginning of the function is -1
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suboffsets cannot be all negative because this would mean no dereferencing and the PEP requires that suboffsets pointer be NULL in this case. If not for that requirement, the right initial value would indeed be -1
, but I don't want to introduce confusion between real -1
and -1
that means ndim - 1
, so I just initialize last_axis
to 0
.
numpy/core/src/multiarray/ctors.c
Outdated
_copy_from_pil_style(char* dest, int ndim, char *buf, npy_intp *shape, | ||
npy_intp *strides, npy_intp *suboffsets, npy_intp itemsize) | ||
{ | ||
int ax, last_axis = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be good to add a comment like /* setting last_axis to anything to silence compiler warnings */
so that no one ends up spending long hours trying to figure out why 0
and not the seemingly more reasonable -1
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a comment. Please note that I plan to squash the commits that start with "FIX:" after the review.
These types of buffers will be copied, while typically we use a view into the buffer as far as I understand. Just for consideration, I am not quite sure yet whether I think there is anything to worry about at all. |
@seberg - yes, I thought about this. Maybe we should only allow While it is not possible to create an array view of a PIL buffer, it is possible to create an array of views into its components in many cases, but I don't think this is what the default constructor should do. |
Yeah, that kind of special array of arrays should not happen, it would be a different thing/function. |
It's already that case that |
It is not that they change, but that the same type might sometimes give a copy and sometimes a view. Of course anyone could make such an object easily already. But yes, I am not bought that we should take the safe but maybe uncomfortable side here. Just unsure. |
I'm not 100% sure myself, but I feel like most code that calls
And in general code that calls |
What of pillow? |
@charris - I've installed pillow and apparently it does not support PEP 3118.
I'll investigate how it manages to be compatible with numpy. It probably supports one of the older protocols:
|
It looks like pillow supports array interface version 3 (whatever it is) but not PEP 3118.
Isn't it ironic given that PEP cites PIL as an application? |
The PEP came too late, PIL died around 2009 ;) Hence Pillow. The two are incompatible, so remove PIL before installing Pillow. You might open an issue on the Pillow site, I really don't think numpy should special case it, rather the other way around. Note that scikit-image supports PIL images among other things. I'm not clear on exactly what functionality you need, but perhaps that will help. |
I'm inclined to close this unless someone wants to keep it open. |
Well, PIL style buffers can be used by other people as well, such as possibly DyND. Since as far as I understand all that "PIL style" means is the use of suboffsets, so pointers to chunks. |
The original poster's use case is PyQ upthread. Given that there are users and it's in a standard (PEP 3118), supporting it seems reasonable, even if the name is out of date. Though I guess one might ask if pyq would be better off using a part of PEP 3118 that other people actually picked up on :-) Is there an easy way to delegate the unpacking to something like |
Maybe stupid question, but by any chance, PyQ does not implement their GetBuffer in such a way that using PyObject_GetBuffer with the |
@seberg - PyQ is Python running inside kdb+ and uses native objects as data buffers. In kdb+, vector objects are extremely simple: an 8-byte header (with reference count, type code etc.), followed by a 64-bit integer length, followed by a contiguous array of elements. There is no support for arrays with ndim>1, but you can have vectors of vectors where each element is a pointer to a similar structure. These vectors of vectors are flexible enough to emulate arrays with an arbitrary ndim (and more). |
@njsmith - it does and it works fine for ndim <= 1 (scalars and 1d vectors). However, the layout of matrices and higher rank tensors is an array of pointers to arrays (of pointers ...). It would be quite awkward to have |
Ping. Is there anything I can do to move this forward? |
This adds quite a bit of non-trivial C code. What do you think of an alternative implementation strategy: if the buffer has suboffsets, create a bytearray from it (letting the bytearray code worry about dealing with the complicated bits), and then use the bytearray as our backing buffer for the array? |
@njsmith - what do you mean by bytearray? Whatever it is, is it currently capable of initializing itself from a buffer with suboffsets? |
I mean the built-in type |
Using
|
I'm interested in moving this forward :).
This should make this PR quite doable I think. To summarize my understanding of the way we could implement this: If a buffer with suboffsets is encountered during For Before I proceed, are there any objections to me opening a new PR? |
I think it is still of interest, but maybe we should close the PR anyway, unless we want to pick it up in the foreseeable future. The use of EDIT: About my "copy-flag" comments above. Lets not worry about it. I think the only reasonable solution for that is to add the |
We could open an issue. This PR really needs a rebase to continue. |
I'll see if I can rebase this with a small effort. |
Considering that this is stalled so long and nobody ever really asked for it. Going to close this. However, we should make sure to error out gracefully for these buffers. If anyone finds that this is important, please feel free to resurface it, but the PR here probably needs a pretty big rebase. |
Added support for PEP 3118 PIL-style buffers in multiarray
constructor.
Closes #5412.