-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cython ND Array iterator #1510
base: main
Are you sure you want to change the base?
Cython ND Array iterator #1510
Conversation
What's the difference between pxi and pyx? |
iter.next() | ||
|
||
del iter | ||
assert np.sum(arr) == sum |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't you need to import numpy somewhere? Also, don't overwrite built-in names...
Again, maybe my Cython noob-ness showing, but is it possible to make this more Pythonic? Here's how I would like it to read: while iter:
total += next(iter) I particularly dislike the Finally, quite often, we want not just the value but the current index. Is there a way for you to get that easily with a |
@jni, a |
/cc @cournape |
/cc @njsmith |
I am summoned! Uh. What am I looking at, and why?
|
I probably missed the discussion, but how is this useful or better than |
@vighneshbirodkar is the |
I think they should be .pxd, not .pxi?
|
Why is this not a PR to Cython then? It should be relatively easy to generate the appropriate code whenever the iterable is a numpy array? So, that you can actually use nice syntax, e.g. |
This looks really weird to me. Aside from the part where you're writing a .pxi instead of a .pxd... you can't Like, if you just call the Python function |
Hello @njsmith , @stefanv thanks for looking into this Yes, the functionality offered by my code is indeed trivial. My only intent is to let it's user not have to do pointer manipulation and C imports in order to iterate over an ND array. With this, all they have to do is call As you said TL;DR The API(when complete) will let the user iterate over an ND array and its neighborhood without being familiar with Numpy C API. Edit : corrected he to they |
If you just remove the free() then you'll have a memory leak. You need to either call Py_D
Friendly reminder that not all programmers are male.
You're using nditer, just via a really limited api. The loss of speed of using the full api would be unmeasurable for any reasonable sized image and would make your code simpler -- all you really need for your purposes is some utility functions (not a class) that call next and return properly cast pointers. For the neighborhood iterator then yeah iirc there is currently no python wrapper for the constructor function so you also need to wrap that. (Maybe numpy should have a python level api for this built in though, if you get inspired... ;-)) |
@njsmith Now that you point it out, the class does seem like an overkill, will make the changes soon. I would disagree with you regarding the loss of speed. Python iterators would make python function calls which will come with its own over head. On the other hand |
@njsmith I'll run a quick benchmark and share the result soon. |
@njsmith See https://github.com/vighneshbirodkar/cy_iter/blob/master/iter.pyx The code is not exactly the same, but similar, the results on my system were Numpy took 0.000637054443359
ans = 2161916
Cython took 0.00178599357605
ans = 2161916
Python took 0.204711914062
ans = 2161916 I am assuming that the eventual operation intended is not something as simply as the arithmetic sum. Now think about iterating over neighbors, let's say over a 2D image. For each pixel, you would have a Python iterator going over 8 values. The difference would be even more obvious then. |
Read again, you're seeing what you expect me to say, not what I'm saying
|
@njsmith Also, multiple |
@njsmith (edit: If) I am not wrong the |
Ah, yeah, you're probably right :-). I am not at all an expert on the details of numpy's iteration APIs (and they're very confusing!), so while I tried to figure out what exactly was going on with I can't really say whether any particular API would be more or less convenient than just using the existing numpy API directly. If the only difference is that the new API is better documented then you're probably better off submitting doc patches :-). If you have other improvement ideas then we'd certainly like to hear them upstream too, though that wouldn't help in the short term (b/c skimage needs to support older numpys). |
Just my 2 cents - I have no problem with Cython up through typed memoryviews, but find the NumPy C API difficult to grok. Even if it's possible using pointers and the C API presently, there's a reason few of our algorithms go this route right now - it's a barrier both to implementation and maintenance. An efficient and elegant Cython implementation, even if just a wrapper (and even if slightly less efficient), would let us realize these gains more generally across the package. |
Since we support numpy 0.9, we can't assume the Npy_* API available to us. And, as the documentation[1] suggests, neighbourhood iteration is not I'll go ahead and refactor this class into utility functions using the same [1] http://docs.scipy.org/doc/numpy/reference/c-api.iterator.html
|
Hi Vighnesh, You may want to take a look at the Cython implementation of labeling in Anyway, @njsmith pointed you in the wrong direction, but with good intentions. ;-) As you mentioned elsewhere, this is highly overlapping with @aman-iitj and his GSoC project. This said, I by no means want to push you away from working on this, but rather pull you in so that your work is coordinated with Aman's. |
|
||
cdef inline cnp.float_t get_float(self): | ||
""" Returns the current element as `float`. """ | ||
cdef cnp.float_t *address = <cnp.float_t*>self.iter.dataptr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately you cannot do this... At least not without previously checking that the array being iterated is aligned and its type not byteswapped. The dtype object has a pointer to a copyswap
function, see nere, that can be used to copy data into an aligned buffer which you can then dereference without worries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @jaimefrio
I plan to re visit this soon
Could you elaborate what you mean by aligned ? Or could you point me to some documentation which elaborates it ?
And I am guessing here by byteswapped you mean I should check for the endianness of the array ?
@jaimefrio If it can be written more clearly in C, shouldn't we do that and wrap the C with Cython? |
Thoughts appreciated.
Will submit the neighborhood iterator once this is approved.