-
-
Notifications
You must be signed in to change notification settings - Fork 9.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
np.any and np.all short-circuiting (Trac #1237) #1835
Comments
Attachment added by trac user lucaciti on 2009-09-24: fix1237.patch |
trac user lucaciti wrote on 2009-09-24 Please note that in the table the rows with "//" mean that the condition is never satisfied. This means that the improvement we see between the starred and the non-starred versions cannot be due to the short-circuiting. I suspect it is due to the fact that
Therefore, I think, the same trick I used for logical_and and logical_or could be extended to all functions and all reduction operations would benefit from it (e.g. the ubiquitous x.sum(), x.mean(), ...). One thing I do not get is why the timings for "any" and "all" are different and the order changes between the starred and non-starred version. The only difference is in the operation and/or and the check for the short-circuiting ==0/!=0. |
@charris wrote on 2009-09-25 What happens if you do the same trick for add? |
trac user lucaciti wrote on 2009-09-25 I specialized the loops for the common operations so that they use a different loop for inplace operations and for reductions. It looks like having a specialized loop for inplace operations only gives a slight boost (at least with the vector length and data types I have tried). Instead, the specialized reduce loop gives a significant boost around: 60% faster. As many operations rely on reductions (e.g. mean, std, ...), this can have a widespread positive effect on the speed. Below the results of the test I have done so far. For "any" and "all" only the most favourable (a short-circuit at the first element) and the least favourable ("zeros" for "any" and "ones" for "all") cases are shown.
|
Attachment added by trac user lucaciti on 2009-09-25: fix-1237_rev2.patch |
@charris wrote on 2009-09-25 Looks the reduce part is worth doing. Can you think up a pair of macros for this case? I'm thinking of something that looks like
I kind of detest burying control in macros, but sometimes it is the only way to make all the code look alike, which is important in massively repetitive stuff like the loops. If you come up with something, make it a separate patch. |
trac user lucaciti wrote on 2009-09-25 Hello! You mean the reduce and the short-circuiting? I agree with you, I like the "if" outside the macro better. I think the solution I found isn't that bad. Let me know if you meant something different. |
Attachment added by trac user lucaciti on 2009-09-25: fix-1237_rev3.patch |
@charris wrote on 2009-09-25 Looks better, but the use of a macro argument bothers me, the idea is to hide most of the repetitive stuff that doesn't serve to clarify the code. I don't think the inplace case is worth the hassle. Style wise, the rest of the code has been using
I'm not adamant about that and seem to be in the minority in that preference, but until it is discussed on the list and someone (else) steps forward to change all the current code I would prefer it to conform to current usage. Make two patches, one for the macros, a second for the loop changes. |
trac user lucaciti wrote on 2009-09-25 I am fine with the style thing and the two patches. But concerning to the "use of a macro argument", I do not see how to remove it. The "repeat" thing happens before the preprocessor, right? If you have a better solution you can implement it differently, I do not mind, honestly. Thanks! |
@charris wrote on 2009-09-25 I was hinting that you could bring the style thing up on the list, where I suspect you will find plenty of support. I was instigating a revolt against myself ;) On the macro argument thing, the Boolean section is an anomaly because the output type differs from the input type. There are some other functions like that, but they are all unary. Hmm, I think the template argument should take at most the type and that the variable name io1 should be hardwired. That would make the function of the template a bit clearer. Let me think about it a bit more. |
Attachment added by trac user lucaciti on 2009-09-26: fix-1237_rev4.patch |
trac user lucaciti wrote on 2009-09-26 The macro thing looks better now. I tend to use "else" between the two braces but it is a matter of taste. I think I understand the reason why numpy programmers like the other way better. I think they mentally associate the open brace with a semicolon, just ignore the closed brace... et voilà, they can pretend they are programming in python :-D |
@charris wrote on 2009-09-26 Looks good, thanks. Maybe use BINARY_REDUCE_LOOP, but if you don't do that I'll commit it as is tomorrow. I prefer the not-two-braces thing because my aging eyes find it a bit easier to catch the "else" than the "}" when scanning down the edge of the code, especially if 'else' is colorized. Most do prefer the two-braces, however, and that style takes up less space. |
Attachment added by trac user lucaciti on 2009-09-27: fix-1237_rev5.patch |
@charris wrote on 2009-09-27 Applied in r7420, thanks. I removed some trailing whitespace and changed
to
You can probably set up your editor to highlight trailing whitespace. There are also some scripts that will do that. |
trac user lucaciti wrote on 2009-09-27 Thanks! Perfect. |
Original ticket http://projects.scipy.org/numpy/ticket/1237 on 2009-09-24 by trac user lucaciti, assigned to unknown.
I noticed that python's "any" can be faster than numpy's "any" (and the similarly for "all").
Then I wondered why.
I realized that numpy implements "any" as logical_or.reduce (and "all" as logical_and.reduce).
This means that numpy cannot take advantage of short-circuiting.
Looking at the timings confirmed my suspects.
I think python fetches one element at the time from the array and as soon as any of them is true it returns true.
Instead, numpy goes on until the end of the array even if the very first element is already true.
Looking at the code I think I found a way to fix it.
In the following table any(x) is python's version, np.any(x) is numpy's, while np.any(x) is mine.
The text was updated successfully, but these errors were encountered: