New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pythran may make a function slower? #1753
Comments
Thanks @charlotte12l for the detailed report. Your test case is interesting. Although I fail to reproduce your timings (I still get a slight performance improvement with pythran, see below), we can probably do better, and it's great to explore that. So first, here is how I run the benchmark: I modified your test function to remove noise:
and run the test through
python kernel: to understand where the pythran version spends time, I use
And then it's some pythran internal, but basically a third of the time is spent in the |
Thanks for your reply! I think optimizing array copy will be very useful! I can help test it after you implement it :) As for this function, I think the speed may vary on different machines, inputs, and even on different The good cases:
The bad cases:
|
Can you rerurn your test now that #1867 is merged in? It may change the performance of the above code. |
Now the timing results are as follows, seems now Pythran version is slightly better than the Python version. Does that meet your expectation?
The Pythran function:
|
yeah, as long as we no longer have a slowdown, the bug can be considered fixed. Let's discuss the profitability of swapping implementation in a scipy ticket. |
amendum: considering f8/f9, there may be cache effects, it's probably to test each function independently. |
I tried to use Pythran to improve this function:
and the test function is
However, when I compare its performance with the pure python version, I'm surprised to find it is slower...
Through analyzing function
kendall_p_exact(n, c)
via line_profiler, I noticed thatnp.cumsum
consumes about 75% of the total time, then I guessed that the pythran acceleratedcumsum
may be slower than the originalnp.cumsum
so that it is reasonable thattest_pythran_kendall_p_exact()
is slower thantest_orig_kendall_p_exact()
. I did a small test but find the pythran-cumsum is actually faster thannp.cumsum
...The text was updated successfully, but these errors were encountered: