-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pythran version of scipy.optimize._group_columns #13336
Pythran version of scipy.optimize._group_columns #13336
Conversation
2762c33
to
afd55c0
Compare
@rgommers this one looks good. The CI issue seems unrelated. My local benchmarks give interesting speedup $ python -m timeit -s 'from numpy import array; n = 200; m = n - 12; x = array(range(n)); y = array(range(12, 12 +n)); xy = array(range(n*n)).reshape((n,n)); from _group_columns import group_sparse as gs, group_dense as gd;' 'gd(n, m, xy)' pythran: |
afd55c0
to
96a8ded
Compare
@rgommers gentle ping ;-) |
Thanks for the ping, and sorry for the delay @serge-sans-paille. I'm kind of distracted by a proposal deadline until the 19th. That's a massive speedup, guess there's a serious problem in the Cython code somehow. The code looks like a correct line-by-line translation, it's not clear to me why there should be a ~700x performance difference. |
I can't reproduce that. I get the same result for Pythran:
Cython code gives an exception for that, it only works with
Then if I change Pythran to
So the performance gain is close to 2x. Looks like you swapped the numbers and wrote |
96a8ded
to
cedfeb5
Compare
Oh, got it, I was using Python Int in my setup, which probably caused my Cython measurement to be meaningless. I've updated the Pythran export line to only accept C |
cedfeb5
to
96a8ded
Compare
@rgommers is there anything I should do for that one? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM. This function is used only in least_squares
it looks like. I did some quick timings on the first example in its docstring (Rosenbrock function with trf
method), and the speedup is ~8%.
In it goes, thanks @serge-sans-paille
No description provided.