-
-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: improve siegelslopes via pythran #14430
Conversation
527cb98
to
69c5a79
Compare
69c5a79
to
d2f1ffe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @charlotte12l! This looks straightforward, except for the same discussion as in gh-14429 about the many signatures. Let's resolve that on gh-14429 and then see if we're happy with this PR as it is now.
b9f4775
to
533ede2
Compare
…inner via Pythran
533ede2
to
d263843
Compare
I just left the public function and doc in Python, as we discussed before. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @charlotte12l, this is looking quite good now. One more design issue to deal with about dtypes.
scipy/stats/_siegelslopes_pythran.py
Outdated
import numpy as np | ||
|
||
|
||
# pythran export siegelslopes(int[:] or float[:], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The one thing to look at still is this type annotation. The types are not constrained in the public Python function, so this won't work for, e.g., float32
or int16
. I think we still need to figure out exactly how to deal with dtypes in these cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a balance to strike between binary size, performance, and maintainability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't look like the original Python function had support for different data types in mind, as x
is explicitly cast to float
at the beginning. Were we hoping to fix this in the Pythran version?
(Just to confirm, since you don't mention complex dtypes here - contributors probably don't often have complex generalizations in mind when they write stats
functions. Do we have a policy about support for complex dtypes in stats functions?)
For now, I think that a Pythranized version of this function would be useful even if it were just for float64
(and maybe also float32
). The benefit of integer versions seems less compelling here. There could be some memory savings for int8
and int16
, but it will probably be slower as the type needs to be promoted to do the division.
If we want to be more careful about dtypes
throughout stats in the future, we could come back to it. In the interest of seeing these PRs through, would supporting just those one or two types make sense?
This looked close, so I went ahead and |
This PR renames |
The Pythran version is 10x faster than the original Python version. cc @rgommers @serge-sans-paille
Timing codes: