-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
Faster and more accurate sinpi, cospi, sincospi for Float32, Float64 #41744
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
153405e to
889ef74
Compare
|
Thanks Oscar. For my own curiosity, how did you generate these polynomial coefficients? If it's not too much code, it would be cool to record that for future usage with other special functions, so that you're not the only person who knows how to do this. Test failures look like |
|
The coefficient generation is slightly complex, but the TLDR is to use Remez to generate a minimax polynomial over the range of inputs, specialized to the even/oddness of the polynomial. I really need to put the general principals all into a blogpost at some point, but the TLDR for this is |
|
This has been surprisingly tricky to get right. At the moment, the only failing tests is the behavior as to whether |
|
adding triage label to discus signs of zeros. |
|
removing triage tag after research suggests Base was correct. |
|
Is it worth trying to get this done? I suppose too late for 1.8 but maybe 1.9? |
|
yeah. it's still on my radar. I just haven't had the time recently. |
|
re: signed zeros, the IEEE754-2019 spec states
|
|
I think this is finally ready to go. Anyone want to review? |
gbaraldi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Instead of using
sin_kernelandcos_kernel, use better kernels that need less precision.I benchmark this as 14% faster for Float64, 7% faster for Float32.
Accuracy for Float64 is .64 ULPs compared to .75 for master.
Accuracy for Float32 is .5 ULP