-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
Add expm1(::Float16) #40867
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add expm1(::Float16) #40867
Conversation
|
A bit off-topic: Why is this file mixing so much between using |
|
|
|
Ok. Thanks. You have a point that the paper concerns matrix polynomials. I agree that it is more complicated at this low-level, but I thought a rule-of-thumb of computation cost for scalar is the same: Multiplication is much more expensive than addition, and division (which is not present in the paper I referenced) is even more expensive.
Okay. I can accept this directly assuming "small polynomials" means the degree is max (say) 4 or 5. The file has polys with degrees 8 (or maybe higher). Horner eval of poly of degree 8 requires 8 multiplications. With 8 multiplications we can obtain a poly of degree 25 using Paterson-Stockmayer, see figure 1 in http://eprints.maths.manchester.ac.uk/2676/1/fasi18.pdf I don't understand why a Paterson-Stockmayer approach would be less favorable for fma's. The estrin scheme also seems to require quite a few multiplications. |
|
Ping @mfasi who is an expert on Paterson-Stockmayer. What do you think about the use of PS for scalar polynomial evaluation? |
|
Okay. On second thought maybe my comment is less useful for this than I first thought. You have to count also coefficient * argument multiplication as a regular multiplication and the PS is not as competitive. |
|
It seems it is possible to get less fma instructions with a PS approach for a polynomial of degree 8. I am happy to discuss it further in a better place: https://discourse.julialang.org/t/fmas-and-poly-eval/61421 |
|
Can I get review on this? This fixes a fairly major bug ( |
musm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, looks fine to me. I did not verify coefficients, but I trust they're good presumably generated via same method (Remez.jl).
|
What's buildkite? I haven't seen it before. |
Co-authored-by: Mustafa M <mus-m@outlook.com>
Co-authored-by: Mustafa M <mus-m@outlook.com>
|
Merging in a day or 2 sans objections. |
add expm1(::Float16) and tests for expm1(-floatmax(T)) Co-authored-by: Mustafa M <mus-m@outlook.com> Co-authored-by: Kristoffer Carlsson <kcarlsson89@gmail.com>
add expm1(::Float16) and tests for expm1(-floatmax(T)) Co-authored-by: Mustafa M <mus-m@outlook.com> Co-authored-by: Kristoffer Carlsson <kcarlsson89@gmail.com>
Not sure if this is a regression, but it's definitely bad. Also, fixes a problems I found in
expm1(::Float32)with very small inputs.