-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fused multiple-add instruction #23342
Comments
For JS, would something like this be enough? function fma(a, b, c) {
return (a * b) + c
} And an explanation on the documentation for JS targets. |
A quick google search gives: assert( 0.1 * 10 - 1 == 0)
assert( fma(0.1, 10, 1) == 5.55112e-17) # Example from en.cppreference.com I would expect a different rounding behaviour from a fma function than from an axpy computation. P.S: The axpy term comes from the BLAS (basic linear algebra subprogram) specification. |
Then make it inside a |
most recent high-perf x86 CPUs should do a uop-fuse on axpy, the FMA3 instructions really exist for increased accuracy. not really certain about ARM-land. |
Feel free to open a pull request; see also nim-lang/RFCs#92 |
@ringabout Sorry, I do not have time to implement this. |
Summary
I would like to add the fused-multiply-add instruction either directly to system or to std/math.
This instruction is crucial for high-performance computing. It computes a product and an addition in one cycle: fma(a, x, y) = a*x + y. Not only is this instruction faster but it also comes with less rounding errors. It is guaranteed to compute the product as if it had infinite precision.
The problem with this instruction, is that some operations should not use a fused-multiple add. Indeed, when one wants to compute the product of two complex numbers a = x + iy and b = x' + iy', the imaginary part is given by: yx' + xy'. This sum of two products can be computed by using fma in two ways, but none of them would be accurate enough, to distinguish whether the complex number product is real or complex. (See e.g. Nicolas J. Higham, Accuracy and Stability of Numerical Algorithms, 2002).
Consequently, it is up to the programmer and not the compiler to decide, given the context, whether or not a fused multiply-add should be used.
This difference of rounding with the two separate instructions enable the use of error-free transformations like the two-product error free transformation of Ogita et al (2005)
Description
Currently, I import in my projects:
A solution would be to add these imports in std/math. We would know be able to use error-free transformations without these imports:
Alternatives
No response
Examples
No response
Backwards Compatibility
There might be some issues with the JS backend, as always when it comes with floating-point arithmetic support.
I propose to align with the C++ specification.
Links
https://en.wikipedia.org/wiki/Multiply%E2%80%93accumulate_operation
https://en.cppreference.com/w/cpp/numeric/math/fma
https://epubs.siam.org/doi/epdf/10.1137/030601818
The text was updated successfully, but these errors were encountered: