-
Notifications
You must be signed in to change notification settings - Fork 12
Closed
Description
The naive dot product implementation currently uses "explicit" SIMD instructions:
| function add!(acc::DotAcc, x, y) |
This should prevent LLVM from transforming those into vfmadd instructions, which is a shame since those would probably be both more accurate and more efficient. We should
- either change those to non-explicit ("fuseable") SIMD instructions, relying on LLVM to turn those into
vfmadds, or - explicitly use an fma here.
I wonder if this is what explains the differences between OpenBLAS.ddot and AccurateArithmetic.dot_naive observed in the paper (fig. 4)
Metadata
Metadata
Assignees
Labels
No labels