-
Notifications
You must be signed in to change notification settings - Fork 19
add fma #106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
add fma #106
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #106 +/- ##
===========================================
- Coverage 96.95% 82.74% -14.22%
===========================================
Files 3 2 -1
Lines 197 197
===========================================
- Hits 191 163 -28
- Misses 6 34 +28 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
src/bfloat16.jl
Outdated
@@ -440,5 +440,7 @@ for F in (:abs, :abs2, :sqrt, :cbrt, | |||
end | |||
end | |||
|
|||
Base.fma(x::BFloat16, y::BFloat16, z::BFloat16) = BFloat16(fma(Float32(x), Float32(y), Float32(z))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks very inefficient on hardware with native bf16 support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like
ccall("llvm.fma.bf16", llvmcall, BFloat16, (BFloat16, BFloat16, BFloat16), x, y, z)
would be better (although this should probably be handled by julia's codegen directly)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not just use llvmcall
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be better --
I was not sure it worked correctly (at one time it did not).
I am making the change.
No description provided.