-
Notifications
You must be signed in to change notification settings - Fork 15k
Open
Labels
Description
Consider the following LLVM IR:
define bfloat @do_fma(bfloat %a, bfloat %b, bfloat %c) {
%res = call bfloat @llvm.fma.bf16(bfloat %a, bfloat %b, bfloat %c)
ret bfloat %res
}LLVM turns this into the equivalent of:
define bfloat @do_fma(bfloat %a, bfloat %b, bfloat %c) {
%a_f32 = fpext bfloat %a to float
%b_f32 = fpext bfloat %b to float
%c_f32 = fpext bfloat %c to float
%res_f32 = call float @llvm.fma.f32(float %a_f32, float %b_f32, float %c_f32)
%res = fptrunc float %res_f32 to bfloat
ret bfloat %res
}This is a miscompilation, however, as float does not have enough precision to do a fused-multiply-add for bfloat without double rounding becoming an issue. For instance: do_fma(0x1.40p+127, 0x1.04p+0, 0x1.00p-133) = 0x1.46p+127, but LLVM's lowering to float FMA gives an incorrect result of 0x1.44p+127.
Just using double instead of float would also not be a correct lowering: it would give the same incorrect result as the example above (using the reasoning from #128450 (comment), a 126 + 127 + 8 = 261-bit significand would be required for double rounding not to be a problem with this lowering). I suspect the best option here is to lower to a libcall instead.