-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMDGPU] Fix wrong operand value when floating-point value is used as operand of type i16 #84106
Conversation
You can test this locally with the following command:git-clang-format --diff 3f7aa042b657671319f994ad3fb7c3eb79a6fe00 57a134c493f8cb2afe1e31a7a4fa4270463706f5 -- llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp View the diff from clang-format here.diff --git a/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp b/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
index 5050aec261..fd8ce9e07f 100644
--- a/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
+++ b/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
@@ -2065,8 +2065,8 @@ bool AMDGPUOperand::isInlinableImm(MVT type) const {
break;
}
return isInlineableLiteralOp16(
- static_cast<uint16_t>(FPLiteral.bitcastToAPInt().getZExtValue()), type,
- AsmParser->hasInv2PiInlineImm());
+ static_cast<uint16_t>(FPLiteral.bitcastToAPInt().getZExtValue()),
+ type, AsmParser->hasInv2PiInlineImm());
}
// Check if single precision literal is inlinable
|
70996de
to
e48c168
Compare
… operand of type i16 Based on the section "OPF_INV2PI_16" of the spec, when a floating-point value is used as operand of type `i16`, the 32-bit representation of the constant truncated to the 16 LSBs should be used. Currently we directly use the FP16 representation, which doesn't conform with the spec. For example, when `0.5` is used, for now we take it as `0x3800` because that is the encoding of `<half 0.5>`. Instead, it should be `0x3f000000` truncated to 16 LSB, which is `0x0000`.
Is there any script to update those tests? It is quite a nightmare to update them manually. |
@@ -2045,9 +2047,26 @@ bool AMDGPUOperand::isInlinableImm(MVT type) const { | |||
return false; | |||
|
|||
if (type.getScalarSizeInBits() == 16) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is redundant with the switch over the handled 16-bit types
v_cvt_f16_u16_e32 v5, 0.5 | ||
// GFX10: encoding: [0xff,0xa0,0x0a,0x7e,0x00,0x38,0x00,0x00] | ||
// GFX10: encoding: [0x80,0xa0,0x0a,0x7e] | ||
|
||
v_cvt_f16_u16_e32 v5, -4.0 | ||
// GFX10: encoding: [0xff,0xa0,0x0a,0x7e,0x00,0xc4,0x00,0x00] | ||
// GFX10: encoding: [0x80,0xa0,0x0a,0x7e] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note these two instructions now assemble to identical binary since the low 16 bits of f32 0.5 and f32 -4.0 are identical. Can you add tests with a more interesting literal, like inv2pi, which has non-0 low 16 bits?
This patch will be merged into 530f0e6 when it is relanded. |
Based on the section "OPF_INV2PI_16" of the spec, when a floating-point value is
used as operand of type
i16
, the 32-bit representation of the constanttruncated to the 16 LSBs should be used. Currently we directly use the FP16
representation, which doesn't conform with the spec.
For example, when
0.5
is used, for now we take it as0x3800
because that isthe encoding of
<half 0.5>
. Instead, it should be0x3f000000
truncated to 16LSB, which is
0x0000
.