Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ExpandLargeFpConvert] Fix bug in int-to-fp expansion. #85370

Merged
merged 1 commit into from
Mar 15, 2024

Conversation

bevin-hansson
Copy link
Contributor

When deciding whether to perform rounding on the significand,
the generated IR was using (width - leading zeros - 1) rather
than (width - leading zeros). This is different from how the
routine in compiler-rt does it:

int sd = srcBits - clzSrcT(a);
int e = sd - 1;
if (sd > dstMantDig) {

This bug means that the following code, when built on -O0:

#include <stdio.h>

_BitInt(233) v_1037 = 0;

int main(void)
{
    v_1037 = 18014398509481982wb;
    double d = v_1037;
    printf("d = %f\n", d);

    return 0;
}

prints "d = 9007199254740992.000000", which is incorrect.
The correct result is "d = 18014398509481982.000000".

When deciding whether to perform rounding on the significand,
the generated IR was using (width - leading zeros - 1) rather
than (width - leading zeros). This is different from how the
routine in compiler-rt does it:

    int sd = srcBits - clzSrcT(a);
    int e = sd - 1;
    if (sd > dstMantDig) {

This bug means that the following code, when built on -O0:

    #include <stdio.h>

    _BitInt(233) v_1037 = 0;

    int main(void)
    {
        v_1037 = 18014398509481982wb;
        double d = v_1037;
        printf("d = %f\n", d);

        return 0;
    }

prints "d = 9007199254740992.000000", which is incorrect.
The correct result is "d = 18014398509481982.000000".
@llvmbot
Copy link
Collaborator

llvmbot commented Mar 15, 2024

@llvm/pr-subscribers-backend-amdgpu

@llvm/pr-subscribers-llvm-transforms

Author: Bevin Hansson (bevin-hansson)

Changes

When deciding whether to perform rounding on the significand,
the generated IR was using (width - leading zeros - 1) rather
than (width - leading zeros). This is different from how the
routine in compiler-rt does it:

int sd = srcBits - clzSrcT(a);
int e = sd - 1;
if (sd &gt; dstMantDig) {

This bug means that the following code, when built on -O0:

#include &lt;stdio.h&gt;

_BitInt(233) v_1037 = 0;

int main(void)
{
    v_1037 = 18014398509481982wb;
    double d = v_1037;
    printf("d = %f\n", d);

    return 0;
}

prints "d = 9007199254740992.000000", which is incorrect.
The correct result is "d = 18014398509481982.000000".


Full diff: https://github.com/llvm/llvm-project/pull/85370.diff

3 Files Affected:

  • (modified) llvm/lib/CodeGen/ExpandLargeFpConvert.cpp (+1-1)
  • (modified) llvm/test/Transforms/ExpandLargeFpConvert/X86/expand-large-fp-convert-si129tofp.ll (+15-15)
  • (modified) llvm/test/Transforms/ExpandLargeFpConvert/X86/expand-large-fp-convert-ui129tofp.ll (+15-15)
diff --git a/llvm/lib/CodeGen/ExpandLargeFpConvert.cpp b/llvm/lib/CodeGen/ExpandLargeFpConvert.cpp
index 78ad2a25d0e47a..308f13c19f7564 100644
--- a/llvm/lib/CodeGen/ExpandLargeFpConvert.cpp
+++ b/llvm/lib/CodeGen/ExpandLargeFpConvert.cpp
@@ -375,7 +375,7 @@ static void expandIToFP(Instruction *IToFP) {
   Value *Sub2 = Builder.CreateSub(Builder.getIntN(BitWidthNew, BitWidth - 1),
                                   FloatWidth == 128 ? Call : Cast);
   Value *Cmp3 = Builder.CreateICmpSGT(
-      Sub2, Builder.getIntN(BitWidthNew, FPMantissaWidth + 1));
+      Sub1, Builder.getIntN(BitWidthNew, FPMantissaWidth + 1));
   Builder.CreateCondBr(Cmp3, IfThen4, IfElse);
 
   // if.then4:
diff --git a/llvm/test/Transforms/ExpandLargeFpConvert/X86/expand-large-fp-convert-si129tofp.ll b/llvm/test/Transforms/ExpandLargeFpConvert/X86/expand-large-fp-convert-si129tofp.ll
index 3961fec5b3be18..76f5248302badc 100644
--- a/llvm/test/Transforms/ExpandLargeFpConvert/X86/expand-large-fp-convert-si129tofp.ll
+++ b/llvm/test/Transforms/ExpandLargeFpConvert/X86/expand-large-fp-convert-si129tofp.ll
@@ -15,12 +15,12 @@ define half @si129tohalf(i129 %a) {
 ; CHECK-NEXT:    [[TMP5:%.*]] = trunc i129 [[TMP4]] to i32
 ; CHECK-NEXT:    [[TMP6:%.*]] = sub i32 129, [[TMP5]]
 ; CHECK-NEXT:    [[TMP7:%.*]] = sub i32 128, [[TMP5]]
-; CHECK-NEXT:    [[TMP8:%.*]] = icmp sgt i32 [[TMP7]], 24
+; CHECK-NEXT:    [[TMP8:%.*]] = icmp sgt i32 [[TMP6]], 24
 ; CHECK-NEXT:    br i1 [[TMP8]], label [[ITOFP_IF_THEN4:%.*]], label [[ITOFP_IF_ELSE:%.*]]
 ; CHECK:       itofp-if-then4:
 ; CHECK-NEXT:    switch i32 [[TMP6]], label [[ITOFP_SW_DEFAULT:%.*]] [
-; CHECK-NEXT:    i32 25, label [[ITOFP_SW_BB:%.*]]
-; CHECK-NEXT:    i32 26, label [[ITOFP_SW_EPILOG:%.*]]
+; CHECK-NEXT:      i32 25, label [[ITOFP_SW_BB:%.*]]
+; CHECK-NEXT:      i32 26, label [[ITOFP_SW_EPILOG:%.*]]
 ; CHECK-NEXT:    ]
 ; CHECK:       itofp-sw-bb:
 ; CHECK-NEXT:    [[TMP9:%.*]] = shl i129 [[TMP3]], 1
@@ -100,12 +100,12 @@ define float @si129tofloat(i129 %a) {
 ; CHECK-NEXT:    [[TMP5:%.*]] = trunc i129 [[TMP4]] to i32
 ; CHECK-NEXT:    [[TMP6:%.*]] = sub i32 129, [[TMP5]]
 ; CHECK-NEXT:    [[TMP7:%.*]] = sub i32 128, [[TMP5]]
-; CHECK-NEXT:    [[TMP8:%.*]] = icmp sgt i32 [[TMP7]], 24
+; CHECK-NEXT:    [[TMP8:%.*]] = icmp sgt i32 [[TMP6]], 24
 ; CHECK-NEXT:    br i1 [[TMP8]], label [[ITOFP_IF_THEN4:%.*]], label [[ITOFP_IF_ELSE:%.*]]
 ; CHECK:       itofp-if-then4:
 ; CHECK-NEXT:    switch i32 [[TMP6]], label [[ITOFP_SW_DEFAULT:%.*]] [
-; CHECK-NEXT:    i32 25, label [[ITOFP_SW_BB:%.*]]
-; CHECK-NEXT:    i32 26, label [[ITOFP_SW_EPILOG:%.*]]
+; CHECK-NEXT:      i32 25, label [[ITOFP_SW_BB:%.*]]
+; CHECK-NEXT:      i32 26, label [[ITOFP_SW_EPILOG:%.*]]
 ; CHECK-NEXT:    ]
 ; CHECK:       itofp-sw-bb:
 ; CHECK-NEXT:    [[TMP9:%.*]] = shl i129 [[TMP3]], 1
@@ -184,12 +184,12 @@ define double @si129todouble(i129 %a) {
 ; CHECK-NEXT:    [[TMP5:%.*]] = trunc i129 [[TMP4]] to i32
 ; CHECK-NEXT:    [[TMP6:%.*]] = sub i32 129, [[TMP5]]
 ; CHECK-NEXT:    [[TMP7:%.*]] = sub i32 128, [[TMP5]]
-; CHECK-NEXT:    [[TMP8:%.*]] = icmp sgt i32 [[TMP7]], 53
+; CHECK-NEXT:    [[TMP8:%.*]] = icmp sgt i32 [[TMP6]], 53
 ; CHECK-NEXT:    br i1 [[TMP8]], label [[ITOFP_IF_THEN4:%.*]], label [[ITOFP_IF_ELSE:%.*]]
 ; CHECK:       itofp-if-then4:
 ; CHECK-NEXT:    switch i32 [[TMP6]], label [[ITOFP_SW_DEFAULT:%.*]] [
-; CHECK-NEXT:    i32 54, label [[ITOFP_SW_BB:%.*]]
-; CHECK-NEXT:    i32 55, label [[ITOFP_SW_EPILOG:%.*]]
+; CHECK-NEXT:      i32 54, label [[ITOFP_SW_BB:%.*]]
+; CHECK-NEXT:      i32 55, label [[ITOFP_SW_EPILOG:%.*]]
 ; CHECK-NEXT:    ]
 ; CHECK:       itofp-sw-bb:
 ; CHECK-NEXT:    [[TMP9:%.*]] = shl i129 [[TMP3]], 1
@@ -273,12 +273,12 @@ define x86_fp80 @si129tox86_fp80(i129 %a) {
 ; CHECK-NEXT:    [[TMP5:%.*]] = trunc i129 [[TMP4]] to i32
 ; CHECK-NEXT:    [[TMP6:%.*]] = sub i129 129, [[TMP4]]
 ; CHECK-NEXT:    [[TMP7:%.*]] = sub i129 128, [[TMP4]]
-; CHECK-NEXT:    [[TMP8:%.*]] = icmp sgt i129 [[TMP7]], 113
+; CHECK-NEXT:    [[TMP8:%.*]] = icmp sgt i129 [[TMP6]], 113
 ; CHECK-NEXT:    br i1 [[TMP8]], label [[ITOFP_IF_THEN4:%.*]], label [[ITOFP_IF_ELSE:%.*]]
 ; CHECK:       itofp-if-then4:
 ; CHECK-NEXT:    switch i129 [[TMP6]], label [[ITOFP_SW_DEFAULT:%.*]] [
-; CHECK-NEXT:    i129 114, label [[ITOFP_SW_BB:%.*]]
-; CHECK-NEXT:    i129 115, label [[ITOFP_SW_EPILOG:%.*]]
+; CHECK-NEXT:      i129 114, label [[ITOFP_SW_BB:%.*]]
+; CHECK-NEXT:      i129 115, label [[ITOFP_SW_EPILOG:%.*]]
 ; CHECK-NEXT:    ]
 ; CHECK:       itofp-sw-bb:
 ; CHECK-NEXT:    [[TMP9:%.*]] = shl i129 [[TMP3]], 1
@@ -357,12 +357,12 @@ define fp128 @si129tofp128(i129 %a) {
 ; CHECK-NEXT:    [[TMP5:%.*]] = trunc i129 [[TMP4]] to i32
 ; CHECK-NEXT:    [[TMP6:%.*]] = sub i129 129, [[TMP4]]
 ; CHECK-NEXT:    [[TMP7:%.*]] = sub i129 128, [[TMP4]]
-; CHECK-NEXT:    [[TMP8:%.*]] = icmp sgt i129 [[TMP7]], 113
+; CHECK-NEXT:    [[TMP8:%.*]] = icmp sgt i129 [[TMP6]], 113
 ; CHECK-NEXT:    br i1 [[TMP8]], label [[ITOFP_IF_THEN4:%.*]], label [[ITOFP_IF_ELSE:%.*]]
 ; CHECK:       itofp-if-then4:
 ; CHECK-NEXT:    switch i129 [[TMP6]], label [[ITOFP_SW_DEFAULT:%.*]] [
-; CHECK-NEXT:    i129 114, label [[ITOFP_SW_BB:%.*]]
-; CHECK-NEXT:    i129 115, label [[ITOFP_SW_EPILOG:%.*]]
+; CHECK-NEXT:      i129 114, label [[ITOFP_SW_BB:%.*]]
+; CHECK-NEXT:      i129 115, label [[ITOFP_SW_EPILOG:%.*]]
 ; CHECK-NEXT:    ]
 ; CHECK:       itofp-sw-bb:
 ; CHECK-NEXT:    [[TMP9:%.*]] = shl i129 [[TMP3]], 1
diff --git a/llvm/test/Transforms/ExpandLargeFpConvert/X86/expand-large-fp-convert-ui129tofp.ll b/llvm/test/Transforms/ExpandLargeFpConvert/X86/expand-large-fp-convert-ui129tofp.ll
index e05ff198ecc33b..96d87a5cace98b 100644
--- a/llvm/test/Transforms/ExpandLargeFpConvert/X86/expand-large-fp-convert-ui129tofp.ll
+++ b/llvm/test/Transforms/ExpandLargeFpConvert/X86/expand-large-fp-convert-ui129tofp.ll
@@ -15,12 +15,12 @@ define half @ui129tohalf(i129 %a) {
 ; CHECK-NEXT:    [[TMP5:%.*]] = trunc i129 [[TMP4]] to i32
 ; CHECK-NEXT:    [[TMP6:%.*]] = sub i32 129, [[TMP5]]
 ; CHECK-NEXT:    [[TMP7:%.*]] = sub i32 128, [[TMP5]]
-; CHECK-NEXT:    [[TMP8:%.*]] = icmp sgt i32 [[TMP7]], 24
+; CHECK-NEXT:    [[TMP8:%.*]] = icmp sgt i32 [[TMP6]], 24
 ; CHECK-NEXT:    br i1 [[TMP8]], label [[ITOFP_IF_THEN4:%.*]], label [[ITOFP_IF_ELSE:%.*]]
 ; CHECK:       itofp-if-then4:
 ; CHECK-NEXT:    switch i32 [[TMP6]], label [[ITOFP_SW_DEFAULT:%.*]] [
-; CHECK-NEXT:    i32 25, label [[ITOFP_SW_BB:%.*]]
-; CHECK-NEXT:    i32 26, label [[ITOFP_SW_EPILOG:%.*]]
+; CHECK-NEXT:      i32 25, label [[ITOFP_SW_BB:%.*]]
+; CHECK-NEXT:      i32 26, label [[ITOFP_SW_EPILOG:%.*]]
 ; CHECK-NEXT:    ]
 ; CHECK:       itofp-sw-bb:
 ; CHECK-NEXT:    [[TMP9:%.*]] = shl i129 [[A]], 1
@@ -100,12 +100,12 @@ define float @ui129tofloat(i129 %a) {
 ; CHECK-NEXT:    [[TMP5:%.*]] = trunc i129 [[TMP4]] to i32
 ; CHECK-NEXT:    [[TMP6:%.*]] = sub i32 129, [[TMP5]]
 ; CHECK-NEXT:    [[TMP7:%.*]] = sub i32 128, [[TMP5]]
-; CHECK-NEXT:    [[TMP8:%.*]] = icmp sgt i32 [[TMP7]], 24
+; CHECK-NEXT:    [[TMP8:%.*]] = icmp sgt i32 [[TMP6]], 24
 ; CHECK-NEXT:    br i1 [[TMP8]], label [[ITOFP_IF_THEN4:%.*]], label [[ITOFP_IF_ELSE:%.*]]
 ; CHECK:       itofp-if-then4:
 ; CHECK-NEXT:    switch i32 [[TMP6]], label [[ITOFP_SW_DEFAULT:%.*]] [
-; CHECK-NEXT:    i32 25, label [[ITOFP_SW_BB:%.*]]
-; CHECK-NEXT:    i32 26, label [[ITOFP_SW_EPILOG:%.*]]
+; CHECK-NEXT:      i32 25, label [[ITOFP_SW_BB:%.*]]
+; CHECK-NEXT:      i32 26, label [[ITOFP_SW_EPILOG:%.*]]
 ; CHECK-NEXT:    ]
 ; CHECK:       itofp-sw-bb:
 ; CHECK-NEXT:    [[TMP9:%.*]] = shl i129 [[A]], 1
@@ -184,12 +184,12 @@ define double @ui129todouble(i129 %a) {
 ; CHECK-NEXT:    [[TMP5:%.*]] = trunc i129 [[TMP4]] to i32
 ; CHECK-NEXT:    [[TMP6:%.*]] = sub i32 129, [[TMP5]]
 ; CHECK-NEXT:    [[TMP7:%.*]] = sub i32 128, [[TMP5]]
-; CHECK-NEXT:    [[TMP8:%.*]] = icmp sgt i32 [[TMP7]], 53
+; CHECK-NEXT:    [[TMP8:%.*]] = icmp sgt i32 [[TMP6]], 53
 ; CHECK-NEXT:    br i1 [[TMP8]], label [[ITOFP_IF_THEN4:%.*]], label [[ITOFP_IF_ELSE:%.*]]
 ; CHECK:       itofp-if-then4:
 ; CHECK-NEXT:    switch i32 [[TMP6]], label [[ITOFP_SW_DEFAULT:%.*]] [
-; CHECK-NEXT:    i32 54, label [[ITOFP_SW_BB:%.*]]
-; CHECK-NEXT:    i32 55, label [[ITOFP_SW_EPILOG:%.*]]
+; CHECK-NEXT:      i32 54, label [[ITOFP_SW_BB:%.*]]
+; CHECK-NEXT:      i32 55, label [[ITOFP_SW_EPILOG:%.*]]
 ; CHECK-NEXT:    ]
 ; CHECK:       itofp-sw-bb:
 ; CHECK-NEXT:    [[TMP9:%.*]] = shl i129 [[A]], 1
@@ -273,12 +273,12 @@ define x86_fp80 @ui129tox86_fp80(i129 %a) {
 ; CHECK-NEXT:    [[TMP5:%.*]] = trunc i129 [[TMP4]] to i32
 ; CHECK-NEXT:    [[TMP6:%.*]] = sub i129 129, [[TMP4]]
 ; CHECK-NEXT:    [[TMP7:%.*]] = sub i129 128, [[TMP4]]
-; CHECK-NEXT:    [[TMP8:%.*]] = icmp sgt i129 [[TMP7]], 113
+; CHECK-NEXT:    [[TMP8:%.*]] = icmp sgt i129 [[TMP6]], 113
 ; CHECK-NEXT:    br i1 [[TMP8]], label [[ITOFP_IF_THEN4:%.*]], label [[ITOFP_IF_ELSE:%.*]]
 ; CHECK:       itofp-if-then4:
 ; CHECK-NEXT:    switch i129 [[TMP6]], label [[ITOFP_SW_DEFAULT:%.*]] [
-; CHECK-NEXT:    i129 114, label [[ITOFP_SW_BB:%.*]]
-; CHECK-NEXT:    i129 115, label [[ITOFP_SW_EPILOG:%.*]]
+; CHECK-NEXT:      i129 114, label [[ITOFP_SW_BB:%.*]]
+; CHECK-NEXT:      i129 115, label [[ITOFP_SW_EPILOG:%.*]]
 ; CHECK-NEXT:    ]
 ; CHECK:       itofp-sw-bb:
 ; CHECK-NEXT:    [[TMP9:%.*]] = shl i129 [[A]], 1
@@ -357,12 +357,12 @@ define fp128 @ui129tofp128(i129 %a) {
 ; CHECK-NEXT:    [[TMP5:%.*]] = trunc i129 [[TMP4]] to i32
 ; CHECK-NEXT:    [[TMP6:%.*]] = sub i129 129, [[TMP4]]
 ; CHECK-NEXT:    [[TMP7:%.*]] = sub i129 128, [[TMP4]]
-; CHECK-NEXT:    [[TMP8:%.*]] = icmp sgt i129 [[TMP7]], 113
+; CHECK-NEXT:    [[TMP8:%.*]] = icmp sgt i129 [[TMP6]], 113
 ; CHECK-NEXT:    br i1 [[TMP8]], label [[ITOFP_IF_THEN4:%.*]], label [[ITOFP_IF_ELSE:%.*]]
 ; CHECK:       itofp-if-then4:
 ; CHECK-NEXT:    switch i129 [[TMP6]], label [[ITOFP_SW_DEFAULT:%.*]] [
-; CHECK-NEXT:    i129 114, label [[ITOFP_SW_BB:%.*]]
-; CHECK-NEXT:    i129 115, label [[ITOFP_SW_EPILOG:%.*]]
+; CHECK-NEXT:      i129 114, label [[ITOFP_SW_BB:%.*]]
+; CHECK-NEXT:      i129 115, label [[ITOFP_SW_EPILOG:%.*]]
 ; CHECK-NEXT:    ]
 ; CHECK:       itofp-sw-bb:
 ; CHECK-NEXT:    [[TMP9:%.*]] = shl i129 [[A]], 1

@bevin-hansson bevin-hansson merged commit f623adb into llvm:main Mar 15, 2024
5 of 6 checks passed
@jayfoad
Copy link
Contributor

jayfoad commented Mar 15, 2024

This commit causes test/CodeGen/AMDGPU/itofp.i128.ll to fail in my Release+Asserts build. I don't understand why it didn't fail in the CI build too.

@nico
Copy link
Contributor

nico commented Mar 15, 2024

Looks like this breaks check-llvm: http://45.33.8.238/linux/133290/step_12.txt

Please take a look and revert for now if it takes a while to fix.

@arsenm
Copy link
Contributor

arsenm commented Mar 15, 2024

This commit causes test/CodeGen/AMDGPU/itofp.i128.ll to fail in my Release+Asserts build. I don't understand why it didn't fail in the CI build too.

Probably a race, they were pushed very close

@arsenm
Copy link
Contributor

arsenm commented Mar 15, 2024

Looks like this breaks check-llvm: http://45.33.8.238/linux/133290/step_12.txt

Please take a look and revert for now if it takes a while to fix.

Should just be regenerate test checks

@bevin-hansson
Copy link
Contributor Author

bevin-hansson commented Mar 15, 2024

I'll try reproducing it and regenerating the tests, though it will take a little while for me to rebuild from trunk.

@jayfoad
Copy link
Contributor

jayfoad commented Mar 15, 2024

I'll try reproducing it and regenerating the tests.

I'm already set up to do that myself...

@bevin-hansson
Copy link
Contributor Author

I'll try reproducing it and regenerating the tests.

I'm already set up to do that myself...

Okay, if it's not too much trouble you should probably do it instead.

jayfoad added a commit that referenced this pull request Mar 15, 2024
@jayfoad
Copy link
Contributor

jayfoad commented Mar 15, 2024

Done!

@bevin-hansson
Copy link
Contributor Author

Thanks!

@FreddyLeaf
Copy link
Contributor

Thanks for this fix!

searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Apr 27, 2024
When deciding whether to perform rounding on the significand,
the generated IR was using (width - leading zeros - 1) rather
than (width - leading zeros). This is different from how the
routine in compiler-rt does it:

    int sd = srcBits - clzSrcT(a);
    int e = sd - 1;
    if (sd > dstMantDig) {

This bug means that the following code, when built on -O0:

    #include <stdio.h>

    _BitInt(233) v_1037 = 0;

    int main(void)
    {
        v_1037 = 18014398509481982wb;
        double d = v_1037;
        printf("d = %f\n", d);

        return 0;
    }

prints "d = 9007199254740992.000000", which is incorrect.
The correct result is "d = 18014398509481982.000000".

(cherry picked from commit f623adb)
Change-Id: Ieb9c0c786a1b781200f40e6229e69231c8fabe15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants