Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FastISel][AArch64] Compare Instruction Miscompilation Fix #75993

Merged

Conversation

brendaso1
Copy link
Contributor

@brendaso1 brendaso1 commented Dec 20, 2023

When shl is folded in compare instruction, a miscompilation occurs when the CMP instruction is also sign-extended. For the following IR:

%op3 = shl i8 %op2, 3
%tmp3 = icmp eq i8 %tmp2, %op3

It used to generate

cmp w8, w9, sxtb #3

which means sign extend w9, shift left by 3, and then compare with the value in w8. However, the original intention of the IR would require %op2 to first shift left before extending the operands in the comparison operation . Moreover, if sign extension is used instead of zero extension, the sample test would miscompile. This PR creates a fix for the issue, more specifically to not fold the left shift into the CMP instruction, and to create a zero-extended value rather than a sign-extended value.

When shl is folded in compare instruction, a miscompilation occurs
when the CMP instruction is also sign-extended. This fix avoids
folding shl into cmp instruction when the second operand is
extended, and also turns equality operations into unsigned
comparisons.
Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be
notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write
permissions for the repository. In which case you can instead tag reviewers by
name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review
by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate
is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot
Copy link
Collaborator

llvmbot commented Dec 20, 2023

@llvm/pr-subscribers-backend-aarch64

Author: None (brendaso1)

Changes

When shl is folded in compare instruction, a miscompilation occurs when the CMP instruction is also sign-extended. For the following IR:

%op3 = shl i8 %op2, 3
%tmp3 = icmp eq i8 %tmp2, %op3

It used to generate

cmp w8, w9, sxtb #3

which means sign extend w9, shift left by 3, and then compare with the value in w8. However, the original intention of the IR is "left shift w9 by 3 bits, zero-extend the value, then compare with the value in w8". This PR creates a fix for the issue, more specifically to not fold the left shift into the CMP instruction, and to create a zero-extended value rather than a sign-extended value.


Full diff: https://github.com/llvm/llvm-project/pull/75993.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64FastISel.cpp (+6-2)
  • (added) llvm/test/CodeGen/AArch64/fastisel-shift-and-cmp.ll (+33)
diff --git a/llvm/lib/Target/AArch64/AArch64FastISel.cpp b/llvm/lib/Target/AArch64/AArch64FastISel.cpp
index 9b8162ce8dd4d0..ddd2c4eebe491c 100644
--- a/llvm/lib/Target/AArch64/AArch64FastISel.cpp
+++ b/llvm/lib/Target/AArch64/AArch64FastISel.cpp
@@ -1233,7 +1233,11 @@ unsigned AArch64FastISel::emitAddSub(bool UseAdd, MVT RetVT, const Value *LHS,
       isValueAvailable(RHS)) {
     if (const auto *SI = dyn_cast<BinaryOperator>(RHS))
       if (const auto *C = dyn_cast<ConstantInt>(SI->getOperand(1)))
-        if ((SI->getOpcode() == Instruction::Shl) && (C->getZExtValue() < 4)) {
+        if ((SI->getOpcode() == Instruction::Shl) && (C->getZExtValue() < 4) &&
+            !NeedExtend) {
+          // We can only fold instructions that doesn't need extension because
+          // in add/sub instruction, the instruction is first extended, then
+          // shifted
           Register RHSReg = getRegForValue(SI->getOperand(0));
           if (!RHSReg)
             return 0;
@@ -2426,7 +2430,7 @@ bool AArch64FastISel::selectBranch(const Instruction *I) {
       }
 
       // Emit the cmp.
-      if (!emitCmp(CI->getOperand(0), CI->getOperand(1), CI->isUnsigned()))
+      if (!emitCmp(CI->getOperand(0), CI->getOperand(1), !CI->isSigned()))
         return false;
 
       // FCMP_UEQ and FCMP_ONE cannot be checked with a single branch
diff --git a/llvm/test/CodeGen/AArch64/fastisel-shift-and-cmp.ll b/llvm/test/CodeGen/AArch64/fastisel-shift-and-cmp.ll
new file mode 100644
index 00000000000000..5dad290134bbd7
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/fastisel-shift-and-cmp.ll
@@ -0,0 +1,33 @@
+; RUN: llc --global-isel=false -fast-isel -O0 -mtriple=aarch64-none-none < %s | FileCheck %s
+
+; Check that the shl instruction did not get folded in together with 
+; the cmp instruction. It would create a miscompilation 
+
+@A = dso_local global [5 x i8] c"\C8\AA\C8\AA\AA"
+@.str = private unnamed_addr constant [13 x i8] c"TRUE BRANCH\0A\00"
+@.str.1 = private unnamed_addr constant [14 x i8] c"FALSE BRANCH\0A\00"
+
+define dso_local i32 @main() {
+
+  %tmp = load i8, ptr getelementptr inbounds ([5 x i8], ptr @A, i64 0, i64 1)
+  %tmp2 = load i8, ptr getelementptr inbounds ([5 x i8], ptr @A, i64 0, i64 2)
+  %op1 = xor i8 %tmp, -49
+  %op2 = mul i8 %op1, %op1
+; CHECK-NOT: cmp [[REGS:.*]] #[[SHIFT_VAL:[0-9]+]]
+  %op3 = shl i8 %op2, 3
+  %tmp3 = icmp eq i8 %tmp2, %op3
+  br i1 %tmp3, label %_true, label %_false
+
+_true:
+  %res = call i32 (ptr, ...) @printf(ptr noundef @.str)
+  br label %_ret
+
+_false:
+  %res2 = call i32 (ptr, ...) @printf(ptr noundef @.str.1)
+  br label %_ret
+
+_ret:
+  ret i32 0
+}
+
+declare i32 @printf(ptr noundef, ...) 

@.str = private unnamed_addr constant [13 x i8] c"TRUE BRANCH\0A\00"
@.str.1 = private unnamed_addr constant [14 x i8] c"FALSE BRANCH\0A\00"

define dso_local i32 @main() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to simplify the test? I think it could

  • remove dso_local
  • remove the globals
  • remove the loads
  • remove the calls
  • remove --global-isel=false and -O0?

There are some existing tests in llvm/test/CodeGen/AArch64/arm64-fast-isel-icmp.ll, but they don't seem to include any with shift amounts at the moment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm in that case do you think it's more beneficial if I put this test in llvm/test/CodeGen/AArch64/arm64-fast-isel-icmp.ll?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that sounds like it might be a good place for it. Feel free to use the update_llc_test_checks script too to generate the check lines.

if ((SI->getOpcode() == Instruction::Shl) && (C->getZExtValue() < 4)) {
if ((SI->getOpcode() == Instruction::Shl) && (C->getZExtValue() < 4) &&
!NeedExtend) {
// We can only fold instructions that doesn't need extension because
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to reach this point now, if NeedExtend==false and ExtendType != AArch64_AM::InvalidShiftExtend?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't add ExtendType != AArch64_AM::InvalidShiftExtend because there is already a check before this nested-if statement in line 1232. Do you think it's still better to add the check?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - I meant that if ExtendType != InvalidShift then does that mean that NeedExtend is always false? Is it ever possible to get to the emitAddSub_rx(..) line now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah Gotcha. You're right, if ExtendType != AArch64_AM::InvalidShiftExtend, NeedExtend is always true. So the logic would never be reached. Maybe it's a better idea to remove that part of the logic all together?

- remove dso_local
- remove the globals
- remove the loads
- remove the calls
- remove --global-isel=false and -O0
@brendaso1
Copy link
Contributor Author

I have a check that is failing but I am not sure what the cause is.... It looks like the last tests that failed are on MLIR: https://buildkite.com/llvm-project/github-pull-requests/builds/25125 , but my fix has nothing to do with that part of the code, and all the automated checks were passing a few days ago. Is it because this branch does not have the most up-to-date commits as main? Should I merge / rebase my branch to main in this case?

@davemgreen
Copy link
Collaborator

Yep I think we can ignore those errors, they look unrelated.

Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use the update_llc_test_checks.py script to update the test? Then this LGTM, thanks.

@brendaso1
Copy link
Contributor Author

Added the generated CHECK lines to the unit tests. Let's see how the builders look.

Also, when I did more testing, it seems like the sample test wouldn't fail even if the Cmp is sign-extended rather than zero-extended in assembly. So I removed the related changes as well.

@davemgreen
Copy link
Collaborator

Thanks. This LGTM. Are you happy for it to be squashed and merged?

@brendaso1
Copy link
Contributor Author

brendaso1 commented Jan 3, 2024

Yes let's do it! 😁

@aemerson aemerson merged commit 923f6ac into llvm:main Jan 3, 2024
4 checks passed
@aemerson
Copy link
Contributor

aemerson commented Jan 3, 2024

Thanks for the fix @brendaso1. I find the "squash and merge" button satisfying so I hit it for you. If you contribute 2/3 more patches you can request commit access yourself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants