Skip to content

Conversation

v01dXYZ
Copy link
Contributor

@v01dXYZ v01dXYZ commented Jun 27, 2024

Fixes #96893.

Tasks remaining:

  • Add in the comment about EHPad it's also necessary for indirect targets.
  • Add a sanity test with a timeout (I could not set a timeout duration for a specific test only so I add the test file).

also the test was only added for x86, tell me if you think it's not enough.

FYI, I wrote another PR #104240 to solve the same issue that doesn't use check explicitly for FallThrough not being a Indirect Target block. Feel free to indicate which PR is best.

Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be
notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write
permissions for the repository. In which case you can instead tag reviewers by
name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review
by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate
is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@v01dXYZ v01dXYZ force-pushed the 96893-branch-folding-callbr-indirect-targets-infinite-loop branch from 2bd22fb to ad95db2 Compare August 14, 2024 16:16
@v01dXYZ v01dXYZ marked this pull request as ready for review August 14, 2024 16:20
@llvmbot
Copy link
Member

llvmbot commented Aug 14, 2024

@llvm/pr-subscribers-backend-x86

Author: None (v01dXYZ)

Changes

Fixes #96893.

Tasks remaining:

  • Add in the comment about EHPad it's also necessary for indirect targets.
  • Add a sanity test with a timeout (I could not set a timeout duration for a specific test only so I add the test file).

Full diff: https://github.com/llvm/llvm-project/pull/96956.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/BranchFolding.cpp (+15-9)
  • (added) llvm/test/CodeGen/X86/branchfolding-no-inf-loop-with-callbr-indirect-blocks.ll (+18)
diff --git a/llvm/lib/CodeGen/BranchFolding.cpp b/llvm/lib/CodeGen/BranchFolding.cpp
index 1dc278586f1178..3a03a86cb44ada 100644
--- a/llvm/lib/CodeGen/BranchFolding.cpp
+++ b/llvm/lib/CodeGen/BranchFolding.cpp
@@ -1757,18 +1757,24 @@ bool BranchFolder::OptimizeBlock(MachineBasicBlock *MBB) {
       // advantage in "falling through" to an EH block, so we don't want to
       // perform this transformation for that case.
       //
-      // Also, Windows EH introduced the possibility of an arbitrary number of
-      // successors to a given block.  The analyzeBranch call does not consider
-      // exception handling and so we can get in a state where a block
-      // containing a call is followed by multiple EH blocks that would be
+      // Also, some constructs introduced the possibility of an arbitrary number
+      // of successors to a given block.
+      //
+      // - Windows EH: with EH blocks
+      // - CALLBR instruction: with Indirect Target blocks
+      //
+      // The analyzeBranch call does not consider exception handling or inline
+      // asm with control flow instructions. So we can get in a state where a
+      // block containing a call is followed by multiple blocks that would be
       // rotated infinitely at the end of the function if the transformation
-      // below were performed for EH "FallThrough" blocks.  Therefore, even if
-      // that appears not to be happening anymore, we should assume that it is
-      // possible and not remove the "!FallThrough()->isEHPad" condition below.
+      // below were performed for such blocks. Therefore, we should assume
+      // that it is possible and not remove the "!FallThrough()->isEHPad &&
+      // !FallThrough->isInlineAsmIndirectTarget()" condition below (even if
+      // in the case of EHPad that appears not to be happening anymore).
       MachineBasicBlock *PrevTBB = nullptr, *PrevFBB = nullptr;
       SmallVector<MachineOperand, 4> PrevCond;
-      if (FallThrough != MF.end() &&
-          !FallThrough->isEHPad() &&
+      if (FallThrough != MF.end() && !FallThrough->isEHPad() &&
+          !FallThrough->isInlineAsmBrIndirectTarget() &&
           !TII->analyzeBranch(PrevBB, PrevTBB, PrevFBB, PrevCond, true) &&
           PrevBB.isSuccessor(&*FallThrough)) {
         MBB->moveAfter(&MF.back());
diff --git a/llvm/test/CodeGen/X86/branchfolding-no-inf-loop-with-callbr-indirect-blocks.ll b/llvm/test/CodeGen/X86/branchfolding-no-inf-loop-with-callbr-indirect-blocks.ll
new file mode 100644
index 00000000000000..5a9557c3e343f8
--- /dev/null
+++ b/llvm/test/CodeGen/X86/branchfolding-no-inf-loop-with-callbr-indirect-blocks.ll
@@ -0,0 +1,18 @@
+; RUN: llc -mtriple x86_64 %s -stop-after=branch-folder
+
+; Check branch-folder do not keep swapping %indirect.target.block.1 and
+; %indirect.target.block.2
+define void @no_inf_loop() {
+entry:
+  br label %bb1
+
+bb1:
+  callbr void asm "", "!i,!i"() to label %bb1
+    [label %indirect.target.block.1, label %indirect.target.block.2]
+
+indirect.target.block.1:
+  br label %bb1
+
+indirect.target.block.2:
+  br label %bb1
+}

In OptimizeBlock, add support for CALLBR and indirect target blocks to
avoid infinite loop similarly to what is done with INVOKE and
EHPad (ie do not move the MBB to the end if fallthrough is an indirect
target block).

Improvement based on the commit ff6a1ed dealing with INVOKE and EHPad.

Reminder of CALLBR RFC:
https://lists.llvm.org/pipermail/llvm-dev/2018-October/127239.html
@v01dXYZ v01dXYZ force-pushed the 96893-branch-folding-callbr-indirect-targets-infinite-loop branch from ad95db2 to 54edc37 Compare August 14, 2024 16:39
@v01dXYZ
Copy link
Contributor Author

v01dXYZ commented Aug 14, 2024

About the comment explaining why it's necessary to check Fallthrough not being EHPad or Indirect Target, I kept the following wording even if in the case of EHPad that appears not to be happening anymore. It seems to be wrong as the test branchfolding-catchpads.ll will inf loop if removing the check. So it still happens. @andykaylor do you think it's fair to remove this part from the comment ?

@andykaylor
Copy link
Contributor

@v01dXYZ It's been a very long time since I wrote that comment, but I think what I meant was that the check for EHPad is definitely necessary, and even if, at some time in the future, it looked as if the infinite loop wasn't happening if the check was removed, the check still shouldn't be removed. My description in the original code review (https://reviews.llvm.org/D27582) says that the fix had been reverted because it caused compile time issues and the original case wasn't failing anymore. Then I found another case where the infinite loop occurred, so when I recommitted the fix I added a note warning against reverting the fix again. I guess I never completely understood the conditions that triggered the infinite loop, but I was certain that the potential will always be there.

So, maybe it's worth rewording the comment, but I think it would be good to keep it in some form. How about this:

"Changes to the branch folding algorithm may allow the regression tests to pass without these checks, but the checks should still not be removed because the possibility for an infinite loop remains as long as there can be multiple blocks that meet the criteria for being moved to the end."

Maybe we should consider some more general solution where we keep track of which blocks we've moved to the end and skip those in subsequent loop iterations. I haven't looked at this code in a very long time, so I don't know how hard that would be to implement.

Copy link
Contributor

@andykaylor andykaylor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fix looks good to me.

@andykaylor andykaylor requested review from rnk and HaohaiWen August 15, 2024 00:41
Copy link
Contributor

@HaohaiWen HaohaiWen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@andykaylor
Copy link
Contributor

I hadn't noticed your question about whether x86 testing is sufficient. It probably isn't. I don't see anything target-specific in your test, and it looks like the branch folding pass is added for all targets if not explicitly disabled. Can you move the test to CodeGen/Generic?

@v01dXYZ
Copy link
Contributor Author

v01dXYZ commented Aug 16, 2024

I would like to replace the PrevBB->isSuccessor(&*FallThrough) with (PrevTBB == &*FallThrough) || (PrevFBB == &*FallThrough). I think it's the cause of the infinite loop.

It seems to be related to the fact that analyzeBranch(PrevBB, ...) ignores the instructions for INVOKE/INLINEASM_BR while those constructs still affect the successors/predecessors of the blocks. That's because those instructions are not terminators.

Using isSuccessor to check if a Block is a target of the terminating branch is faulty.

If the return value is as if the branch is analyzable, given a successor of PrevBB, it does not mean the branch refers to this specific successor. BTW if the branch is analyzable, the branch exists (PrevBB could not be branchless because of previous checks ensuring PrevBB does not fallthough).

For example, in the case of PrevBB having a JMP to MBB at the end (EBB0, EBB1 are blocks only used by an INLINEASM_BR). When using PrevBB->isSuccessor(&*FallThrough), the transform is applied and MBB is moved at the end. The next two iterations will do the same for EBB0, EBB1. After the third iteration, we are back at the initial placement. Thus, the infinite loop.

    +----------+               +----------+
    | PrevBB   |--+            | PrevBB   |--+
    | JMP MBB  |  |            | JMP MBB  |  |
    +----------+  |            +----------+  |
         |        |          _______/        |
         v        |         /                |
    +----------+  |         |  +----------+  |
    |   MBB    |  |         |  | EBB0     |<-+
    +----------+  |   ==>   |  +----------+  |
                  |         |                |
    +----------+  |         |  +----------+  |
    | EBB0     |<-+         |  | EBB0     |<-+
    +----------+  |         |  +----------+
                  |         |
    +----------+  |         |  +----------+
    | EBB1     |<-+         +->| MBB      |
    +----------+               +----------+

analyzeBranch returns false and set only PrevTBB pointing to MBB while not setting PrevFBB. Since PrevTBB != FallThrough, the transform is not applied when using PrevTBB/PrevFBB instead of PrevBB->isSuccessor(...).

Keeping the check about FallThrough not being a EHPad block is not absolutely necessary, but it prevents uselessly calling the expensive analyzeBranch since a catchpad could only be reached from an INVOKE.

As Indirect Targets could be reached from a branch, the check about not being an Indirect Target is conservative to prevent the infinite loop from reappearing.

@v01dXYZ
Copy link
Contributor Author

v01dXYZ commented Aug 23, 2024

@andykaylor I forgot to directly mention you.

@v01dXYZ
Copy link
Contributor Author

v01dXYZ commented Aug 28, 2024

@andykaylor ping (sorry for the fat paragraph). To summarize, I think we should only use the result of analyzeBranch instead of calling isSuccessor.

@andykaylor
Copy link
Contributor

@v01dXYZ Your analysis makes sense to me. I'm a bit concerned by your remark that analyzeBranch ignores INLINEASM_BR, but I suppose in the case of inline assembly we probably wouldn't eliminate an explicit branch instruction anyway, so maybe placing that block after the block that jumps to it would be of limited value anyway. In any case, that's a separate problem, and you should go ahead with the change you suggested to replace isSuccessor().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Compiler hangs when using indirect labels
4 participants