Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BOLT][AArch64] Fixes assertion errors occurred when perf2bolt was executed #83394

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

kaadam
Copy link
Contributor

@kaadam kaadam commented Feb 29, 2024

BOLT only checks for the most common indirect branch pattern during the branch analyzation. Extended the logic with two other indirect patterns which slightly differ from the expected one.

Since these patterns are not related to JT, mark them as UNKNOWN branch.

Fixes: #83114

Copy link

github-actions bot commented Feb 29, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

…ecuted

BOLT only checks for the most common indirect branch pattern during
the branch analyzation. Extended the logic with two other indirect pattern
which slightly differs from the expected one.

Since these patterns are not related to JT, mark them as UNKNOWN branch.

Fixes: llvm#83114
@yota9
Copy link
Member

yota9 commented Mar 4, 2024

Hello, thank you for a change.
But aren't these changes related to JT? We're loading table address, offset, add them and jump - I'm not completely sure why we want to skip it.
Also please add proper test.

@kaadam
Copy link
Contributor Author

kaadam commented Mar 26, 2024

Hi @yota9, Sorry for the delayed response.
Started to work on to handle these two patterns maybe you can give me some pointers about them.
Preprocessing indirect branch we should extract PCRelBase, and JumpTable (label address) + Offset if it is possible.

Considering this pattern, this adr instruction its address could be PCRelbase, as I tried to follow the original logic maybe it could be "JT" as well. But we only have the address of the entry in sigall_set "section", should we know about the label address of the sigall_set?

      //   adr     x6, 0x219fb0 <sigall_set+0x88> => PCRelBase, JT?
      //   add     x6, x6, x14, lsl #2
      //   ldr     w7, [x6]
      //   add     x6, x6, w7, sxtw
      //   br      x6

a snippet of the print-cfg:

.Ltmp7286 (10 instructions, align : 1)
  CFI State : 0
  Predecessors: .LFT5416
    00000268:   ldp     q5, q6, [x1], #0x20
    0000026c:   stp     q1, q2, [x3], #0x20
    00000270:   and     x3, x3, #0xfffffffffffffff0
    00000274:   sub     x2, x2, #0x20
    00000278:   hint    #0x0 # NOP: 1
    0000027c:   adr     x6, #-365276
    00000280:   add     x6, x6, x14, lsl #2
    00000284:   ldr     w7, [x6]
    00000288:   add     x6, x6, w7, sxtw
    0000028c:   br      x6 # UNKNOWN CONTROL FLOW
  CFI State: 0

Binary attached:
bubblesort_arm64_static.zip

@yota9
Copy link
Member

yota9 commented Mar 26, 2024

Hi @kaadam ! The signal_set is just closest (from above) symbol in the binary, probably it has nothing to do with JT. Consider 0x219fb0 as a base address.

@paschalis-mpeis
Copy link
Member

Two cases are identified in this patch, and TMU the first of which ((ShiftVal != 2)) is hit by users more often. Virtually any stage-0 binary that is statically linked should hit this.

With the patch as is (ie returning 'false'), BOLT won't be able to build the CFG for such functions, meaning less optimizations but a stable binary regardless.

@kaadam has made some good progress locally, to properly parse these patterns, but it's still incomplete.

So I would suggest that we do this on 2 steps:

1) have a quick workaround in place:

Merge for now something like the latest commit, plus some added test:

Any .c file, statically linked with libc triggers the first error.
Maybe we can emit some additional warning before returning false, or use --print-cfg or -v=1 and FileCheck that.

test.c:

int main() { return 1; }

in lit test:

clang -O0 test.c -static -o out
llvm-bolt out -o out.bolt

2) use follow-up patches for a proper solution:

Parse the 2 patterns, extract any relevant info (JumpTable/Offset/Scale/Base), and adjust tests as needed.
The DefJTBaseAdd being an ADR would need an extra test. There's already good local progress on this direction..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BOLT] Perf2bolt is failing with assertion on AArch64
3 participants