Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BOLT] Avoid reference updates for non-JT symbol operands #88838

Merged
merged 1 commit into from
Apr 30, 2024

Conversation

linsinan1995
Copy link
Member

Add a check to skip updating references for operands that do not directly refer to jump table symbols but fall within a jump table's address range to prevent unintended modifications.

@linsinan1995
Copy link
Member Author

pass validate-memrefs wrongly update the correct reference to the jump table reference, which leads to a different execution result.

(a.out is compiled from jt-symbol-disambiguation-4.s attached in this PR)

+ ./a.out
FF
+ ./llvm-bolt -v=2 -jump-tables=move a.out -o a.out-opt
+ ./a.out-opt
5FFC00E
+ ./llvm-bolt -v=2 a.out -o a.out-opt-nomove
+ ./a.out-opt-nomove
FF

a.out

0000000000401160 <foo>:
  401160:       48 c7 c0 00 00 00 00    mov    $0x0,%rax
  401167:       ff 24 c5 18 20 40 00    jmpq   *0x402018(,%rax,8) // JT label address 0x402018
                        40116a: R_X86_64_32S    .rodata+0x18
  ...

0000000000401130 <main>:
  401130:       48 c7 c0 f0 ff ff ff    mov    $0xfffffffffffffff0,%rax
  401137:       8b 90 19 20 40 00       mov    0x402019(%rax),%edx // var `c` address 0x402008
                        401139: R_X86_64_32S    c+0x11
  40113d:       89 d6                   mov    %edx,%esi

0000000000402008 <c>:
  402008:       01 ff 00 00 00 00 00 00 00 00 00 00 00 00 00 00     ................
  402018:       71 11 40 00 00 00 00 00 71 11 40 00 00 00 00 00     q.@.....q.@.....
  ...

a.out-opt

0000000000800122 <foo>:
  800122:       48 c7 c0 00 00 00 00    mov    $0x0,%rax
  800129:       ff 24 c5 20 03 80 00    jmpq   *0x800320(,%rax,8) # JT label address 0x800320
  ...

0000000000800100 <main>:
  800100:       48 c7 c0 f0 ff ff ff    mov    $0xfffffffffffffff0,%rax
  800107:       8b 90 21 03 80 00       mov    0x800321(%rax),%edx
  80010d:       89 d6                   mov    %edx,%esi

0000000000800320 <.rodata.cold>:
  800320:       30 01                   xor    %al,(%rcx)
  800322:       80 00 00              addb   $0x0,(%rax)
  800325:       00 00                   add    %al,(%rax)
  800327:       00 30                   add    %dh,(%rax)

main before validate-memref from BOLT log

.LBB07 (8 instructions, align : 1)
  Entry Point
  CFI State : 0
    00000000: 	movq	$-0x10, %rax
    00000007: 	movl	c+17(%rax), %edx
    0000000d: 	movl	%edx, %esi
    0000000f: 	movl	$SYMBOLat0x402038, %edi
    00000014: 	movl	$0x0, %eax
    00000019: 	callq	printf@PLT
    0000001e: 	xorl	%eax, %eax
    00000020: 	retq
  CFI State: 0

main after validate-memref from BOLT log

.LBB07 (8 instructions, align : 1)
  Entry Point
  CFI State : 0
    00000000: 	movq	$-0x10, %rax
    00000007: 	movl	"JUMP_TABLE/foo/1.0"+1(%rax), %edx
    0000000d: 	movl	%edx, %esi
    0000000f: 	movl	$SYMBOLat0x402038, %edi
    00000014: 	movl	$0x0, %eax
    00000019: 	callq	printf@PLT
    0000001e: 	xorl	%eax, %eax
    00000020: 	retq
  CFI State: 0

@linsinan1995 linsinan1995 force-pushed the validate-memrefs branch 2 times, most recently from 01deba3 to 696f3a7 Compare April 16, 2024 07:16
@maksfb
Copy link
Contributor

maksfb commented Apr 16, 2024

Thank you for the fix. How did you discover the problem?

@linsinan1995
Copy link
Member Author

Thank you for the fix. How did you discover the problem?

Hi @maksfb , I found this problem in an internal testsuite built with clang/llvm toolchain. I did some investigation and saw that some SCEV-based optimizations like loop-reduce and slsr in LLVM can generate this kind of pattern.

LLVM IR log:

define dso_local i32 @main(i32 noundef %0, ptr nocapture noundef readnone %1) local_unnamed_addr #0 {
  ...
  br label %5

5:                                                ; preds = %5, %2
  %6 = phi i64 [ 0, %2 ], [ %62, %5 ]
  %7 = phi i32 [ %4, %2 ], [ %61, %5 ]
  ...
  %36 = getelementptr inbounds [8 x [2 x i8]], ptr @c, i64 0, i64 %6, i64 1
  %37 = load i8, ptr %36, align 1, !tbaa !8
  %38 = sext i8 %37 to i32
  ...
  %50 = ashr i32 %49, 8
  %62 = add nuw nsw i64 %6, 1
  %63 = icmp eq i64 %62, 8
  br i1 %63, label %64, label %5, !llvm.loop !9

*** IR Dump After Loop Strength Reduction (loop-reduce) ***
; Preheader:
  ...
  br label %5

; Loop:
5:                                                ; preds = %5, %2
  %6 = phi i64 [ %64, %5 ], [ -16, %2 ]
  ...
  %37 = getelementptr i8, ptr @c, i64 %6
  %38 = getelementptr i8, ptr %37, i64 17
  %39 = load i8, ptr %38, align 1, !tbaa !8
  %40 = sext i8 %39 to i32
  %41 = xor i32 %35, %40
  %42 = sext i32 %41 to i64

in this case, llvm optimize getelementptr inbounds [8 x [2 x i8]], ptr @c, i64 0, i64 %6, i64 1 into two simpler operations, %37 = getelementptr i8, ptr @c, i64 %6 and getelementptr i8, ptr %37, i64 17.

after removing PHI

  %7:gr64 = MOV64ri32 -16
  %0:gr32 = MOV32rm $rip, 1, $noreg, @b, $noreg :: (dereferenceable load (s32) from @b, !tbaa !4)
  %60:gr64 = COPY killed %7:gr64
  %61:gr32 = COPY killed %0:gr32

bb.1 (%ir-block.5):
; predecessors: %bb.0, %bb.1
  successors: %bb.2(0x04000000), %bb.1(0x7c000000); %bb.2(3.12%), %bb.1(96.88%)

  %2:gr32 = COPY killed %61:gr32
  %1:gr64 = COPY killed %60:gr64
  ...
  %41:gr32 = MOVSX32rm8 %1:gr64, 1, $noreg, @c + 17, $noreg :: (invariant load (s8) from %ir.38, !tbaa !8)
  %42:gr32 = XOR32rr killed %39:gr32(tied-def 0), killed %41:gr32, implicit-def dead $eflags
  ...
  %60:gr64 = COPY killed %4:gr64
  %61:gr32 = COPY %3:gr32
  JCC_1 %bb.1, 5, implicit killed $eflags
  JMP_1 %bb.2

Then we have

movq	$-0x10, %rax
movl	c+17(%rax), %edx

@maksfb
Copy link
Contributor

maksfb commented Apr 18, 2024

Thanks for the context. It's interesting how you hit a corner case where the pass does the exact opposite of what it's supposed to do. I.e it fetches a symbol at TargetAddress - 1 which happens to be a jump table symbol and uses it instead of the original non-jump table symbol.

Instead of verifying if the symbol Sym matches the first label of the jump table, I suggest we detect the jump table at the address of the symbol. I.e.:

  JumpTable *JT = BC.getJumpTableContainingAddress(BD->getAddress());

This way, we are also making sure the symbol will not collide with other symbols registered at the same address. Then you'll also have to adjust the way the new object is created and used:

  MCSymbol *NewSym = BC.getOrCreateGlobalSymbol(BD->getAddress() - 1, "DATAat");
  BC.MIB->setOperandToSymbolRef(Inst, OperandNum, NewSym, Offset + 1, &*BC.Ctx,
                                0);

@maksfb
Copy link
Contributor

maksfb commented Apr 25, 2024

@linsinan1995, does the above make sense to you?

Add a check to skip updating references for operands that do not directly
refer to jump table symbols but fall within a jump table's address
range to prevent unintended modifications.
@linsinan1995
Copy link
Member Author

@maksfb sorry for the delay. Thank you for your suggestions. The changes have been made accordingly.

Copy link
Contributor

@maksfb maksfb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks again!

@linsinan1995
Copy link
Member Author

linsinan1995 commented Apr 29, 2024

@maksfb , One thing I am not very sure about is the IsLegitAccess check in this pass. I think it is possible to have a case like bolt/test/runtime/X86/jt-symbol-disambiguation.s, but both .end_of_table and JT are in the same function, and then BOLT will generate the wrong code again.

like


  .text
  .globl _start
  .type _start, @function
_start:
  .cfi_startproc
  movq   (%rsp), %rdi
  xor    %rax,%rax
  and    $0x3,%rdi
  leaq   .JT1(%rip), %rax
  movslq  (%rax, %rdi, 4), %rdi
  addq   %rax, %rdi
  jmpq   *%rdi
.LBB1:
  movl   $0x1,%eax
  jmp    .LBB5
.LBB2:
  movl   $0x2,%eax
  jmp    .LBB5
.LBB3:
  movl   $0x3,%eax
  jmp    .LBB5
.LBB4:
  movl   $0x4,%eax
.LBB5:
  leaq   .start_of_table(%rip), %rsi  # iterator
  leaq   .end_of_table(%rip), %rdi    # iterator end
.LBB6:
  cmpq %rsi, %rdi
  je .LBB7
  movq (%rsi), %rbx
  leaq 8(%rsi), %rsi            # ++iterator
  jmp .LBB6
.LBB7:
  xor   %rdi, %rdi
  callq exit@PLT
  .cfi_endproc
  .size _start, .-_start

# ----
# Data section
# ----
  .section .rodata,"a",@progbits
  .p2align 3
.start_of_table:
  .quad 123
  .quad 456
  .quad 789
.end_of_table:
.JT1:
  .long .LBB1 - .JT1
  .long .LBB2 - .JT1
  .long .LBB3 - .JT1
  .long .LBB4 - .JT1

Although the code is not directly generated by the compiler, we may need more checks for such addresses overlapping cases in BOLT. at least more warnings in the future?

@maksfb
Copy link
Contributor

maksfb commented Apr 29, 2024

Yes, you are correct. We planned to add data flow analysis to disambiguate such cases.

@linsinan1995 linsinan1995 merged commit 9d5411f into llvm:main Apr 30, 2024
4 checks passed
# REQUIRES: system-linux


# RUN: %clang -no-pie %s -o %t.exe -Wl,-q
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test fails on my AArch64-linux machine.
It seems that something is missing on this clang command line to tell it explicitly to target x86? In some of the other tests in this directory, it seems that might be done indirectly by adding %cflags to the clang command line?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted.

Copy link
Contributor

@aaupov aaupov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please address the issues with the test.

# REQUIRES: system-linux


# RUN: %clang -no-pie %s -o %t.exe -Wl,-q
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use llvm-mc + lld for assembly tests (check bolt/test/X86 for examples).


# RUN: %clang -no-pie %s -o %t.exe -Wl,-q

# RUN: %t.exe
Copy link
Contributor

@aaupov aaupov Apr 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Runnable tests need to be under bolt/test/runtime. But in this case, there's no need to run the binary to verify the behavior. Please remove these lines with running the binary.

# RUN: %t.exe
# RUN: llvm-bolt -funcs=main,foo/1 %t.exe -o %t.exe.bolt -jump-tables=move
# RUN: %t.exe.bolt

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a CHECK line verifying the intended behavior of the pass: that the output binary contains the correct reference.

# REQUIRES: system-linux


# RUN: %clang -no-pie %s -o %t.exe -Wl,-q
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted.

ayermolo pushed a commit to ayermolo/llvm-project that referenced this pull request May 30, 2024
Summary:
In ValidateMemRefs pass, when we validate references of the form
`Symbol + Addend`, we should check Symbol against aliasing a jump table
instead of the Symbol + Addend value.

llvm#88838

Test Plan: NFC

Reviewers: aaupov, #llvm-bolt

Reviewed By: aaupov

Differential Revision: https://phabricator.intern.facebook.com/D56213679

Tags: accept2ship
maksfb added a commit that referenced this pull request Jun 4, 2024
In ValidateMemRefs pass, when we validate references in the form of
`Symbol + Addend`, we should check `Symbol` not `Symbol + Addend`
against aliasing a jump table.

Recommitting with a modified test case:
#88838

Co-authored-by: sinan <sinan.lin@linux.alibaba.com>
vedantparanjape-amd pushed a commit to vedantparanjape-amd/llvm-project that referenced this pull request Jun 7, 2024
In ValidateMemRefs pass, when we validate references in the form of
`Symbol + Addend`, we should check `Symbol` not `Symbol + Addend`
against aliasing a jump table.

Recommitting with a modified test case:
llvm#88838

Co-authored-by: sinan <sinan.lin@linux.alibaba.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants