Skip to content

Commit

Permalink
BPF: make 32bit register spill with 64bit alignment
Browse files Browse the repository at this point in the history
In llvm, for non-alu32 mode, the stack alignment is 64bit so only one
64bit spill per 64bit slot. For alu32 mode, the stack alignment
is 32bit, so it is possible to have two 32bit spills per
64bit slot.

Currently, bpf kernel verifier does not preserve register states
for 32bit spills. That is, one 32bit register may hold a constant
value or a bounded range before spill. After reload from the
stack, the information is lost and sometimes this may cause
verifier failure. For 64bit register spill, the verifier
indeed tries to preserve the register state for reloading.

The current verifier can be modestly changed to handle one
32bit spill per 64bit stack slot with state-preserving reload.
Handling two 32bit spills per 64bit stack slot will require
substantial changes.

This patch changes stack alignment for alu32 to be 64bit.
This way, for any 64bit slot in alu32 mode, only one
32bit or 64bit register values can be saved. Together
with previous-mentioned verifier enhancement, 32bit
spill can be handled with state preserving.

Note that llvm stack slot coallescing
seems only doing adjacent packing which may leave some holes
in the stack. For example,
   stack slot 8   <== 8 bytes
   stack slot 4   <== 8 bytes with 4 byte hole
   stack slot 8   <== 8 bytes
   stack slot 4   <== 4 bytes

Differential Revision: https://reviews.llvm.org/D109073
  • Loading branch information
yonghong-song committed Sep 21, 2021
1 parent 58abc8c commit ea72b03
Show file tree
Hide file tree
Showing 2 changed files with 36 additions and 1 deletion.
2 changes: 1 addition & 1 deletion llvm/lib/Target/BPF/BPFRegisterInfo.td
Expand Up @@ -36,7 +36,7 @@ foreach I = 0-11 in {
}

// Register classes.
def GPR32 : RegisterClass<"BPF", [i32], 32, (add
def GPR32 : RegisterClass<"BPF", [i32], 64, (add
(sequence "W%u", 1, 9),
W0, // Return value
W11, // Stack Ptr
Expand Down
35 changes: 35 additions & 0 deletions llvm/test/CodeGen/BPF/spill-alu32.ll
@@ -0,0 +1,35 @@
; RUN: llc -march=bpf -mcpu=v3 < %s | FileCheck %s
;
; Source code:
; void foo(int, int, int, long, int);
; int test(int a, int b, int c, long d, int e) {
; foo(a, b, c, d, e);
; __asm__ __volatile__ ("":::"r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7", "r8", "r9", "memory");
; foo(a, b, c, d, e);
; return 0;
; }
; Compilation flag:
; clang -target bpf -S -emit-llvm -O2 -mcpu=v3 t.c

; Function Attrs: nounwind
define dso_local i32 @test(i32 %a, i32 %b, i32 %c, i64 %d, i32 %e) local_unnamed_addr #0 {
entry:
tail call void @foo(i32 %a, i32 %b, i32 %c, i64 %d, i32 %e) #2
tail call void asm sideeffect "", "~{r0},~{r1},~{r2},~{r3},~{r4},~{r5},~{r6},~{r7},~{r8},~{r9},~{memory}"() #2

; CHECK: *(u32 *)(r10 - 8) = w5
; CHECK: *(u64 *)(r10 - 16) = r4
; CHECK: *(u32 *)(r10 - 24) = w3
; CHECK: *(u32 *)(r10 - 32) = w2
; CHECK: *(u32 *)(r10 - 40) = w1
; CHECK: call foo

tail call void @foo(i32 %a, i32 %b, i32 %c, i64 %d, i32 %e) #2
ret i32 0
}

declare dso_local void @foo(i32, i32, i32, i64, i32) local_unnamed_addr #1

attributes #0 = { nounwind "frame-pointer"="all" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="v3" }
attributes #1 = { "frame-pointer"="all" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="v3" }
attributes #2 = { nounwind }

0 comments on commit ea72b03

Please sign in to comment.