This repository contains the tool described in our upcoming OSDI'20 paper "Specification, implementation, and verification of just-in-time compilers for an in-kernel virtual machine".
This is a tool for formally verifying the Linux kernel BPF JITs that builds on our verification framework Serval. We have used this tool to aid the development in the following ways:
-
develop the RV32 JIT:
-
review the patches that add support for far jumps and branching in RV64:
-
find bugs, validate patches, and write test cases:
- arm, bpf: Fix bugs with ALU64 {RSH, ARSH} BPF_K shift by 0
- arm, bpf: Fix offset overflow for BPF_MEM BPF_DW
- arm64: insn: Fix two bugs in encoding 32-bit logical immediates
- bpf, riscv: clear high 32 bits for ALU32 add/sub/neg/lsh/rsh/arsh
- bpf, riscv: Fix tail call count off by one in RV32 BPF JIT
- bpf, riscv: Fix stack layout of JITed code on RV32
- riscv, bpf: Fix offset range checking for auipc+jalr on RV64
- bpf, x32: Fix bug with ALU64 {LSH, RSH, ARSH} BPF_K shift by 0
- bpf, x32: Fix bug with ALU64 {LSH, RSH, ARSH} BPF_X shift by 0
- bpf, x32: Fix bug with JMP32 JSET BPF_X checking upper bits
- bpf, x86_32: Fix incorrect encoding in BPF_LDX zero-extension
- bpf, x86_32: Fix clobbering of dst for BPF_JSET
- bpf, x86: Fix encoding for lower 8-bit registers in BPF_STX BPF_B
- bpf, selftests: Add test for BPF_STX BPF_B storing R10
- selftests: bpf: Add test for JMP32 JSET BPF_X with upper bits set
- selftests: bpf: add tests for shifts by zero
-
add optimizations:
- bpf, arm: Optimize ALU64 ARSH X using orrpl conditional instruction
- bpf, arm: Optimize ALU ARSH K using asr immediate instruction
- bpf, arm64: Optimize AND,OR,XOR,JSET BPF_K using arm64 logical immediates
- bpf, arm64: Optimize ADD,SUB,JMP BPF_K using arm64 add/sub immediates
- bpf, riscv: Enable zext optimization for more RV64G ALU ops
- Merge branch 'bpf-rv64-jit'
-
add RVC support to the RV64 JIT:
-
Download and install Racket (tested on v8.9).
-
Uninstall any previous versions of Serval:
raco pkg remove serval
- Clone the repository and install a good version of Serval with BPF JIT verification and synthesis:
git clone --recursive 'https://github.com/uw-unsat/jitterbug.git'
cd jitterbug
raco pkg install --auto ./serval
You may also wish to install the Boolector SMT solver. We tested using Boolector v3.2.1 configured to use the CaDiCal SAT solver. The verification and synthesis will attempt to use it first, before falling back to Z3, which may take significantly longer (e.g., more than 10x slower for synthesis).
-
arch/
contains the C code of the BPF JITs from the Linux kernel. -
racket/<arch>/
contains the Racket implementations of the BPF JITs that correspond to the C code underarch/
.racket/<arch>/spec.rkt
contains the specification for<arch>
.racket/<arch>/synthesis.rkt
(if present) contains the synthesis infrastructure and sketch for the BPF JIT for<arch>
.racket/rv32/bpf_jit_comp32.c.tmpl
is a template used with theracket/rv32/bpf_jit_comp32.rkt
to generate the final C implementation.
-
racket/lib/
contains the BPF JIT verification infrastructure.racket/lib/bpf-common.rkt
contains the BPF JIT correctness specification and other common BPF functionality.racket/lib/riscv-common.rkt
provides features specific to the JIT for the RISC-V ISA.racket/lib/bvaxiom.rkt
axiomatizes bvmul/bvudiv/bvurem.racket/lib/lemmas.lean
contains the axiomatization of bitvector operations in Lean.
-
racket/test/
contains verification and synthesis test cases for different classes of instructions.
Run all of the verification test cases in parallel using 8 jobs:
raco test --jobs 8 racket/test
You can also run individual files for a specific class of instructions:
raco test racket/test/rv64/verify-alu32-x.rkt
As an example, let's inject a bug fixed in commit 1e692f09e091.
Specifically, remove the zero extension for BPF_ALU|BPF_ADD|BPF_X
in racket/rv64/bpf_jit_comp64.rkt
:
[((BPF_ALU BPF_ADD BPF_X)
(BPF_ALU64 BPF_ADD BPF_X))
(emit (if is64 (rv_add rd rd rs) (rv_addw rd rd rs)) ctx)
(when (&& (! is64) (! (->prog->aux->verifier_zext ctx)))
(emit_zext_32 rd ctx))]
Now we have a buggy JIT implementation:
[((BPF_ALU BPF_ADD BPF_X)
(BPF_ALU64 BPF_ADD BPF_X))
(emit (if is64 (rv_add rd rd rs) (rv_addw rd rd rs)) ctx)]
Run the verification:
raco test racket/test/rv64/verify-alu32-x.rkt
Verification will fail with a counterexample:
[ RUN ] "VERIFY (BPF_ALU BPF_ADD BPF_X)"
<unknown>: <unknown>: bpf-jit-specification: Final registers must match
--------------------
riscv64-alu32-x tests > VERIFY (BPF_ALU BPF_ADD BPF_X)
FAILURE
name: check-unsat?
location: [...]/bpf-common.rkt:218:4
params:
'((model
[x?$0 #f]
[r0$0 (bv #x0000000080000000 64)]
[r1$0 (bv #x0000000000000000 64)]
[insn$0 (bv #x00800000 32)]
[offsets$0 (fv (bitvector 32)~>(bitvector 32))]
[target-pc-base$0 (bv #x0000000000000000 64)]
[off$0 (bv #x8000 16)]))
--------------------
The counterexample produced by the tool gives BPF register values
that will trigger the bug: it chooses r0
to be 0x0000000080000000
and r1
to be 0x0000000000000000
. This demonstrates the bug
because the RISC-V instructions sign extend the result of the 32-bit
addition, where the BPF instruction performs zero extension.
The verification works by proving equivalence between a BPF instruction and the RISC-V instructions generated by the JIT for that instruction.
As a concrete example, consider verifying the BPF_ALU|BPF_ADD|BPF_X
instruction. The verification starts by constructing a symbolic BPF
instruction where the destination and source register fields can
take on any legit value:
BPF_ALU32_REG(BPF_ADD, {{rd}}, {{rs}})
Next, the verifier symbolically evaluates the JIT on the BPF instruction, producing a set of possible sequences of symbolic RISC-V instructions:
addw {{rv_reg(rd)}} {{rv_reg(rd)}} {{rv_reg(rs)}}
slli {{rv_reg(rd)}} {{rv_reg(rd)}} 32
srli {{rv_reg(rd)}} {{rv_reg(rd)}} 32
Here, rv_reg
denotes the RISC-V register associated
with a particular BPF register.
Next, the tool proves that every possible sequence of generated
RISC-V instructions has the equivalent behavior as the original BPF
instruction, for all possible choices of registers rd
and rs
,
and for all initial values of all RISC-V general-purpose registers.
To do so, it leverages automated verifiers provided by the Serval
verification framework, as follows.
The verifier starts with a symbolic BPF state and symbolic RISC-V
state, called bpf-state
and riscv-state
, assuming that the two
initial states match, e.g., riscv-state[rv_reg(reg)] == bpf-state[reg]
for all reg
. Next, it runs the Serval BPF and RISC-V verifiers
on the BPF instruction over bpf-state
and every possible sequence
of generated RISC-V instructions over riscv-state
, respectively.
It then proves that the final BPF and RISC-V states still match.
Support for more complex instructions, such as jumps and branches,
requires additional care. For the details, you can see the JIT
correctness definition in lib/bpf-common.rkt:verify-jit-refinement
.
This complexity comes from having to prove that the generated
instructions preserve a correspondence between the BPF program
counter and the RISC-V program counter.
To help develop and optimize the RV32 JIT, we used Rosette's program synthesis feature to synthesize per-instruction compilers for a subset of BPF instructions.
The synthesis process takes as input the BPF and RISC-V interpreters from Serval, the BPF JIT correctness specification, and a program sketch which describes the structure of the code to search for. Synthesis then exhaustively searches increasingly large candidate sequences in the search space until it finds one that satisfies the JIT correctness specification.
You can try this feature out by running the following:
raco test racket/test/rv32/synthesize-alu64-x.rkt
It will produce output similar to the following:
riscv32-alu64-x synthesis
Running test "SYNTHESIZE (BPF_ALU64 BPF_SUB BPF_X)"
Using solver #<boolector>
Synthesizing for op '(BPF_ALU64 BPF_SUB BPF_X) with size 0
Synthesizing for op '(BPF_ALU64 BPF_SUB BPF_X) with size 1
Synthesizing for op '(BPF_ALU64 BPF_SUB BPF_X) with size 2
Synthesizing for op '(BPF_ALU64 BPF_SUB BPF_X) with size 3
Synthesizing for op '(BPF_ALU64 BPF_SUB BPF_X) with size 4
Solution found for '(BPF_ALU64 BPF_SUB BPF_X):
void emit_op(u8 rd, u8 rs, s32 imm, struct rv_jit_context *ctx) {
emit(rv_sub(RV_REG_T1, hi(rd), hi(rs)), ctx);
emit(rv_sltu(RV_REG_T0, lo(rd), lo(rs)), ctx);
emit(rv_sub(hi(rd), RV_REG_T1, RV_REG_T0), ctx);
emit(rv_sub(lo(rd), lo(rd), lo(rs)), ctx);
}
cpu time: 3513 real time: 141078 gc time: 324
In this example, synthesis was able to find a sequence of four RV32
instructions that correctly emulate the behavior of a BPF_ALU64 BPF_SUB BPF_X
instruction. This solution is printed out as a C
function emit_op
. You can see how this is integrated into the
final JIT in arch/riscv/net/bpf_jit_comp32.c
, in the emit_rv32_alu_r64
function, in the BPF_SUB
case. Note that the instructions in the
JIT may be different than the solution the solver on your machine
happens to find.
You can play around with synthesizing other instructions by looking
in the other racket/test/rv32/synthesize-*.rkt
files, and
uncommenting cases for other instructions in those files.
Not every instruction sequence in the final JIT was found using synthesis. Many synthesis queries either take extremely long or do not produce any results: especially those that require very long instruction sequences (e.g., 64-bit shifts), or those that use non-linear arithmetic (multiplication, division, etc.) For these instructions, we manually write an implementation and verify the correctness using the other techniques described in this guide.
The RV32 JIT is split in two parts: the Racket implementation in
racket/rv32/bpf_jit_comp32.rkt
contains the code required for
generating RV32 instruction sequences for a given BPF instruction,
and racket/rv32/bpf_jit_comp32.c.tmpl
is a template containing
glue C code required to have the JIT interface compile in C and
interface with the rest of the kernel. The template contains holes
that are filled in by extracting the corresponding Racket source
code to C.
One can generate arch/riscv/net/bpf_jit_comp32.c
with the header
arch/riscv/net/bpf_jit.h
via:
make gen
These files can be copied to the Linux kernel source tree for building and testing.
The test cases under racket/test/
show which instructions the JIT
is currently verified for. For those instructions, it proves that
the JIT is correct for all possible initial register values, for
all jump offsets, for all immediate values, etc.
The verification proves the JIT correct for individual BPF instructions at a time. There are several properties of the JIT that are currently not specified or verified:
-
Limited support for verifying prologue and epilogue, on riscv32 only.
-
Limited support for verifying call/exit instructions:
BPF_CALL
,BPF_TAIL_CALL
, andBPF_EXIT
are supported on riscv32 and riscv64;BPF_CALL
andBPF_EXIT
are supported on arm32 and arm64.
The verification makes the following assumptions:
-
The
ctx->offset
mapping has been correctly constructed. -
The BPF program has passed the kernel's BPF verifier: e.g., it assumes no divide-by-zero or out-of-range shifts.
-
The number of BPF instructions is less than 16M and the generated RISC-V image is smaller than 128MB. These bounds can be increased but will increase overall verification time for jumps.
-
The translation between the verified JIT in Racket and the C code is trusted. Any mismatches in this translation can mean there are bugs in the C version not present in the verified one.