Skip to content

Commit

Permalink
Reject BPF program if uninit stack is accessed
Browse files Browse the repository at this point in the history
Reject programs if registers are used before intialized
Make undefined behavior check optional

Signed-off-by: Alan Jowett <alanjo@microsoft.com>
  • Loading branch information
Alan-Jowett committed May 21, 2024
1 parent 08b451b commit 1156cfc
Show file tree
Hide file tree
Showing 7 changed files with 594 additions and 84 deletions.
94 changes: 94 additions & 0 deletions libfuzzer/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# ubpf_fuzzer

This is a libfuzzer based fuzzer.

To build, run:
```
cmake \
-G Ninja \
-S . \
-B build \
-DCMAKE_BUILD_TYPE=Debug \
-DCMAKE_C_COMPILER=clang \
-DCMAKE_CXX_COMPILER=clang++ \
-DUBPF_ENABLE_LIBFUZZER=1 \
-DCMAKE_BUILD_TYPE=Debug
cmake --build build
```

To run:
Create folder for the corpus and artifacts for any crashes found, then run the fuzzer.

```
mkdir corpus
mkdir artifacts
build/bin/ubpf_fuzzer corpus -artifact_prefix=artifacts/
```

Optionally, add the "-jobs=100" to gather 100 crashes at a time.

This will produce a lot of output that looks like:
```
#529745 REDUCE cov: 516 ft: 932 corp: 442/22Kb lim: 2875 exec/s: 264872 rss: 429Mb L: 50/188 MS: 3 CrossOver-ChangeBit-EraseBytes-
#529814 REDUCE cov: 516 ft: 932 corp: 442/22Kb lim: 2875 exec/s: 264907 rss: 429Mb L: 45/188 MS: 4 ChangeBit-ShuffleBytes-PersAutoDict-EraseBytes- DE: "\005\000\000\000\000\000\000\000"-
#530202 REDUCE cov: 516 ft: 932 corp: 442/22Kb lim: 2875 exec/s: 265101 rss: 429Mb L: 52/188 MS: 3 ChangeByte-ChangeASCIIInt-EraseBytes-
#531224 REDUCE cov: 518 ft: 934 corp: 443/22Kb lim: 2875 exec/s: 265612 rss: 429Mb L: 73/188 MS: 2 CopyPart-PersAutoDict- DE: "\001\000\000\000"-
#531750 REDUCE cov: 518 ft: 934 corp: 443/22Kb lim: 2875 exec/s: 265875 rss: 429Mb L: 45/188 MS: 1 EraseBytes-
#532127 REDUCE cov: 519 ft: 935 corp: 444/22Kb lim: 2875 exec/s: 266063 rss: 429Mb L: 46/188 MS: 2 ChangeBinInt-ChangeByte-
#532246 REDUCE cov: 519 ft: 935 corp: 444/22Kb lim: 2875 exec/s: 266123 rss: 429Mb L: 66/188 MS: 4 ChangeBit-CrossOver-ShuffleBytes-EraseBytes-
#532357 NEW cov: 520 ft: 936 corp: 445/22Kb lim: 2875 exec/s: 266178 rss: 429Mb L: 55/188 MS: 1 ChangeBinInt-
#532404 REDUCE cov: 520 ft: 936 corp: 445/22Kb lim: 2875 exec/s: 266202 rss: 429Mb L: 57/188 MS: 2 ChangeBit-EraseBytes-
#532486 REDUCE cov: 520 ft: 936 corp: 445/22Kb lim: 2875 exec/s: 266243 rss: 429Mb L: 44/188 MS: 2 EraseByte
```

Eventually it will probably crash and produce a message like:
```
=================================================================
==376403==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x000000000000 bp 0x7ffca9d3cda0 sp 0x7ffca9d3cb98 T0)
==376403==Hint: pc points to the zero page.
==376403==The signal is caused by a READ memory access.
==376403==Hint: address points to the zero page.
#0 0x0 (<unknown module>)
#1 0x50400001a48f (<unknown module>)
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (<unknown module>)
==376403==ABORTING
MS: 1 ChangeByte-; base unit: cea14e5e2ecdc723b9beb640471a18b4ea529f75
0x28,0x0,0x0,0x0,0xb4,0x50,0x10,0x6a,0x6a,0x4a,0x6a,0x2d,0x2e,0x1,0x0,0x0,0x0,0x0,0x0,0x0,0x4,0x21,0x0,0x0,0x0,0x0,0x95,0x95,0x26,0x21,0xfc,0xff,0xff,0xff,0x95,0x95,0x95,0x95,0x97,0xb7,0x97,0x97,0x0,0x8e,0x0,0x24,
(\000\000\000\264P\020jjJj-.\001\000\000\000\000\000\000\004!\000\000\000\000\225\225&!\374\377\377\377\225\225\225\225\227\267\227\227\000\216\000$
artifact_prefix='artifacts/'; Test unit written to artifacts/crash-7036cbef2b568fa0b6e458a9c8062571a65144e1
Base64: KAAAALRQEGpqSmotLgEAAAAAAAAEIQAAAACVlSYh/P///5WVlZWXt5eXAI4AJA==
```

To triage the crash, the crash can be post processed using:
```
libfuzzer/split.sh artifacts/crash-7036cbef2b568fa0b6e458a9c8062571a65144e1
Extracting program-7036cbef2b568fa0b6e458a9c8062571a65144e1...
Extracting memory-7036cbef2b568fa0b6e458a9c8062571a65144e1...
Disassembling program-7036cbef2b568fa0b6e458a9c8062571a65144e1...
Program size: 40
Memory size: 2
Disassembled program:
mov32 %r0, 0x2d6a4a6a
jgt32 %r1, %r0, +0
add32 %r1, 0x95950000
jgt32 %r1, 0x9595ffff, -4
exit
Memory contents:
00000000: 0024 .$
```

To repro the crash, you can run:
```
build/bin/ubpf_fuzzer artifacts/crash-7036cbef2b568fa0b6e458a9c8062571a65144e1
```

Or you can repro it using ubpf_test:
```
build/bin/ubpf-test --mem artifacts/memory-7036cbef2b568fa0b6e458a9c8062571a65144e1 artifacts/program-7036cbef2b568fa0b6e458a9c8062571a65144e1 --jit
```

202 changes: 148 additions & 54 deletions libfuzzer/libfuzz_harness.cc
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@
#include <string>
#include <sstream>


extern "C"
{
#include "ebpf.h"
Expand Down Expand Up @@ -42,26 +41,107 @@ int null_printf(FILE* stream, const char* format, ...)
return 0;
}

/**
* @brief Accept an input buffer and size.
*
* @param[in] data Pointer to the input buffer.
* @param[in] size Size of the input buffer.
* @return -1 if the input is invalid
* @return 0 if the input is valid and processed.
*/
int LLVMFuzzerTestOneInput(const uint8_t* data, std::size_t size)
typedef std::unique_ptr<ubpf_vm, decltype(&ubpf_destroy)> ubpf_vm_ptr;

ubpf_vm_ptr create_ubpf_vm(const std::vector<uint8_t>& program_code)
{
// Assume the fuzzer input is as follows:
// 32-bit program length
// program byte
// test data
// Automatically free the VM when it goes out of scope.
std::unique_ptr<ubpf_vm, decltype(&ubpf_destroy)> vm(ubpf_create(), ubpf_destroy);

// Copy memory into a writable buffer.
std::vector<uint8_t> memory;
if (vm == nullptr) {
// Failed to create the VM.
// This is not interesting, as the fuzzer input is invalid.
// Do not add it to the corpus.
return vm;
}

ubpf_toggle_undefined_behavior_check(vm.get(), true);

char* error_message = nullptr;

ubpf_set_error_print(vm.get(), null_printf);

if (ubpf_load(vm.get(), program_code.data(), program_code.size(), &error_message) != 0) {
// The program failed to load, due to a validation error.
// This is not interesting, as the fuzzer input is invalid.
// Do not add it to the corpus.
free(error_message);
vm.reset();
return vm;
}

ubpf_toggle_bounds_check(vm.get(), true);

if (ubpf_register_external_dispatcher(vm.get(), test_helpers_dispatcher, test_helpers_validator) != 0) {
// Failed to register the external dispatcher.
// This is not interesting, as the fuzzer input is invalid.
// Do not add it to the corpus.
vm.reset();
return vm;
}

if (ubpf_set_instruction_limit(vm.get(), 10000, nullptr) != 0) {
// Failed to set the instruction limit.
// This is not interesting, as the fuzzer input is invalid.
// Do not add it to the corpus.
vm.reset();
return vm;
}

return vm;
}

bool call_ubpf_interpreter(const std::vector<uint8_t>& program_code, std::vector<uint8_t>& memory, std::vector<uint8_t>& ubpf_stack, uint64_t& interpreter_result)
{
auto vm = create_ubpf_vm(program_code);

if (vm == nullptr) {
// VM creation failed.
return false;
}

// Execute the program using the input memory.
if (ubpf_exec_ex(vm.get(), memory.data(), memory.size(), &interpreter_result, ubpf_stack.data(), ubpf_stack.size()) != 0) {
// VM execution failed.
return false;
}

// VM execution succeeded.
return true;
}

bool call_ubpf_jit(const std::vector<uint8_t>& program_code, std::vector<uint8_t>& memory, std::vector<uint8_t>& ubpf_stack, uint64_t& jit_result)
{
auto vm = create_ubpf_vm(program_code);

char* error_message = nullptr;

if (vm == nullptr) {
// VM creation failed.
return false;
}

auto fn = ubpf_compile_ex(vm.get(), &error_message, JitMode::ExtendedJitMode);

if (fn == nullptr) {
free(error_message);

// Compilation failed.
return false;
}

jit_result = fn(memory.data(), memory.size(), ubpf_stack.data(), ubpf_stack.size());

// Compilation succeeded.
return true;
}

bool call_linux_jit(const std::vector<uint8_t>& program_code, std::vector<uint8_t>& memory, std::vector<uint8_t> ubpf_stack, uint64_t& linux_jit_result);

bool split_input(const uint8_t* data, std::size_t size, std::vector<uint8_t>& program, std::vector<uint8_t>& memory)
{
if (size < 4)
return -1;
return false;

uint32_t program_length = *reinterpret_cast<const uint32_t*>(data);
uint32_t memory_length = size - 4 - program_length;
Expand All @@ -71,22 +151,25 @@ int LLVMFuzzerTestOneInput(const uint8_t* data, std::size_t size)
if (program_length > size) {
// The program length is larger than the input size.
// This is not interesting, as the fuzzer input is invalid.
// Do not add it to the corpus.
return -1;
return false;
}

if (program_length == 0) {
// The program length is zero.
// This is not interesting, as the fuzzer input is invalid.
// Do not add it to the corpus.
return -1;
return false;
}

if (program_length + 4u > size) {
// The program length is larger than the input size.
// This is not interesting, as the fuzzer input is invalid.
// Do not add it to the corpus.
return -1;
return false;
}

if ((program_length % sizeof(ebpf_inst)) != 0) {
// The program length needs to be a multiple of sizeof(ebpf_inst_t).
// This is not interesting, as the fuzzer input is invalid.
return false;
}

// Copy any input memory into a writable buffer.
Expand All @@ -95,53 +178,64 @@ int LLVMFuzzerTestOneInput(const uint8_t* data, std::size_t size)
std::memcpy(memory.data(), memory_start, memory_length);
}

// Automatically free the VM when it goes out of scope.
std::unique_ptr<ubpf_vm, decltype(&ubpf_destroy)> vm(ubpf_create(), ubpf_destroy);
program.resize(program_length);
std::memcpy(program.data(), program_start, program_length);

if (vm == nullptr) {
// Failed to create the VM.
// This is not interesting, as the fuzzer input is invalid.
// Do not add it to the corpus.
return -1;
}
return true;
}

char* error_message = nullptr;
/**
* @brief Accept an input buffer and size.
*
* @param[in] data Pointer to the input buffer.
* @param[in] size Size of the input buffer.
* @return -1 if the input is invalid
* @return 0 if the input is valid and processed.
*/
int LLVMFuzzerTestOneInput(const uint8_t* data, std::size_t size)
{
// Assume the fuzzer input is as follows:
// 32-bit program length
// program byte
// test data

if (ubpf_load(vm.get(), program_start, program_length, &error_message) != 0) {
// The program failed to load, due to a validation error.
// This is not interesting, as the fuzzer input is invalid.
// Do not add it to the corpus.
free(error_message);
std::vector<uint8_t> program;
std::vector<uint8_t> memory;
std::vector<uint8_t> ubpf_stack(3*4096);

if (!split_input(data, size, program, memory)) {
// The input is invalid. Not interesting.
return -1;
}

ubpf_set_error_print(vm.get(), null_printf);
uint64_t interpreter_result = 0;
uint64_t jit_result = 0;

ubpf_toggle_bounds_check(vm.get(), true);

if (ubpf_register_external_dispatcher(vm.get(), test_helpers_dispatcher, test_helpers_validator) != 0) {
// Failed to register the external dispatcher.
if (!call_ubpf_interpreter(program, memory, ubpf_stack, interpreter_result)) {
// Failed to load or execute the program in the interpreter.
// This is not interesting, as the fuzzer input is invalid.
// Do not add it to the corpus.
return -1;
return 0;
}

if (ubpf_set_instruction_limit(vm.get(), 10000, nullptr) != 0) {
// Failed to set the instruction limit.
// This is not interesting, as the fuzzer input is invalid.
// Do not add it to the corpus.
if (!split_input(data, size, program, memory)) {
// The input is invalid. Not interesting.
return -1;
}

uint64_t result = 0;

// Execute the program using the input memory.
if (ubpf_exec(vm.get(), memory.data(), memory.size(), &result) != 0) {
// The program passed validation during load, but failed during execution.
// due to a runtime error. Add it to the corpus as it may be interesting.
if (!call_ubpf_jit(program, memory, ubpf_stack, jit_result)) {
// Failed to load or execute the program in the JIT.
// This is not interesting, as the fuzzer input is invalid.
return 0;
}

// If interpreter_result is not equal to jit_result, raise a fatal signal
if (interpreter_result != jit_result) {
printf("%lx ubpf_stack\n", reinterpret_cast<uintptr_t>(ubpf_stack.data()) + ubpf_stack.size());
printf("interpreter_result: %lx\n", interpreter_result);
printf("jit_result: %lx\n", jit_result);
throw std::runtime_error("interpreter_result != jit_result");
}

// Program executed successfully.
// Add it to the corpus as it may be interesting.
return 0;
Expand Down
37 changes: 37 additions & 0 deletions libfuzzer/split.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
#!/bin/bash

# Split the file name into path and base name
path=$(dirname $1)
base=$(basename $1)

# Get the first 4 bytes from the file (which is the length of the program)
input="$(xxd -p -l 4 $1)"
# Convert from little endian
input="${input:6:2}${input:4:2}${input:2:2}${input:0:2}"

# Convert input from hex string to value
length=$((16#$input))

# Extract the hash part from the file name
hash=$(echo $base | cut -d'-' -f2-)

# Copy the program to a file named program-$hash
echo "Extracting program-$hash..."
dd if=$1 of=$path/program-$hash bs=1 skip=4 count=$length 2> /dev/null

echo "Extracting memory-$hash..."
# Copy the rest to a file named memory-$hash
dd if=$1 of=$path/memory-$hash bs=1 skip=$((4 + $length)) 2> /dev/null

echo "Disassembling program-$hash..."
# Unassembly program using bin/ubpf-disassembler
bin/ubpf-disassembler $path/program-$hash > $path/program-$hash.asm

echo "Program size: $(stat -c %s $path/program-$hash)"
echo "Memory size: $(stat -c %s $path/memory-$hash)"

echo "Disassembled program:"
cat $path/program-$hash.asm

echo "Memory contents:"
xxd $path/memory-$hash

0 comments on commit 1156cfc

Please sign in to comment.