Skip to content

Commit

Permalink
Reject BPF program if uninitialized stack or registers are accessed d…
Browse files Browse the repository at this point in the history
…uring interpret path (#445)

* Reject BPF program if uninit stack is accessed
Reject programs if registers are used before intialized
Make undefined behavior check optional

Signed-off-by: Alan Jowett <alanjo@microsoft.com>

* Apply suggestions from code review

Co-authored-by: Will Hawkins <whh8b@obs.cr>
Signed-off-by: Alan Jowett <alanjo@microsoft.com>

* PR feedback

Signed-off-by: Alan Jowett <alanjo@microsoft.com>

---------

Signed-off-by: Alan Jowett <alanjo@microsoft.com>
Co-authored-by: Will Hawkins <whh8b@obs.cr>
  • Loading branch information
Alan-Jowett and hawkinsw committed May 21, 2024
1 parent 6789eee commit 2868ce4
Show file tree
Hide file tree
Showing 7 changed files with 631 additions and 82 deletions.
93 changes: 93 additions & 0 deletions libfuzzer/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
# ubpf_fuzzer

This is a libfuzzer based fuzzer.

To build, run:
```
cmake \
-G Ninja \
-S . \
-B build \
-DCMAKE_BUILD_TYPE=Debug \
-DCMAKE_C_COMPILER=clang \
-DCMAKE_CXX_COMPILER=clang++ \
-DUBPF_ENABLE_LIBFUZZER=1 \
-DCMAKE_BUILD_TYPE=Debug
cmake --build build
```

To run:
Create folder for the corpus and artifacts for any crashes found, then run the fuzzer.

```
mkdir corpus
mkdir artifacts
build/bin/ubpf_fuzzer corpus -artifact_prefix=artifacts/
```

Optionally, add the "-jobs=100" to gather 100 crashes at a time.

This will produce a lot of output that looks like:
```
#529745 REDUCE cov: 516 ft: 932 corp: 442/22Kb lim: 2875 exec/s: 264872 rss: 429Mb L: 50/188 MS: 3 CrossOver-ChangeBit-EraseBytes-
#529814 REDUCE cov: 516 ft: 932 corp: 442/22Kb lim: 2875 exec/s: 264907 rss: 429Mb L: 45/188 MS: 4 ChangeBit-ShuffleBytes-PersAutoDict-EraseBytes- DE: "\005\000\000\000\000\000\000\000"-
#530202 REDUCE cov: 516 ft: 932 corp: 442/22Kb lim: 2875 exec/s: 265101 rss: 429Mb L: 52/188 MS: 3 ChangeByte-ChangeASCIIInt-EraseBytes-
#531224 REDUCE cov: 518 ft: 934 corp: 443/22Kb lim: 2875 exec/s: 265612 rss: 429Mb L: 73/188 MS: 2 CopyPart-PersAutoDict- DE: "\001\000\000\000"-
#531750 REDUCE cov: 518 ft: 934 corp: 443/22Kb lim: 2875 exec/s: 265875 rss: 429Mb L: 45/188 MS: 1 EraseBytes-
#532127 REDUCE cov: 519 ft: 935 corp: 444/22Kb lim: 2875 exec/s: 266063 rss: 429Mb L: 46/188 MS: 2 ChangeBinInt-ChangeByte-
#532246 REDUCE cov: 519 ft: 935 corp: 444/22Kb lim: 2875 exec/s: 266123 rss: 429Mb L: 66/188 MS: 4 ChangeBit-CrossOver-ShuffleBytes-EraseBytes-
#532357 NEW cov: 520 ft: 936 corp: 445/22Kb lim: 2875 exec/s: 266178 rss: 429Mb L: 55/188 MS: 1 ChangeBinInt-
#532404 REDUCE cov: 520 ft: 936 corp: 445/22Kb lim: 2875 exec/s: 266202 rss: 429Mb L: 57/188 MS: 2 ChangeBit-EraseBytes-
#532486 REDUCE cov: 520 ft: 936 corp: 445/22Kb lim: 2875 exec/s: 266243 rss: 429Mb L: 44/188 MS: 2 EraseByte
```

Eventually it will probably crash and produce a message like:
```
=================================================================
==376403==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x000000000000 bp 0x7ffca9d3cda0 sp 0x7ffca9d3cb98 T0)
==376403==Hint: pc points to the zero page.
==376403==The signal is caused by a READ memory access.
==376403==Hint: address points to the zero page.
#0 0x0 (<unknown module>)
#1 0x50400001a48f (<unknown module>)
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (<unknown module>)
==376403==ABORTING
MS: 1 ChangeByte-; base unit: cea14e5e2ecdc723b9beb640471a18b4ea529f75
0x28,0x0,0x0,0x0,0xb4,0x50,0x10,0x6a,0x6a,0x4a,0x6a,0x2d,0x2e,0x1,0x0,0x0,0x0,0x0,0x0,0x0,0x4,0x21,0x0,0x0,0x0,0x0,0x95,0x95,0x26,0x21,0xfc,0xff,0xff,0xff,0x95,0x95,0x95,0x95,0x97,0xb7,0x97,0x97,0x0,0x8e,0x0,0x24,
(\000\000\000\264P\020jjJj-.\001\000\000\000\000\000\000\004!\000\000\000\000\225\225&!\374\377\377\377\225\225\225\225\227\267\227\227\000\216\000$
artifact_prefix='artifacts/'; Test unit written to artifacts/crash-7036cbef2b568fa0b6e458a9c8062571a65144e1
Base64: KAAAALRQEGpqSmotLgEAAAAAAAAEIQAAAACVlSYh/P///5WVlZWXt5eXAI4AJA==
```

To triage the crash, post process it with:
```
libfuzzer/split.sh artifacts/crash-7036cbef2b568fa0b6e458a9c8062571a65144e1
Extracting program-7036cbef2b568fa0b6e458a9c8062571a65144e1...
Extracting memory-7036cbef2b568fa0b6e458a9c8062571a65144e1...
Disassembling program-7036cbef2b568fa0b6e458a9c8062571a65144e1...
Program size: 40
Memory size: 2
Disassembled program:
mov32 %r0, 0x2d6a4a6a
jgt32 %r1, %r0, +0
add32 %r1, 0x95950000
jgt32 %r1, 0x9595ffff, -4
exit
Memory contents:
00000000: 0024 .$
```

To repro the crash, you can run:
```
build/bin/ubpf_fuzzer artifacts/crash-7036cbef2b568fa0b6e458a9c8062571a65144e1
```

Or you can repro it using ubpf_test:
```
build/bin/ubpf-test --mem artifacts/memory-7036cbef2b568fa0b6e458a9c8062571a65144e1 artifacts/program-7036cbef2b568fa0b6e458a9c8062571a65144e1 --jit
```

231 changes: 179 additions & 52 deletions libfuzzer/libfuzz_harness.cc
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,14 @@
#include <string>
#include <sstream>


extern "C"
{
#include "ebpf.h"
#include "ubpf.h"
}

#include "test_helpers.h"
#include <cassert>

uint64_t test_helpers_dispatcher(uint64_t p0, uint64_t p1,uint64_t p2,uint64_t p3, uint64_t p4, unsigned int idx, void* cookie) {
UNREFERENCED_PARAMETER(cookie);
Expand All @@ -42,26 +42,138 @@ int null_printf(FILE* stream, const char* format, ...)
return 0;
}

typedef std::unique_ptr<ubpf_vm, decltype(&ubpf_destroy)> ubpf_vm_ptr;

/**
* @brief Accept an input buffer and size.
* @brief Create a ubpf vm object and load the program code into it.
*
* @param[in] data Pointer to the input buffer.
* @param[in] size Size of the input buffer.
* @return -1 if the input is invalid
* @return 0 if the input is valid and processed.
* @param[in] program_code The program code to load into the VM.
* @return A unique pointer to the ubpf_vm object or nullptr if the VM could not be created.
*/
int LLVMFuzzerTestOneInput(const uint8_t* data, std::size_t size)
ubpf_vm_ptr create_ubpf_vm(const std::vector<uint8_t>& program_code)
{
// Assume the fuzzer input is as follows:
// 32-bit program length
// program byte
// test data
// Automatically free the VM when it goes out of scope.
std::unique_ptr<ubpf_vm, decltype(&ubpf_destroy)> vm(ubpf_create(), ubpf_destroy);

// Copy memory into a writable buffer.
std::vector<uint8_t> memory;
if (vm == nullptr) {
// Failed to create the VM.
// This is not interesting, as the fuzzer input is invalid.
// Do not add it to the corpus.
return {nullptr, nullptr};
}

ubpf_toggle_undefined_behavior_check(vm.get(), true);

char* error_message = nullptr;

ubpf_set_error_print(vm.get(), null_printf);

if (ubpf_load(vm.get(), program_code.data(), program_code.size(), &error_message) != 0) {
// The program failed to load, due to a validation error.
// This is not interesting, as the fuzzer input is invalid.
// Do not add it to the corpus.
free(error_message);
return {nullptr, nullptr};
}

ubpf_toggle_bounds_check(vm.get(), true);

if (ubpf_register_external_dispatcher(vm.get(), test_helpers_dispatcher, test_helpers_validator) != 0) {
// Failed to register the external dispatcher.
// This is not interesting, as the fuzzer input is invalid.
// Do not add it to the corpus.
return {nullptr, nullptr};
}

if (ubpf_set_instruction_limit(vm.get(), 10000, nullptr) != 0) {
// Failed to set the instruction limit.
// This is not interesting, as the fuzzer input is invalid.
// Do not add it to the corpus.
return {nullptr, nullptr};
}

return vm;
}

/**
* @brief Invoke the ubpf interpreter with the given program code and input memory.
*
* @param[in] program_code The program code to execute.
* @param[in,out] memory The input memory to use when executing the program. May be modified by the program.
* @param[in,out] ubpf_stack The stack to use when executing the program. May be modified by the program.
* @param[out] interpreter_result The result of the program execution.
* @return true if the program executed successfully.
* @return false if the program failed to execute.
*/
bool call_ubpf_interpreter(const std::vector<uint8_t>& program_code, std::vector<uint8_t>& memory, std::vector<uint8_t>& ubpf_stack, uint64_t& interpreter_result)
{
auto vm = create_ubpf_vm(program_code);

if (vm == nullptr) {
// VM creation failed.
return false;
}

// Execute the program using the input memory.
if (ubpf_exec_ex(vm.get(), memory.data(), memory.size(), &interpreter_result, ubpf_stack.data(), ubpf_stack.size()) != 0) {
// VM execution failed.
return false;
}

// VM execution succeeded.
return true;
}

/**
* @brief Execute the given program code using the ubpf JIT.
*
* @param[in] program_code The program code to execute.
* @param[in,out] memory The input memory to use when executing the program. May be modified by the program.
* @param[in,out] ubpf_stack The stack to use when executing the program. May be modified by the program.
* @param[out] interpreter_result The result of the program execution.
* @return true if the program executed successfully.
* @return false if the program failed to execute.
*/
bool call_ubpf_jit(const std::vector<uint8_t>& program_code, std::vector<uint8_t>& memory, std::vector<uint8_t>& ubpf_stack, uint64_t& jit_result)
{
auto vm = create_ubpf_vm(program_code);

char* error_message = nullptr;

if (vm == nullptr) {
// VM creation failed.
return false;
}

auto fn = ubpf_compile_ex(vm.get(), &error_message, JitMode::ExtendedJitMode);

if (fn == nullptr) {
free(error_message);

// Compilation failed.
return false;
}

jit_result = fn(memory.data(), memory.size(), ubpf_stack.data(), ubpf_stack.size());

// Compilation succeeded.
return true;
}

/**
* @brief Copy the program and memory from the input buffer into separate buffers.
*
* @param[in] data The input buffer from the fuzzer.
* @param[in] size The size of the input buffer.
* @param[out] program The program code extracted from the input buffer.
* @param[out] memory The input memory extracted from the input buffer.
* @return true if the input buffer was successfully split.
* @return false if the input buffer is malformed.
*/
bool split_input(const uint8_t* data, std::size_t size, std::vector<uint8_t>& program, std::vector<uint8_t>& memory)
{
if (size < 4)
return -1;
return false;

uint32_t program_length = *reinterpret_cast<const uint32_t*>(data);
uint32_t memory_length = size - 4 - program_length;
Expand All @@ -71,22 +183,25 @@ int LLVMFuzzerTestOneInput(const uint8_t* data, std::size_t size)
if (program_length > size) {
// The program length is larger than the input size.
// This is not interesting, as the fuzzer input is invalid.
// Do not add it to the corpus.
return -1;
return false;
}

if (program_length == 0) {
// The program length is zero.
// This is not interesting, as the fuzzer input is invalid.
// Do not add it to the corpus.
return -1;
return false;
}

if (program_length + 4u > size) {
// The program length is larger than the input size.
// This is not interesting, as the fuzzer input is invalid.
// Do not add it to the corpus.
return -1;
return false;
}

if ((program_length % sizeof(ebpf_inst)) != 0) {
// The program length needs to be a multiple of sizeof(ebpf_inst_t).
// This is not interesting, as the fuzzer input is invalid.
return false;
}

// Copy any input memory into a writable buffer.
Expand All @@ -95,53 +210,65 @@ int LLVMFuzzerTestOneInput(const uint8_t* data, std::size_t size)
std::memcpy(memory.data(), memory_start, memory_length);
}

// Automatically free the VM when it goes out of scope.
std::unique_ptr<ubpf_vm, decltype(&ubpf_destroy)> vm(ubpf_create(), ubpf_destroy);
program.resize(program_length);
std::memcpy(program.data(), program_start, program_length);

if (vm == nullptr) {
// Failed to create the VM.
// This is not interesting, as the fuzzer input is invalid.
// Do not add it to the corpus.
return -1;
}
return true;
}

char* error_message = nullptr;
/**
* @brief Accept an input buffer and size.
*
* @param[in] data Pointer to the input buffer.
* @param[in] size Size of the input buffer.
* @return -1 if the input is invalid
* @return 0 if the input is valid and processed.
*/
int LLVMFuzzerTestOneInput(const uint8_t* data, std::size_t size)
{
// Assume the fuzzer input is as follows:
// 32-bit program length
// program byte
// test data

if (ubpf_load(vm.get(), program_start, program_length, &error_message) != 0) {
// The program failed to load, due to a validation error.
// This is not interesting, as the fuzzer input is invalid.
// Do not add it to the corpus.
free(error_message);
std::vector<uint8_t> program;
std::vector<uint8_t> memory;
std::vector<uint8_t> ubpf_stack(3*4096);

if (!split_input(data, size, program, memory)) {
// The input is invalid. Not interesting.
return -1;
}

ubpf_set_error_print(vm.get(), null_printf);
uint64_t interpreter_result = 0;
uint64_t jit_result = 0;

ubpf_toggle_bounds_check(vm.get(), true);

if (ubpf_register_external_dispatcher(vm.get(), test_helpers_dispatcher, test_helpers_validator) != 0) {
// Failed to register the external dispatcher.
if (!call_ubpf_interpreter(program, memory, ubpf_stack, interpreter_result)) {
// Failed to load or execute the program in the interpreter.
// This is not interesting, as the fuzzer input is invalid.
// Do not add it to the corpus.
return -1;
return 0;
}

if (ubpf_set_instruction_limit(vm.get(), 10000, nullptr) != 0) {
// Failed to set the instruction limit.
// This is not interesting, as the fuzzer input is invalid.
// Do not add it to the corpus.
return -1;
if (!split_input(data, size, program, memory)) {
// The input was successfully split, but failed to split again.
// This should not happen.
assert(!"split_input failed");
}

uint64_t result = 0;

// Execute the program using the input memory.
if (ubpf_exec(vm.get(), memory.data(), memory.size(), &result) != 0) {
// The program passed validation during load, but failed during execution.
// due to a runtime error. Add it to the corpus as it may be interesting.
if (!call_ubpf_jit(program, memory, ubpf_stack, jit_result)) {
// Failed to load or execute the program in the JIT.
// This is not interesting, as the fuzzer input is invalid.
return 0;
}

// If interpreter_result is not equal to jit_result, raise a fatal signal
if (interpreter_result != jit_result) {
printf("%lx ubpf_stack\n", reinterpret_cast<uintptr_t>(ubpf_stack.data()) + ubpf_stack.size());
printf("interpreter_result: %lx\n", interpreter_result);
printf("jit_result: %lx\n", jit_result);
throw std::runtime_error("interpreter_result != jit_result");
}

// Program executed successfully.
// Add it to the corpus as it may be interesting.
return 0;
Expand Down
Loading

0 comments on commit 2868ce4

Please sign in to comment.