Sandboxing loads and stores #13

eira-fransham · 2018-12-19T13:30:57Z

This is the first JIT that I've worked on, so I don't know how one goes about same-process memory isolation without generating check-address-and-trap instructions. Obviously check-and-trap is viable, but I feel like it must be possible to jack into the OS's (and therefore, hardware's) memory protection mechanisms to get the same protections with better performance. I assume it works by calling into the operating system to set accessible memory regions before jumping into wasm code and then resetting the accessible regions afterwards or when calling into host functions, but how do you stop the wasm code from doing an i32.store onto the program counter (by using some method to guess the location of it) without also preventing it from writing to the stack?

The text was updated successfully, but these errors were encountered:

pepyakin · 2018-12-19T14:02:30Z

You can implement it like as follows on 64-bit platforms.

When you compile code, let's say that you have the start of the linear memory in some register B. You compile i32.load or i32.store as follows: you take address specified as operand and offset B by this value and then perform the access. Because the operand is 32-bit (and this is guaranteed by the validation process) you can only access addresses from B to B+4GB (*).

In run time, when you instantiate a linear memory, you allocate 4GB(*) worth of virtual memory pages. You do that unconditionally (even if only 1 page was requested you still map all 4GBs). However, you mark as readable/writeable only the part that was explictly requested by the memory definition. The rest of this 4GB (if any) is marked as non-acccessible.

When you jump to compiled code you pass the start of the memory to the register B as expected by the compiled code. When code that was compiled from i32.load/i32.store gets a pointer to out-of-bounds address it will definitely hit a region which was marked as non-readable/writeable and thus trap.

Then it is up to you to catch this trap. For POSIX like systems you should be able to install a signal handler before jumping to generated code and to save jmpbuf_t with setjmp. In case of OOB access it will be invoked synchroniously on the offender thread. You can then find appropriate jmpbuf_t and jump to it using longjmp.

See:

Catching traps bytecodealliance/wasmtime#15
Implement wasm trap handlers. bytecodealliance/wasmtime#27
https://webassembly.github.io/spec/core/exec/instructions.html#exec-load
https://github.com/WebAssembly/spec/blob/master/papers/pldi2017.pdf (chapter "Virtual Memory techniques")

_{(*) there is a catch. It is actually possible to specify address for more than 4GB, since i32.load and i32.store have 32-bit offset immediate so actually you can address 8GBs. But that doesn't change a lot of things.}

sunfishcode · 2018-12-19T15:48:19Z

What @pepyakin said :-). Also, wasmtime has support for allocating such memory regions, handling signals, and unwinding with longjmp, as well as other runtime and WebAssembly module support, so I think it'd make sense to start looking at integrating lightbeam with wasmtime.

A first step might be using wasmtime-environ to decode the Module as I sketched out here. Once that's in place, we might want to move lightbeam to live within the wasmtime repository. We'll probably need to make some changes to wasmtime to integrate it well, but we can do that.

sunfishcode · 2018-12-24T19:50:30Z

I propose the following plan:

First, start using wasmtime-environ for parsing the module as sketch out here. Wasmtime-environ via wasmparser currently expect to consume the whole wasm module as a &[u8], which isn't ideal, but is fixable.
Then, we can start to teach lightbeam how to obtain the base address and length of a linear memory. Wasmtime-runtime's main way of communicating with JITs is through the VMContext struct and the other data structures in that file. A pointer to the VMContext struct is passed as an extra argument at the end in all compiled wasm functions (the vmctx argument, in Cranelift).

For reference, the pointer to the array of linear memory records is here. To make it easy to access these fields, wasmtime-environ provides functions that return the offsets to these fields. Construct a VMOffsets instance (which just needs to know the target pointer size), and then vmctx_memories tells you the offset from the vmctx pointer to the memories pointer, and then index_vmmemory_definition_base and index_vmmemory_definition_current_length provide the offsets from the memories pointer to the fields containing the base address and length of linear memory (for now, use index 0 as wasm only supports one linear memory).

Once we have the base address and the length, we can emit the code to do the bounds check if needed, and the access.

It's simplest to start by reloading the base and length for each access, and then we can make it be clever about reusing the base and length in a register later, as we'll need to be careful about caching these values across a potential heap resize.

There's admittedly a lot here, and a lot of this is new API surface area in Wasmtime, so please ask if there's anything that doesn't make sense here.

eira-fransham closed this as completed Mar 12, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sandboxing loads and stores #13

Sandboxing loads and stores #13

eira-fransham commented Dec 19, 2018 •

edited

Loading

pepyakin commented Dec 19, 2018 •

edited

Loading

sunfishcode commented Dec 19, 2018

sunfishcode commented Dec 24, 2018

Sandboxing loads and stores #13

Sandboxing loads and stores #13

Comments

eira-fransham commented Dec 19, 2018 • edited Loading

pepyakin commented Dec 19, 2018 • edited Loading

sunfishcode commented Dec 19, 2018

sunfishcode commented Dec 24, 2018

eira-fransham commented Dec 19, 2018 •

edited

Loading

pepyakin commented Dec 19, 2018 •

edited

Loading