-
Notifications
You must be signed in to change notification settings - Fork 171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Const integer bounds approach to bounds checking #194
Comments
@Lichtso Maintain the following metadata in // fields should always be positive
struct SectionInfo {
bp_host_addr: i64,
len: i64,
}
struct EbpfElf {
..
// sections as indexed from 1 to 4
section_infos: [SectionInfo; 4],
}
// memory remapping ops
SUBTRACT_HOST_ADDR: u8 = 0;
ADD_HOST_ADDR: u8 = 1;
SUBTRACT_LEN: u8 = 2;
ADD_LEN: u8 = 3;
struct OffsetEntry {
offset: isize,
op: u8,
}
struct JitProgram {
..
// sections as indexed from 1 to 4
section_infos: [SectionInfo; 4],
remap_offsets: [Vec<OffsetEntry>; 4],
}
impl JitProgram {
fn remap_memory(&mut self, new_section_infos: [SectionInfo; 4]) {
let text_ptr_i64_loc = jit_program._sections.text_section as *mut i64;
for ((info, new_info), offsets) self.section_infos.iter().zip(&new_section_infos).zip(&self.remap_offsets) {
for entry in &offsets {
let modifier = match entry.op {
0 => info.bp_host_addr - new_info.bp_host_addr,
1 => new_info.bp_host_addr - info.bp_host_addr,
2 => info.len - new_info.len,
3 => new_info.len - info.len,
_ => Err,
};
unsafe { *text_ptr_i64_loc.offset(entry.offset) += modifier; }
}
}
self.section_infos = new_infos;
}
}
impl JitCompiler {
..
fn emit_memory_translation(..) {
// subtract input section's host base ptr addr from x86 i64 immediate.
jit.add_offset_entry(
section_index::INPUT,
remap_ops::SUBTRACT_HOST_ADDR,
(jit.offset_in_text_section + opcode_leading_size) as isize,
);
..
}
} |
I think this part here won't work:
The |
Hmm, I believe this approach requires making a local copy of the executable everytime a VM is created. That's because each instance should have its own memory mapping. So each instance needs to have a unique remapped executable. So maybe I ought to first do a local copy into the VM. i.e. the VM should have a local copy of the |
@Lichtso , maybe in order to save on copies, I should make the semantics copy-on-write if shared, else write in place. Well, one executor could be about 1MB or so, so this can save on copies. 1MB copy will take about 50-120us. This is already somewhat long. It's not clear the remapping operation will be quick either, since it needs to write to many random memory locations. |
Here is a simplified strategy to bounds checking:
|
I will close this for now, as #196 was merged. |
Related: #193
Here is the x86 generated by rust.godbolt with flag --O from the following source:
source
Here is the equivalent with offset = -4:
offset = 4:
You will get a different result with
-C opt-level=2
and-C opt-level=3
Just to give some basic ideas for how to inline the bounds checks as x86.
As an idea of the overhead, this is about 10 instructions, and at IPC = 1, about 2.5ns. This is compared to the current measured overhead of addr translation of about 9-12ns (4GHz).
So the potential savings here is about 3.6x per addr translation. Which doesn't seem that great.
It does seem like BCE might be necessary here...
I believe that in order to achieve WASM-like performance, you really do need linear memory. The VM address is an offset, which doesn't require any lower bounds checking, so can be directly inlined as a u64 address by adding the base pointer. This makes a WASM bounds check just a single cmp and jmp, which maybe takes 1 or 2 cycles, and makes it easier for the processor to do speculative prefetching.
The text was updated successfully, but these errors were encountered: