-
Notifications
You must be signed in to change notification settings - Fork 171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Address Translation Performance without Breaking Changes #196
Conversation
Lichtso
commented
Jul 12, 2021
•
edited
Loading
edited
- Concatenates all read only sections (including the text section) into one
- Requires exactly one heap section
- This way there are always exactly 5 memory regions:
- 0: NULL
- 1: Program & Constants (read only)
- 2: Stack
- 3: Heap
- 4: Input
- Enforces the memory regions virtual address to be aligned.
- Makes address translation constant time complexity by interpreting the upper half of a pointer as region index.
- Replaces rust call in JIT by x86 native implementation.
- Makes stack frame gaps in VM address space optional / configurable.
497c274
to
957e295
Compare
This is the code which is now generated for (a store / write) address translation:
Lines shifted to the right are optional and can be turned off by |
rbpf systematic benchmarks
Analysis for Field MulUsing the following two pieces of code to measure ld/st time without overhead: and r1, 64917
mov r3, 4
lsh r3, 32
add r1, r3
ldxdw r3, [r1]
stxdw [r1], r1
add r1, r3
add r2, 1
jlt r2, 100000, -9
exit and r1, 64917
mov r3, 4
lsh r3, 32
add r1, r3
add r1, r3
add r2, 1
jlt r2, 100000, -7
exit we obtain:
My guess is that:
If exposing u128 primitive type which can be leveraged by hand-optimised code, or exposed as a capability to the LLVM codegen, one can probably achieve much closer to native - about 2x with WASM-like ld/st overhead. |
5e99071
to
a45e9d2
Compare