Initial implementation of JIT engine for amd64 target (#60)

This commit adds the initial implementation of Just-In-Time compilation engine, and aims to avoid the massive PR by introducing the minimal JIT engine like only being able to execute fibonacci func, or only subset of instructions are supported. Supported instructions are * call * if * br_if * i64.const * i64.sub * i64.le_u * loca.get but they are enough to prove that it is actually feasible to implement the complete JIT engine purely in Go! Notably, this commit adds jit package which implements the JIT engine for WebAssembly purely written in Go. See wasm/jit/README.md for details on design choices and considerations. Signed-off-by: Takeshi Yoneda <takeshi@tetrate.io>
tetratelabs · Dec 9, 2021 · 297f9db · 297f9db
1 parent 2488f2c
commit 297f9db
Show file tree

Hide file tree

Showing 24 changed files with 3,693 additions and 47 deletions.
diff --git a/RATIONALE.md b/RATIONALE.md
@@ -11,3 +11,7 @@ runtime vs interpreting Wasm directly (the `naivevm` interpreter).
 
 Note: `microwasm` was never specified formally, and only exists in a historical codebase of wasmtime:
 https://github.com/bytecodealliance/wasmtime/blob/v0.29.0/crates/lightbeam/src/microwasm.rs
+
+### JIT engine implementation
+
+See [wasm/jit/RATIONALE.md](wasm/jit/RATIONALE.md).
diff --git a/go.mod b/go.mod
@@ -3,4 +3,8 @@ module github.com/tetratelabs/wazero
 // temporarily support go 1.16 per #37
 go 1.16
 
-require github.com/stretchr/testify v1.5.1
+require (
+	github.com/stretchr/testify v1.5.1
+	// Once we reach some maturity, remove this dep and implement our own assembler.
+	github.com/twitchyliquid64/golang-asm v0.15.1
+)
diff --git a/go.sum b/go.sum
@@ -5,6 +5,8 @@ github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZN
 github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=
 github.com/stretchr/testify v1.5.1 h1:nOGnQDM7FYENwehXlg/kFVnos3rEvtKTjRvOWSzb6H4=
 github.com/stretchr/testify v1.5.1/go.mod h1:5W2xD1RspED5o8YsWQXVCued0rvSQ+mT+I5cxcmMvtA=
+github.com/twitchyliquid64/golang-asm v0.15.1 h1:SU5vSMR7hnwNxj24w34ZyCi/FmDZTkS4MhqMhdFk5YI=
+github.com/twitchyliquid64/golang-asm v0.15.1/go.mod h1:a1lVb/DtPvCB8fslRZhAngC2+aY1QWCk3Cedj/Gdt08=
 gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405 h1:yhCVgyC4o1eVCa2tZl7eS0r+SDo693bJlVdllGtEeKM=
 gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
 gopkg.in/yaml.v2 v2.2.2 h1:ZCJp+EgiOT7lHqUV2J862kp8Qj64Jo6az82+3Td9dZw=

diff --git a/wasm/engine.go b/wasm/engine.go
@@ -9,4 +9,12 @@ type Engine interface {
 	Call(f *FunctionInstance, args ...uint64) (returns []uint64, err error)
 	// Compile compiles down the function instance.
 	Compile(f *FunctionInstance) error
+	// PreCompile prepares the compilation for given function instances.
+	// This is called for all the instances in a module instance
+	// before Compile is called. That is necessary because
+	// JIT engine needs to assign unique ids to each function instance
+	// before it compiles each function. Concretely, the JIT engine
+	// uses the ids at the time when emitting call instructions against the yet-compiled
+	// function instances.
+	PreCompile(fs []*FunctionInstance) error
 }
diff --git a/wasm/jit/RATIONALE.md b/wasm/jit/RATIONALE.md
@@ -0,0 +1,101 @@
+# Just-In-Time compilation engine
+
+This package implements the JIT engine for WebAssembly *purely written in Go*. 
+In this README, we describe the background, technical difficulties and some of the design choices.
+
+## General limitations on pure Go JIT engines
+
+In Go program, each Goroutine manages its own stack, and each item on Goroutine stack is managed by Go runtime for garbage collection, etc.
+
+These impose some difficulties on JIT engine purely written in Go because we *cannot* use native push/pop instructions to save/restore temporaly variables spilling from registers. This results in making it impossible for us to invoke Go functions from JITed native codes with the native `call` instruction since it involves stack manipulations.
+
+*TODO: maybe it is possible to hack the runtime to make it possible to achieve function calls with `call`.*
+
+## How to generate native codes
+
+Currently we rely on [`twitchyliquid64/golang-asm`](https://github.com/twitchyliquid64/golang-asm) to assemble native codes. The library is just a copy of Go official compiler's assembler with modified import paths. So once we reach some maturity, we could implement our own assembler to reduce the unnecessary dependency as being less dependency is one of our primary goal in this project.
+
+The assembled native codes are represented as `[]byte` and the slice region is marked as executable via mmap system call.
+
+## How to enter native codes
+
+Assuming that we have a native code as `[]byte`, it is straightforward to enter the native code region via 
+Go assembly code. In this package, we have the function without body called `jitcall`
+
+```go
+func jitcall(codeSegment, engine, memory uintptr)
+```
+
+where we pass `codeSegment uintptr` as a first argument. This pointer is pointing to the first instruction to be executed. The pointer can be easily derived from `[]byte` via `unsafe.Pointer`:
+```go
+code := []byte{}
+/* ...Compilation ...*/
+codeSegment := uintptr(unsafe.Pointer(&code[0]))
+jitcall(codeSegment, ...)
+```
+
+And `jitcall` is actually implemented in [jit_amd64.s](./jit_amd64.s) as a convenience layer to comply with the Go's official calling convention and we delegate the task to jump into the code segment to the Go assembler code.
+
+## How to achieve function calls
+
+Given that we cannot use `call` instruction at all in native code, here's how we achieve the function calls back and forth among Go and (JITed) Wasm native functions.
+
+The general principle is that all the function calls consists of 1) emitting instruction to record the continuation program counter to `engine.continuationAddressOffset` 2) emitting `return` instruction.
+
+For example, the following Wasm code
+
+```
+0x3: call 1
+0x5: i64.const 100
+```
+
+will be compiled as 
+
+```
+mov [engine.functionCallIndex] $1 ;; Set the index of call target function to functionCallIndex field of engine.
+mov [engine.continuationAddressOffset] $0x05 ;; Set the continuation address to continuationAddressOffset field of engine.
+return ;; Return from the function.
+mov ... $100 ;; This is the beginning of program *after* function return.
+```
+
+This way, the engine, which enters the native code via `jitcall`, can know the continuation address of the caller's function frame: 
+
+```go
+case jitCallStatusCodeCallWasmFunction:
+    nextFunc := e.compiledWasmFunctions[e.functionCallIndex]
+    // Calculate the continuation address so
+    // we can resume this caller function frame.
+    currentFrame.continuationAddress = currentFrame.f.codeInitialAddress + e.continuationAddressOffset
+    currentFrame.continuationStackPointer = e.currentStackPointer + nextFunc.outputNum - nextFunc.inputNum
+    currentFrame.baseStackPointer = e.currentBaseStackPointer
+```
+
+and calling into another function in JIT engine's main loop:
+
+```go
+    // Create the callee frame.
+    frame := &callFrame{
+        continuationAddress: nextFunc.codeInitialAddress,
+        f:                   nextFunc,
+        // Set the caller frame so we can return back to the current frame!
+        caller: currentFrame,
+        // Set the base pointer to the beginning of the function inputs
+        baseStackPointer: e.currentBaseStackPointer + e.currentStackPointer - nextFunc.inputNum,
+    }
+```
+
+After finished executing the callee code, we return back to the caller's code with the specified return address:
+
+```go
+case jitStatusReturned:
+    // Meaning that the current frame exits
+    // so we just get back to the caller's frame.
+    callerFrame := currentFrame.caller
+    e.callFrameStack = callerFrame
+    e.currentBaseStackPointer = callerFrame.baseStackPointer
+    e.currentStackPointer = callerFrame.continuationStackPointer
+```
+
+To summarize, every function call is achieved by returning back to Go code (`engine.exec`'s main loop) with some continuation infor, and enter the callee native code (or host functions) from there. That, of course, comes with a bit of overhead because each function call is implemented by two steps (returning back to `jitcall` callsite AND entering `jitcall` again) vs just `call` instruction (or `jmp`) in usual native codes.
+
+Note that this mechanism is a minimal PoC impl, so in the near future, we would achieve the function calls without returning back to `engine.exec`'s main loop and instead `jmp` directly to the callee native code.