Skip to content

Commit

Permalink
wasm: introduce one-off eval function, use it instead
Browse files Browse the repository at this point in the history
* wasm/sdk: check version, call old eval path for ABI 1.1

  Fixes open-policy-agent#3146.

* docs/wasm: document addition as ABI 1.2

* wasm-sdk: overwrite previous inputs, don't accumulate them

  There is a little room for optimization here, should the input
  ever grow so large that it eats up too much precious heap space,
  we could look into changing this so that the memory used for it
  can be reclaimed.

* internal/compiler/wasm: commit generated wasm

  I've noticed that since the CI build running on macos-latest doesn't
  have docker installed, it cannot update these files itself at build
  time. We thus end up with macos binaries that have the wasm binary
  data from the main branch, not the PR.

  This can be observed from the test failure:

      Run make ci-binary-smoke-test-wasm BINARY=opa_darwin_amd64
      chmod +x "_release/0.31.0-dev/opa_darwin_amd64"
      "_release/0.31.0-dev/opa_darwin_amd64" eval -t "wasm" 'time.now_ns()'
      make: *** [ci-binary-smoke-test-wasm] Error 2
      {
        "errors": [
          {
            "message": "caller not found: opa_eval (opa_eval)"
          }
        ]
      }
      Error: Process completed with exit code 2.

  Since I had previously commit the CSV data that drives the dead
  code elimination process, that optimization had failed to find a
  function it expected to have.

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
  • Loading branch information
srenatus committed Jul 15, 2021
1 parent f3284cf commit e30b182
Show file tree
Hide file tree
Showing 17 changed files with 259 additions and 74 deletions.
4 changes: 4 additions & 0 deletions capabilities.json
Original file line number Diff line number Diff line change
Expand Up @@ -3503,6 +3503,10 @@
{
"version": 1,
"minor_version": 1
},
{
"version": 1,
"minor_version": 2
}
]
}
2 changes: 1 addition & 1 deletion compile/compile_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -498,7 +498,7 @@ func TestCompilerWasmTargetWithCapabilitiesMismatch(t *testing.T) {

for note, wabis := range map[string][]ast.WasmABIVersion{
"none": {},
"mismatch": {{Version: 0}, {Version: 1, Minor: 2}},
"mismatch": {{Version: 0}, {Version: 1, Minor: 2000}},
} {
t.Run(note, func(t *testing.T) {
caps := ast.CapabilitiesForThisVersion()
Expand Down
59 changes: 34 additions & 25 deletions docs/content/wasm.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@ import functions are dependencies of the compiled policies.

Wasm modules built using OPA 0.27.0 onwards contain a global variable named
`opa_wasm_abi_version` that has a constant i32 value indicating the ABI version
this module requires. Described below you find ABI version 1.
this module requires. Described below you find ABI versions `1.x`.

There's another i32 constant exported, `opa_wasm_abi_minor_version`, used
to track backwards-compatible changes.
Expand All @@ -129,31 +129,40 @@ Export[19]:

Note the `i32=1` of `global[1]`, exported by the name of `opa_wasm_abi_version`.

#### Exports
##### Version notes

ABI | Notes
--- | ---
1.0 | Start of ABI versioning.
1.1 | Adds export `memory`.
1.2 | Adds exported function `opa_eval`.

The primary exported functions for interacting with policy modules are:

| Function Signature | Description|
| --- | --- |
| <span class="opa-keep-it-together">`int32 eval(ctx_addr)`</span> | Evaluates the loaded policy with the provided evaluation context. The return value is reserved for future use. |
| <span class="opa-keep-it-together">`value_addr builtins(void)`</span> | Returns the address of a mapping of built-in function names to numeric identifiers that are required by the policy. |
| <span class="opa-keep-it-together">`value_addr entrypoints(void)`</span> | Returns the address of a mapping of entrypoints to numeric identifiers that can be selected when evaluating the policy. |
| <span class="opa-keep-it-together">`ctx_addr opa_eval_ctx_new(void)`</span> | Returns the address of a newly allocated evaluation context. |
| <span class="opa-keep-it-together">`void opa_eval_ctx_set_input(ctx_addr, value_addr)`</span> | Set the input value to use during evaluation. This must be called before each `eval()` call. If the input value is not set before evaluation, references to the `input` document result produce no results (i.e., they are undefined.) |
| <span class="opa-keep-it-together">`void opa_eval_ctx_set_data(ctx_addr, value_addr)`</span> | Set the data value to use during evalutaion. This should be called before each `eval()` call. If the data value is not set before evalutaion, references to base `data` documents produce no results (i.e., they are undefined.) |
| <span class="opa-keep-it-together">`void opa_eval_ctx_set_entrypoint(ctx_addr, entrypoint_id)`</span> | Set the entrypoint to evaluate. By default, entrypoint with id `0` is evaluated. |
| <span class="opa-keep-it-together">`value_addr opa_eval_ctx_get_result(ctx_addr)`</span> | Get the result set produced by the evaluation process. |
| <span class="opa-keep-it-together">`addr opa_malloc(int32 size)`</span> | Allocates size bytes in the shared memory and returns the starting address. |
| <span class="opa-keep-it-together">`void opa_free(addr)`</span> | Free a pointer. Calls `opa_abort` on error. |
| <span class="opa-keep-it-together">`value_addr opa_json_parse(str_addr, size)`</span> | Parses the JSON serialized value starting at str_addr of size bytes and returns the address of the parsed value. The parsed value may refer to a null, boolean, number, string, array, or object value. |
| <span class="opa-keep-it-together">`value_addr opa_value_parse(str_addr, size)`</span> | The same as `opa_json_parse` except Rego set literals are supported. |
| <span class="opa-keep-it-together">`str_addr opa_json_dump(value_addr)`</span> | Dumps the value referred to by `value_addr` to a null-terminated JSON serialized string and returns the address of the start of the string. Rego sets are serialized as JSON arrays. Non-string Rego object keys are serialized as strings. |
| <span class="opa-keep-it-together">`str_addr opa_value_dump(value_addr)`</span> | The same as `opa_json_dump` except Rego sets are serialized using the literal syntax and non-string Rego object keys are not serialized as strings. |
| <span class="opa-keep-it-together">`void opa_heap_ptr_set(addr)`</span> | Set the heap pointer for the next evaluation. |
| <span class="opa-keep-it-together">`addr opa_heap_ptr_get(void)`</span> | Get the current heap pointer. |
| <span class="opa-keep-it-together">`int32 opa_value_add_path(base_value_addr, path_value_addr, value_addr)`</span> | Add the value at the `value_addr` into the object referenced by `base_value_addr` at the given path. The `path_value_addr` must point to an array value with string keys (eg: `["a", "b", "c"]`). Existing values will be updated. On success the value at `value_addr` is no longer owned by the caller, it will be freed with the base value. The path must be freed by the caller after use (see `opa_free`). If an error occurs the base value will remain unchanged. Example: base object `{"a": {"b": 123}}`, path `["a", "x", "y"]`, and value `{"foo": "bar"}` will yield `{"a": {"b": 123, "x": {"y": {"foo": "bar"}}}}`. Returns an error code (see below). |
| <span class="opa-keep-it-together">`int32 opa_value_remove_path(base_value_addr, path_value_addr)`</span> | Remove the value from the object referenced by `base_value_addr` at the given path. Values removed will be freed. The path must be freed by the caller after use (see `opa_free`). The `path_value_addr` must point to an array value with string keys (eg: `["a", "b", "c"]`). Returns an error code (see below). |
#### Exports

The primary exported functions for interacting with policy modules are listed below.
In the ABI column, you can find the ABI version with which the export was introduced.

| Function Signature | Description | ABI
| --- | --- | --- |
| <span class="opa-keep-it-together">`int32 eval(ctx_addr)`</span> | Evaluates the loaded policy with the provided evaluation context. The return value is reserved for future use. | 1.0 |
| <span class="opa-keep-it-together">`value_addr builtins(void)`</span> | Returns the address of a mapping of built-in function names to numeric identifiers that are required by the policy. | 1.0 |
| <span class="opa-keep-it-together">`value_addr entrypoints(void)`</span> | Returns the address of a mapping of entrypoints to numeric identifiers that can be selected when evaluating the policy. | 1.0 |
| <span class="opa-keep-it-together">`ctx_addr opa_eval_ctx_new(void)`</span> | Returns the address of a newly allocated evaluation context. | 1.0 |
| <span class="opa-keep-it-together">`void opa_eval_ctx_set_input(ctx_addr, value_addr)`</span> | Set the input value to use during evaluation. This must be called before each `eval()` call. If the input value is not set before evaluation, references to the `input` document result produce no results (i.e., they are undefined.) | 1.0 |
| <span class="opa-keep-it-together">`void opa_eval_ctx_set_data(ctx_addr, value_addr)`</span> | Set the data value to use during evalutaion. This should be called before each `eval()` call. If the data value is not set before evalutaion, references to base `data` documents produce no results (i.e., they are undefined.) | 1.0 |
| <span class="opa-keep-it-together">`void opa_eval_ctx_set_entrypoint(ctx_addr, entrypoint_id)`</span> | Set the entrypoint to evaluate. By default, entrypoint with id `0` is evaluated. | 1.0 |
| <span class="opa-keep-it-together">`value_addr opa_eval_ctx_get_result(ctx_addr)`</span> | Get the result set produced by the evaluation process. | 1.0 |
| <span class="opa-keep-it-together">`addr opa_malloc(int32 size)`</span> | Allocates size bytes in the shared memory and returns the starting address. | 1.0 |
| <span class="opa-keep-it-together">`void opa_free(addr)`</span> | Free a pointer. Calls `opa_abort` on error. | 1.0 |
| <span class="opa-keep-it-together">`value_addr opa_json_parse(str_addr, size)`</span> | Parses the JSON serialized value starting at str_addr of size bytes and returns the address of the parsed value. The parsed value may refer to a null, boolean, number, string, array, or object value. | 1.0 |
| <span class="opa-keep-it-together">`value_addr opa_value_parse(str_addr, size)`</span> | The same as `opa_json_parse` except Rego set literals are supported. | 1.0 |
| <span class="opa-keep-it-together">`str_addr opa_json_dump(value_addr)`</span> | Dumps the value referred to by `value_addr` to a null-terminated JSON serialized string and returns the address of the start of the string. Rego sets are serialized as JSON arrays. Non-string Rego object keys are serialized as strings. | 1.0 |
| <span class="opa-keep-it-together">`str_addr opa_value_dump(value_addr)`</span> | The same as `opa_json_dump` except Rego sets are serialized using the literal syntax and non-string Rego object keys are not serialized as strings. | 1.0 |
| <span class="opa-keep-it-together">`void opa_heap_ptr_set(addr)`</span> | Set the heap pointer for the next evaluation. | 1.0 |
| <span class="opa-keep-it-together">`addr opa_heap_ptr_get(void)`</span> | Get the current heap pointer. | 1.0 |
| <span class="opa-keep-it-together">`int32 opa_value_add_path(base_value_addr, path_value_addr, value_addr)`</span> | Add the value at the `value_addr` into the object referenced by `base_value_addr` at the given path. The `path_value_addr` must point to an array value with string keys (eg: `["a", "b", "c"]`). Existing values will be updated. On success the value at `value_addr` is no longer owned by the caller, it will be freed with the base value. The path must be freed by the caller after use (see `opa_free`). If an error occurs the base value will remain unchanged. Example: base object `{"a": {"b": 123}}`, path `["a", "x", "y"]`, and value `{"foo": "bar"}` will yield `{"a": {"b": 123, "x": {"y": {"foo": "bar"}}}}`. Returns an error code (see below). | 1.0 |
| <span class="opa-keep-it-together">`int32 opa_value_remove_path(base_value_addr, path_value_addr)`</span> | Remove the value from the object referenced by `base_value_addr` at the given path. Values removed will be freed. The path must be freed by the caller after use (see `opa_free`). The `path_value_addr` must point to an array value with string keys (eg: `["a", "b", "c"]`). Returns an error code (see below). | 1.0 |
| <span class="opa-keep-it-together">`str_addr opa_eval(addr, entrypoint_id, value_addr, str_addr, int32, addr, format)`</span> | One-off policy evaluation method. Its arguments are everything needed to evaluate: entrypoint, address of data in memory, address and length of input JSON string in memory, heap address to use, and the output format (`0` is JSON, `1` is "value", i.e. serialized Rego values). The first argument is reserved for future use and must be `0`. Returns the address to the serialised result value. | 1.2 |

The addresses passed and returned by the policy modules are 32-bit integer
offsets into the shared memory region. The `value_addr` parameters and return
Expand All @@ -162,7 +171,7 @@ values refer to OPA value data structures: `null`, `boolean`, `number`,

__Error codes:__

OPA WASM Error codes are int32 values defined as:
OPA Wasm Error codes are int32 values defined as:

| Value | Name | Description |
|-------|------|-------------|
Expand Down
6 changes: 6 additions & 0 deletions internal/compiler/wasm/opa/callgraph.csv
Original file line number Diff line number Diff line change
Expand Up @@ -200,6 +200,12 @@ opa_cmp_lt,opa_boolean
opa_cmp_lte,opa_value_compare
opa_cmp_lte,opa_boolean
opa_eval_ctx_new,opa_malloc
opa_eval,opa_abort
opa_eval,opa_heap_ptr_set
opa_eval,opa_value_parse
opa_eval,eval
opa_eval,opa_value_dump
opa_eval,opa_json_dump
__force_import_opa_builtins,opa_builtin0
__force_import_opa_builtins,opa_builtin1
__force_import_opa_builtins,opa_builtin2
Expand Down
4 changes: 2 additions & 2 deletions internal/compiler/wasm/opa/opa.go

Large diffs are not rendered by default.

Binary file modified internal/compiler/wasm/opa/opa.wasm
Binary file not shown.
33 changes: 13 additions & 20 deletions internal/compiler/wasm/wasm.go
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ import (
const (
opaWasmABIVersionVal = 1
opaWasmABIVersionVar = "opa_wasm_abi_version"
opaWasmABIMinorVersionVal = 1
opaWasmABIMinorVersionVal = 2
opaWasmABIMinorVersionVar = "opa_wasm_abi_minor_version"
)

Expand Down Expand Up @@ -899,20 +899,7 @@ func (c *Compiler) replaceBooleanFunc() error {
c.appendInstr(instruction.GetLocal{Index: 0})
c.appendInstr(instruction.Select{})

// replace the code segment
var idx uint32
for _, fn := range c.module.Names.Functions {
if fn.Name == opaBoolean {
idx = fn.Index - uint32(c.functionImportCount())
}
}
var buf bytes.Buffer
if err := encoding.WriteCodeEntry(&buf, c.code); err != nil {
return err
}

c.module.Code.Segments[idx].Code = buf.Bytes()
return nil
return c.storeFunc(opaBoolean, c.code)
}

func (c *Compiler) compileBlock(block *ir.Block) ([]instruction.Instruction, error) {
Expand Down Expand Up @@ -1487,11 +1474,17 @@ func (c *Compiler) compileExternalCall(stmt *ir.CallStmt, id int32, result *[]in

func (c *Compiler) emitFunctionDecl(name string, tpe module.FunctionType, export bool) {

typeIndex := c.emitFunctionType(tpe)
c.module.Function.TypeIndices = append(c.module.Function.TypeIndices, typeIndex)
c.module.Code.Segments = append(c.module.Code.Segments, module.RawCodeSegment{})
idx := uint32((len(c.module.Function.TypeIndices) - 1) + c.functionImportCount())
c.funcs[name] = idx
var idx uint32
if old, ok := c.funcs[name]; ok {
c.debug.Printf("function declaration for %v is being emitted multiple times (overwriting old index %d)", name, old)
idx = old
} else {
typeIndex := c.emitFunctionType(tpe)
c.module.Function.TypeIndices = append(c.module.Function.TypeIndices, typeIndex)
c.module.Code.Segments = append(c.module.Code.Segments, module.RawCodeSegment{})
idx = uint32((len(c.module.Function.TypeIndices) - 1) + c.functionImportCount())
c.funcs[name] = idx
}

if export {
c.module.Export.Exports = append(c.module.Export.Exports, module.Export{
Expand Down
99 changes: 99 additions & 0 deletions internal/wasm/sdk/internal/wasm/vm.go
Original file line number Diff line number Diff line change
Expand Up @@ -28,13 +28,16 @@ type VM struct {
instance *wasmtime.Instance // Pointer to avoid unintented destruction (triggering finalizers within).
intHandle *wasmtime.InterruptHandle
policy []byte
abiMajorVersion int32
abiMinorVersion int32
memory *wasmtime.Memory
memoryMin uint32
memoryMax uint32
entrypointIDs map[string]int32
baseHeapPtr int32
dataAddr int32
evalHeapPtr int32
evalOneOff func(context.Context, int32, int32, int32, int32, int32) (int32, error)
eval func(context.Context, int32) error
evalCtxGetResult func(context.Context, int32) (int32, error)
evalCtxNew func(context.Context) (int32, error)
Expand Down Expand Up @@ -105,6 +108,14 @@ func newVM(opts vmOpts) (*VM, error) {
return nil, fmt.Errorf("get interrupt handle: %w", err)
}

v.abiMajorVersion, v.abiMinorVersion, err = getABIVersion(i, store)
if err != nil {
return nil, fmt.Errorf("invalid module: %w", err)
}
if v.abiMajorVersion != int32(1) || (v.abiMinorVersion != int32(1) && v.abiMinorVersion != int32(2)) {
return nil, fmt.Errorf("invalid module: unsupported ABI version: %d.%d", v.abiMajorVersion, v.abiMinorVersion)
}

v.store = store
v.instance = i
v.policy = opts.policy
Expand All @@ -122,6 +133,9 @@ func newVM(opts vmOpts) (*VM, error) {
v.evalCtxSetInput = func(ctx context.Context, a int32, b int32) error {
return callVoid(ctx, v, "opa_eval_ctx_set_input", a, b)
}
v.evalOneOff = func(ctx context.Context, ep, dataAddr, inputAddr, inputLen, heapAddr int32) (int32, error) {
return call(ctx, v, "opa_eval", 0 /* reserved */, ep, dataAddr, inputAddr, inputLen, heapAddr, 1 /* value output */)
}
v.evalCtxSetEntrypoint = func(ctx context.Context, a int32, b int32) error {
return callVoid(ctx, v, "opa_eval_ctx_set_entrypoint", a, b)
}
Expand Down Expand Up @@ -238,9 +252,88 @@ func newVM(opts vmOpts) (*VM, error) {
return v, nil
}

func getABIVersion(i *wasmtime.Instance, store wasmtime.Storelike) (int32, int32, error) {
major := i.GetExport(store, "opa_wasm_abi_version").Global()
minor := i.GetExport(store, "opa_wasm_abi_minor_version").Global()
if major != nil && minor != nil {
majorVal := major.Get(store)
minorVal := minor.Get(store)
if majorVal.Kind() == wasmtime.KindI32 && minorVal.Kind() == wasmtime.KindI32 {
return majorVal.I32(), minorVal.I32(), nil
}
}
return 0, 0, fmt.Errorf("failed to read ABI version")
}

// Eval performs an evaluation of the specified entrypoint, with any provided
// input, and returns the resulting value dumped to a string.
func (i *VM) Eval(ctx context.Context, entrypoint int32, input *interface{}, metrics metrics.Metrics, seed io.Reader, ns time.Time) ([]byte, error) {
if i.abiMinorVersion < int32(2) {
return i.evalCompat(ctx, entrypoint, input, metrics, seed, ns)
}

metrics.Timer("wasm_vm_eval").Start()
defer metrics.Timer("wasm_vm_eval").Stop()

mem := i.memory.UnsafeData(i.store)
inputAddr, inputLen := int32(0), int32(0)

// NOTE: we'll never free the memory used for the input string during
// the one evaluation, but we'll overwrite it on the next evaluation.
heapPtr := i.evalHeapPtr

if input != nil {
metrics.Timer("wasm_vm_eval_prepare_input").Start()
var raw []byte
switch v := (*input).(type) {
case []byte:
raw = v
case *ast.Term:
raw = []byte(v.String())
case ast.Value:
raw = []byte(v.String())
default:
var err error
raw, err = json.Marshal(v)
if err != nil {
return nil, err
}
}
inputLen = int32(len(raw))
inputAddr = i.evalHeapPtr
heapPtr += inputLen
copy(mem[inputAddr:inputAddr+inputLen], raw)

metrics.Timer("wasm_vm_eval_prepare_input").Stop()
}

// Setting the ctx here ensures that it'll be available to builtins that
// make use of it (e.g. `http.send`); and it will spawn a go routine
// cancelling the builtins that use topdown.Cancel, when the context is
// cancelled.
i.dispatcher.Reset(ctx, seed, ns)

metrics.Timer("wasm_vm_eval_call").Start()
resultAddr, err := i.evalOneOff(ctx, int32(entrypoint), i.dataAddr, inputAddr, inputLen, heapPtr)
if err != nil {
return nil, err
}
metrics.Timer("wasm_vm_eval_call").Stop()

data := i.memory.UnsafeData(i.store)[resultAddr:]
n := bytes.IndexByte(data, 0)
if n < 0 {
n = 0
}

// Skip free'ing input and result JSON as the heap will be reset next round anyway.
return data[:n], nil
}

// evalCompat evaluates a policy using multiple calls into the VM to set the stage.
// It's been superceded with ABI version 1.2, but still here for compatibility with
// Wasm modules lacking the needed export (i.e., ABI 1.1).
func (i *VM) evalCompat(ctx context.Context, entrypoint int32, input *interface{}, metrics metrics.Metrics, seed io.Reader, ns time.Time) ([]byte, error) {
metrics.Timer("wasm_vm_eval").Start()
defer metrics.Timer("wasm_vm_eval").Stop()

Expand Down Expand Up @@ -640,6 +733,12 @@ func callOrCancel(ctx context.Context, vm *VM, name string, args ...int32) (inte
}
}
if msg != "" {
// TODO(sr): Out of bounds memory access is a trap, too!
// This "interrupted at" is a bit misleading, however, currently
// the only way to fix this is by looking at the string
// `t.Error()` which also contains a (long, prettily) rendered
// backtrace.
// See also https://github.com/bytecodealliance/wasmtime-go/issues/63
msg = "interrupted at " + msg
}
}
Expand Down
5 changes: 1 addition & 4 deletions internal/wasm/sdk/opa/capabilities/capabilities.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,7 @@

package capabilities

const abiVersion = 1
const abiMinorVersion = 1

// ABIVersions returns the ABI versions that this SDK supports
func ABIVersions() [][2]int {
return [][2]int{{abiVersion, abiMinorVersion}}
return [][2]int{{1, 1}, {1, 2}}
}
1 change: 0 additions & 1 deletion internal/wasm/sdk/opa/opa_bench_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,6 @@ func BenchmarkWasmRego(b *testing.B) {
policy := compileRegoToWasm("a = true", "data.p.a = x", false)
instance, _ := opa.New().
WithPolicyBytes(policy).
WithMemoryLimits(131070, 2*131070). // TODO: For some reason unlimited memory slows down the eval_ctx_new().
WithPoolSize(1).
Init()

Expand Down

0 comments on commit e30b182

Please sign in to comment.