Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wasm: reduce number of VM calls for policy evaluation #3627

Conversation

srenatus
Copy link
Contributor

@srenatus srenatus commented Jul 9, 2021

This adds a new function export to the wasm modules, opa_eval: It offers a faster eval functionality that requires less VM calls to evaluate a policy.

Where before we had to do these VM calls to evaluate a policy with input:

<opa_heap_ptr_set>
<opa_eval_ctx_new>
<opa_eval_ctx_set_data>
<opa_eval_ctx_set_entrypoint>
<opa_malloc>
<opa_value_parse>
<opa_eval_ctx_set_input>
<eval>
<opa_eval_ctx_get_result>
<opa_value_dump>

Now, it'll be one call: opa_eval, that takes the entrypoint, a pointer to data, a pointer (+ length) of the input JSON, and a heap ptr. It returns the same result pointer you'd gotten before (as end result of the entire call set).

👉 The added function changes our ABI from 1.1 to 1.2.

Fixes #3146.


TODO

  • give the export a better name (💭 opa_eval...?)
  • bump abi minor version
  • add integration test for being able to eval ABI 1.1 module
  • docs
  • compile some benchmarks
  • use new method in make wasm-rego-test

$ benchstat main.bench branch.bench
name         old time/op    new time/op    delta
WasmRego-16    84.3µs ± 6%    15.1µs ± 0%  -82.07%  (p=0.008 n=5+5)

name         old alloc/op   new alloc/op   delta
WasmRego-16    6.05kB ± 0%    2.24kB ± 0%  -62.92%  (p=0.008 n=5+5)

name         old allocs/op  new allocs/op  delta
WasmRego-16       266 ± 0%        65 ± 0%  -75.60%  (p=0.008 n=5+5)

@srenatus srenatus changed the title wasm: introduce one-off eval function, use it instead wasm: reduce number of VM calls for policy evaluation Jul 9, 2021
@srenatus srenatus force-pushed the sr/wasm/experiment-with-one-off-eval-function branch 4 times, most recently from ffdcefe to 30393e2 Compare July 9, 2021 16:30
@srenatus
Copy link
Contributor Author

Mishandled the heap ptr in vm.go, we'll keep accumulating inputs, only append them instead of overwriting. Will fix.

@srenatus srenatus force-pushed the sr/wasm/experiment-with-one-off-eval-function branch 2 times, most recently from e205155 to b397de9 Compare July 12, 2021 08:39
@srenatus srenatus marked this pull request as ready for review July 12, 2021 08:59
Copy link
Member

@tsandall tsandall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Left a few minor comments/thoughts. Looks like this provides a 50% speedup on trivial queries (e.g., opa bench -t wasm 'true')

| <span class="opa-keep-it-together">`addr opa_heap_ptr_get(void)`</span> | Get the current heap pointer. | 1.0 |
| <span class="opa-keep-it-together">`int32 opa_value_add_path(base_value_addr, path_value_addr, value_addr)`</span> | Add the value at the `value_addr` into the object referenced by `base_value_addr` at the given path. The `path_value_addr` must point to an array value with string keys (eg: `["a", "b", "c"]`). Existing values will be updated. On success the value at `value_addr` is no longer owned by the caller, it will be freed with the base value. The path must be freed by the caller after use (see `opa_free`). If an error occurs the base value will remain unchanged. Example: base object `{"a": {"b": 123}}`, path `["a", "x", "y"]`, and value `{"foo": "bar"}` will yield `{"a": {"b": 123, "x": {"y": {"foo": "bar"}}}}`. Returns an error code (see below). | 1.0 |
| <span class="opa-keep-it-together">`int32 opa_value_remove_path(base_value_addr, path_value_addr)`</span> | Remove the value from the object referenced by `base_value_addr` at the given path. Values removed will be freed. The path must be freed by the caller after use (see `opa_free`). The `path_value_addr` must point to an array value with string keys (eg: `["a", "b", "c"]`). Returns an error code (see below). | 1.0 |
| <span class="opa-keep-it-together">`str_addr opa_eval(entrypoint_id, value_addr, str_addr, int32, addr, format)`</span> | One-off policy evaluation method. Its arguments are everything needed to evaluate: entrypoint, address of data in memory, address and length of input JSON string in memory, heap address to use, and the output format (`0` is "value", `1` is JSON). Returns the address to the serialised result value. | 1.2 |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: WDYT about tacking an extra param on that we could (ab)use in the future if needed. The signature would be...

str_addr opa_eval(addr, entrypoint_id, value_addr, str_addr, int32, addr, format)

Where the first addr could be ignored for now but reserved for future use.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I was wondering about whether we should flip want_json around so that 0 specifies JSON formatting. On the one hand, most callers will only want/be able to deal with JSON output so making 0 specify JSON would cater to the masses. On the other hand, we could just recommend that people request Rego values for output; if the policy query generates a set when it's not supposed to, then the JSON parser in the app will catch it.

I'm fine either way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, we could just recommend that people request Rego values for output; if the policy query generates a set when it's not supposed to, then the JSON parser in the app will catch it.

The result set will always be a set, so there's no happy path here, I think 🤔

I'm fine flipping json/rego though! 👍

internal/compiler/wasm/wasm.go Outdated Show resolved Hide resolved
# NOTE(sr): this is the last image emitting wasm modules with ABI 1.1
IMAGE := openpolicyagent/opa:0.30.1

bundle.tar.gz: t.rego
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't see this hooked up anywhere. Do we want to keep this directory? Same for tmp.wasm above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not hooked up, but the generated file is used here: https://github.com/open-policy-agent/opa/pull/3627/files#diff-98755a90ab5e33f048eb9d54795984851c84b9697b16d9b18e5083753eee10ebR278

I thought it better to not commit generated data without the instructions on how to generate it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

☝️ @tsandall just realised I've merged this without having resolved that thing. I'll do a cleanup PR if you think differently about generated files there! 😃

return majorVal.I32(), minorVal.I32(), nil
}
}
return 0, 0, fmt.Errorf("couldn't retrieve ABI version")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'm unsure what error actually gets printed but this would seem to indicate an invalid wasm module got passed into the VM... I'm wondering if something like `invalid module: failed to read ABI version" would be more helpful.

// Eval performs an evaluation of the specified entrypoint, with any provided
// input, and returns the resulting value dumped to a string.
func (i *VM) Eval(ctx context.Context, entrypoint int32, input *interface{}, metrics metrics.Metrics, seed io.Reader, ns time.Time) ([]byte, error) {
if i.abiMinorVersion != int32(2) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should we make this check for less than 1.2? I'm just thinking about insulating from issues when we end up with 1.3 for some unrelated reason...

@@ -21,7 +21,6 @@ func BenchmarkWasmRego(b *testing.B) {
policy := compileRegoToWasm("a = true", "data.p.a = x", false)
instance, _ := opa.New().
WithPolicyBytes(policy).
WithMemoryLimits(131070, 2*131070). // TODO: For some reason unlimited memory slows down the eval_ctx_new().
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎊

@srenatus
Copy link
Contributor Author

@tsandall thanks for the feedback, if you could have another look, that would be great 🚀

tsandall
tsandall previously approved these changes Jul 14, 2021
| <span class="opa-keep-it-together">`addr opa_heap_ptr_get(void)`</span> | Get the current heap pointer. | 1.0 |
| <span class="opa-keep-it-together">`int32 opa_value_add_path(base_value_addr, path_value_addr, value_addr)`</span> | Add the value at the `value_addr` into the object referenced by `base_value_addr` at the given path. The `path_value_addr` must point to an array value with string keys (eg: `["a", "b", "c"]`). Existing values will be updated. On success the value at `value_addr` is no longer owned by the caller, it will be freed with the base value. The path must be freed by the caller after use (see `opa_free`). If an error occurs the base value will remain unchanged. Example: base object `{"a": {"b": 123}}`, path `["a", "x", "y"]`, and value `{"foo": "bar"}` will yield `{"a": {"b": 123, "x": {"y": {"foo": "bar"}}}}`. Returns an error code (see below). | 1.0 |
| <span class="opa-keep-it-together">`int32 opa_value_remove_path(base_value_addr, path_value_addr)`</span> | Remove the value from the object referenced by `base_value_addr` at the given path. Values removed will be freed. The path must be freed by the caller after use (see `opa_free`). The `path_value_addr` must point to an array value with string keys (eg: `["a", "b", "c"]`). Returns an error code (see below). | 1.0 |
| <span class="opa-keep-it-together">`str_addr opa_eval(addr, entrypoint_id, value_addr, str_addr, int32, addr, format)`</span> | One-off policy evaluation method. Its arguments are everything needed to evaluate: entrypoint, address of data in memory, address and length of input JSON string in memory, heap address to use, and the output format (`0` is JSON, `1` is "value", i.e. serialized Rego values). The first argument is reserved for future use (ignored for now). Returns the address to the serialised result value. | 1.2 |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: let's mention that callers should supply NULL for the first parameter for now. This way we'll be able to differentiate between set and unset in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll update it. I think I'll have it say "pass 0", since there's no "native" concept of NULL in wasm... 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, pass 0 makes sense.

WASM_EXPORT(opa_eval)
char *opa_eval(void *, int entrypoint, opa_value *data, char *input, uint32_t input_len, uint32_t heap, bool want_value)
{
opa_heap_ptr_set(heap);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Check that the first parameter is NULL and return error in case?

@srenatus srenatus force-pushed the sr/wasm/experiment-with-one-off-eval-function branch from d6eef8e to 5dcaf1d Compare July 14, 2021 19:42
* wasm/sdk: check version, call old eval path for ABI 1.1

  Fixes open-policy-agent#3146.

* docs/wasm: document addition as ABI 1.2

* wasm-sdk: overwrite previous inputs, don't accumulate them

  There is a little room for optimization here, should the input
  ever grow so large that it eats up too much precious heap space,
  we could look into changing this so that the memory used for it
  can be reclaimed.

* internal/compiler/wasm: commit generated wasm

  I've noticed that since the CI build running on macos-latest doesn't
  have docker installed, it cannot update these files itself at build
  time. We thus end up with macos binaries that have the wasm binary
  data from the main branch, not the PR.

  This can be observed from the test failure:

      Run make ci-binary-smoke-test-wasm BINARY=opa_darwin_amd64
      chmod +x "_release/0.31.0-dev/opa_darwin_amd64"
      "_release/0.31.0-dev/opa_darwin_amd64" eval -t "wasm" 'time.now_ns()'
      make: *** [ci-binary-smoke-test-wasm] Error 2
      {
        "errors": [
          {
            "message": "caller not found: opa_eval (opa_eval)"
          }
        ]
      }
      Error: Process completed with exit code 2.

  Since I had previously commit the CSV data that drives the dead
  code elimination process, that optimization had failed to find a
  function it expected to have.

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
@srenatus srenatus force-pushed the sr/wasm/experiment-with-one-off-eval-function branch from 5dcaf1d to e30b182 Compare July 15, 2021 07:15
@srenatus srenatus merged commit c0c3cd1 into open-policy-agent:main Jul 15, 2021
@tsandall
Copy link
Member

tsandall commented Jul 16, 2021 via email

@srenatus
Copy link
Contributor Author

@tsandall I'll clean them up, sure.

dolevf pushed a commit to dolevf/opa that referenced this pull request Nov 4, 2021
…ent#3627)

* wasm/sdk: check version, call old eval path for ABI 1.1

  Fixes open-policy-agent#3146.

* docs/wasm: document addition as ABI 1.2

* wasm-sdk: overwrite previous inputs, don't accumulate them

  There is a little room for optimization here, should the input
  ever grow so large that it eats up too much precious heap space,
  we could look into changing this so that the memory used for it
  can be reclaimed.

* internal/compiler/wasm: commit generated wasm

  I've noticed that since the CI build running on macos-latest doesn't
  have docker installed, it cannot update these files itself at build
  time. We thus end up with macos binaries that have the wasm binary
  data from the main branch, not the PR.

  This can be observed from the test failure:

      Run make ci-binary-smoke-test-wasm BINARY=opa_darwin_amd64
      chmod +x "_release/0.31.0-dev/opa_darwin_amd64"
      "_release/0.31.0-dev/opa_darwin_amd64" eval -t "wasm" 'time.now_ns()'
      make: *** [ci-binary-smoke-test-wasm] Error 2
      {
        "errors": [
          {
            "message": "caller not found: opa_eval (opa_eval)"
          }
        ]
      }
      Error: Process completed with exit code 2.

  Since I had previously commit the CSV data that drives the dead
  code elimination process, that optimization had failed to find a
  function it expected to have.

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
Signed-off-by: Dolev Farhi <farhi.dolev@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

internal wasm sdk: check ABI version
2 participants