Wasmtime: refactor the pooling allocator for components #6835

fitzgen · 2023-08-10T22:54:54Z

We used to have one index allocator, an index per instance, and give out N
tables and M memories to every instance regardless how many tables and memories
they need.

Now we have an index allocator for memories and another for tables. An instance
isn't associated with a single instance, each of its memories and tables have an
index. We allocate exactly as many tables and memories as the instance actually
needs.

Ultimately, this gives us better component support, where a component instance
might have varying numbers of internal tables and memories.

Additionally, you can now limit the number of tables, memories, and core
instances a single component can allocate from the pooling allocator, even if
there is the capacity for that many available. This is to give embedders tools
to limit individual component instances and prevent them from hogging too much
of the pooling allocator's resources.

TODO before landing:

Update RELEASES.md with a heads up about the config changes and give a small guide of how to migrate existing set ups

fitzgen · 2023-08-10T22:56:17Z

crates/runtime/src/instance.rs

+    /// The `MemoryAllocationIndex` was given from our `InstanceAllocator` and
+    /// must be given back to the instance allocator when deallocating each
+    /// memory.
+    memories: PrimaryMap<DefinedMemoryIndex, (MemoryAllocationIndex, Memory)>,


Unsure whether this is better as-written, or if moving the MemoryAllocationIndex into wasmtime_runtime::Memory is better. Feel free to bike shed.

github-actions · 2023-08-11T01:55:22Z

Subscribe to Label Action

cc @peterhuene

This issue or pull request has been labeled: "wasmtime:api", "wasmtime:config"

Thus the following users have been cc'd because of the following labels:

peterhuene: wasmtime:api

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

github-actions · 2023-08-11T01:55:51Z

Label Messager: wasmtime:config

It looks like you are changing Wasmtime's configuration options. Make sure to
complete this check list:

If you added a new Config method, you wrote extensive documentation for
it.

Our documentation should be of the following form:

Short, simple summary sentence.

More details. These details can be multiple paragraphs. There should be
information about not just the method, but its parameters and results as
well.

Is this method fallible? If so, when can it return an error?

Can this method panic? If so, when does it panic?

# Example

Optional example here.

If you added a new Config method, or modified an existing one, you
ensured that this configuration is exercised by the fuzz targets.

For example, if you expose a new strategy for allocating the next instance
slot inside the pooling allocator, you should ensure that at least one of our
fuzz targets exercises that new strategy.

Often, all that is required of you is to ensure that there is a knob for this
configuration option in wasmtime_fuzzing::Config (or one
of its nested structs).

Rarely, this may require authoring a new fuzz target to specifically test this
configuration. See our docs on fuzzing for more details.
If you are enabling a configuration option by default, make sure that it
has been fuzzed for at least two weeks before turning it on by default.

To modify this label's message, edit the .github/label-messager/wasmtime-config.md file.

To add new label messages or remove existing label messages, edit the
.github/label-messager.json configuration file.

Learn more.

github-actions · 2023-08-11T21:45:04Z

Subscribe to Label Action

cc @fitzgen

This issue or pull request has been labeled: "fuzzing", "wasmtime:docs"

Thus the following users have been cc'd because of the following labels:

fitzgen: fuzzing

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

fitzgen · 2023-08-14T23:13:41Z

Alex is out of office till the end of the week; do you think you could take a look at this @cfallin?

cfallin · 2023-08-15T00:00:50Z

Alex is out of office till the end of the week; do you think you could take a look at this @cfallin?

I can possibly take a look, but I'm dealing with pretty bad wrist RSI right now and trying to learn to use my machine with voice dictation so it might take quite a lot of time. if it can wait until Alex is back maybe that's better...

jameysharp · 2023-08-15T00:21:12Z

The first two commits in this PR are tiny enough that I've just reviewed them and would be happy to sign off on them. Unfortunately the third PR is the interesting part and is a little more overwhelming, and I can't say much about it yet.

On a brief skim I can at least say that moving MemoryAllocationIndex into wasmtime_runtime::Memory like you suggest would remove a bunch of changes which just add .1 into various places. (Similarly for table allocations, I assume?) I don't know what other impact that would have so I'm not sure why you didn't go with that option to begin with.

The other thing that jumps out at me is that extracting MemoryPool/TablePool/StackPool to separate modules looks like it might be easy to split out as a separate PR to reduce the amount of churn in this commit.

cfallin · 2023-08-15T00:26:11Z

I'd be happy to do a live review over Zoom if that would help... I'm just awfully slow at typing right now!

crates/runtime/src/instance/allocator.rs

crates/runtime/src/instance/allocator/pooling/index_allocator.rs

crates/runtime/src/instance/allocator/pooling/memory_pool.rs

crates/wasmtime/src/component/component.rs

cfallin

Together with earlier zoom review and associated comments, this overall looks great to me! High-quality implementation with good attention paid to safety (e.g. index newtypes). A few comments below as well but nothing too major.

crates/wasmtime/src/config.rs

crates/wasmtime/src/store.rs

fitzgen · 2023-08-16T21:11:32Z

FWIW, 25 instantiation benchmarks "improved", while 9 "regressed". I think this is basically all within the noise.

sequential/default/data_segments.wat
                        time:   [16.207 µs 16.300 µs 16.396 µs]
                        change: [-14.937% -13.832% -12.750%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) high mild
  3 (3.00%) high severe
sequential/pooling/data_segments.wat
                        time:   [4.1824 µs 4.2111 µs 4.2460 µs]
                        change: [-1.4282% +0.2191% +1.7369%] (p = 0.79 > 0.05)
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  4 (4.00%) high mild
  3 (3.00%) high severe

parallel/default/data_segments.wat: with 1 thread
                        time:   [16.374 µs 16.470 µs 16.579 µs]
                        change: [-12.172% -10.245% -8.4601%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
  7 (7.00%) high mild
  6 (6.00%) high severe
parallel/default/data_segments.wat: with 2 threads
                        time:   [22.580 µs 22.767 µs 22.995 µs]
                        change: [-9.2143% -7.3138% -4.8914%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 11 outliers among 100 measurements (11.00%)
  6 (6.00%) high mild
  5 (5.00%) high severe
parallel/default/data_segments.wat: with 3 threads
                        time:   [32.501 µs 32.825 µs 33.211 µs]
                        change: [-5.8662% -3.5161% -0.8508%] (p = 0.01 < 0.05)
                        Change within noise threshold.
Found 16 outliers among 100 measurements (16.00%)
  5 (5.00%) high mild
  11 (11.00%) high severe
parallel/default/data_segments.wat: with 4 threads
                        time:   [55.409 µs 56.588 µs 57.990 µs]
                        change: [-6.6418% -3.2057% +0.3236%] (p = 0.08 > 0.05)
                        No change in performance detected.
parallel/pooling/data_segments.wat: with 1 thread
                        time:   [4.2061 µs 4.2405 µs 4.2806 µs]
                        change: [-0.5488% +1.1592% +2.8516%] (p = 0.17 > 0.05)
                        No change in performance detected.
Found 11 outliers among 100 measurements (11.00%)
  9 (9.00%) high mild
  2 (2.00%) high severe
parallel/pooling/data_segments.wat: with 2 threads
                        time:   [4.9711 µs 5.0071 µs 5.0478 µs]
                        change: [+0.0367% +1.6271% +3.1874%] (p = 0.05 < 0.05)
                        Change within noise threshold.
Found 9 outliers among 100 measurements (9.00%)
  6 (6.00%) high mild
  3 (3.00%) high severe
parallel/pooling/data_segments.wat: with 3 threads
                        time:   [5.5409 µs 5.6522 µs 5.8000 µs]
                        change: [+2.4451% +7.1764% +11.976%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 15 outliers among 100 measurements (15.00%)
  7 (7.00%) high mild
  8 (8.00%) high severe
parallel/pooling/data_segments.wat: with 4 threads
                        time:   [6.0691 µs 6.3499 µs 6.7057 µs]
                        change: [-1.7330% +7.2518% +17.575%] (p = 0.13 > 0.05)
                        No change in performance detected.
Found 20 outliers among 100 measurements (20.00%)
  20 (20.00%) high severe

deserialize/deserialize/data_segments.wat
                        time:   [33.431 µs 33.809 µs 34.249 µs]
                        change: [+1.4039% +3.1953% +5.1095%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
  3 (3.00%) high mild
  9 (9.00%) high severe
deserialize/deserialize_file/data_segments.wat
                        time:   [31.674 µs 31.989 µs 32.413 µs]
                        change: [-1.6587% +0.5155% +3.0517%] (p = 0.66 > 0.05)
                        No change in performance detected.
Found 13 outliers among 100 measurements (13.00%)
  1 (1.00%) high mild
  12 (12.00%) high severe

sequential/default/empty.wat
                        time:   [2.8884 µs 2.9012 µs 2.9153 µs]
                        change: [-13.844% -8.4599% -3.6245%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) high mild
  4 (4.00%) high severe
sequential/pooling/empty.wat
                        time:   [2.9106 µs 2.9333 µs 2.9626 µs]
                        change: [-4.1436% -1.5437% +1.6409%] (p = 0.30 > 0.05)
                        No change in performance detected.
Found 13 outliers among 100 measurements (13.00%)
  3 (3.00%) high mild
  10 (10.00%) high severe

parallel/default/empty.wat: with 1 thread
                        time:   [2.9121 µs 2.9299 µs 2.9504 µs]
                        change: [+1.0205% +2.5318% +4.2334%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 17 outliers among 100 measurements (17.00%)
  10 (10.00%) high mild
  7 (7.00%) high severe
parallel/default/empty.wat: with 2 threads
                        time:   [3.2615 µs 3.3018 µs 3.3505 µs]
                        change: [+1.9848% +3.3641% +4.8919%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 14 outliers among 100 measurements (14.00%)
  14 (14.00%) high severe
parallel/default/empty.wat: with 3 threads
                        time:   [3.4042 µs 3.4264 µs 3.4529 µs]
                        change: [+1.3432% +3.2608% +5.7222%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
  5 (5.00%) high mild
  8 (8.00%) high severe
parallel/default/empty.wat: with 4 threads
                        time:   [3.5997 µs 3.7497 µs 3.9238 µs]
                        change: [-2.5449% +3.1724% +8.8255%] (p = 0.29 > 0.05)
                        No change in performance detected.
Found 18 outliers among 100 measurements (18.00%)
  1 (1.00%) high mild
  17 (17.00%) high severe
parallel/pooling/empty.wat: with 1 thread
                        time:   [2.9411 µs 2.9568 µs 2.9740 µs]
                        change: [-6.8359% -4.3504% -2.0078%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  4 (4.00%) high mild
  4 (4.00%) high severe
parallel/pooling/empty.wat: with 2 threads
                        time:   [3.2878 µs 3.3114 µs 3.3471 µs]
                        change: [-12.572% -10.707% -9.0176%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe
parallel/pooling/empty.wat: with 3 threads
                        time:   [3.4820 µs 3.5166 µs 3.5603 µs]
                        change: [-11.531% -8.9122% -6.6623%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  4 (4.00%) high mild
  5 (5.00%) high severe
parallel/pooling/empty.wat: with 4 threads
                        time:   [3.6409 µs 3.7537 µs 3.8965 µs]
                        change: [-16.527% -11.957% -6.6742%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 19 outliers among 100 measurements (19.00%)
  6 (6.00%) high mild
  13 (13.00%) high severe

deserialize/deserialize/empty.wat
                        time:   [30.528 µs 30.743 µs 31.004 µs]
                        change: [-2.9185% -0.4947% +1.5731%] (p = 0.68 > 0.05)
                        No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
  4 (4.00%) high mild
  2 (2.00%) high severe
deserialize/deserialize_file/empty.wat
                        time:   [31.130 µs 31.350 µs 31.596 µs]
                        change: [-1.4528% -0.2984% +0.7568%] (p = 0.62 > 0.05)
                        No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
  5 (5.00%) high mild
  1 (1.00%) high severe

sequential/default/spidermonkey.wasm
                        time:   [17.126 µs 17.343 µs 17.598 µs]
                        change: [-24.108% -23.194% -22.072%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) high mild
  3 (3.00%) high severe
sequential/pooling/spidermonkey.wasm
                        time:   [5.6331 µs 5.6638 µs 5.6966 µs]
                        change: [-4.0009% -2.8480% -1.6732%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe

parallel/default/spidermonkey.wasm: with 1 thread
                        time:   [17.304 µs 17.442 µs 17.610 µs]
                        change: [-23.116% -20.805% -18.284%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  2 (2.00%) high mild
  10 (10.00%) high severe
parallel/default/spidermonkey.wasm: with 2 threads
                        time:   [31.028 µs 31.269 µs 31.553 µs]
                        change: [-1.1847% +0.3038% +1.7367%] (p = 0.68 > 0.05)
                        No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
  6 (6.00%) high mild
  2 (2.00%) high severe
parallel/default/spidermonkey.wasm: with 3 threads
                        time:   [39.188 µs 39.735 µs 40.395 µs]
                        change: [-14.835% -12.205% -9.7141%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 16 outliers among 100 measurements (16.00%)
  8 (8.00%) high mild
  8 (8.00%) high severe
parallel/default/spidermonkey.wasm: with 4 threads
                        time:   [59.137 µs 60.393 µs 61.785 µs]
                        change: [-10.352% -6.7148% -2.9688%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild
parallel/pooling/spidermonkey.wasm: with 1 thread
                        time:   [5.7202 µs 5.7692 µs 5.8275 µs]
                        change: [-5.0476% -2.2400% +0.5254%] (p = 0.13 > 0.05)
                        No change in performance detected.
Found 13 outliers among 100 measurements (13.00%)
  10 (10.00%) high mild
  3 (3.00%) high severe
parallel/pooling/spidermonkey.wasm: with 2 threads
                        time:   [6.9303 µs 6.9849 µs 7.0466 µs]
                        change: [-0.1129% +4.3773% +9.6818%] (p = 0.08 > 0.05)
                        No change in performance detected.
Found 11 outliers among 100 measurements (11.00%)
  4 (4.00%) high mild
  7 (7.00%) high severe
parallel/pooling/spidermonkey.wasm: with 3 threads
                        time:   [8.0897 µs 8.1976 µs 8.3418 µs]
                        change: [-1.6239% +3.7225% +9.3172%] (p = 0.18 > 0.05)
                        No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
  8 (8.00%) high severe
parallel/pooling/spidermonkey.wasm: with 4 threads
                        time:   [9.6367 µs 10.057 µs 10.593 µs]
                        change: [-4.8839% +3.2482% +12.152%] (p = 0.44 > 0.05)
                        No change in performance detected.
Found 20 outliers among 100 measurements (20.00%)
  5 (5.00%) high mild
  15 (15.00%) high severe

deserialize/deserialize/spidermonkey.wasm
                        time:   [11.630 ms 11.684 ms 11.743 ms]
                        change: [-3.1513% -2.3446% -1.5742%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  7 (7.00%) high mild
Benchmarking deserialize/deserialize_file/spidermonkey.wasm: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.7s, enable flat sampling, or reduce sample count to 60.
deserialize/deserialize_file/spidermonkey.wasm
                        time:   [1.1071 ms 1.1189 ms 1.1354 ms]
                        change: [-1.2033% +1.1429% +3.8242%] (p = 0.38 > 0.05)
                        No change in performance detected.
Found 15 outliers among 100 measurements (15.00%)
  2 (2.00%) high mild
  13 (13.00%) high severe

sequential/default/small_memory.wat
                        time:   [11.358 µs 11.437 µs 11.523 µs]
                        change: [-39.079% -38.266% -37.471%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  6 (6.00%) high mild
  2 (2.00%) high severe
sequential/pooling/small_memory.wat
                        time:   [3.9798 µs 4.0016 µs 4.0261 µs]
                        change: [+1.0485% +2.5775% +4.5576%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
  4 (4.00%) high mild
  1 (1.00%) high severe

parallel/default/small_memory.wat: with 1 thread
                        time:   [11.213 µs 11.275 µs 11.353 µs]
                        change: [-32.990% -32.172% -31.378%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) high mild
  3 (3.00%) high severe
parallel/default/small_memory.wat: with 2 threads
                        time:   [20.071 µs 20.145 µs 20.225 µs]
                        change: [-12.427% -11.448% -10.464%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe
parallel/default/small_memory.wat: with 3 threads
                        time:   [25.655 µs 25.978 µs 26.414 µs]
                        change: [-15.242% -12.123% -9.2153%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
  3 (3.00%) high mild
  10 (10.00%) high severe
parallel/default/small_memory.wat: with 4 threads
                        time:   [37.618 µs 38.369 µs 39.122 µs]
                        change: [-15.044% -12.768% -10.341%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
parallel/pooling/small_memory.wat: with 1 thread
                        time:   [3.9885 µs 4.0232 µs 4.0678 µs]
                        change: [+0.6034% +2.4216% +4.0956%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 14 outliers among 100 measurements (14.00%)
  5 (5.00%) high mild
  9 (9.00%) high severe
parallel/pooling/small_memory.wat: with 2 threads
                        time:   [4.7548 µs 4.7850 µs 4.8202 µs]
                        change: [-0.2723% +0.8170% +1.7389%] (p = 0.12 > 0.05)
                        No change in performance detected.
Found 9 outliers among 100 measurements (9.00%)
  3 (3.00%) high mild
  6 (6.00%) high severe
parallel/pooling/small_memory.wat: with 3 threads
                        time:   [4.9112 µs 4.9459 µs 4.9913 µs]
                        change: [-2.0811% +1.1116% +4.6606%] (p = 0.54 > 0.05)
                        No change in performance detected.
Found 14 outliers among 100 measurements (14.00%)
  3 (3.00%) high mild
  11 (11.00%) high severe
parallel/pooling/small_memory.wat: with 4 threads
                        time:   [5.4325 µs 5.7838 µs 6.2023 µs]
                        change: [+1.8127% +10.297% +19.397%] (p = 0.02 < 0.05)
                        Performance has regressed.
Found 21 outliers among 100 measurements (21.00%)
  1 (1.00%) high mild
  20 (20.00%) high severe

deserialize/deserialize/small_memory.wat
                        time:   [30.289 µs 30.493 µs 30.719 µs]
                        change: [-1.3081% +1.0727% +3.6305%] (p = 0.42 > 0.05)
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  2 (2.00%) high mild
  5 (5.00%) high severe
deserialize/deserialize_file/small_memory.wat
                        time:   [31.315 µs 31.534 µs 31.788 µs]
                        change: [-0.9317% +0.3888% +1.8105%] (p = 0.60 > 0.05)
                        No change in performance detected.
Found 10 outliers among 100 measurements (10.00%)
  8 (8.00%) high mild
  2 (2.00%) high severe

sequential/default/wasi.wasm
                        time:   [16.558 µs 16.649 µs 16.750 µs]
                        change: [-25.611% -24.753% -23.899%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe
sequential/pooling/wasi.wasm
                        time:   [5.4302 µs 5.4648 µs 5.5051 µs]
                        change: [-1.5963% -0.4439% +0.6725%] (p = 0.45 > 0.05)
                        No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) high mild
  4 (4.00%) high severe

parallel/default/wasi.wasm: with 1 thread
                        time:   [16.963 µs 17.099 µs 17.247 µs]
                        change: [-22.726% -21.819% -20.838%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe
parallel/default/wasi.wasm: with 2 threads
                        time:   [29.800 µs 30.070 µs 30.396 µs]
                        change: [-8.0261% -6.9323% -5.8986%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) high mild
  3 (3.00%) high severe
parallel/default/wasi.wasm: with 3 threads
                        time:   [41.575 µs 42.025 µs 42.543 µs]
                        change: [-14.413% -12.331% -10.346%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  6 (6.00%) high mild
  4 (4.00%) high severe
parallel/default/wasi.wasm: with 4 threads
                        time:   [62.538 µs 63.781 µs 65.096 µs]
                        change: [-12.456% -9.5987% -6.6781%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
parallel/pooling/wasi.wasm: with 1 thread
                        time:   [5.4351 µs 5.4636 µs 5.4958 µs]
                        change: [-4.5508% -3.2820% -2.1305%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  6 (6.00%) high mild
  6 (6.00%) high severe
parallel/pooling/wasi.wasm: with 2 threads
                        time:   [6.6933 µs 6.7343 µs 6.7819 µs]
                        change: [+4.2012% +6.7091% +9.2872%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 16 outliers among 100 measurements (16.00%)
  7 (7.00%) high mild
  9 (9.00%) high severe
parallel/pooling/wasi.wasm: with 3 threads
                        time:   [7.8589 µs 7.9351 µs 8.0393 µs]
                        change: [+2.0956% +3.7618% +5.3506%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe
parallel/pooling/wasi.wasm: with 4 threads
                        time:   [9.5130 µs 9.8876 µs 10.356 µs]
                        change: [-6.2590% +0.6151% +7.5760%] (p = 0.86 > 0.05)
                        No change in performance detected.
Found 16 outliers among 100 measurements (16.00%)
  3 (3.00%) high mild
  13 (13.00%) high severe

deserialize/deserialize/wasi.wasm
                        time:   [204.86 µs 206.12 µs 207.50 µs]
                        change: [-0.8695% +1.3453% +3.7360%] (p = 0.25 > 0.05)
                        No change in performance detected.
Found 9 outliers among 100 measurements (9.00%)
  3 (3.00%) high mild
  6 (6.00%) high severe
deserialize/deserialize_file/wasi.wasm
                        time:   [80.138 µs 80.533 µs 80.982 µs]
                        change: [-4.6080% -1.8013% +1.1720%] (p = 0.24 > 0.05)
                        No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) high mild
  5 (5.00%) high severe

We will have multiple kinds of index allocators soon, so clarify which one this is.

This will be used in future commits refactoring the pooling allocator.

We used to have one index allocator, an index per instance, and give out N tables and M memories to every instance regardless how many tables and memories they need. Now we have an index allocator for memories and another for tables. An instance isn't associated with a single instance, each of its memories and tables have an index. We allocate exactly as many tables and memories as the instance actually needs. Ultimately, this gives us better component support, where a component instance might have varying numbers of internal tables and memories. Additionally, you can now limit the number of tables, memories, and core instances a single component can allocate from the pooling allocator, even if there is the capacity for that many available. This is to give embedders tools to limit individual component instances and prevent them from hogging too much of the pooling allocator's resources.

The exact `cfg`s that unlock the tests that use these are platform and feature dependent and ends up being like 5 things and super long. Simpler to just allow unused for when we are testing on other platforms or don't have the compile time features enabled.

Also fix a couple scenarios where we could leak indices if allocating an index for a memory/table succeeded but then creating the memory/table itself failed.

…ance#6835) * Wasmtime: Rename `IndexAllocator` to `ModuleAffinityIndexAllocator` We will have multiple kinds of index allocators soon, so clarify which one this is. * Wasmtime: Introduce a simple index allocator This will be used in future commits refactoring the pooling allocator. * Wasmtime: refactor the pooling allocator for components We used to have one index allocator, an index per instance, and give out N tables and M memories to every instance regardless how many tables and memories they need. Now we have an index allocator for memories and another for tables. An instance isn't associated with a single instance, each of its memories and tables have an index. We allocate exactly as many tables and memories as the instance actually needs. Ultimately, this gives us better component support, where a component instance might have varying numbers of internal tables and memories. Additionally, you can now limit the number of tables, memories, and core instances a single component can allocate from the pooling allocator, even if there is the capacity for that many available. This is to give embedders tools to limit individual component instances and prevent them from hogging too much of the pooling allocator's resources. * Remove unused file Messed up from rebasing, this code is actually just inline in the index allocator module. * Address review feedback * Fix benchmarks build * Fix ignoring test under miri The `async_functions` module is not even compiled-but-ignored with miri, it is completely `cfg`ed off. Therefore we ahve to do the same with this test that imports stuff from that module. * Fix doc links * Allow testing utilities to be unused The exact `cfg`s that unlock the tests that use these are platform and feature dependent and ends up being like 5 things and super long. Simpler to just allow unused for when we are testing on other platforms or don't have the compile time features enabled. * Debug assert that the pool is empty on drop, per Alex's suggestion Also fix a couple scenarios where we could leak indices if allocating an index for a memory/table succeeded but then creating the memory/table itself failed. * Fix windows compile errors

…eature/kserve * 'feature/kserve' of github.com:geekbeast/wasmtime: Refactor Wasmtime CLI to support components (bytecodealliance#6836) Bump the wasm-tools family of crates (bytecodealliance#6861) Wasmtime: refactor the pooling allocator for components (bytecodealliance#6835)

alexcrichton

This all looks great to me, thanks again for tackling this!

alexcrichton · 2023-08-21T15:27:17Z

crates/runtime/src/instance.rs

+    /// The `TableAllocationIndex` was given from our `InstanceAllocator` and
+    /// must be given back to the instance allocator when deallocating each
+    /// table.
+    tables: PrimaryMap<DefinedTableIndex, (TableAllocationIndex, Table)>,


I mentioned this in person as well, but would it be possible to eschew this index (and the one above) and infer the index from an address in the pooling allocator?

alexcrichton · 2023-08-21T15:34:34Z

crates/runtime/src/instance/allocator.rs

+// Every `InstanceAllocatorImpl` is an `InstanceAllocator` when used
+// correctly. Also, no one is allowed to override this trait's methods, they
+// must use the defaults. This blanket impl provides both of those things.
+impl<T: InstanceAllocatorImpl> InstanceAllocator for T {}


Would it be possible to keep this as one trait? Inferring why this was split into two it seems like it wants to guarantee that the default implementations of methods are used, but this is purely internal and it's already an unsafe trait, so I think that should be enough to cover the bases? (I don't think we're at risk of duplicating these default trait method implementations anywhere)

alexcrichton · 2023-08-21T15:45:27Z

crates/runtime/src/instance/allocator/pooling.rs

-            let table = mem::take(table);
-            assert!(table.is_static());
+    fn decrement_core_instance_count(&self) {
+        self.live_core_instances.fetch_sub(1, Ordering::AcqRel);


Mind throwing in a debug assert here that the return value is not 0? (e.g. this never goes negative)

(and the other decrement methods too)

This commit addresses some more fallout from bytecodealliance#6835 by updating some error messages and adding clauses for new conditions. Namely: * Module compilation is now allowed to fail when the module may have more memories/tables than the pooling allocator allows per-module. * The error message for the core instance limit being reached has been updated.

* Fix some warnings on nightly Rust * Fix some more fuzz-test cases from pooling changes This commit addresses some more fallout from #6835 by updating some error messages and adding clauses for new conditions. Namely: * Module compilation is now allowed to fail when the module may have more memories/tables than the pooling allocator allows per-module. * The error message for the core instance limit being reached has been updated.

…6943) * Fix some warnings on nightly Rust * Fix some more fuzz-test cases from pooling changes This commit addresses some more fallout from bytecodealliance#6835 by updating some error messages and adding clauses for new conditions. Namely: * Module compilation is now allowed to fail when the module may have more memories/tables than the pooling allocator allows per-module. * The error message for the core instance limit being reached has been updated.

fitzgen requested a review from alexcrichton August 10, 2023 22:54

fitzgen requested a review from a team as a code owner August 10, 2023 22:54

fitzgen removed the request for review from a team August 10, 2023 22:55

fitzgen commented Aug 10, 2023

View reviewed changes

github-actions bot added wasmtime:api Related to the API of the `wasmtime` crate itself wasmtime:config Issues related to the configuration of Wasmtime labels Aug 11, 2023

fitzgen force-pushed the refactor-pooling-allocator branch from 69f4a12 to 7a3070e Compare August 11, 2023 19:09

fitzgen requested review from a team as code owners August 11, 2023 19:09

github-actions bot added fuzzing Issues related to our fuzzing infrastructure wasmtime:docs Issues related to Wasmtime's documentation labels Aug 11, 2023

fitzgen force-pushed the refactor-pooling-allocator branch 2 times, most recently from a9587b8 to fcc174e Compare August 11, 2023 22:30

github-actions bot added the wasi Issues pertaining to WASI label Aug 11, 2023

fitzgen requested review from cfallin and removed request for alexcrichton August 14, 2023 23:13

fitzgen commented Aug 15, 2023

View reviewed changes

crates/runtime/src/instance/allocator.rs Outdated Show resolved Hide resolved

fitzgen commented Aug 15, 2023

View reviewed changes

cfallin approved these changes Aug 15, 2023

View reviewed changes

crates/wasmtime/src/config.rs Outdated Show resolved Hide resolved

crates/wasmtime/src/config.rs Show resolved Hide resolved

crates/wasmtime/src/config.rs Show resolved Hide resolved

crates/wasmtime/src/store.rs Outdated Show resolved Hide resolved

fitzgen added 3 commits August 16, 2023 14:13

Wasmtime: Rename IndexAllocator to ModuleAffinityIndexAllocator

26adb4d

We will have multiple kinds of index allocators soon, so clarify which one this is.

Wasmtime: Introduce a simple index allocator

aa06701

This will be used in future commits refactoring the pooling allocator.

fitzgen force-pushed the refactor-pooling-allocator branch from 240c4b0 to 98fcc39 Compare August 17, 2023 16:24

fitzgen added this pull request to the merge queue Aug 17, 2023

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 17, 2023

Fix doc links

393b18b

fitzgen added this pull request to the merge queue Aug 17, 2023

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 17, 2023

fitzgen enabled auto-merge August 17, 2023 19:19

fitzgen added this pull request to the merge queue Aug 17, 2023

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 17, 2023

Debug assert that the pool is empty on drop, per Alex's suggestion

c20ae75

Also fix a couple scenarios where we could leak indices if allocating an index for a memory/table succeeded but then creating the memory/table itself failed.

fitzgen enabled auto-merge August 18, 2023 19:52

fitzgen added this pull request to the merge queue Aug 18, 2023

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 18, 2023

Fix windows compile errors

f37b84b

fitzgen enabled auto-merge August 18, 2023 21:01

fitzgen added this pull request to the merge queue Aug 18, 2023

Merged via the queue into bytecodealliance:main with commit a34427a Aug 18, 2023
18 checks passed

fitzgen deleted the refactor-pooling-allocator branch August 18, 2023 22:04

alexcrichton reviewed Aug 21, 2023

View reviewed changes

alexcrichton mentioned this pull request Aug 21, 2023

Change how components interact with the pooling instance allocator #5120

Closed

This was referenced Aug 21, 2023

Fuzzing tweaks in wake of the pooling allocator refactor #6873

Merged

Pooling allocator: switch to a struct-of-free-lists representation #6627

Closed

alexcrichton mentioned this pull request Aug 31, 2023

Fix some more fuzz-test cases from pooling changes #6943

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wasmtime: refactor the pooling allocator for components #6835

Wasmtime: refactor the pooling allocator for components #6835

fitzgen commented Aug 10, 2023 •

edited

fitzgen Aug 10, 2023

github-actions bot commented Aug 11, 2023

github-actions bot commented Aug 11, 2023 •

edited by fitzgen

github-actions bot commented Aug 11, 2023

fitzgen commented Aug 14, 2023

cfallin commented Aug 15, 2023

jameysharp commented Aug 15, 2023

cfallin commented Aug 15, 2023

cfallin left a comment

fitzgen commented Aug 16, 2023

alexcrichton left a comment

alexcrichton Aug 21, 2023

alexcrichton Aug 21, 2023

alexcrichton Aug 21, 2023

alexcrichton Aug 21, 2023

Wasmtime: refactor the pooling allocator for components #6835

Wasmtime: refactor the pooling allocator for components #6835

Conversation

fitzgen commented Aug 10, 2023 • edited

fitzgen Aug 10, 2023

Choose a reason for hiding this comment

github-actions bot commented Aug 11, 2023

Subscribe to Label Action

github-actions bot commented Aug 11, 2023 • edited by fitzgen

Label Messager: wasmtime:config

github-actions bot commented Aug 11, 2023

Subscribe to Label Action

fitzgen commented Aug 14, 2023

cfallin commented Aug 15, 2023

jameysharp commented Aug 15, 2023

cfallin commented Aug 15, 2023

cfallin left a comment

Choose a reason for hiding this comment

fitzgen commented Aug 16, 2023

alexcrichton left a comment

Choose a reason for hiding this comment

alexcrichton Aug 21, 2023

Choose a reason for hiding this comment

alexcrichton Aug 21, 2023

Choose a reason for hiding this comment

alexcrichton Aug 21, 2023

Choose a reason for hiding this comment

alexcrichton Aug 21, 2023

Choose a reason for hiding this comment

fitzgen commented Aug 10, 2023 •

edited

github-actions bot commented Aug 11, 2023 •

edited by fitzgen