Add shared memories #4187

abrown · 2022-05-24T22:18:09Z

This change adds the ability to use shared memories in Wasmtime when the
threads proposal is enabled. Shared memories are annotated as shared
in the WebAssembly syntax, e.g., (memory 1 1 shared), and are
protected from concurrent access during memory.size and memory.grow.

In order to implement this in Wasmtime, there are two main cases to
cover:
- a program may simply create a shared memory and possibly export it;
this means that Wasmtime itself must be able to create shared
memories
- a user may create a shared memory externally and pass it in as an
import during instantiation; this is the case when the program
contains code like (import "env" "memory" (memory 1 1 shared))--this case is handled by a new Wasmtime API
type--SharedMemory

Because of the first case, this change allows any of the current
memory-creation mechanisms to work as-is. Wasmtime can still create
either static or dynamic memories in either on-demand or pooling modes,
and any of these memories can be considered shared. When shared, the
Memory runtime container will lock appropriately during memory.size
and memory.grow operations; since all memories use this container, it
is an ideal place for implementing the locking once and once only.

The second case is covered by the new SharedMemory structure. It uses
the same Mmap allocation under the hood as non-shared memories, but
allows the user to perform the allocation externally to Wasmtime and
share the memory across threads (via an Arc). The pointer address to
the actual memory is carefully wired through and owned by the
SharedMemory structure itself. This means that there are differing
views of where to access the pointer (i.e., VMMemoryDefinition): for
owned memories (the default), the VMMemoryDefinition is stored
directly by the VMContext; in the SharedMemory case, however, this
VMContext must point to this separate structure.

To ensure that the VMContext can always point to the correct
VMMemoryDefinition, this change alters the VMContext structure.
Since a SharedMemory owns its own VMMemoryDefinition, the
defined_memories table in the VMContext becomes a sequence of
pointers--in the shared memory case, they point to the
VMMemoryDefinition owned by the SharedMemory and in the owned memory
case (i.e., not shared) they point to VMMemoryDefinitions stored in a
new table, owned_memories.

This change adds an additional indirection (through the *mut VMMemoryDefinition pointer) that could add overhead. Using an imported
memory as a proxy, we measured a 1-3% overhead of this approach on the
pulldown-cmark benchmark. To avoid this, Cranelift-generated code will
special-case the owned memory access (i.e., load a pointer directly to
the owned_memories entry) for memory.size so that only
shared memories (and imported memories, as before) incur the indirection
cost.

crates/wasmtime/src/memory.rs

github-actions · 2022-05-24T23:17:48Z

Subscribe to Label Action

cc @fitzgen, @peterhuene

This issue or pull request has been labeled: "cranelift", "cranelift:module", "wasmtime:api", "wasmtime:ref-types"

Thus the following users have been cc'd because of the following labels:

fitzgen: wasmtime:ref-types
peterhuene: wasmtime:api

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

crates/cranelift/src/compiler.rs

alexcrichton

I think this is some good progress! I realize this might already be getting to be somewhat of an unwieldy PR so if you'd like I think various things below can also be deferred to later. If you'd prefer to do that I think it would be good to make an issue with remaining work items and to make sure everything is reflected in that before landing this.

One other issue not mentioned below though that I want to be sure to not forget is that all of Wasmtime's interactions with shared memories through intrinsics like memory.init are going to need to be updated for shared memories. I think in general Wasmtime needs to be very carefully audited for the rampant usage of VMMemoryDefinition and have that all locked down through some different abstraction in the case of shared or non-shared memory. I think a static distinction at the top level with Memory and SharedMemory will help with this (I commented below but I think we may actually want to prevent a SharedMemory from being viewed as a Memory) but we'll still want to audit internals for implementations like memory.init, memory.fill, memory.copy, etc.

crates/environ/src/module_environ.rs

crates/runtime/Cargo.toml

crates/runtime/src/vmcontext.rs

crates/wasmtime/src/memory.rs

crates/runtime/src/memory.rs

crates/wasmtime/src/memory.rs

github-actions · 2022-06-02T17:38:01Z

Subscribe to Label Action

cc @peterhuene

This issue or pull request has been labeled: "wasmtime:c-api"

Thus the following users have been cc'd because of the following labels:

peterhuene: wasmtime:c-api

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

alexcrichton

These are some pieces I saw going through, but I think I'll want to do one more final pass week before landing. Overall I think tihs is looking quite good and is basically ready to land, I've mostly just got some little things here and there.

crates/environ/src/module.rs

crates/environ/src/vmoffsets.rs

crates/runtime/src/vmcontext.rs

alexcrichton · 2022-06-03T15:56:11Z

crates/runtime/src/instance.rs

        }
    }

    /// Return the indexed `VMMemoryDefinition`.
    fn memory(&self, index: DefinedMemoryIndex) -> VMMemoryDefinition {
-        unsafe { *self.memory_ptr(index) }
+        unsafe { VMMemoryDefinition::load(self.memory_ptr(index)) }
    }

    /// Set the indexed memory to `VMMemoryDefinition`.


Reading over this again, could this function perhaps be changed to take an OwnedMemoryIndex since those are the only ones we should ever set?

I don't see it. Instance::memory is called by Instance::get_memory which is used by memory_copy, memory_fill, etc. That functionality should still be present for shared memory I think.

Sorry I couldn't place this comment well due to the PR not actually covering the relevant line of code, I mean the set_memory function just below this comment

Ok, see my comment here: it's tricky to re-calculate the owned index for every memory.grow.

crates/wasmtime/src/memory.rs

crates/runtime/src/memory.rs

tests/all/threads.rs

cfallin

I made a general pass over this and added a few questions below. I'll defer to @alexcrichton for final approval, but some of the things below I think would help to clarify and boost confidence a bit. Thanks!

crates/runtime/src/memory.rs

cfallin · 2022-06-06T17:26:31Z

crates/runtime/src/vmcontext.rs

+    /// Return the current length of the [`VMMemoryDefinition`] by performing a
+    /// relaxed load. Concurrent growth of shared memory may alter the value of
+    /// this field in unexpected ways: do not use this function for situations
+    /// in which a precise length is needed.


This doc-comment seems a little unclear to me: if the result isn't exactly precise at all times, how might it differ? Is it always smaller or equal (because memory can only grow but it may be outdated)? Or something else? It seems we need a stronger statement here to rely on it at all.

I also wonder if this is necessary vs. a variant that uses Ordering::Acquire and makes a stronger claim? Or in other words, can we justify why a "fast and imprecise" variant is needed?

Actually that raises another question for me about atomic orderings which sort of furthers my concerns about using ptr::copy for memory.copy and friends. If one thread witnesses that the shared memory has a particular size I don't know what the guarantees are about witnessing other parts of memory in terms of memory orderings. For example should this be Acquire? Should this be SeqCst? Do wasm modules otherwise need to atomic.fence before calling host functionality or something like that?

I personally know very little about the atomic memory model for wasm and how it's all intended to work out so there may not be a great answer here. That being said it's almost certainly not wrong to fall back to SeqCst so I might say we should do that to be maximally conservative?

I believe the reason @alexcrichton suggested this relaxed function was for all the cases where the length is retrieved but no synchronization was necessary (i.e., anything related to owned memory). I could certainly make this SeqCst but that doesn't totally make sense for all the owned memory cases. Since, like @cfallin mentioned, current_length() can only differ by appearing smaller than what another thread sees, I will update the documentation comment accordingly. This function should be precise for any owned memories (regardless of what ordering is used) and will be safely imprecise for shared memories but under-estimating (at least until the spec changes to allow memory shrinking in some way).

One thing to consider here is the effect on the synchronization of this precise field, the current length, but there's also the effects of whether we can see the rest of concurrent memory modifications or such. For example if we read the current length at a particular value what does that mean for reading other pieces of memory?

I'm not sure what the answer is but I'm pretty sure it's affected by the memory orderings here. I don't know what the desired memory orderings are so I would personally use SeqCst everywhere because I know it's not wrong. Then again SeqCst can also be buggy because it provides too strong of an ordering which means we can't relax it in the future when it turns out we're not supposed to have such a strong ordering.

Anyway these are also theoretical concerns of mine and don't necessarily need any changes at this time.

crates/cranelift/src/func_environ.rs

cfallin · 2022-06-06T17:35:33Z

crates/cranelift/src/func_environ.rs

+                    // VMMemoryDefinition` to it and dereference that when
+                    // atomically growing it.
+                    let from_offset = self.offsets.vmctx_vmmemory_pointer(def_index);
+                    let memory = func.create_global_value(ir::GlobalValueData::Load {


I think in the sense of "make sure we don't feed current_length to the heap-access machinery in Cranelift"-safety, we could maybe make current_length_offset an Option here, and return None for the shared case. Then below wherever we use it, unwrap it with "expected current-length field (always present on non-shared memories)" or something of the sort as an expect error?

This change adds the ability to use shared memories in Wasmtime when the [threads proposal] is enabled. Shared memories are annotated as `shared` in the WebAssembly syntax, e.g., `(memory 1 1 shared)`, and are protected from concurrent access during `memory.size` and `memory.grow`. [threads proposal]: https://github.com/WebAssembly/threads/blob/master/proposals/threads/Overview.md In order to implement this in Wasmtime, there are two main cases to cover: - a program may simply create a shared memory and possibly export it; this means that Wasmtime itself must be able to create shared memories - a user may create a shared memory externally and pass it in as an import during instantiation; this is the case when the program contains code like `(import "env" "memory" (memory 1 1 shared))`--this case is handled by a new Wasmtime API type--`SharedMemory` Because of the first case, this change allows any of the current memory-creation mechanisms to work as-is. Wasmtime can still create either static or dynamic memories in either on-demand or pooling modes, and any of these memories can be considered shared. When shared, the `Memory` runtime container will lock appropriately during `memory.size` and `memory.grow` operations; since all memories use this container, it is an ideal place for implementing the locking once and once only. The second case is covered by the new `SharedMemory` structure. It uses the same `Mmap` allocation under the hood as non-shared memories, but allows the user to perform the allocation externally to Wasmtime and share the memory across threads (via an `Arc`). The pointer address to the actual memory is carefully wired through and owned by the `SharedMemory` structure itself. This means that there are differing views of where to access the pointer (i.e., `VMMemoryDefinition`): for owned memories (the default), the `VMMemoryDefinition` is stored directly by the `VMContext`; in the `SharedMemory` case, however, this `VMContext` must point to this separate structure. To ensure that the `VMContext` can always point to the correct `VMMemoryDefinition`, this change alters the `VMContext` structure. Since a `SharedMemory` owns its own `VMMemoryDefinition`, the `defined_memories` table in the `VMContext` becomes a sequence of pointers--in the shared memory case, they point to the `VMMemoryDefinition` owned by the `SharedMemory` and in the owned memory case (i.e., not shared) they point to `VMMemoryDefinition`s stored in a new table, `owned_memories`. This change adds an additional indirection (through the `*mut VMMemoryDefinition` pointer) that could add overhead. Using an imported memory as a proxy, we measured a 1-3% overhead of this approach on the `pulldown-cmark` benchmark. To avoid this, Cranelift-generated code will special-case the owned memory access (i.e., load a pointer directly to the `owned_memories` entry) for `memory.size` so that only shared memories (and imported memories, as before) incur the indirection cost.

crates/runtime/src/memory.rs

alexcrichton

Ok thanks again for doing all this @abrown and being patient with reviews! I'm going to approve and merge this now. Further work is tracked on #4245 but I think this is a solid enough base to build on which should be workable enough for the next level of experimentation and feedback on the implementation.

abrown force-pushed the shared-memory-vmcontext branch from d58ae3e to d74a422 Compare May 24, 2022 22:56

abrown commented May 24, 2022

View reviewed changes

crates/wasmtime/src/memory.rs Outdated Show resolved Hide resolved

abrown force-pushed the shared-memory-vmcontext branch from d74a422 to e9c4f5f Compare May 24, 2022 23:05

github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:module wasmtime:api Related to the API of the `wasmtime` crate itself wasmtime:ref-types Issues related to reference types and GC in Wasmtime labels May 24, 2022

abrown commented May 24, 2022

View reviewed changes

crates/cranelift/src/compiler.rs Outdated Show resolved Hide resolved

abrown force-pushed the shared-memory-vmcontext branch 3 times, most recently from f2ffbee to 5b765d5 Compare May 24, 2022 23:23

alexcrichton reviewed May 25, 2022

View reviewed changes

abrown mentioned this pull request Jun 1, 2022

shared memory: audit safety of pointer copies in the presence of shared memories #4203

Open

abrown force-pushed the shared-memory-vmcontext branch from c6410bc to ec08f2c Compare June 2, 2022 15:59

github-actions bot added the wasmtime:c-api Issues pertaining to the C API. label Jun 2, 2022

abrown force-pushed the shared-memory-vmcontext branch 2 times, most recently from 5aa957b to 15e23fb Compare June 2, 2022 19:13

abrown marked this pull request as ready for review June 2, 2022 23:53

abrown requested a review from alexcrichton June 2, 2022 23:53

alexcrichton reviewed Jun 3, 2022

View reviewed changes

cfallin reviewed Jun 6, 2022

View reviewed changes

abrown added 7 commits June 6, 2022 14:42

review: remove thread feature check

8e88076

review: swap wasmtime-types dependency for existing wasmtime-environ use

6281c86

review: remove unused VMMemoryUnion

9a6d871

review: reword cross-engine error message

234ea40

review: improve tests

681a627

review: refactor to separate prevent Memory <-> SharedMemory conversion

41a5016

abrown force-pushed the shared-memory-vmcontext branch from 5975db1 to 14ce663 Compare June 6, 2022 21:50

abrown added 11 commits June 6, 2022 14:54

review: make owned_memory_index panic instead of returning an option

f087502

review: clarify calculation of num_owned_memories

b8ef602

review: move 'use' to top of file

e37f881

review: change '*const [u8]' to '*mut [u8]'

87da77a

review: remove TODO

17f8961

review: avoid hard-coding memory index

8d981bf

review: remove 'preallocation' parameter from 'Memory::_new'

b154a16

fix: component model memory length

e90c3ea

review: check that shared memory plans are static

bd6e799

review: ignore growth limits for shared memory

2bdaaed

review: improve atomic store comment

19f3081

abrown mentioned this pull request Jun 7, 2022

shared memory: no way to report memory growth failure #4240

Open

abrown added 3 commits June 7, 2022 11:03

review: add FIXME for memory growth failure

0a13a7d

review: add comment about absence of bounds-checked 'memory.size'

75ddcd1

review: make 'current_length()' doc comment more precise

a96fccd

abrown requested a review from alexcrichton June 7, 2022 18:53

alexcrichton reviewed Jun 7, 2022

View reviewed changes

crates/runtime/src/memory.rs Outdated Show resolved Hide resolved

alexcrichton reviewed Jun 7, 2022

View reviewed changes

crates/runtime/src/memory.rs Outdated Show resolved Hide resolved

abrown added 4 commits June 7, 2022 13:11

review: more comments related to memory.size non-determinism

480a1e1

review: make 'vmmemory' unreachable for shared memory

be221c7

review: move code around

6868519

review: thread plan through to 'wrap()'

e8101a0

abrown mentioned this pull request Jun 8, 2022

shared memory: use shared memories with the pooling allocator #4244

Open

review: disallow shared memory allocation with the pooling allocator

9e71b72

abrown mentioned this pull request Jun 8, 2022

Tracking issue for shared memory #4245

Open

10 tasks

alexcrichton approved these changes Jun 8, 2022

View reviewed changes

alexcrichton merged commit 2b52f47 into bytecodealliance:main Jun 8, 2022

abrown deleted the shared-memory-vmcontext branch June 8, 2022 17:18

abrown added the wasm-proposal:threads Issues related to the WebAssembly threads proposal label Dec 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add shared memories #4187

Add shared memories #4187

abrown commented May 24, 2022

github-actions bot commented May 24, 2022

alexcrichton left a comment

github-actions bot commented Jun 2, 2022

alexcrichton left a comment

alexcrichton Jun 3, 2022

abrown Jun 7, 2022

alexcrichton Jun 7, 2022

abrown Jun 7, 2022

cfallin left a comment

cfallin Jun 6, 2022

alexcrichton Jun 6, 2022

abrown Jun 7, 2022

alexcrichton Jun 7, 2022

cfallin Jun 6, 2022

alexcrichton left a comment

Add shared memories #4187

Add shared memories #4187

Conversation

abrown commented May 24, 2022

github-actions bot commented May 24, 2022

Subscribe to Label Action

alexcrichton left a comment

Choose a reason for hiding this comment

github-actions bot commented Jun 2, 2022

Subscribe to Label Action

alexcrichton left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cfallin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alexcrichton left a comment

Choose a reason for hiding this comment