Skip to content

Commit

Permalink
Add shared memories
Browse files Browse the repository at this point in the history
This change adds the ability to use shared memories in Wasmtime when the
[threads proposal] is enabled. Shared memories are annotated as `shared`
in the WebAssembly syntax, e.g., `(memory 1 1 shared)`, and are
protected from concurrent access during `memory.size` and `memory.grow`.

[threads proposal]: https://github.com/WebAssembly/threads/blob/master/proposals/threads/Overview.md

In order to implement this in Wasmtime, there are two main cases to
cover:
    - a program may simply create a shared memory and possibly export it;
    this means that Wasmtime itself must be able to create shared
    memories
    - a user may create a shared memory externally and pass it in as an
    import during instantiation; this is the case when the program
    contains code like `(import "env" "memory" (memory 1 1
    shared))`--this case is handled by a new Wasmtime API
    type--`SharedMemory`

Because of the first case, this change allows any of the current
memory-creation mechanisms to work as-is. Wasmtime can still create
either static or dynamic memories in either on-demand or pooling modes,
and any of these memories can be considered shared. When shared, the
`Memory` runtime container will lock appropriately during `memory.size`
and `memory.grow` operations; since all memories use this container, it
is an ideal place for implementing the locking once and once only.

The second case is covered by the new `SharedMemory` structure. It uses
the same `Mmap` allocation under the hood as non-shared memories, but
allows the user to perform the allocation externally to Wasmtime and
share the memory across threads (via an `Arc`). The pointer address to
the actual memory is carefully wired through and owned by the
`SharedMemory` structure itself. This means that there are differing
views of where to access the pointer (i.e., `VMMemoryDefinition`): for
owned memories (the default), the `VMMemoryDefinition` is stored
directly by the `VMContext`; in the `SharedMemory` case, however, this
`VMContext` must point to this separate structure.

To ensure that the `VMContext` can always point to the correct
`VMMemoryDefinition`, this change alters the `VMContext` structure.
Since a `SharedMemory` owns its own `VMMemoryDefinition`, the
`defined_memories` table in the `VMContext` becomes a sequence of
pointers--in the shared memory case, they point to the
`VMMemoryDefinition` owned by the `SharedMemory` and in the owned memory
case (i.e., not shared) they point to `VMMemoryDefinition`s stored in a
new table, `owned_memories`.

This change adds an additional indirection (through the `*mut
VMMemoryDefinition` pointer) that could add overhead. Using an imported
memory as a proxy, we measured a 1-3% overhead of this approach on the
`pulldown-cmark` benchmark. To avoid this, Cranelift-generated code will
special-case the owned memory access (i.e., load a pointer directly to
the `owned_memories` entry) for `memory.size` so that only
shared memories (and imported memories, as before) incur the indirection
cost.
  • Loading branch information
abrown committed May 24, 2022
1 parent 140b835 commit e9c4f5f
Show file tree
Hide file tree
Showing 22 changed files with 857 additions and 143 deletions.
1 change: 1 addition & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions cranelift/object/tests/basic.rs
Original file line number Diff line number Diff line change
Expand Up @@ -200,7 +200,7 @@ fn libcall_function() {

#[test]
#[should_panic(
expected = "Result::unwrap()` on an `Err` value: Backend(Symbol \"function\\0with\\0nul\\0bytes\" has a null byte, which is disallowed"
expected = "Result::unwrap()` on an `Err` value: Backend(Symbol \"function\\u{0}with\\u{0}nul\\u{0}bytes\" has a null byte, which is disallowed"
)]
fn reject_nul_byte_symbol_for_func() {
let flag_builder = settings::builder();
Expand All @@ -224,7 +224,7 @@ fn reject_nul_byte_symbol_for_func() {

#[test]
#[should_panic(
expected = "Result::unwrap()` on an `Err` value: Backend(Symbol \"data\\0with\\0nul\\0bytes\" has a null byte, which is disallowed"
expected = "Result::unwrap()` on an `Err` value: Backend(Symbol \"data\\u{0}with\\u{0}nul\\u{0}bytes\" has a null byte, which is disallowed"
)]
fn reject_nul_byte_symbol_for_data() {
let flag_builder = settings::builder();
Expand Down
5 changes: 3 additions & 2 deletions crates/cranelift/src/compiler.rs
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ use cranelift_codegen::{MachSrcLoc, MachStackMap};
use cranelift_entity::{EntityRef, PrimaryMap};
use cranelift_frontend::FunctionBuilder;
use cranelift_wasm::{
DefinedFuncIndex, DefinedMemoryIndex, FuncIndex, FuncTranslator, MemoryIndex, SignatureIndex,
DefinedFuncIndex, FuncIndex, FuncTranslator, MemoryIndex, OwnedMemoryIndex, SignatureIndex,
WasmFuncType,
};
use object::write::Object;
Expand Down Expand Up @@ -332,8 +332,9 @@ impl wasmtime_environ::Compiler for Compiler {
let memory_offset = if ofs.num_imported_memories > 0 {
ModuleMemoryOffset::Imported(ofs.vmctx_vmmemory_import(MemoryIndex::new(0)))
} else if ofs.num_defined_memories > 0 {
// TODO shared?
ModuleMemoryOffset::Defined(
ofs.vmctx_vmmemory_definition_base(DefinedMemoryIndex::new(0)),
ofs.vmctx_vmmemory_definition_base(OwnedMemoryIndex::new(0)),
)
} else {
ModuleMemoryOffset::None
Expand Down
68 changes: 52 additions & 16 deletions crates/cranelift/src/func_environ.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1368,18 +1368,37 @@ impl<'module_environment> cranelift_wasm::FuncEnvironment for FuncEnvironment<'m

fn make_heap(&mut self, func: &mut ir::Function, index: MemoryIndex) -> WasmResult<ir::Heap> {
let pointer_type = self.pointer_type();

let is_shared = self.module.memory_plans[index].memory.shared;
let (ptr, base_offset, current_length_offset) = {
let vmctx = self.vmctx(func);
if let Some(def_index) = self.module.defined_memory_index(index) {
let base_offset =
i32::try_from(self.offsets.vmctx_vmmemory_definition_base(def_index)).unwrap();
let current_length_offset = i32::try_from(
self.offsets
.vmctx_vmmemory_definition_current_length(def_index),
)
.unwrap();
(vmctx, base_offset, current_length_offset)
if is_shared {
// As with imported memory, the `VMMemoryDefinition` for a
// shared memory is stored elsewhere. We store a `*mut
// VMMemoryDefinition` to it and dereference that when
// atomically growing it.
let from_offset = self.offsets.vmctx_vmmemory_pointer(def_index);
let memory = func.create_global_value(ir::GlobalValueData::Load {
base: vmctx,
offset: Offset32::new(i32::try_from(from_offset).unwrap()),
global_type: pointer_type,
readonly: true,
});
let base_offset = i32::from(self.offsets.vmmemory_definition_base());
let current_length_offset =
i32::from(self.offsets.vmmemory_definition_current_length());
(memory, base_offset, current_length_offset)
} else {
let owned_index = self.module.owned_memory_index(def_index).expect("TODO");
let owned_base_offset =
self.offsets.vmctx_vmmemory_definition_base(owned_index);
let owned_length_offset = self
.offsets
.vmctx_vmmemory_definition_current_length(owned_index);
let current_base_offset = i32::try_from(owned_base_offset).unwrap();
let current_length_offset = i32::try_from(owned_length_offset).unwrap();
(vmctx, current_base_offset, current_length_offset)
}
} else {
let from_offset = self.offsets.vmctx_vmmemory_import_from(index);
let memory = func.create_global_value(ir::GlobalValueData::Load {
Expand Down Expand Up @@ -1693,16 +1712,33 @@ impl<'module_environment> cranelift_wasm::FuncEnvironment for FuncEnvironment<'m
) -> WasmResult<ir::Value> {
let pointer_type = self.pointer_type();
let vmctx = self.vmctx(&mut pos.func);
let is_shared = self.module.memory_plans[index].memory.shared;
let base = pos.ins().global_value(pointer_type, vmctx);
let current_length_in_bytes = match self.module.defined_memory_index(index) {
Some(def_index) => {
let offset = i32::try_from(
self.offsets
.vmctx_vmmemory_definition_current_length(def_index),
)
.unwrap();
pos.ins()
.load(pointer_type, ir::MemFlags::trusted(), base, offset)
if is_shared {
let offset =
i32::try_from(self.offsets.vmctx_vmmemory_pointer(def_index)).unwrap();
let vmmemory_ptr =
pos.ins()
.load(pointer_type, ir::MemFlags::trusted(), base, offset);
// TODO should be an atomic_load (need a way to to do atomic_load + offset).
pos.ins().load(
pointer_type,
ir::MemFlags::trusted(),
vmmemory_ptr,
i32::from(self.offsets.vmmemory_definition_current_length()),
)
} else {
let owned_index = self.module.owned_memory_index(def_index).expect("TODO");
let offset = i32::try_from(
self.offsets
.vmctx_vmmemory_definition_current_length(owned_index),
)
.unwrap();
pos.ins()
.load(pointer_type, ir::MemFlags::trusted(), base, offset)
}
}
None => {
let offset = i32::try_from(self.offsets.vmctx_vmmemory_import_from(index)).unwrap();
Expand Down
25 changes: 22 additions & 3 deletions crates/environ/src/module.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,15 +10,15 @@ use std::mem;
use std::ops::Range;
use wasmtime_types::*;

/// Implemenation styles for WebAssembly linear memory.
/// Implementation styles for WebAssembly linear memory.
#[derive(Debug, Clone, Hash, Serialize, Deserialize)]
pub enum MemoryStyle {
/// The actual memory can be resized and moved.
Dynamic {
/// Extra space to reserve when a memory must be moved due to growth.
reserve: u64,
},
/// Addresss space is allocated up front.
/// Address space is allocated up front.
Static {
/// The number of mapped and unmapped pages.
bound: u64,
Expand Down Expand Up @@ -160,7 +160,7 @@ pub enum MemoryInitialization {
/// which might reside in a compiled module on disk, available immediately
/// in a linear memory's address space.
///
/// To facilitate the latter fo these techniques the `try_static_init`
/// To facilitate the latter of these techniques the `try_static_init`
/// function below, which creates this variant, takes a host page size
/// argument which can page-align everything to make mmap-ing possible.
Static {
Expand Down Expand Up @@ -919,6 +919,25 @@ impl Module {
}
}

/// Convert a `DefinedMemoryIndex` into an `OwnedMemoryIndex`. Returns None
/// if the index is an imported memory.
#[inline]
pub fn owned_memory_index(&self, memory: DefinedMemoryIndex) -> Option<OwnedMemoryIndex> {
if memory.index() >= self.memory_plans.len() {
return None;
}
// Once we know that the memory index is not greater than the number of
// plans, we can iterate through the plans up to the memory index and
// count how many are not shared (i.e., owned).
let owned_memory_index = self
.memory_plans
.iter()
.take(memory.index())
.filter(|(_, mp)| !mp.memory.shared)
.count();
Some(OwnedMemoryIndex::new(owned_memory_index))
}

/// Test whether the given memory index is for an imported memory.
#[inline]
pub fn is_imported_memory(&self, index: MemoryIndex) -> bool {
Expand Down
4 changes: 2 additions & 2 deletions crates/environ/src/module_environ.rs
Original file line number Diff line number Diff line change
Expand Up @@ -240,7 +240,7 @@ impl<'a, 'data> ModuleEnvironment<'a, 'data> {
EntityType::Function(sig_index)
}
TypeRef::Memory(ty) => {
if ty.shared {
if ty.shared && !self.validator.features().threads {
return Err(WasmError::Unsupported("shared memories".to_owned()));
}
self.result.module.num_imported_memories += 1;
Expand Down Expand Up @@ -296,7 +296,7 @@ impl<'a, 'data> ModuleEnvironment<'a, 'data> {

for entry in memories {
let memory = entry?;
if memory.shared {
if memory.shared && !self.validator.features().threads {
return Err(WasmError::Unsupported("shared memories".to_owned()));
}
let plan = MemoryPlan::for_memory(memory.into(), &self.tunables);
Expand Down
57 changes: 48 additions & 9 deletions crates/environ/src/vmoffsets.rs
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,8 @@
// imported_memories: [VMMemoryImport; module.num_imported_memories],
// imported_globals: [VMGlobalImport; module.num_imported_globals],
// tables: [VMTableDefinition; module.num_defined_tables],
// memories: [VMMemoryDefinition; module.num_defined_memories],
// memories: [*mut VMMemoryDefinition; module.num_defined_memories],
// owned_memories: [VMMemoryDefinition; module.num_owned_memories],
// globals: [VMGlobalDefinition; module.num_defined_globals],
// anyfuncs: [VMCallerCheckedAnyfunc; module.num_escaped_funcs],
// }
Expand All @@ -26,6 +27,7 @@ use crate::{
use cranelift_entity::packed_option::ReservedValue;
use more_asserts::assert_lt;
use std::convert::TryFrom;
use wasmtime_types::OwnedMemoryIndex;

/// Sentinel value indicating that wasm has been interrupted.
// Note that this has a bit of an odd definition. See the `insert_stack_check`
Expand Down Expand Up @@ -67,6 +69,8 @@ pub struct VMOffsets<P> {
pub num_defined_tables: u32,
/// The number of defined memories in the module.
pub num_defined_memories: u32,
/// The number of memories owned by the module instance.
pub num_owned_memories: u32,
/// The number of defined globals in the module.
pub num_defined_globals: u32,
/// The number of escaped functions in the module, the size of the anyfuncs
Expand All @@ -86,6 +90,7 @@ pub struct VMOffsets<P> {
imported_globals: u32,
defined_tables: u32,
defined_memories: u32,
owned_memories: u32,
defined_globals: u32,
defined_anyfuncs: u32,
size: u32,
Expand Down Expand Up @@ -133,16 +138,23 @@ pub struct VMOffsetsFields<P> {
pub num_defined_tables: u32,
/// The number of defined memories in the module.
pub num_defined_memories: u32,
/// The number of memories owned by the module instance.
pub num_owned_memories: u32,
/// The number of defined globals in the module.
pub num_defined_globals: u32,
/// The numbe of escaped functions in the module, the size of the anyfunc
/// The number of escaped functions in the module, the size of the anyfunc
/// array.
pub num_escaped_funcs: u32,
}

impl<P: PtrSize> VMOffsets<P> {
/// Return a new `VMOffsets` instance, for a given pointer size.
pub fn new(ptr: P, module: &Module) -> Self {
let num_shared_memories = module
.memory_plans
.iter()
.filter(|p| p.1.memory.shared)
.count();
VMOffsets::from(VMOffsetsFields {
ptr,
num_imported_functions: cast_to_u32(module.num_imported_funcs),
Expand All @@ -152,6 +164,7 @@ impl<P: PtrSize> VMOffsets<P> {
num_defined_functions: cast_to_u32(module.functions.len()),
num_defined_tables: cast_to_u32(module.table_plans.len()),
num_defined_memories: cast_to_u32(module.memory_plans.len()),
num_owned_memories: cast_to_u32(module.memory_plans.len() - num_shared_memories),
num_defined_globals: cast_to_u32(module.globals.len()),
num_escaped_funcs: cast_to_u32(module.num_escaped_funcs),
})
Expand Down Expand Up @@ -181,13 +194,14 @@ impl<P: PtrSize> VMOffsets<P> {
num_defined_tables: _,
num_defined_globals: _,
num_defined_memories: _,
num_owned_memories: _,
num_defined_functions: _,
num_escaped_funcs: _,

// used as the initial size below
size,

// exhaustively match teh rest of the fields with input from
// exhaustively match the rest of the fields with input from
// the macro
$($name,)*
} = *self;
Expand All @@ -211,6 +225,7 @@ impl<P: PtrSize> VMOffsets<P> {
defined_anyfuncs: "module functions",
defined_globals: "defined globals",
defined_memories: "defined memories",
owned_memories: "owned memories",
defined_tables: "defined tables",
imported_globals: "imported globals",
imported_memories: "imported memories",
Expand All @@ -237,6 +252,7 @@ impl<P: PtrSize> From<VMOffsetsFields<P>> for VMOffsets<P> {
num_defined_functions: fields.num_defined_functions,
num_defined_tables: fields.num_defined_tables,
num_defined_memories: fields.num_defined_memories,
num_owned_memories: fields.num_owned_memories,
num_defined_globals: fields.num_defined_globals,
num_escaped_funcs: fields.num_escaped_funcs,
runtime_limits: 0,
Expand All @@ -251,6 +267,7 @@ impl<P: PtrSize> From<VMOffsetsFields<P>> for VMOffsets<P> {
imported_globals: 0,
defined_tables: 0,
defined_memories: 0,
owned_memories: 0,
defined_globals: 0,
defined_anyfuncs: 0,
size: 0,
Expand Down Expand Up @@ -303,7 +320,9 @@ impl<P: PtrSize> From<VMOffsetsFields<P>> for VMOffsets<P> {
size(defined_tables)
= cmul(ret.num_defined_tables, ret.size_of_vmtable_definition()),
size(defined_memories)
= cmul(ret.num_defined_memories, ret.size_of_vmmemory_definition()),
= cmul(ret.num_defined_memories, ret.size_of_vmmemory_pointer()),
size(owned_memories)
= cmul(ret.num_owned_memories, ret.size_of_vmmemory_definition()),
align(16),
size(defined_globals)
= cmul(ret.num_defined_globals, ret.size_of_vmglobal_definition()),
Expand Down Expand Up @@ -445,6 +464,12 @@ impl<P: PtrSize> VMOffsets<P> {
pub fn size_of_vmmemory_definition(&self) -> u8 {
2 * self.pointer_size()
}

/// Return the size of `*mut VMMemoryDefinition`.
#[inline]
pub fn size_of_vmmemory_pointer(&self) -> u8 {
self.pointer_size()
}
}

/// Offsets for `VMGlobalImport`.
Expand Down Expand Up @@ -604,6 +629,12 @@ impl<P: PtrSize> VMOffsets<P> {
self.defined_memories
}

/// The offset of the `owned_memories` array.
#[inline]
pub fn vmctx_owned_memories_begin(&self) -> u32 {
self.owned_memories
}

/// The offset of the `globals` array.
#[inline]
pub fn vmctx_globals_begin(&self) -> u32 {
Expand Down Expand Up @@ -667,11 +698,19 @@ impl<P: PtrSize> VMOffsets<P> {
self.vmctx_tables_begin() + index.as_u32() * u32::from(self.size_of_vmtable_definition())
}

/// Return the offset to `VMMemoryDefinition` index `index`.
/// Return the offset to the `*mut VMMemoryDefinition` at index `index`.
#[inline]
pub fn vmctx_vmmemory_definition(&self, index: DefinedMemoryIndex) -> u32 {
pub fn vmctx_vmmemory_pointer(&self, index: DefinedMemoryIndex) -> u32 {
assert_lt!(index.as_u32(), self.num_defined_memories);
self.vmctx_memories_begin() + index.as_u32() * u32::from(self.size_of_vmmemory_definition())
self.vmctx_memories_begin() + index.as_u32() * u32::from(self.size_of_vmmemory_pointer())
}

/// Return the offset to the owned `VMMemoryDefinition` at index `index`.
#[inline]
pub fn vmctx_vmmemory_definition(&self, index: OwnedMemoryIndex) -> u32 {
assert_lt!(index.as_u32(), self.num_owned_memories);
self.vmctx_owned_memories_begin()
+ index.as_u32() * u32::from(self.size_of_vmmemory_definition())
}

/// Return the offset to the `VMGlobalDefinition` index `index`.
Expand Down Expand Up @@ -735,13 +774,13 @@ impl<P: PtrSize> VMOffsets<P> {

/// Return the offset to the `base` field in `VMMemoryDefinition` index `index`.
#[inline]
pub fn vmctx_vmmemory_definition_base(&self, index: DefinedMemoryIndex) -> u32 {
pub fn vmctx_vmmemory_definition_base(&self, index: OwnedMemoryIndex) -> u32 {
self.vmctx_vmmemory_definition(index) + u32::from(self.vmmemory_definition_base())
}

/// Return the offset to the `current_length` field in `VMMemoryDefinition` index `index`.
#[inline]
pub fn vmctx_vmmemory_definition_current_length(&self, index: DefinedMemoryIndex) -> u32 {
pub fn vmctx_vmmemory_definition_current_length(&self, index: OwnedMemoryIndex) -> u32 {
self.vmctx_vmmemory_definition(index) + u32::from(self.vmmemory_definition_current_length())
}

Expand Down
1 change: 1 addition & 0 deletions crates/runtime/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ edition = "2021"
wasmtime-environ = { path = "../environ", version = "=0.38.0" }
wasmtime-fiber = { path = "../fiber", version = "=0.38.0", optional = true }
wasmtime-jit-debug = { path = "../jit-debug", version = "=0.38.0", features = ["gdb_jit_int"] }
wasmtime-types = { path = "../types", version = "=0.38.0" }
region = "2.1.0"
libc = { version = "0.2.112", default-features = false }
log = "0.4.8"
Expand Down
Loading

0 comments on commit e9c4f5f

Please sign in to comment.