WASI: Implement experimental threading support #16207

Luukdegram · 2023-06-25T16:59:29Z

This PR implements threading support for WASI. Note that WASI-threads is still an experimental feature and not all runtimes support it yet. The main goal of this PR is to not only add support for threads but also to gather feedback to improve the feature upstream.

Currently, the WASI-Threads proposal defines a single API entry which is wasi.thread-spawn, allowing us to pass a pointer to a context for later usage. This will ask the host to create an OS thread. Upon thread creation, an exported function wasi_thread_start will be called by the host environment. Providing us with a thread ID, and the original pointer we passed to thread-spawn. Although the host is responsible for creating and initializing the OS thread, it is up to the WebAssembly module to set up and initialize the memory for the thread. For this, we create enough memory to store:

A guard page (to help prevent other threads from overwriting the memory of another thread)
- This is done as best effort as there's no notion of read-only memory, yet.
A new stack for the thread. (Set to user-specified size, or a single page, whichever is larger).
A TLS segment
Remaining memory required to store our Instance which holds metadata required to bootstrap the thread.

Upon thread creation, we then initialize the TLS segment and set the __tls_base pointer to this new TLS segment so loads and stores to TLS work correctly. We also set the stack pointer to the new stack that we created upon the call to spawn.

For WASI we ask the user to provide an allocator to ensure allocators are aware of any memory allocated by spawning a thread. Without this, it would be easy for the user to overwrite the memory we allocate during spawn as we can only grow pages sequentially and Wasm provides no way to tell what page is reserved and not. This also allows us to free the memory during join so the user's allocator can re-use memory that was previously allocated as there's currently no way in WebAssembly to free memory.
Unfortunately, this meant having to use a 'hack' during detach to free the memory (resetting the stack pointer upon thread exit to ensure we can free the memory using the allocator without using the stack that is being freed). The other option is to leak memory when a user uses detach. In the future, we can use memory.discard to handle freeing this memory and maybe get rid of the allocator from the API.

The next steps outside this PR are:

Enable std tests, which requires us to:
- Update CI Wasmtime to atleast v7.0.0
- Update build.zig/test-runner to pass correct flags to Wasmtime to enable/run threading support.
Add support for threads in WASI-LibC

For those wanting to play around with this, here's a test program with CLI invocation to both Zig and Wasmtime:

const std = @import("std");
pub fn main() !void {
    var threads: [3]std.Thread = undefined;
    var timer = try std.time.Timer.start();
    for (&threads, 0..3) |*thread, id| {
        thread.* = try std.Thread.spawn(.{ .allocator = std.heap.page_allocator }, myFunc, .{threads.len - id});
    }

    for (threads, 0..3) |thread, index| {
        thread.join();
    }
    std.debug.print("Total runtime: '{d}' ms\n", .{timer.read() / std.time.ns_per_ms});
}

fn myFunc(id: usize) void {
    std.debug.print("Sleeping {d} seconds on thread '{d}'\n", .{ id, std.Thread.getCurrentId() });
    std.time.sleep(id * std.time.ns_per_s);
    std.debug.print("Finished '{d}'\n", .{std.Thread.getCurrentId()});
}

Build thread.zig and run on Wasmtime:

zig build-exe thread.zig -femit-bin=thread.wasm -target wasm32-wasi --shared-memory -mcpu=mvp+atomics+bulk_memory --import-memory --export=wasi_thread_start --export-memory

wasmtime thread.wasm --wasm-features threads --wasi-modules experimental-wasi-threads

Output:

Sleeping 3 seconds on thread '1246793758'
Sleeping 2 seconds on thread '694100545'
Sleeping 1 seconds on thread '290362801'
Finished '290362801'
Finished '694100545'
Finished '1246793758'
Total runtime: '3010' ms

andrewrk

Nice work!

src/target.zig

src/Compilation.zig

This flag allows the user to force export the memory to the host environment. This is useful when the memory is imported from the host but must also be exported. This is (currently) required to pass the memory validation for runtimes when using threads. In this future this may become an error instead.

When the user enabled the linker-feature 'shared-memory' we do not force a singlethreaded build. The linker already verifies all other CPU features required for threads are enabled. This is true for both WASI and freestanding.

Implements std's `Futex` for the WebAssembly target using Wasm's `atomics` instruction set. When the `atomics` cpu feature is disabled we emit a compile-error.

This implements a first version to spawn a WASI-thread. For a new thread to be created, we calculate the size required to store TLS, the new stack, and metadata. This size is then allocated using a user-provided allocator. After a new thread is spawn, the HOST will call into our bootstrap procedure. This bootstrap procedure will then initialize the TLS segment and set the newly spawned thread's TID. It will also set the stack pointer to the newly created stack to ensure we do not clobber the main thread's stack. When bootstrapping the thread is completed, we will call the user's function on this new thread.

We now store the original allocator that was used to allocate the memory required for the thread. This allocator can then be used in any cleanup functionality to ensure the memory is freed correctly. Secondly, we now use a function to set the stack pointer instead of generating a function using global assembly. This is a lot cleaner and more readable.

We now reset the Thread ID to 0 and wake up the main thread listening for the thread to finish. We use inline assembly as we cannot use the stack to set the thread ID as it could possibly clobber any of the memory. Currently, we leak the memory that was allocated for the thread. We need to implement a way where we can clean up the memory without using the stack (as the stack is stored inside this same memory).

When `join` detects a thread has completed, it will free the allocated memory of the thread. For this we must first copy the allocator. This is required as the allocated memory holds a reference to the original allocator. If we free the memory, we would end up with UB as the allocator would free itself.

When a thread is detached from the main thread, we automatically cleanup any allocated memory. For this we first reset the stack-pointer to the original stack-pointer of the main-thread so we can safely clear the memory which also contains the thread's stack.

When targeting WebAssembly, we default to building a single-threaded build as threads are still experimental. The user however can enable a multi- threaded build by specifying '-fno-single-threaded'. It's a compile-error to enable this flag, but not also enable shared-memory.

andrewrk requested changes Jun 25, 2023

View reviewed changes

src/target.zig Outdated Show resolved Hide resolved

src/Compilation.zig Outdated Show resolved Hide resolved

Luukdegram added 9 commits June 26, 2023 20:00

std: implement Futex for WebAssembly

ea0d4c8

Implements std's `Futex` for the WebAssembly target using Wasm's `atomics` instruction set. When the `atomics` cpu feature is disabled we emit a compile-error.

Luukdegram force-pushed the wasi-threads branch from 8557c98 to 87b8a05 Compare June 26, 2023 18:01

andrewrk approved these changes Jun 26, 2023

View reviewed changes

Luukdegram merged commit 622c5f3 into ziglang:master Jun 27, 2023
10 checks passed

Luukdegram deleted the wasi-threads branch June 27, 2023 16:28

Luukdegram restored the wasi-threads branch June 27, 2023 16:28

Luukdegram deleted the wasi-threads branch June 27, 2023 16:28

Luukdegram restored the wasi-threads branch June 27, 2023 16:31

Luukdegram mentioned this pull request Jul 18, 2023

wasm-linker: finish shared-memory & TLS implementation #16439

Merged

This was referenced Jul 23, 2023

Add support for wasm32-wasi-threads #15484

Closed

Adding support for threadlocal variables for Wasm #15935

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WASI: Implement experimental threading support #16207

WASI: Implement experimental threading support #16207

Luukdegram commented Jun 25, 2023 •

edited

andrewrk left a comment

WASI: Implement experimental threading support #16207

WASI: Implement experimental threading support #16207

Conversation

Luukdegram commented Jun 25, 2023 • edited

andrewrk left a comment

Choose a reason for hiding this comment

Luukdegram commented Jun 25, 2023 •

edited