Implement support for proc macros on wasm32-wasip2#157590
Implement support for proc macros on wasm32-wasip2#157590Mark-Simulacrum wants to merge 1 commit into
Conversation
This is an initial draft implementation. A few problems need to be
addressed before this could be merged:
* Figure out testing strategy -- currently, this hardcodes compiletest
to build in wasip2, which clearly won't work to land. Most likely
we'll need an opt-in of some kind, especially given that building the
standard library for wasip2 currently requires downloading a
wasi sysroot.
* This duplicates the proc macro ABI with a WIT file for all the
individual functions. There's not too much complexity inherent to the
WIT approach -- the duplication doesn't seem to cost us much, since
the ABI is pretty simple. But it does mean the base data structures
need replicating, though some of that could be eliminated by always
using the codegen'd struct/enums in the C bridge as well.
* There's also some impedance mismatch, e.g., Span is Copy and so we
end up leaking resources to mitigate the issue there. I think the
right thing is probably to expose it as a u32 wrapper (similar to
the old ABI).
But in the meantime this does pass tests (confirmed on aarch64-apple and
x86_64-linux, the two systems I have readily available), so opening up a
draft PR to share the initial state.
Some not necessarily blocking challenges:
* `cc` crate dependency causes us to need an old wasmtime dependency. I
think we should be able to relax the upstream constraint but haven't
looked into it yet.
|
The job Click to see the possible cause of the failure (guessed by this bot) |
|
My planned next steps here are to probably refactor away some of the leaking (particularly around Span), figure out a path towards testing in some CI job, and figuring out reviewer(s) for the code. It's not really that much net-new code that's interesting so my hope is that isn't too hard to get happen. |
| pub mod wasi_bindgen { | ||
| // We currently cannot depend on the wit-bindgen proc macro or the implementation backing it | ||
| // (wit-bindgen-{rust, core}) from within the library/ tree because they have a relatively large | ||
| // amount of reasonable dependencies that don't have the rustc-dep-of-std cargo feature already |
There was a problem hiding this comment.
rustc-dep-of-std only matters for target deps, not host deps like build scripts and proc-macros. The more important reason that using the proc macro directly isn't possible here is that it would break cross-compilation. The proc-macro shipped in the rust-std component would be compiled for the wrong host. Using wasm proc-macros would solve that, but that did require a wasm proc-macro implementation that works on all host architecturee, unlike wasmtime.
| buf.slice(|buf| &buf[data_start..(data_start + metadata_len)]) | ||
| } | ||
| CrateFlavor::Rmeta => get_rmeta_metadata_section(filename)?, | ||
| CrateFlavor::Wasm => { |
There was a problem hiding this comment.
I didn't need any special case in my implementation. I guess because the object crate supports wasm modules, but not wasm components.
| #[derive(PartialEq, Clone, Debug, StableHash, Encodable, Decodable)] | ||
| pub struct CrateSource { | ||
| pub dylib: Option<PathBuf>, | ||
| pub wasm: Option<PathBuf>, |
There was a problem hiding this comment.
Why not reuse the dylib field like on native?
There was a problem hiding this comment.
We need to know whether we've loaded wasm or native code, right? I guess we could sniff the header of the file or check the extension, but both seem worse?
| // FIXME: This is a bit of a hack to make our crate loading code avoid needing a | ||
| // 'wasm_proc_macro' search path. We probably *do* actually want wasm proc macros to | ||
| // have their own search path inside the session, in which case this wouldn't be | ||
| // needed. |
There was a problem hiding this comment.
For me doing a second crate search for the wasm32-wasi target worked without needing to change any filenames. Adding a lib prefix would break on Windows if you use the host prefix during the crate search.
There was a problem hiding this comment.
Are you sure loading from directories worked for you? --extern with a path to wasm indeed works without this but finding the wasm component without an explicit path pointing to it (from -L) didn't.
But yeah, this code is a bit hacky, would need some cleanup. I'd be surprised if it worked out of the box though as the defaults don't map well (e.g., lib prefix on Linux, .so vs .wasm, etc)?
| [dependencies.wasmtime] | ||
| version = "38" | ||
| default-features = false | ||
| features = ["runtime", "winch", "component-model", "all-arch", "std", "cranelift"] |
There was a problem hiding this comment.
For this you can drop the all-arch feature as that's only necessary when cross-compiling to other architectures. The host architecture is always supported by default, however.
| impl Default for Ctx { | ||
| fn default() -> Self { | ||
| Ctx { | ||
| ctx: wasmtime_wasi::WasiCtx::builder().inherit_stdout().inherit_stderr().build(), |
There was a problem hiding this comment.
FWIW this is where deterministic randomness would get inserted via methods like this. This'll probably want to use inherit_env as well or at least pass the well-known cargo/rustc env vars into the macro.
| .unwrap(); | ||
| let ctx = Ctx::default(); | ||
| let mut store = wasmtime::Store::new(&engine, ctx); | ||
| let bindings = generated::ProcMacro::instantiate(&mut store, &component, &linker).unwrap(); |
There was a problem hiding this comment.
One thing I might recommend here is to split the instantiate part into instantiate_pre followed by instantiate-with-that-InstancePre. That way the closures below can close over the InstancePre<Ctx> which would mean that per-macro invocation would bypass linker creation and some instantiation costs, making those cheaper.
| let mut store = wasmtime::Store::new(&engine, ctx); | ||
| let bindings = generated::ProcMacro::instantiate(&mut store, component, &linker).unwrap(); |
There was a problem hiding this comment.
One point worth noting here is that by creating a separate instance for each expansion that's pretty different from proc-macros of today which share the same address space. By re-instantiating it'd be running the wasm in complete isolation from other macro invocations, which would break state stored in the macro itself in a Rust static for example.
There was a problem hiding this comment.
Storing state already breaks with rust-analyzer anyway. Rust-analyzer has multiple proc macro server instances and reuses them across crates and caches everything based on the explicit token stream inputs.
| #![feature(decl_macro)] | ||
| #![feature(negative_impls)] | ||
| #![feature(panic_can_unwind)] | ||
| #![cfg_attr(not(target_arch = "wasm32"), feature(panic_can_unwind))] |
There was a problem hiding this comment.
For most of target_arch = "wasm32" I'd recommend switching to target_family = "wasm" (handles a hypothetical future wasm64 target transparently)
| // Note that even if we could use the macro directly we'd still need some post-processing (or | ||
| // upstream a change to the macro to add an option / gate on cfg), see the string replacement | ||
| // done on the internal macro's implementation for details. |
There was a problem hiding this comment.
One possible other way to dice this onion is to have build-dependencies on wit-bindgen-{rust,core} which are include!-d here. That enables postprocessing as well as keeping things out of rustc.
| core = { path = "../core" } | ||
| rustc-literal-escaper = { version = "0.0.7", features = ["rustc-dep-of-std"] } | ||
|
|
||
| [target.'cfg(all(target_os = "wasi", target_env = "p2"))'.dependencies] |
There was a problem hiding this comment.
Could this cfg be target_family = "wasm" to match the source? (or target_arch = "wasm32" as-is today in the source)
| bound: bound, | ||
| } | ||
|
|
||
| resource span { |
There was a problem hiding this comment.
One option here is to have type span = u64 or maybe even record span { internal: u64 }. That would avoid the need for Box::leak throughout the impelmentation and would be a bit cheaper as well. Given the trusted nature of wasm proc macros resource may not be the best fit here.
If that's done, then all of these methods would become free-functions not nested within a resource
| @@ -1352,7 +1353,23 @@ impl<'test> TestCx<'test> { | |||
| let aux_path = self.resolve_aux_path(source_path); | |||
| let mut aux_props = self.props.from_aux_file(&aux_path, self.revision, self.config); | |||
| if aux_type == Some(AuxType::ProcMacro) { | |||
There was a problem hiding this comment.
Does this mean that all preexisting proc-macro tests are becoming wasm-proc-macro tests?
| builder | ||
| .ensure(compile::Std::new(test_compiler, TargetSelection::from_user("wasm32-wasip2"))); |
There was a problem hiding this comment.
One way I might recommend doing this is, as you mentioned already, first having the testing of wasm proc macros being conditional. With that the opt-in would then perform a validation of config.toml and/or the configuration at a high level and ensure that things like the wasm32-wasip2 target, wasm-component-ld, etc, are all included. That way it might need less edits in the build system and more "make sure you configure things right ahead of time"
|
☔ The latest upstream changes (presumably #157616) made this pull request unmergeable. Please resolve the merge conflicts. |
This is an initial draft implementation. A few problems need to be addressed before this could be merged:
But in the meantime this does pass tests (confirmed on aarch64-apple and x86_64-linux, the two systems I have readily available), so opening up a draft PR to share the initial state.
Some not necessarily blocking pieces:
cccrate dependency causes us to need an old wasmtime dependency. I think we should be able to relax the upstream constraint but haven't looked into it yet.cc rust-lang/all-hands-2026#73