New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support 32-bit build. #44
Conversation
a0ff982
to
25ad3c6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very cool, thanks for working on this!
First, be aware that Cranelift's x86-32 support isn't robust or tested yet. While we've made an effort to fill in the x86-32 pieces as we go, but we've mostly been focusing on x86-64, so there may be some missing pieces yet. That said, it'd be great to start testing it, and Wasmtime support is a good way to start! And I can help mentor anyone interested in working on x86-32 support.
With the i64 -> i32 changes, I expect you'll need to remove casts like cast::i32(self.offsets.vmctx_vmshared_signature_id(sig_index))
in func_environ.rs etc.
(You can ignore the Appveyor CI for now, as we're still working on getting that online.)
e8fb1b6
to
733ce0c
Compare
Looks like the 64-bit tests all pass now (with the rustfmt commit added). I'm happy to continue working on x86 32-bit support. However, I should note that this is my procrastination project for when I should be working on my PhD dissertation. So I can't commit to anything or a timeline. Brief background: My immediate personal interest is adding an optional dlopen/dlsym style API so that I can get the Wasm implementation of mal / make-a-lisp running on wasmtime (in addition to running in node, wac and warpy). In wac/wace, I implemented imports using dlopen/dlsym. In wasmtime (as in wac/wace) I would limit it to 32-bit mode and x86/little endian so that the calling boundary can be fairly easily defined without crazy thunking infrastructure (or needing to bindgen define every lib function beforehand). Thus the reason why I was wanting to get 32-bit support working. If an optional dlopen/dlsym API is within scope, I'll send a PR if/when I complete it, otherwise I'll just maintain it as a fork. Technical question related to that: I need to determine the actual memory location of the linear memory used within an API function. I figured out that vmctx is passed as a final argument to function imports and I am able to get the mmap address of memory 0 and print the bytes in my dlsym function import: unsafe {
let memdef = (&mut *vmctx)
.instance_contents()
.memory(DefinedMemoryIndex::new(0));
let m = &slice::from_raw_parts(memdef.base, 1000);
print!("first 1000 bytes of m: {:?}\n", m); Here is the relevant wat I'm using to debug this: (memory 1 16)
(data (i32.const 0)
"fputs\00")
(func $main
(call $print_i32 (i32.load8_u (i32.const 0)))
(call $print_i32 (i32.load8_u (i32.const 1)))
...
(call $print_i32 (call $dlsym (i32.const 0) (i32.const 0)))
... The result of running the wasm with that debug code:
So it appears that the "fputs\00" string is correctly being setup in memory, but I'm not sure why the memory pointed to by vmctx doesn't seem to show the active memory. Any thoughts? |
I've just now landed the current solution to this. Take a look at the commit message for 00a4e93. In short, There unfortunately isn't an example of this in the tree yet, though I have one that I hope to land before long. Eventually, WebAssembly will gain the ability to pass references to linear memory directly as arguments, at which time this mechanism can be phased out. If anything is unclear here, feel free to ask! |
I've rebased to pick up the upstream rustfmt changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Looking at the patch again, I do have a few more comments.
I changed your suggestions. I also added an additional commit to fix compilation errors I was getting with rustc 1.31.1 |
Signatures/functions/imports/exports etc are defined as varuint32 in the WebAssembly specification so use u32 rather than u64. Decrease the static memory constants for 32-bit addressing mode so that they fit within 32-bit memory constraints. Conditionalize cmake compile of SignalHandlers.cpp so that -m32 is passed when building 32-bit. Add a no-op match for Reloc::X86CallPCRel4 during linking. This is probably the wrong thing, but it allows the tests to pass. Using the same logic from the Reloc::X86PCRel4 case did not work.
Thanks, looks good! |
The global map for storing `*Func` objects previously stored `*Store`, but that creates a reference cycle between the Rust and the Go heaps which can't be cleaned up. Instead this commit updates the logic to instead use a "thread local" variable to temporarily store the current freelist during a call and moves the storage outside of the global variables. This should allow everything to get garbage collected as usual. Closes bytecodealliance#42
The specification allows duplicated occurrences of the same `tag` in filter list on the `resume` instruction (and `resume_throw` instruction). Only the first occurrence is relevant as it shadows the subsequent ones, however, cranelift complains if we emit duplicated case labels in a `Switch`, e.g. ``` $ ./target/debug/wasmtime wast -W=exceptions,function-references,typed-continuations dup.wast thread '<unnamed>' panicked at 'Tried to set the same entry 0 twice', cranelift/frontend/src/switch.rs:58:9 ``` This patch compiles only the first occurrence of a tag and skips subsequent entries.
This is tested with both
cargo test
andcargo test --target i686-unknown-linux-gnu
.Signatures/functions/imports/exports etc are defined as varuint32 in
the WebAssembly specification so use u32 rather than u64.
Decrease the static memory constants for 32-bit addressing mode so
that they fit within 32-bit memory constraints.
Conditionalize cmake compile of SignalHandlers.cpp so that -m32 is
passed when building 32-bit.
Add a no-op match for Reloc::X86CallPCRel4 during linking. This is
probably the wrong thing, but it allows the tests to pass. Using the
same logic from the Reloc::X86PCRel4 case did not work.