Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: compiler flag for registering external Wasm functions #4413

Draft
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

rvanasa
Copy link
Contributor

@rvanasa rvanasa commented Feb 22, 2024

This PR introduces the --rts-function flag to register custom extern C-style language bindings in the RTS. The goal is for third-party tools such as Mops and Vessel to be able to include language bindings via sources command for custom mo-rts.wasm files.

Example workflow to manually try this out:

  • Recompile the RTS with a new Rust language binding:
#[no_mangle]
pub unsafe extern "C" fn echo(value: u32) -> u32 {
    value
}
  • Create a Motoko file (ffi.mo):
import Prim "mo:prim";

let input : Nat32 = 5;
let expected : Nat32 = 5;

let result = (prim "rts:echo" : Nat32 -> Nat32)(input);

Prim.debugPrint(debug_show result);

assert result == expected;
  • Run the following commands (with MOC_*_RTS variables automatically configured in Nix environment):
MOC_UNLOCK_PRIM=1 moc --rts-function echo -c ffi.mo -wasi-system-api
wasmtime result-ffi.wasm

Feel free to suggest other ways of passing this definition. Hoping to make this relatively simple for third-party tooling to register these interfaces while keeping everything readable at a quick glance.

Progress:

  • --rts-function flag
  • rts: primitives to access arbitrary FFI bindings in the RTS
  • Infer input and return types from Wasm exports
  • Restrict rts: primitives to registered functions
  • Test --rts-function flag and rts: primitives

Next steps:

  • Either an import syntax for RTS functions or some way to unlock prim expressions for specific Motoko files
  • Tools for building custom mo-rts.wasm files
  • Possibly reserve memory for external language bindings

@crusso
Copy link
Contributor

crusso commented Feb 22, 2024

Nice start!

Something like this could work, but the mapping between Rust and Motoko values isn't that simple. Even an i32 doesn't map to Nat32, and, what's worse, the mapping will depend on the compiled code and choice of GC.

What we could do is provide some Rust functions for constructing/destructing Motoko values to Rust equivalents, and have the Rust code use those functions and perhaps assert the intended Motoko type (maybe using a macro) for the compiler to pick by inspecting the custom RTS (perhaps via custom section declaring the types).

The RTS code already use the type Value (=u32) for this, and we could maybe provide a collection of helper functions
that treat Value as an abstract type, enforcing the expected representation invariants.

toNat32: u32 -> Value
fromNat32 : Value -> u32
toArray:  [Value] -> Value

etc.

#[no_mangle]
#[Motoko("<A>A->A")]
pub unsafe extern "C" fn echo(value: Value) -> Value {
    value
}

#[no_mangle]
#[Motoko("Nat32 ->Nat32")]
pub unsafe extern "C" succ(value: Value) -> Value {
    toValue(fromValue(value)+1);
}

Usually you'd have to worry about the GC running at the same time and avoid introducing GC holes (allocated data that isn't visitible to the GC and could get collected under your feet). But we might be able to avoid all that hassle because we know the GC won't run until the end of the message anyway, after all the FFI code has done its stuff and returned.

Copy link

github-actions bot commented Feb 22, 2024

Comparing from 3f3af73 to 0800c53:
The produced WebAssembly code seems to be completely unchanged.

@rvanasa
Copy link
Contributor Author

rvanasa commented Feb 22, 2024

...the mapping will depend on the compiled code and choice of GC.

Right; I see what you're saying about this. The assumption is that the external tooling is targeting a specific combination of GC and debug/release, so in this case the tooling would be able to specify the correct Wasm value types.

Edit: I removed the DSL in favor of simply reading the types from the RTS Wasm file itself, which solves this problem.

What we could do is provide some Rust functions for constructing/destructing Motoko values to Rust equivalents, and have the Rust code use those functions and perhaps assert the intended Motoko type (maybe using a macro) for the compiler to pick by inspecting the custom RTS (perhaps via custom section declaring the types).

Saw this after writing everything else (and almost suggesting this as a possibility but thinking there was some performance reason not to do this). This would certainly solve the problem of needing to pass different flags depending on GC, etc. Beginning to look into this now.

...maybe provide a collection of helper functions
that treat Value as an abstract type, enforcing the expected representation invariants.

It seems reasonable to build the Motoko attribute (from the example above) directly into the RTS codebase, although it could also be valuable to expose this as an external library so that code generation tools similar to Canpack could build the RTS Wasm file from many individual crates without losing editor type checking, if that makes sense.

Here are some examples for what I'm imagining (presumably handled by external preprocessing tools, but we could build the Motoko attribute directly into the RTS codebase):

  • my_rust_crate/Cargo.toml
# ...

[dependencies]
moc-bindgen = "1.2.3"
  • my_rust_crate/src/lib.rs
use moc_bindgen::{motoko, Value};

#[motoko]
pub fn hello(name: String) -> String {
   format!("Hello, {name}!").into()
}

#[motoko]
pub fn custom(custom_value: Value) -> Value {
   // ...
}

#[motoko]
pub fn get_record_field<A>(arg: RecordStruct<A>) -> A {
   arg.value
}

#[motoko]
struct RecordStruct<A> {
   pub value: A
}

moc_bindgen::export!();

// etc.

Alternatively:

moc_bindgen::export! {
  pub fn hello(name: String) -> String {
     format!("Hello, {name}!")
  }
  
  pub fn get_record_field<A>(arg: RecordStruct<A>) -> A {
     arg.value
  }

  struct RecordStruct<A> {
     pub value: A
  }

  // ...
};
  • dfx_project/mops.toml
# ...

[rust-dependencies]
my-rust-crate = "0.1.2"

Another approach could be to pass Motoko type strings in the --rts-function and handle everything in the compiler itself. The original hesitation was that it would add unnecessary complexity for most use cases, such as if we wanted to create something along the lines of wasm-bindgen for Motoko. I was planning to implement this but wanted to try this other approach first to get a sense for what would be involved.

I suppose yet another approach could be to remove the --rts-function flags and statically determine the Wasm types from usages of (prim "rts:*" ...)() (or some new syntax for this purpose).

@luc-blaeser
Copy link
Contributor

This is very interesting. Thank you for investigation and work!
I believe this would offer a power mode for experienced developers to have an efficient interoperability to carefully selected Rust functionality.
I see a little bit of a challenge of ensuring memory safety, namely guaranteeing that the third-party code cannot break memory management of Motoko. (Otherwise, if we would allow users to assess the correctness on their own and take the corresponding risk, we would probably get memory corruption issue reports that could be due to unsafe FFI imports and it would be difficult to do a triage on such reports.)
Probably, as a possible solution, we could include an analysis to verify that the imported function only uses safe Rust and does not use any dynamic allocations, cannot receive any pointers from Motoko, and possibly other safety requirements. Alternatively, when allowing allocations through our custom Rust-interop-allocator in the RTS, we would need to prevent static variables (as otherwise untracked low-level pointers could be retained by Rust across IC messages, impacting GC correctness).
But I believe it is very important topic and we should continue it.

@rvanasa
Copy link
Contributor Author

rvanasa commented Mar 15, 2024

Thanks for taking a look! Adding a warning if a custom RTS function uses unsafe Rust seems like a great idea. I also fully agree with your evaluation that this FFI approach would require advanced Rust knowledge in general, which we could at least somewhat improve with static code analysis.

@rvanasa rvanasa mentioned this pull request Mar 19, 2024
12 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants