feat: compiler flag for registering external Wasm functions #4413

rvanasa · 2024-02-22T17:37:00Z

This PR introduces the --rts-function flag to register custom extern C-style language bindings in the RTS. The goal is for third-party tools such as Mops and Vessel to be able to include language bindings via sources command for custom mo-rts.wasm files.

Example workflow to manually try this out:

Recompile the RTS with a new Rust language binding:

#[no_mangle]
pub unsafe extern "C" fn echo(value: u32) -> u32 {
    value
}

Create a Motoko file (ffi.mo):

import Prim "mo:prim";

let input : Nat32 = 5;
let expected : Nat32 = 5;

let result = (prim "rts:echo" : Nat32 -> Nat32)(input);

Prim.debugPrint(debug_show result);

assert result == expected;

Run the following commands (with MOC_*_RTS variables automatically configured in Nix environment):

MOC_UNLOCK_PRIM=1 moc --rts-function echo -c ffi.mo -wasi-system-api
wasmtime result-ffi.wasm

Feel free to suggest other ways of passing this definition. Hoping to make this relatively simple for third-party tooling to register these interfaces while keeping everything readable at a quick glance.

Progress:

--rts-function flag
rts: primitives to access arbitrary FFI bindings in the RTS
Infer input and return types from Wasm exports
Restrict rts: primitives to registered functions
Test --rts-function flag and rts: primitives

Next steps:

Either an import syntax for RTS functions or some way to unlock prim expressions for specific Motoko files
Tools for building custom mo-rts.wasm files
Possibly reserve memory for external language bindings

crusso · 2024-02-22T18:43:33Z

Nice start!

Something like this could work, but the mapping between Rust and Motoko values isn't that simple. Even an i32 doesn't map to Nat32, and, what's worse, the mapping will depend on the compiled code and choice of GC.

What we could do is provide some Rust functions for constructing/destructing Motoko values to Rust equivalents, and have the Rust code use those functions and perhaps assert the intended Motoko type (maybe using a macro) for the compiler to pick by inspecting the custom RTS (perhaps via custom section declaring the types).

The RTS code already use the type Value (=u32) for this, and we could maybe provide a collection of helper functions
that treat Value as an abstract type, enforcing the expected representation invariants.

toNat32: u32 -> Value
fromNat32 : Value -> u32
toArray:  [Value] -> Value

etc.

#[no_mangle]
#[Motoko("<A>A->A")]
pub unsafe extern "C" fn echo(value: Value) -> Value {
    value
}

#[no_mangle]
#[Motoko("Nat32 ->Nat32")]
pub unsafe extern "C" succ(value: Value) -> Value {
    toValue(fromValue(value)+1);
}

Usually you'd have to worry about the GC running at the same time and avoid introducing GC holes (allocated data that isn't visitible to the GC and could get collected under your feet). But we might be able to avoid all that hassle because we know the GC won't run until the end of the message anyway, after all the FFI code has done its stuff and returned.

github-actions · 2024-02-22T19:05:48Z

Comparing from 3f3af73 to 0800c53:
The produced WebAssembly code seems to be completely unchanged.

rvanasa · 2024-02-22T21:25:58Z

...the mapping will depend on the compiled code and choice of GC.

Right; I see what you're saying about this. The assumption is that the external tooling is targeting a specific combination of GC and debug/release, so in this case the tooling would be able to specify the correct Wasm value types.

Edit: I removed the DSL in favor of simply reading the types from the RTS Wasm file itself, which solves this problem.

What we could do is provide some Rust functions for constructing/destructing Motoko values to Rust equivalents, and have the Rust code use those functions and perhaps assert the intended Motoko type (maybe using a macro) for the compiler to pick by inspecting the custom RTS (perhaps via custom section declaring the types).

Saw this after writing everything else (and almost suggesting this as a possibility but thinking there was some performance reason not to do this). This would certainly solve the problem of needing to pass different flags depending on GC, etc. Beginning to look into this now.

...maybe provide a collection of helper functions
that treat Value as an abstract type, enforcing the expected representation invariants.

It seems reasonable to build the Motoko attribute (from the example above) directly into the RTS codebase, although it could also be valuable to expose this as an external library so that code generation tools similar to Canpack could build the RTS Wasm file from many individual crates without losing editor type checking, if that makes sense.

Here are some examples for what I'm imagining (presumably handled by external preprocessing tools, but we could build the Motoko attribute directly into the RTS codebase):

my_rust_crate/Cargo.toml

# ...

[dependencies]
moc-bindgen = "1.2.3"

my_rust_crate/src/lib.rs

use moc_bindgen::{motoko, Value};

#[motoko]
pub fn hello(name: String) -> String {
   format!("Hello, {name}!").into()
}

#[motoko]
pub fn custom(custom_value: Value) -> Value {
   // ...
}

#[motoko]
pub fn get_record_field<A>(arg: RecordStruct<A>) -> A {
   arg.value
}

#[motoko]
struct RecordStruct<A> {
   pub value: A
}

moc_bindgen::export!();

// etc.

Alternatively:

moc_bindgen::export! {
  pub fn hello(name: String) -> String {
     format!("Hello, {name}!")
  }
  
  pub fn get_record_field<A>(arg: RecordStruct<A>) -> A {
     arg.value
  }

  struct RecordStruct<A> {
     pub value: A
  }

  // ...
};

dfx_project/mops.toml

# ...

[rust-dependencies]
my-rust-crate = "0.1.2"

Another approach could be to pass Motoko type strings in the --rts-function and handle everything in the compiler itself. The original hesitation was that it would add unnecessary complexity for most use cases, such as if we wanted to create something along the lines of wasm-bindgen for Motoko. I was planning to implement this but wanted to try this other approach first to get a sense for what would be involved.

I suppose yet another approach could be to remove the --rts-function flags and statically determine the Wasm types from usages of (prim "rts:*" ...)() (or some new syntax for this purpose).

luc-blaeser · 2024-03-15T16:05:47Z

This is very interesting. Thank you for investigation and work!
I believe this would offer a power mode for experienced developers to have an efficient interoperability to carefully selected Rust functionality.
I see a little bit of a challenge of ensuring memory safety, namely guaranteeing that the third-party code cannot break memory management of Motoko. (Otherwise, if we would allow users to assess the correctness on their own and take the corresponding risk, we would probably get memory corruption issue reports that could be due to unsafe FFI imports and it would be difficult to do a triage on such reports.)
Probably, as a possible solution, we could include an analysis to verify that the imported function only uses safe Rust and does not use any dynamic allocations, cannot receive any pointers from Motoko, and possibly other safety requirements. Alternatively, when allowing allocations through our custom Rust-interop-allocator in the RTS, we would need to prevent static variables (as otherwise untracked low-level pointers could be retained by Rust across IC messages, impacting GC correctness).
But I believe it is very important topic and we should continue it.

rvanasa · 2024-03-15T16:51:20Z

Thanks for taking a look! Adding a warning if a custom RTS function uses unsafe Rust seems like a great idea. I also fully agree with your evaluation that this FFI approach would require advanced Rust knowledge in general, which we could at least somewhat improve with static code analysis.

rvanasa added 6 commits February 21, 2024 20:14

Add '--rts-function' CLI flag to register RTS bindings

62a9fc0

Add 'rts:*' prim expressions

57555b5

Add DSL parser for RTS function types

76e29c3

Simplify DSL token logic

e397f52

Local refactor

46e693f

Misc

35792b8

rvanasa added 12 commits February 23, 2024 13:49

Misc

08be052

Read export function types from RTS Wasm file

62daada

Update comment

a6b70e0

Merge branch 'master' of https://github.com/dfinity/motoko into ryan/ffi

ff583f0

Remove unused 'open'

32c469f

Rename local variable

43d9ef5

Restrict 'rts:' prim to custom RTS functions

82362a3

Fix warning

2b177d9

Misc

69dcc2a

Update comment

93889fe

Simplify

c8baa22

Refactor / fix

0800c53

rvanasa mentioned this pull request Mar 4, 2024

feat: custom RTS functions using a Wasm custom section #4424

Draft

rvanasa mentioned this pull request Mar 19, 2024

experiment: custom RTS functions #4438

Draft

12 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: compiler flag for registering external Wasm functions #4413

feat: compiler flag for registering external Wasm functions #4413

rvanasa commented Feb 22, 2024 •

edited

crusso commented Feb 22, 2024 •

edited

github-actions bot commented Feb 22, 2024 •

edited

rvanasa commented Feb 22, 2024 •

edited

luc-blaeser commented Mar 15, 2024

rvanasa commented Mar 15, 2024

feat: compiler flag for registering external Wasm functions #4413

Are you sure you want to change the base?

feat: compiler flag for registering external Wasm functions #4413

Conversation

rvanasa commented Feb 22, 2024 • edited

crusso commented Feb 22, 2024 • edited

github-actions bot commented Feb 22, 2024 • edited

rvanasa commented Feb 22, 2024 • edited

luc-blaeser commented Mar 15, 2024

rvanasa commented Mar 15, 2024

rvanasa commented Feb 22, 2024 •

edited

crusso commented Feb 22, 2024 •

edited

github-actions bot commented Feb 22, 2024 •

edited

rvanasa commented Feb 22, 2024 •

edited