Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Imported Functions #4

Closed
tessi opened this issue Jan 16, 2020 · 12 comments
Closed

Support Imported Functions #4

tessi opened this issue Jan 16, 2020 · 12 comments
Labels
help wanted Extra attention is needed

Comments

@tessi
Copy link
Owner

tessi commented Jan 16, 2020

The host of a WASM instance can provide functions to the WASM instance which can be called from WASM code. This is commonly called "imported functions".

An example program (in the WAT format) is which expects the function imports.imported_func to be provided:

(module
  (func $i (import "imports" "imported_func") (param i32))
  (func (export "exported_func")
    i32.const 42
    call $i))

Wasmex does not support imported functions yet. This ticket is the place to brainstorm potential implementation ideas.

Passing Elixir/Erlang functions as callbacks

We implement WASM execution in a NIF. If our NIF gets a fun term from Erlang, we must be able to invoke that function and receive the result.

Unfortunately, Erlangs NIF API does not allow calling function terms. The only function related API seems to be to detect whether a term is a function or not.

Send a message to an erlang process

There is the NIF API call enif_send which can send a message to an erlang process. We could have a separate Erlang process running which listens to these message sends, executes the requested function and calls into the NIF again with the result. This requires some magic with threads and mutexes, though.

I found two people who implemented/proposes this:

The only way to call an Erlang function from a NIF is to send a message
to a callback process that will then run the function. You can send the
fun in this message but to keep it in the NIF you will probably need to
copy it in an environment you allocated yourself (and then copy it again
in the environment that will send the message).

Calling a function from a NIF is easy. Having the NIF receive the result
of the callback is much harder.

My solution involves having a separate NIF function for receiving the
result, called by the callback process at the end of the callback execution.

This NIF function will then require a mutex lock/cond to store the
result where I will be able to read it afterwards, and to signal the
waiting thread that it can read it.

My own code runs in a separate thread from the schedulers, if you need
to wait from inside a scheduler thread things might get funny if there's
only one scheduler or if the callback takes too long.

Suffice to say that it's probably not very efficient to do all this.

This seems to be a viable Option, although complex to implement and probably not very efficient.

@tessi tessi added the help wanted Extra attention is needed label Jan 16, 2020
@bamorim
Copy link

bamorim commented Mar 3, 2020

@tessi first thanks for the work you've put into this.

I was about to start a project exactly like this (integrating wasmer with Elixir via rustler)

On the topic of supporting imported functions, maybe we should start with functions that have no return, just to test the NIF->Elixir way.

I think there is a lot to experiment on how to generate namespaces for imports.

One thing that would already be great would be to be able to stitch multiple wasm modules toghether (one export fullfiling others import).

The high lever API on elixir I think should be something like:

module Wasmex.Instance do
  @type t :: any()

  # Specific callbacks can be passed as an anonymous function or MFA tuple
  @type callback :: (... -> type) | {atom(), atom(), non_neg_integer()}

  # A namespace that exposes other callbacks can be either another instance, a map from name to functions or a module name (maybe we could have a behaviour or we could expose all functions. Not sure yet) I'd stick with the map(String.t(), callback()) for now
  @type namespace :: t() | map(String.t(), callback()) | atom()

  # Imports is just a map from name to namespace
  @type imports() :: map(String.t(), namespace())
  
  @type opt :: {:imports, imports()}
  @type opts :: [opt()]
  
  @spec from_bytes(binary(), opts()) :: t()
end

Maybe the functions should always receive an environment

After reading the link you suggested, it looks like the best thing would be to have every instance be in it's own thread, right? And then we wrap the instance with some mutex/signals to receive values from elixir.

If that is the case, maybe instance should be wrapped and communicated by a GenServer, in order to serialize calls in a standard way.

I think I'll give this a shot next week.

@bamorim
Copy link

bamorim commented Mar 3, 2020

I've been thinking something along these lines (a lot of stuff is incorrect since I haven't thought deeply about, so take it with a grain of salt)

image

@bamorim
Copy link

bamorim commented Mar 3, 2020

Ah, and of course, the instantiate would also call NIF to create the thread, get a ref and pass that ref to caller and callbacker genservers (they can be just a simple process, but genserver is a nice abstraction)

@bamorim
Copy link

bamorim commented Mar 3, 2020

Another crazier idea I had (that actually can solve other Rustler problems, not only wasm-related) is to come up with some kind of bridge between some rust actor system (such as actix) and erlang processes.

Maybe something that we would spawn "twin actors" (a rust actix that have an erlang process counterpart) or something like that. Something that would allow message passing between actix actors and erlang processes. Then we could build on top of that.

@tessi
Copy link
Owner Author

tessi commented Mar 3, 2020

Hey @bamorim thanks for reaching out! I was looking very much for a parter to think this through with. You may just be the hero I need ;)

First, I agree that we should use GenServer to wrap WASM instances. Even more so, because it would allow us solve another related issue (see #6). In the mindset of "devide and conquer" we should probably first implement that, and then come back to imported functions.

Let's assume we have the GenServer part settled, we need a good API for imported functions. I appreciate your take in defining it. From a first read, it sounds reasonable when importing elixir-defined functions. But I'm not 100% sure yet how to properly link two instances -- doing that through elixir land would probably be horribly inefficient. As you said, we would also need to link memory of two instances - I think you proposed to give a pointer to the memory into every elixir-side function. I think, if we want to handle that, we should properly set up imported memory (WASM modules can export and import memory).

Another problem I see on the horizon is how to properly call into wasmer -- as far as i see it does not have a generic way to import methods. They try to protect their users "the rust way" and properly type-check everything. See https://github.com/wasmerio/wasmer/blob/master/lib/runtime-core/src/typed_func.rs . We, however, would need to inject a more generic method (e.g. one that gets all arguments passed in as a Vec). Maybe @Hywan could help us out here.

It all gives me a little headache. This is why I would very much like to tackle this issue step by step. First, refactor the interface to use processes/GenServer as in #6, then handling super simple imported functions.

Your "twin actors" idea really resonated with me. I am a fan of "symmetric" architectures. And this would be a great fit for it. As you said, it would be great to have this either built-in to rustler or in a separate library which we would use here. What do you think?

Also many thanks for your architecture drawing -- I took this as inspiration to for the diagram in #6. Maybe you could have a look there.

TL;DR: Many ❤️ for contributing thoughts. I would recommend tackling #6 first as a first step to implement this. Alternatively (or additionally) we should refine your "twin actors" idea, as I find it really compelling.

@bamorim
Copy link

bamorim commented Mar 5, 2020

Hi @tessi, happy that you liked.

Today and tomorrow I'm attending Code BEAM SF and I have a talk to give today, so I won't have much time, but last week I can dive deep into that.

I agree with you, making the instance run in another thread and everything be async is a good starting point.

About the interface, the way I'm thinking is basically like how GenServers work: you just pass a message. You want to call fun with arg1 and arg2? Well, just send a message (GenServer.call or cast) to our "dispatcher" with {:fun, [arg1, arg2]}. The GenServer will call the NIF with that data that will dispatch to our Actix actor running the instance, the actor will then translate the erlang terms into f64, f32, etc and finally call the instance, get the return, send a message to the GenServer with the return value which will then return to the caller.

That will not work that great for bigger data (such as large strings) because all the copying, but that can be a later optimization (I have some ideas to explore, like trying to use erlang's binary heap for the shared memory, but that would require going deep into BEAM internals).

I'd like to start working on that on monday (or maybe saturday if I get bored hahaha).

Just to sync timezones, you are based in Germany, right?

@tessi
Copy link
Owner Author

tessi commented Mar 5, 2020

Oh my gosh. Have fun doing your talk and being famous :)

I am about to bring my baby to bed but will have another thought about it later.

And yes, Germany :) so quite some time difference to bridge

@tessi
Copy link
Owner Author

tessi commented Mar 5, 2020

Reading through it, I like the plan. Thanks for starting work on it! Tell me if there is anything I can support you with

@tessi
Copy link
Owner Author

tessi commented Apr 14, 2020

With #7 merged, we can go forward with this ticket.

I see that wasmer 0.15.0 implemented polymorphic host function calls, which is another requirement to enable imported functions.

I guess updating to the latest wasmer release is a good next step.

@tessi
Copy link
Owner Author

tessi commented Apr 17, 2020

Update: We updated to the latest wasmer release and I am currently working on a prototype to make imported function happen.

Now thinking about the interface (where @bamorim did a first draft here ❤️ and I steal some ideas from). My thought so far are:

  • we want WASM import objects and export objects to be very similar
  • the import/export object (or maps from Elixirs viewpoint) should not only hold functions, but also memory. And later also globals or tables.
  • Such an import object should be a Map on the Elixir side - it contains UTF-8 strings as keys and "namespace like" objects as values. "namespace like" objects can be namespaces (should probably be a separate struct with corresponding namespace-data in rust-land) or an instance. This would allow us to efficiently link instances
  • the way we currently retrieve the exported WASM memory should be changed to instead retrieve the export object and from there fetch the memory, should it be exported.
  • a namespace struct, allows registrations of entries. That can be functions (but also memory, globals, or tables later) each having an identifier string.
  • I imagine three ways to register import functions in a namespace:
  1. using an anonymous function
  2. using module, name, arity
  3. using an exported function from a WASM instance

each method needs to provide the method signature - this allows us to properly convert values from WASM types and helps wasmer typecheck the expected signature of the given function against what the WASM instance expects.

Namespace.new
  |> Namespace.register_function("sum", &(&1 + &2), [:uint32, :uint32], [:uint32])
  |> Namespace.register_function("sum_variant2", :MyMathModule, :sum, 2, [:uint32, :uint32], [:uint32])
  |> Namespace.register_function("sum_variant3", Namespace.get(Map.get(some_export_object, "exports"), "sum"), [:uint32, :uint32], [:uint32])

@tessi
Copy link
Owner Author

tessi commented Apr 21, 2020

I have a first hacked-together version at #9 - will improve it (and remove hacks there) over the next days.

Turns out my plan to make Namespace a separate resource (shared between Rust and Elixir) wasn't accepted by the Rust compiler. For now I decided to step back a bit and use plain maps to describe the import object when instantiating a WASM module.

@tessi
Copy link
Owner Author

tessi commented May 3, 2020

done by merging #9

@tessi tessi closed this as completed May 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants