-
Notifications
You must be signed in to change notification settings - Fork 694
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: OCAP bindings #1291
Comments
@jgravelle-google @fgmccabe This is the proposal I mentioned in the last webidl video-call. |
Thanks for the thoughtful writeup of the idea! I think the use case you've identified is important. It's possible that what you want is already expressible in terms of existing proposals. In particular, even without ☃ bindings, the type imports and exports proposal has recently been factored out of the GC proposal (thanks @rossberg!), and I think these give you what you want:
For memory management of a linear-memory-implemented abstract type, I think you could either:
|
Yeah, this proposal depends on typed imports. I've edited the description to make that clearer. Although typed imports/exports cover a lot of the design space I had in mind when writing the draft for this, I do think the two features you mentioned (arbitrary, non-reference type exports and
I guess you can shim other types with an i31, but ultimately what you want is a type that matches the C++/Rust/whatever representation of your capability with as little friction as possible. (actually, I'm a little confused when reading the GC/typed import specs; does On the other hand, there's probably an implementation cost for VMs; they have to make sure they can store arbitrary valtypes in tables; and if capabilities can be valtypes, the implementation can't convert them to anyref or rely on a generic reference implementation. That said, I think being able to export and store arbitrary types makes for a coherent type system; and I suspect the implementation costs are small.
By the client, do you mean the importing module? That works as a polyfill, but if you want safe interop even with malicious code, you need the host to enforce ref counting. Otherwise you can always rely on GC, but this proposal is mostly meant for languages and hosts that don't have GC.
I think that's out of scope for snowman bindings. Probably as a post-MVP extension for GC. |
Abstract
This is a proposal for adding inter-module bindings in the form of opaque capabilities that can be exported by arbitrary wasm modules.
Rationale
WebAssembly is currently good at executing a self-contained program managing a monolithic bloc of memory. It lacks key features when it comes to communicating fine-grained data between modules, or between a module and its host.
Some use cases are made harder by lacking these features:
The simplest use cases, where wasm code calls hosts functions which immediately return data (DOM access, WASI) can be covered by simple "static" bindings, which seems to be the direction WebIDL/Snowperson-bindings seem to be headed in.
However, in some cases, you may want to pass persistent data back and forth to a module, without copying it all as a JSON/Protobuff object, eg:
This proposal tries to outline what a scheme capable of supporting the above code would look like.
Description
Each module can export capability types; from the module's perspective, these capabilities are simple wrappers around wasm value types (i32, f64, ref); for other modules/instances' perspectives, they are opaque types can only be copied, moved, dropped, and passed back to the original module instance; they cannot be forged, mutated or downcasted from anyref or other capability types. They can only be stored in local variables and tables (similar to reference types).
Capabilities are statically-checked, nominal types. Modules can't swap the values of two capabilities with the same type name, unless they come from the same instance of the same module (that last part must be checked at runtime).
Note that, because of static type-checking requirement, these types implement an "object capability" scheme, not an "object-oriented programming" scheme. Concepts usually associated with OOP, such as vtables and polymorphism, aren't covered by this proposal. In fact, this proposal doesn't require function references at all, though it depends on the typed imports proposal.
Memory Management
For every capability type, modules also provide lifetime hooks rc_increment and rc_decrement. These hooks are private, accessible only to the host, and allow the host to provide the module trusted information about its capabilities' lifetime, in the form of a reference count.
Using these hooks, modules are responsible for the memory-safety of capabilities in relation to their linear memory. For instance, a C++ exporting shared pointers as capabilities would be responsible for making sure that the linear memory slice these shared pointers are stored in isn't overwritten with garbage data.
The wasm host is only responsible for calling the hooks when capabilities are copied or dropped.
As a result, a program may end-up with reference cycles that can't be trivially detected at compile-time. As with most reference-counting schemes, the developer is ultimately responsible for making sure these cycles don't happen.
An alternative lifetime scheme based on garbage collection may be added, but isn't part of this proposal. The rationale is that most hosts and languages are capable of implementing a RC scheme; but some languages (eg Rust/C++) and hosts may not be able / willing to implement a GC scheme.
Overhead
The overhead of capabilities should be constant-time at worst. Capabilities might be stored as a tuple of a pointer on a store plus its declared type.
In the worst case, passing a capability to a function should only incur a pointer check at runtime. In the best case, the function call could be inlined and using the capability could be equivalent to a single pointer read/write.
Example
The following example is meant to be indicative of what exporting and importing OCAP bindings would look like; the syntax is still fairly loose, and takes a few shortcuts.
main.cpp:
Database.wasm exports:
In the above example, the Database module exports:
The C++ part of the code isn't aware that the capabilities it imports are i32 values; it can only manipulate them through
Database_*
functions.Also, note that
shared_ptr::increment
+shared_ptr::decrement
are different fromDatabase_request_close
+Database_close
. Unlike the reference-counting hooks, theDatabase_close
function isn't guarded by the host. Which means, for instance, that the following code can compile:and should be accounted for in the Database implementation.
Snowperson bindings dependency
This proposal is meant as an extension of snowperson-bindings.
For those who don't know, snowperson-bindings is the WIP name of an intermediate layer between wasm and WebIDL-bindings that was first presented at the June CG meeting.
This proposal is written under the assumption that snowperson-bindings will cover passing and returning typed data between functions (eg structs, arrays, enums, unions). This assumption may be invalidated, as snowperson-bindings are still evolving.
Note that even if, say, structs aren't a part of snowperson bindings, they may still be emulated with OCAP bindings, eg:
may become:
Possible improvements
Interop with JS
Interoperability with JS is outside the scope of this proposal.
Capabilities could be exported as opaque objects (maybe using private fields) with an explicit
dispose()
method. These objects could only be produced from and passed to methods from the module exporting them.Generics
Some use cases need generics.
For instance, if a module A wants to create an object graph, and import a function from a module B that reads through or mutates that graph (eg B exports pathfinding functions), there are several possible implementations:
Using generics may be desirable if B was developed with no knowledge of A (eg B was downloaded from a package manager), and in any situation where a user wants to pass complex data to a module without allocating directly in that module.
However, generics fall outside the scope of this proposal.
Const types
Some functions may be declared as taking a read-only view of a capability. While this is mostly impossible to verify for a wasm host, it could still enable optimizations when compiling the module importing these functions.
The host would need to take into account the possibility of a malicious module lying in its exported definition, so that the optimizations would produce logically incorrect, but memory-safe code.
Tuple types
Modules might want to bind capabilities to tuple types. For instance,
std::shared_ptr<T>
is usually implemented as a tuple of aT*
pointer, and a pointer to the reference counter, for performance reasons.The text was updated successfully, but these errors were encountered: