Skip to content
This repository has been archived by the owner on Aug 17, 2022. It is now read-only.

How far do we want to go with a type system? #6

Closed
magcius opened this issue Nov 11, 2017 · 9 comments
Closed

How far do we want to go with a type system? #6

magcius opened this issue Nov 11, 2017 · 9 comments

Comments

@magcius
Copy link

magcius commented Nov 11, 2017

This proposal tries to use annotations to describe most simple types seen on the web, along with a flat calling convention.

However, WebIDL has a fairly robust type system containing nullables, sequence types, record types, and more, all of which can be composed together. For now, we could mandate that each "variant" of a type requires an explicit host binding and get around things that way.

For instance, CanvasRenderingContext2D.setLineDash could take a Float64Sequence, which one would need to construct and push items to manually:

Float64Sequence* seq = Float64Sequence_new();
Float64Sequence_push(seq, 2);
Float64Sequence_push(seq, 3);
...
CanvasRenderingContext2D_setLineDash(ctx, seq);
Float64Sequence_free(seq);

But it might be interesting to look at supporting a variant-ish type system where we can describe a complex WebIDL value directly, rather than special casing a variety of "simple types" in isolation and expecting bindings to fill in the rest.

@lukewagner
Copy link
Member

For your Float64Array example, I think a minor extension to the current ARRAY_BUFFER binding type would be the ability to construct any type of typed array from an (begin, end) i32 pair.

But I think you're making a more general point that WebIDL supports more interesting compound types than a simple flat list of argument bindings could express. I was actually realizing the same thing while writing my comment in #3: the value of a dictionary key could be more than a plain i32; you might want an OBJECT_HANDLE or even dictionary.

Based on this, I think what we'll need here is a little expression language, with one expression per argument of the callee. I was thinking maybe we could follow the lead of the constant expressions we use for data segment offsets and say that the value of every WebIDL argument is a wasm expression that has special validation rules. However, while constant expressions are a pure subset of normal wasm; these WebIDL expressions would be a subset extended with additional ops.

@magcius
Copy link
Author

magcius commented Nov 15, 2017

I think one important decision we have to make is how much we want to make this accessible at runtime.

This spec doesn't allow me to call something with a variable length number of arguments, for instance, so document.classList.add('foo', 'bar', 'baz') would be off-limits. Or, for any case where one might want to possibly pass an object reference or a string, the user has to either statically import it as either "object reference" or "string", or, for other languages, the compiler has to generate the whole set of combinatorics and do runtime dispatch.

There's an argument to be made for just declaring a runtime JS variant API in a spec somewhere, and using the OBJECT_HANDLE mechanism to e.g. bind DOMTokenList.toggle's optional argument:

import void JSVal_new_string(int32 slot, char* p, int len) __attribute__((wasmjsdom("object_handle:JSVal 
 string")));
import void JSVal_new_boolean_true(int32 slot) __attribute__((wasmjsdom("object_handle:JSVal")));
import void JSVal_new_undefined(int32 slot) __attribute__((wasmjsdom("object_handle:JSVal")));

import void DOMTokenList_toggle(int32 slot, int32 token, int32 force) __attribute__((wasmjsdom("object_handle:DOMTokenList object_handle:JSVal object_handle:JSVal")));

typedef int32 DOMTokenList;

void SetClassName(DOMTokenList tokenList, enum { off, on, toggle } force = toggle) {
    int32 token = _alloc_slot_JSVal();
    JSVal_new_string(token, "cool-css-class", sizeof("cool-css-class"));
    int32 force = _alloc_slot_JSVal();
    switch (force) {
    case on: JSVal_new_boolean_true(force); break;
    case off: JSVal_new_boolean_false(force); break;
    case toggle: JSVal_new_undefined(force); break;
    }
    return DOMTokenList_toggle(token, force);
}

This is basically an escape hatch for edge cases, where runtime variance is required, but it frees up the annotation language from reaching extreme levels of complexity and gives us a lot of extra breathing room.

@lukewagner
Copy link
Member

Right, I think we'll be able to express the completely general case (constructing a JS array of argument JS values and then calling imported Function.prototype.apply) and then it's just a matter of optimizing common/hot call signatures.

What's important is to keep the design of host bindings extensible (like the rest of wasm) so we can grow the set of binding operations over time and have the initial host bindings proposal be a MVP.

Also, once host bindings are out, I think new Web APIs that are performance sensitive will be designed with an eye toward what maps naturally to host bindings. At TPAC just recently, we already saw this to be the case in discussions with the WebGPU and WebAudio folks.

@DLehenbauer
Copy link

In my opinion, it would be a healthy exercise for this proposal to briefly enumerate the WebIDL types/concepts and describe their mapping to host-bindings. I realize this would be a bit tedious, but it would systematically force the right discussions rather than tackle them ad hoc (#5 #11 #13).

(I would be willing to help w/the tedious parts.)

Thoughts?

@DLehenbauer
Copy link

DLehenbauer commented Jan 9, 2018

For instance, it would be useful to start with the WebIDL primitive types and specify how they are to be marshalled <-> wasm.

These mappings are straightforward:

WebIDL Web Assembly
long i32
unsigned long i32
long long i64
unsigned long long i64
float f32*
unrestricted float f32
double f64*
unrestricted double f64

*Interface implementer throws TypeError for NaN/infinities at runtime. (i.e., not really a distinct type from the unrestricted variant.)

Boolean might be the first interesting type to discuss. A couple of options:

  1. Treat like C++ 'bool' (i.e., a 0 or 1)
  2. Similar to @magcius suggested in Binding types for Nullable #5, reserve a pre-initialized table slot for true and false.

(The later could be initialized w/references to the sentinel objects used by a script engine to represent true/false.)

I was also curious if i8/i16 types were considered for wasm, and the motivation for excluding them. (In particular, if the decision was influenced by lack of JS support... I can dig around a bit and see if I can find a discussion.)

@annevk
Copy link
Member

annevk commented Jan 10, 2018

I don't think the mappings are that straightforward, given that IDL only works with JavaScript thus far, so the underlying C++^H^H^HRust might not anticipate an integer in the full i64 range, for instance (https://heycam.github.io/webidl/#es-long-long).

@DLehenbauer
Copy link

You're right, there's something here I hadn't considered.

The simplest support for host-bindings would likely marshal as a JS Var under the covers. This means that the actual mapping would be a two step conversion of 'long long' <-> Var (number) <-> i64. (i.e., the conversion from 'long long' <-> i64 would be "lossy" with |values| > 2^53.)

A few options for how to spec this:

  1. Map as 'long long' -> i64, but permit the lossy conversion.
  2. Map as 'long long' -> i64, but require the lossy conversion for consistency.
  3. Map as 'long long' -> i64, and require hosts to bind directly to the underlying C++ to avoid a lossy conversion.
  4. Map as 'long long' -> f64.

(I'm interested to hear other's opinions.)

I think we'll be okay w/the range of i64 values. JavaScript can already pass values in a similar range (as you approach 2^64, the precision of a double is about +/-2000), so the underlying implementation should be prepared to handle values in a similar range to i64.

PS - Also worth pointing out, there are only a handful of existing APIs that use 'long long' types. Most 'long long's are returned values. Only a handful of APIs like Blob.slice(..) consume a 'long long'.

@magcius
Copy link
Author

magcius commented Jan 12, 2018

So one correction I want to make here is that it is currently possible to map Boolean using the JSON annotation, by passing the string "true" or "false", and these strings can just be in constant memory, too. This is also discounting the dynamic, OBJECT_HANDLE-based jsval API I proposed in an earlier comment.

Are people more interested in nailing down the annotation type system mapping now, or bootstrapping on some form of generic OBJECT_HANDLE-based jsval API for now and leaving the shortcut annotations for later?

@pchickey
Copy link
Collaborator

Closing as out-of-date: these concepts don't map to the current proposal, which has evolved a lot since this issue was opened.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants