Draft specs of idl and bindings #52

fgmccabe · 2019-07-22T20:00:26Z

Initial draft of specs for type schema and bindings

PoignardAzur · 2019-07-23T10:41:39Z

I haven't had the time to take an in-depth look at the PR, but I'm getting the feeling it's solving the wrong problem.

I'm not sure there's much of a point in having a text IDL other than wast. People could always compile the binary IDL from WebIDL or cap'n proto or C headers.
The abstract semantic types feel like a leaky abstraction. Eg I don't understand the point of having an abstract integer type, instead of i32, u32, i64, u64.
It's way too early to be thinking about generics.

rossberg

In general, this seems like a reasonable direction, although, obviously, many details still need working out.

However, I agree with @PoignardAzur that it is way too premature to add features like generic types, type imports, and probably exceptions. As long as they aren't even in Wasm, I don't think it makes sense to advance them into the IDL.

Things like try expressions in coercions also make me worry that the coercion language will quickly Greenspun into a fully-fledged programming language, which would kind of imply that it is the wrong abstraction.

Finally, the syntax could fit Wasm's better. But syntax is only a secondary concern at this point.

rossberg · 2019-07-23T11:08:37Z

proposals/webidl-bindings/bindings.md

+
+    A *BindingLambda* is an expression that encapsulates either access to an
+operation or provision of an operation. Binding lambdas have a function
+signature; if it is an access binding then the signature is a WASM function


Suggested change

signature; if it is an access binding then the signature is a WASM function

signature; if it is an access binding then the signature is a Wasm function

rossberg · 2019-07-23T11:09:38Z

proposals/webidl-bindings/bindings.md

+
+ Access lambda
+  An _access lambda_ represents a specification of how an operation specified in
+wi-IDL (typically an imported function) is accessed from WASM. It takes the


Suggested change

wi-IDL (typically an imported function) is accessed from WASM. It takes the

wi-IDL (typically an imported function) is accessed from Wasm. It takes the

rossberg · 2019-07-23T11:10:11Z

proposals/webidl-bindings/bindings.md

+
+    where Namei are identifiers and Expression is an expression involving the
+various Name parameters together with appropriate lifting and lowering
+operators. The type of this lambda is a WASM function type.


Suggested change

operators. The type of this lambda is a WASM function type.

operators. The type of this lambda is a Wasm function type.

What function type does this syntax imply?

rossberg · 2019-07-23T11:12:48Z

proposals/webidl-bindings/bindings.md

+For example, the lambda:
+
+```
+(access ($B $L) ($integer-to-i32 (utf8-count (base-len-as-string $B $L))))


Did you mean

Suggested change

(access ($B $L) ($integer-to-i32 (utf8-count (base-len-as-string $B $L))))

(access ($B $L) (integer-to-i32 ($utf8-count (base-len-as-string $B $L))))

rossberg · 2019-07-23T11:13:08Z

proposals/webidl-bindings/bindings.md

+(access ($B $L) ($integer-to-i32 (utf8-count (base-len-as-string $B $L))))
+```
+
+denotes an access lambda that accesses a function utf8-count (which presumably counts the number of unicode code points in a string). The sub-expression


Is this a user-defined Wasm function?

rossberg · 2019-07-23T12:22:39Z

proposals/webidl-bindings/idl.md

+
+>Question: Should type hiding be supported? I.e., if a quantified form introduces
+>a type variable that is already in scope does it have the effect of occluding
+>the outer name. Convenience and composability says yes, security says no.


I may misunderstand the question, but AFAICT, composability yells "no". ;)

rossberg · 2019-07-23T12:26:54Z

proposals/webidl-bindings/idl.md

+
+<div class="example">
+```
+(type Boolean [cases true false])


This may become problematic if you ever want to add subtyping to the IDL, since it would force Bool to be compatible with any variant super and subtype.

rossberg · 2019-07-23T12:31:28Z

proposals/webidl-bindings/idl.md

+
+Note that we specifically do not elaborate the numeric types -- for example into
+signed vs unsigned, or 8bit vs 64bit. Such elaboration can be accounted for by
+appropriate operators.


I don't think it flies to have types that do not prescribes a value range. You need to make sure that two sides communicating through an IDL do not make inconsistent assumptions about what that value range is and apply inconsistent binding operators.

rossberg · 2019-07-23T12:36:37Z

proposals/webidl-bindings/idl.md

+
+#### Nominal Type {#nominal-type}
+
+A nominal type expression is used to identify types by name:


It's not clear what you mean by "nominal". Is this just a type alias, or does it introduce a fresh type? "Nominal typing" refers to the latter, but the description kinda suggests the former. Especially since true nominal types would not easily allow self-contained interface descriptions.

rossberg · 2019-07-23T12:41:13Z

proposals/webidl-bindings/idl.md

+
+```
+Signature => 'type' NominalType '=' Type
+Signature => 'all' TypeVar .. TypeVar '.' NominalType '=' Type


If you allow type parameters without restricting type recursion then type equality is undecidable.

Also, the all is odd here, since this is a lambda, not a quantifier.

fgmccabe · 2019-07-23T16:52:34Z

1. The text representation is there to (a) hash out ideas in a more reasonable space than binary and (b) to allow people to write down APIs. However, there definitely is/will be a binary encoding that is a key part of the spec. We have already drafted a binary encoding for webIDL, but it looks like we are not going the webIDL route at the moment. 2. The IDL types are abstract for a reason. In any given provision/use of an API the coercion operators are paired: a lifting operator and a lowering operator. It is the pair that is actually important; the abstract IDL in the middle is only there to act as a kind of discriminator: to limit the space of operators that have to be specified. Anotherway of saying this is: if you have to have a lowering operator of the form <integer>-to-i32, it makes no sense to elaborate this to i32-to-i32, i8-to-i32 i64-to-32, i31-to-i32 and so on. Especially when you will also have to have lifting operators of the form i32-as-i32, i32-as-i8 i32-as-i64 i32-as-i31 etc. It is simply easier to have a single concept of integer and map to it and from it. 3. The motivation for generics in this spec come from a few different directions: the use of quantifiers to define scoping, the requirement to have collection types (think vector of points for a graphics API), the desire that no types are 'special', and the fact that it is easier to define them now. That does not imply that we need to fully elaborate them at the moment. 4. Exception handling is there because coercions can fail. Even simple coercions can fail (e.g., mathematically and physically, the idea of losing the most significant bits when coercing from 64 bits to 32 (or 32 to 31) is absurd); but more importantly, any coercions that affect memory can also fail. Furthermore, IMO, exceptions should be treated no differently than regular return values: if a called function throws an exception, that exception may also need to be coerced.

…

On Tue, Jul 23, 2019 at 3:41 AM Olivier FAURE ***@***.***> wrote: I haven't had the time to take an in-depth look at the PR, but I'm getting the feeling it's solving the wrong problem. - I'm not sure there's much of a point in having a text IDL other than wast. People could always compile the binary IDL from WebIDL or cap'n proto or C headers. - The abstract semantic types feel like a leaky abstraction. Eg I don't understand the point of having an abstract integer type, instead of i32, u32, i64, u64. - It's way too early to be thinking about generics. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#52>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAQAXUCDJC3AEP6MZWH2DG3QA3N6RANCNFSM4IF4GZ2A> .

-- Francis McCabe SWE

rossberg · 2019-07-23T18:28:56Z

The IDL types are abstract for a reason. In any given provision/use of
an API the coercion operators are paired: a lifting operator and a lowering
operator. It is the pair that is actually important; the abstract IDL in
the middle is only there to act as a kind of discriminator: to limit the
space of operators that have to be specified.

Anotherway of saying this is: if you have to have a lowering operator of
the form -to-i32, it makes no sense to elaborate this to
i32-to-i32, i8-to-i32 i64-to-32, i31-to-i32 and so on. Especially when you
will also have to have lifting operators of the form i32-as-i32, i32-as-i8
i32-as-i64 i32-as-i31 etc. It is simply easier to have a single concept of
integer and map to it and from it.

To be useful for programmers and tools alike, an IDL needs to enable expressing self-contained interface descriptions. With what you describe, looking at the interface of a module would not be enough -- to program against an interface correctly I'd need to know how it is implemented, i.e., what binding operators are used on the other end, so that I can match them up. That of course would miss the most important point of an IDL, which is to decouple interface from implementation.

The motivation for generics in this spec come from a few different
directions: the use of quantifiers to define scoping, the requirement to
have collection types (think vector of points for a graphics API), the
desire that no types are 'special', and the fact that it is easier to
define them now. That does not imply that we need to fully elaborate them
at the moment.

Oh, I know -- and agree with -- all the reasons for having polymorphism. And it should be included eventually. But you need to start with something simple, and in particular, you want to avoid getting ahead of the language, or you will likely create something incompatible.

Exception handling is there because coercions can fail. Even simple
coercions can fail (e.g., mathematically and physically, the idea of losing
the most significant bits when coercing from 64 bits to 32 (or 32 to 31) is
absurd); but more importantly, any coercions that affect memory can also
fail.

Yes, but that does not necessarily imply the need for exception handling. It could just trap. Again, I understand that such handling may be desirable eventually. But as long as simple trapping is deemed "good enough" for Wasm itself, there is little reason why the binding layer needs to be special.

Furthermore, IMO, exceptions should be treated no differently than regular
return values: if a called function throws an exception, that exception may
also need to be coerced.

I am aware of your stance on this, but of course the IDL should match what will actually be provided by Wasm (which also happens to match most languages). Another example where it's preferable to not get ahead of the language.

(In general, I'd suggest to focus on a bindings "MVP". It often is tempting to put "all the features" into a proposal, but that doesn't tend to work well. Wasm itself is an example of incrementality by design. That often is painful, and it can be frustrating how long it takes us to add certain features to Wasm now although they had been planned from the beginning, but it improves chances of success.)

jgravelle-google · 2019-07-23T18:43:23Z

Meta-level point:

but it improves chances of success

To elaborate on how I view "success", this means both 1) improve the odds that a proposal reaches consensus by minimizing moving parts (political success), and 2) incrementalism in design is insulation against mistakes, having real-world users helps validate that the design works (technical success)

I believe both successes are entirely crucial to most wasm spec-related things.

Not said to have an opinion on the current design (yet), just mentioning the strategic lens I'm thinking of here

PoignardAzur · 2019-07-23T19:10:53Z

Exception handling is there because coercions can fail. Even simple
coercions can fail (e.g., mathematically and physically, the idea of losing
the most significant bits when coercing from 64 bits to 32 (or 32 to 31) is
absurd); but more importantly, any coercions that affect memory can also
fail.

To be blunt, the fact that these coercions can fail is a pretty good sign that the type system is a leaky abstraction.

For example, let's say you have

a type Integer that can be either i32 or i64
a module that exports a function foobar that takes an Integer

If foobar will throw/trap if you pass it an Integer that can't be converted to i32, then foobar doesn't actually take an Integer, it takes a i32, and it should announce that; the conversion/trapping should happen on the caller side.

I think that goes back to the discussion we had the other day, about how do we determine what is within the scope of snowman bindings, and what adds too much complexity. I think non-trivial coercions that introduce their own rules and can potentially fail should be the concern of compilers, not wasm hosts.

The text representation is there to (a) hash out ideas in a more
reasonable space than binary and (b) to allow people to write down APIs.
However, there definitely is/will be a binary encoding that is a key part
of the spec. We have already drafted a binary encoding for webIDL, but it
looks like we are not going the webIDL route at the moment.

I'm sorry to be negative, but I really had a hard time reading the type notations; it might be that I'm not used enough to mathematical notations. Still, I would have found it easier to understand if the IDL had been something closer to eg the capnproto IDL.

fgmccabe · 2019-07-23T20:29:45Z

I think that I may not have been clear in everything I wrote. When invoking an imported function exceptions can arise both within that function and 'at the edges' due to the coercion of arguments/return values/exceptions. This represents two 'sources' of exceptions.

To be blunt, the fact that these coercions can fail is a pretty good sign

that the type system is a leaky abstraction. I simply do not agree with this! Leaky abstractions refer to the situation where things that should be hidden are not. Regardless of whether it is hidden or not, coercions can fail -- almost by definition. A big part of the reason that we need coercion expressions in the first place and an IDL is that wasm's type system is not oriented to the requirements of publishing and consuming APIs. It is oriented to the proclivities for C-style languages' representation of data. There is a semantic gap that must be bridged. On the other hand, having coercion expressions is also extremely useful: for example, it allows minor differences between consumer and producer of an API to be smoothed over. This facilitates reuse of code. For example, if a function takes an i32 integer (not a pointer) and a consumer 'sends it' an i64. Depending on the choice of coercion operator, this could be either handled dynamically with an exception, or statically by refusing to bind if there is an actual mismatch. In the case of a provider offering a function that actually takes an i32 argument, the lowering part of the coercion is also specified by the provider. So, if the provider does not want to deal with the possibility of failing to coerce an i64 then they specify a 'rigid' coercion. If the provider wants to give flexibility with the possibility of reporting an out of range error, then they use a lowering operator that throws exceptions. If the provider is happy truncating the original then they use a lowering operator that truncates. So, the mechanism we have allows for lots of different scenarios. Wrt to the notation, apologies. On the other hand, a spec is not really intended to be 'user friendly'. It's main task is precision.

…

On Tue, Jul 23, 2019 at 12:10 PM Olivier FAURE ***@***.***> wrote: 1. Exception handling is there because coercions can fail. Even simple coercions can fail (e.g., mathematically and physically, the idea of losing the most significant bits when coercing from 64 bits to 32 (or 32 to 31) is absurd); but more importantly, any coercions that affect memory can also fail. To be blunt, the fact that these coercions can fail is a pretty good sign that the type system is a leaky abstraction. For example, let's say you have - a type Integer that can be either i32 or i64 - a module that exports a function foobar that takes an Integer If foobar will throw/trap if you pass it an Integer that can't be converted to i32, then foobar doesn't actually take an Integer, it takes a i32, and it should announce that; the conversion/trapping should happen on the caller side. I think that goes back to the discussion we had the other day, about how do we determine what is within the scope of snowman bindings, and what adds too much complexity. I think non-trivial coercions that introduce their own rules and can potentially fail should be the concern of compilers, not wasm hosts. 1. The text representation is there to (a) hash out ideas in a more reasonable space than binary and (b) to allow people to write down APIs. However, there definitely is/will be a binary encoding that is a key part of the spec. We have already drafted a binary encoding for webIDL, but it looks like we are not going the webIDL route at the moment. I'm sorry to be negative, but I really had a hard time reading the type notations; it might be that I'm not used enough to mathematical notations. Still, I would have found it easier to understand if the IDL had been something closer to eg the capnproto IDL. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#52>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAQAXUADMNY3NQQRK5JCPZTQA5JUDANCNFSM4IF4GZ2A> .

-- Francis McCabe SWE

PoignardAzur · 2019-07-24T13:24:00Z

On the other hand, having coercion expressions is also extremely useful:
for example, it allows minor differences between consumer and producer of
an API to be smoothed over. This facilitates reuse of code.

I don't think generality of the kind you're describing facilitates reuse of code.

Modular code is easiest to reuse when it follows design by contract. In particular, when writing a function's prototype / API, its parameter types should represent the function's preconditions.

A function shouldn't have to worry about its precondition being broken. Eg, if a function takes an enum of A, B and C, then it should assume that these are the only three values it's ever going to get; if a function takes a u8, it should assume it's only ever going to receive values from 0 to 255; if it takes a s8, from -128 to 127.

If the provider wants to give flexibility with the possibility of reporting an out
of range error, then they use a lowering operator that throws exceptions.
If the provider is happy truncating the original then they use a lowering
operator that truncates.

I don't think that kind of flexibility is desirable.

In general, out-of-range errors and other precondition breaks should be checked on the caller side, not the callee side.

In 99% of use cases, what you want in your IDL is to export

function someFunction : (arg1: u16, arg2: u16, arg3: string) -> (s32, s32)

On the caller side, the compiler will transcompile and/or import the IDL file, and match the above types to their equivalent in the caller's language (eg Rust's u16 or D's ushort or Java's unsigned short or C++'s uint16_t).

The compiler is the one that handles the conversion, and issues a warning / a runtime exception / whatever in case of narrowing conversions. The wasm host shouldn't worry about any of that.

fgmccabe · 2019-07-24T15:51:55Z

I agree that design by contract is in many ways preferable. However, when you are crossing ownership boundaries, you cannot trust your caller - you have to verify your preconditions on the callee side. The flexibility I am referring to is that of being able to make small adjustments that would otherwise prevent use. The original reason I did not elaborate the integer type was that I saw that the coercion expressions had the same effect; and I knew that there would in any case have to be a reasonably large range of coercion operators. And having a lot of integer types in the idl pollutes it a fair amount - because now you have to have a lot more coercion operators. As for your assertion that the most important case of an API is: function someFunction : (arg1: u16, arg2: u16, arg3: string) -> (s32, s32) I beg to differ. It depends entirely on your use case and your view of the ecosystem. I could just as easily argue that func processCC:(who : person, cc: creditCard, amount : currency) -> verificationResponse is equally as important; especially in the long run when more people are deploying wasm-only applications and libraries. Even if you leave that heady scenario out, you also have: func paintLines:(sequence<Point> points, gc:graphicsContext) -> () throws graphicsException as something that we have to be able to do today.

…

On Wed, Jul 24, 2019 at 6:24 AM Olivier FAURE ***@***.***> wrote: On the other hand, having coercion expressions is also extremely useful: for example, it allows minor differences between consumer and producer of an API to be smoothed over. This facilitates reuse of code. I don't think generality of the kind you're describing facilitates reuse of code. Modular code is easiest to reuse when it follows *design by contract*. In particular, when writing a function's prototype / API, its parameter types should represent the function's preconditions. A function shouldn't have to worry about its precondition being broken. Eg, if a function takes an enum of A, B and C, then it should assume that these are the only three values it's ever going to get; if a function takes a u8, it should assume it's only ever going to receive values from 0 to 255; if it takes a s8, from -128 to 127. If the provider wants to give flexibility with the possibility of reporting an out of range error, then they use a lowering operator that throws exceptions. If the provider is happy truncating the original then they use a lowering operator that truncates. I don't think that kind of flexibility is desirable. In general, out-of-range errors and other precondition breaks should be checked on the caller side, not the callee side. In 99% of use cases, what you want in your IDL is to export function someFunction : (arg1: u16, arg2: u16, arg3: string) -> (s32, s32) On the caller side, the compiler will transcompile and/or import the IDL file, and match the above types to their equivalent in the caller's language (eg Rust's u16 or D's ushort or Java's unsigned short or C++'s uint16_t). The compiler is the one that handles the conversion, and issues a warning / a runtime exception / whatever in case of narrowing conversions. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#52>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAQAXUHK6JBRCDK74XHF4WTQBBJXJANCNFSM4IF4GZ2A> .

-- Francis McCabe SWE

fitzgen

First off, thanks for writing these drafts up, @fgmccabe.

My overall impression is that this seems to propose too much, and the snowman language seems too expressive for something that is ultimately only intended to lift Wasm values into IDL values and lower them back again. I agree with the others calling for something smaller and more "MVP".

Additionally, my understanding was that we are aiming to carve out a subset of Web IDL (or at least something that we can line up 1 to 1 with Web IDL constructs) so that we can easily meet our original goal: fast and direct access to DOM methods. This proposal has a bunch more stuff that Web IDL doesn't: not only quantified and existential types, but even just sum types don't exist in Web IDL. I'm concerned both with the amount of scope creep here, and that we are forgetting and failing our original goal as the scope creeps.

A few additional comments and questions inline below.

Thanks!

fitzgen · 2019-07-24T16:52:35Z

docs/index.html

@@ -0,0 +1,881 @@
+<!doctype html><html lang="en">


This file is just the output of running bikeshed on proposals/webidl-bindings/binary.bs, right?

fitzgen · 2019-07-24T16:54:36Z

proposals/webidl-bindings/binary.bs

+convenient to support a binary representation of the webIDL.
+
+This specification mirrors the official specification [[!webIDL]] in most
+ways. However, the binary encoding may admit non-legal forms of


What is the relationship between the snowman bindings language with Web IDL and the spec for a binary representation of Web IDL?

fitzgen · 2019-07-24T17:15:55Z

proposals/webidl-bindings/binary.bs

+Type => ErrorType
+Type => ParameterizedType
+Type => BufferSourceType
+Type => CompoundType


We should be de-duplicating / referring to compound types by index (like my experimental implementation does) to save on code size.

In fact, I can't find where Type transitively goes to some sort of reference to an already-defined type at all, so for example a FunctionType cannot take a previously defined dictionary type as a parameter under these current rules. Is some sort of a production like Type => ReferenceToPreviouslyDefinedType missing?

fitzgen · 2019-07-24T17:20:03Z

proposals/webidl-bindings/bindings.md

+W-ProvisionLambda ::= '(' 'provide' W-Tuple W-ProvisionExpression ')'
+```
+
+#### Provision Lambda Type


So the idea is that instead of

mapping tuples of Web IDL values into tuples of Wasm values (incoming binding maps in the explainer) and vice versa (outgoing binding maps in the explainer),

and then combining an incoming and outgoing binding map into either an import binding (with an outgoing binding map for arguments and incoming binding map for return values) or an export binding (with an incoming binding map for arguments and outgoing binding map for return values)

we now have

access lambdas and provision lambdas that contain both lifting and lowering operators,

but the lifting/lowering operators are separated from each other by which side (parameter or return) of the function whose values they are coercing

and we have ^ to splice nested lists into the current list

correct?

The original approach has two benefits, as I see it:

The incoming and outgoing binding maps contain only lowering and lifting coercions respectively, while the access and provision lambdas contain heterogeneous expression trees of coercions. The lack of heterogeneity in the original allows for reusing opcodes across lifting and lowering coercions, which means that ultimately we can keep all the opcodes in the one-byte LEB128 space longer before reaching any two-byte LEB128s, which helps code size.

The original incoming/outgoing binding map formulation didn't require list splicing operators. To convert into n values, you just provided n expressions and each operator evaluated to a single value. The new approach requires an "hourglass" shape and the ^ splicer since the function being bound exists as a single W-expression but can return many values.

fitzgen · 2019-07-24T17:22:37Z

proposals/webidl-bindings/bindings.md

+
+The three arguments to `string-to-base-ptr` are the `String` value itself, the
+base address of a block of linear memory -- together with the size of the
+buffer. The returned result is the address of the utf8-encoded sequence of bytes


Am I understanding correctly that these three arguments are runtime arguments, and none are immediates? Where is the buffer supposed to come from? Are we going to allow some subset of Wasm instructions inside access/provision lambdas so that they can call allocators themselves instead of building the concept of allocators into coercion operators?

If there isn't enough room for the incoming string in the given allocation buffer, what happens? Does this try calling an allocator function? Throw? Only write a substring into the buffer?

fgmccabe · 2019-07-24T18:04:32Z

On Wed, Jul 24, 2019 at 10:31 AM Nick Fitzgerald ***@***.***> wrote: ***@***.**** commented on this pull request. First off, thanks for writing these drafts up, @fgmccabe <https://github.com/fgmccabe>. My overall impression is that this seems to propose too much, and the snowman language seems too expressive for something that is ultimately only intended to lift Wasm values into IDL values and lower them back again. I agree with the others calling for something smaller and more "MVP".

The design of the IDL is intended to be 'spanning' rather than 'minimal'. In fact, my colleagues will have heard my opinion of the term 'minimum viable product'; I prefer the term 'minimum awesome product'. On the other hand, I also strongly agree that we should resist scope creep and feature-itis. The biggest issue is that the gap between WASM types (which are oriented to the machine) and IDL types (which are oriented to interoperability) is fundamentally much larger than some might hope.

Additionally, my understanding was that we are aiming to carve out a subset of Web IDL (or at least something that we can line up 1 to 1 with Web IDL constructs) so that we can easily meet our original goal: fast and direct access to DOM methods. This proposal has a bunch more stuff that Web IDL doesn't: not only quantified and existential types, but even just sum types don't exist in Web IDL. I'm concerned both with the amount of scope creep here, and that we are forgetting and failing our original goal as the scope creeps.

Access to the web APIs was the original motivation. But it does not motivate everyone who is involved. Incidentally, sum types DO exist in webIDL. This proposal is effectively a cleaned up version of webIDL with some omissions (no extended attributes) and some reorganization (in particular having a definite concept of scope -- which is what the quantifiers are for).

A few additional comments and questions inline below. Thanks! ------------------------------ In docs/index.html <#52 (comment)> : > @@ -0,0 +1,881 @@ +<!doctype html><html lang="en"> This file is just the output of running bikeshed on proposals/webidl-bindings/binary.bs, right?

Yes.

------------------------------ In proposals/webidl-bindings/binary.bs <#52 (comment)> : > + "href" : "http://heycam.github.io/webidl", + "title" : "Web IDL", + "publisher" : "W3C", + "status" : "Editor's Draft" + } +} +</pre> + +Introduction {#intro} +===================== + +In order to be able to process webIDL signatures efficiently it is +convenient to support a binary representation of the webIDL. + +This specification mirrors the official specification [[!webIDL]] in most +ways. However, the binary encoding may admit non-legal forms of What is the relationship between the snowman bindings language with Web IDL and the spec for a binary representation of Web IDL?

Historical. Originally we were going to use webIDL; we are currently on a path that diverges somewhat from that. The 'binary.bs' specification is obsolete and we are currently on an editing effort to incorporate a binary encoding in the main document.

------------------------------ In proposals/webidl-bindings/binary.bs <#52 (comment)> : > +that is part of webIDL itself then it is encoded using a small number +that denotes a negative number in the LEB encoding. If the type is a +reference to a user-defined type; i.e., its a reference to a type that +is declared in one of InterfaceDef, DictionaryDef, EnumerationDef then +it is represented as a positive number: the index of its definition in +the top-level sequence. + +``` +Type => AnyType +Type => NumericType +Type => StringType +Type => ObjectType +Type => ErrorType +Type => ParameterizedType +Type => BufferSourceType +Type => CompoundType We should be de-duplicating / referring to compound types by index (like my experimental implementation does <https://github.com/rustwasm/wasm-webidl-bindings/blob/master/BINARY.md#references-to-web-idl-types>) to save on code size.

Agreed.

In fact, I can't find where Type transitively goes to some sort of reference to an already-defined type at all, so for example a FunctionType cannot take a previously defined dictionary type as a parameter under these current rules. Is some sort of a production like Type => ReferenceToPreviouslyDefinedType missing?

This is handled through the NominalType rule (not in binary.bs though). The binary representation of a nominal type in the encoding is exactly an index into an outer scoped table. That table would then either define the type import or define an algebraic combination.

------------------------------ In proposals/webidl-bindings/bindings.md <#52 (comment)> : > +### Provision Lambda + +A _ProvisionLambda_ is a function that defines how an exported EB function is +realized term terms of exported wasm elements. + +``` +ProvisionLambda => 'provide' Tuple '=>' ProvisionExpression +``` + +or, as a W-Expression form: + +``` +W-ProvisionLambda ::= '(' 'provide' W-Tuple W-ProvisionExpression ')' +``` + +#### Provision Lambda Type So the idea is that instead of - mapping tuples of Web IDL values into tuples of Wasm values (incoming binding maps in the explainer) and vice versa (outgoing binding maps in the explainer), - and then combining an incoming and outgoing binding map into either an import binding (with an outgoing binding map for arguments and incoming binding map for return values) or an export binding (with an incoming binding map for arguments and outgoing binding map for return values) we now have - access lambdas and provision lambdas that contain both lifting and lowering operators, - but the lifting/lowering operators are separated from each other by which side (parameter or return) of the function whose values they are coercing - and we have ^ to splice nested lists into the current list correct?

More-or-less, yes.

The original approach has two benefits, as I see it: 1. The incoming and outgoing binding maps contain only lowering and lifting coercions respectively, while the access and provision lambdas contain heterogeneous expression trees of coercions. The lack of heterogeneity in the original allows for reusing opcodes across lifting and lowering coercions, which means that ultimately we can keep all the opcodes in the one-byte LEB128 space longer before reaching any two-byte LEB128s, which helps code size. 2. The original incoming/outgoing binding map formulation didn't require list splicing operators. To convert into n values, you just provided n expressions and each operator evaluated to a single value. The new approach requires an "hourglass" shape and the ^ splicer since the function being bound exists as a single W-expression but can return many values.

It turns out that you need a mix anyway; for a couple of basic reasons: in order to handle 'callbacks' you have to invert the normal relationship; and even within a single coercion operator you will have a mix of terms. For example, the lowering operator for string values needs a way of representing both the incoming string (an IDL value) and the area of linear memory to put it into (wasm values). As for splat; while wasm may be a stack machine, that is not true for most API languages (not true for webIDL itself for example). IMO, an explicit splat operator is more likely to be understandable. (The other corner case of a sub-expression not returning any values at all is very unlikely in the context of a coercion expression; it corresponds to a stateful coercion, something we want to avoid.)

------------------------------

In proposals/webidl-bindings/bindings.md <#52 (comment)> : > +``` +(integer-to-i32 (i64-as-integer X)) == X +``` +is not, in general. + +#### string-to-base-ptr + +The `string-to-base-ptr` maps a wi-IDL `String` value into a block of webAssembly linear memory -- as a sequence of utf-8 encoded bytes. + +``` +(string-to-base-ptr : (String i32 i32) => (i32 i32) $throws i32 +``` + +The three arguments to `string-to-base-ptr` are the `String` value itself, the +base address of a block of linear memory -- together with the size of the +buffer. The returned result is the address of the utf8-encoded sequence of bytes Am I understanding correctly that these three arguments are runtime arguments, and none are immediates? Where is the buffer supposed to come from? Are we going to allow some subset of Wasm instructions inside access/provision lambdas so that they can call allocators themselves instead of building the concept of allocators into coercion operators?

Yes, they are run-time. This is one of the drivers for needing additional sub-expressions within a coercion expression. You need to be able to identify the buffer -- probably by evaluating a wasm function call.

If there isn't enough room for the incoming string in the given allocation buffer, what happens? Does this try calling an allocator function? Throw? Only write a substring into the buffer?

One could in principle imagine having a coercion operator with an additional allocator function argument. This would be different to the one already described; but valid. Personally, I am wary of having any form of automatic allocation of return values. The reason being that this opens up my library/application to abuse from external libraries. Furthermore, even if you have an allocator function, allocation can still fail and so we must also handle that scenario. The 'allocate within buffer' is a simple 'sandbox' to allow us to design for what must happen anyway; we solve it for that we solve it for all.

…

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#52>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAQAXUBAXVKL2NZFNMY4G2LQBCGVZANCNFSM4IF4GZ2A> .

-- Francis McCabe SWE

Refactored variants to be part of a signature rather than being a union type.

lukewagner

Thanks for all the initial work! I'm not quite sure yet, but I think I like the generalization of having binding lambdas being able to contain 0..N calls to imports/exports and just using validation rules to ensure that the types line up. It does seem to add some more complexity to an optimizing trampoline compiler, but we can see how that manifests in practice.

I'd like to +1 and emphasize some comments that @rossberg has already made:

we shouldn't add anything to this proposal that will be coming to core wasm since (1) it absolutely must be designed to align with the core wasm feature, (2) if we do anything before the core wasm feature is stablized (or even proposed), it's very likely we'll likely make an alignment error. So in particular, I think these don't belong in the "MVP":
- anything to do with exceptions (that should block on wasm exceptions)
- nominal types (which should I think instead type imports and exports)
- quantified types (those aren't even proposed yet)
we should have explicitly-signed integer types with fixed ranges. An infinite-range Integer type is more work, not strictly required for the initial set of use cases afaics and also introduces the need to have binding operators that specify (possibly configurable) out-of-range behavior.

Additionally:

I don't think there should be a special Resource<T> type bindings, core wasm reference types should be used instead.
If we can define a module type and its textual representation as part of core wasm (as discussed in design/#1289), then I don't think we need a separate IDL for bindings; it's sufficient just to reuse what's defined for core wasm, by simply allowing binding types in place of core wasm types. Thus, I'd very much prefer we didn't frame this proposal as defining a new IDL, but instead as defining a new set of types that can be used within the existing text format for defining module types (signatures).

fgmccabe · 2019-07-24T21:11:35Z

Some responses inline

On Wed, Jul 24, 2019 at 1:50 PM Luke Wagner ***@***.***> wrote: ***@***.**** requested changes on this pull request. Thanks for all the initial work! I'm not *quite* sure yet, but I think I like the generalization of having binding lambdas being able to free contain 0..N calls to imports/exports and just using validation rules to ensure that the types line up. It does seem to add some more complexity to an optimizing trampoline compiler, but we can see how that manifests in practice. I'd like to +1 and emphasize some comments that @rossberg <https://github.com/rossberg> has already made: - we shouldn't add anything to this proposal that will be coming to core wasm since (1) it absolutely must be designed to align with the core wasm feature, (2) if we do anything before the core wasm feature is stablized (or even proposed), it's very likely we'll likely make an alignment error. So in particular, I think these don't belong in the "MVP": - anything to do with exceptions (that should block on wasm exceptions <https://github.com/webassembly/exception-handling>) - nominal types (which should I think instead type imports and exports <https://github.com/webassembly/type-imports>) - quantified types (those aren't even proposed yet) I am ok with 'holding off' on some of these. I do not agree that type

imports count as nominal types.

- - we should have explicitly-signed integer types with fixed ranges. An infinite-range Integer type is more work, not strictly required for the initial set of use cases afaics and also introduces the need to have binding operators that specify (possibly configurable) out-of-range behavior. You need to handle out of range semantics in any case. (i64->i32). What

you need to handle should be dependant on the complete chain of coercion operators; not some arbitrary point in the middle.

Additionally: - I don't think there should be a special Resource<T> type bindings, core wasm reference types should be used instead. I think that it is dangerous to conflate wasm types with idl types. wasm

types are about how memory is used to realize applications, idl types are about interoperability. 2 corner cases: a compiler may be free to change its mapping of language constructs to wasm types based on optimization opportunities; resulting in non-predictable encodings for language values. (I have a compiler that does exactly this) Two languages have the right to map the same concept very differently (using different record layouts, using pointer compression, etc. etc.) In general, this will still be true after GC types are implemented. It is highly likely that something like Resource<T> would be mapped to wasm reference types in many situations; but then again, a given language may have its own idea about what makes a resource.

- If we can define a module type and its textual representation as part of core wasm (as discussed in design/#1289 <WebAssembly/design#1289 (comment)>), then I don't think we need a *separate* IDL for bindings; it's sufficient just to reuse what's defined for core wasm, by simply allowing binding types in place of core wasm types. Thus, I'd very much prefer we didn't frame this proposal as defining a *new IDL*, but instead as defining a *new set of types* that can be used within the *existing* text format for defining module types (signatures).

I think I need more context to understand this. I have not been following this discussion closely.

…

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#52>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAQAXUGRNZWLEPHQI5QUOJTQBC6CJANCNFSM4IF4GZ2A> .

-- Francis McCabe SWE

PoignardAzur · 2019-07-24T21:47:07Z

If we can define a module type and its textual representation as part of core wasm (as discussed in design/#1289), then I don't think we need a separate IDL for bindings; it's sufficient just to reuse what's defined for core wasm, by simply allowing binding types in place of core wasm types. Thus, I'd very much prefer we didn't frame this proposal as defining a new IDL, but instead as defining a new set of types that can be used within the existing text format for defining module types (signatures).

You mean use the existing S-expression syntax with new instruction names for new types?

lukewagner · 2019-07-24T21:56:36Z

@PoignardAzur What's being discussed in design/#1289 is just describing a modules' type(i.e. "signature" or "interface"), and thus it only contains the types, not the binding expressions, which are implementation details of either side of the interface.

fgmccabe added 9 commits April 18, 2019 15:21

Draft of binary encoding spec

32d6734

Refactored proposal to eliminate use of named types.

418609b

Creating a space for the HTML to show up on github

1303bcd

rename binary.html to index.html

dd7eb4b

Fixed typos in first example

49b2789

MW on spec.

4ef87f7

MW

a9ee350

Initial draft of specs for bindings and IDL

a1f5058

Added an example operator

89c70dc

rossberg reviewed Jul 23, 2019

View reviewed changes

fitzgen reviewed Jul 24, 2019

View reviewed changes

Added skeleton of binary encoding.

a6ebbdf

Refactored variants to be part of a signature rather than being a union type.

lukewagner suggested changes Jul 24, 2019

View reviewed changes

Removed the Map/2 type from standard types

ffe9160

fgmccabe added 2 commits July 25, 2019 14:28

More details on binary encoding

e317474

Remove redundant binary spec

769f90c

lukewagner mentioned this pull request Aug 19, 2019

Rewrite explainer (again) to take into account new names, scope, and concepts #59

Merged

lukewagner deleted the branch WebAssembly:master September 21, 2021 19:19

lukewagner closed this Sep 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Draft specs of idl and bindings #52

Draft specs of idl and bindings #52

fgmccabe commented Jul 22, 2019

PoignardAzur commented Jul 23, 2019

rossberg left a comment

rossberg Jul 23, 2019

rossberg Jul 23, 2019

rossberg Jul 23, 2019

rossberg Jul 23, 2019

rossberg Jul 23, 2019

rossberg Jul 23, 2019

rossberg Jul 23, 2019

rossberg Jul 23, 2019

rossberg Jul 23, 2019

rossberg Jul 23, 2019

rossberg Jul 23, 2019

fgmccabe commented Jul 23, 2019 via email

rossberg commented Jul 23, 2019

jgravelle-google commented Jul 23, 2019

PoignardAzur commented Jul 23, 2019

fgmccabe commented Jul 23, 2019 via email

PoignardAzur commented Jul 24, 2019 •

edited

Loading

fgmccabe commented Jul 24, 2019 via email

fitzgen left a comment

fitzgen Jul 24, 2019

fitzgen Jul 24, 2019

fitzgen Jul 24, 2019

fitzgen Jul 24, 2019

fitzgen Jul 24, 2019

fgmccabe commented Jul 24, 2019 via email

lukewagner left a comment •

edited

Loading

fgmccabe commented Jul 24, 2019 via email

PoignardAzur commented Jul 24, 2019

lukewagner commented Jul 24, 2019

	signature; if it is an access binding then the signature is a WASM function
	signature; if it is an access binding then the signature is a Wasm function

	wi-IDL (typically an imported function) is accessed from WASM. It takes the
	wi-IDL (typically an imported function) is accessed from Wasm. It takes the

	operators. The type of this lambda is a WASM function type.
	operators. The type of this lambda is a Wasm function type.

	(access ($B $L) ($integer-to-i32 (utf8-count (base-len-as-string $B $L))))
	(access ($B $L) (integer-to-i32 ($utf8-count (base-len-as-string $B $L))))


		#### Nominal Type {#nominal-type}

		A nominal type expression is used to identify types by name:

Draft specs of idl and bindings #52

Draft specs of idl and bindings #52

Conversation

fgmccabe commented Jul 22, 2019

PoignardAzur commented Jul 23, 2019

rossberg left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fgmccabe commented Jul 23, 2019 via email

rossberg commented Jul 23, 2019

jgravelle-google commented Jul 23, 2019

PoignardAzur commented Jul 23, 2019

fgmccabe commented Jul 23, 2019 via email

PoignardAzur commented Jul 24, 2019 • edited Loading

fgmccabe commented Jul 24, 2019 via email

fitzgen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fgmccabe commented Jul 24, 2019 via email

lukewagner left a comment • edited Loading

Choose a reason for hiding this comment

fgmccabe commented Jul 24, 2019 via email

PoignardAzur commented Jul 24, 2019

lukewagner commented Jul 24, 2019

PoignardAzur commented Jul 24, 2019 •

edited

Loading

lukewagner left a comment •

edited

Loading