Extensible encoding of function signatures #636

rossberg · 2016-04-05T12:23:54Z

As agreed upon at the summit. Better supports future extension with

new forms of type definitions,
functions with multiple return values,
potential mixture of base type constructors (i32..f64) and structured types in common positions, by avoiding overlap in their encoding.

ghost · 2016-04-05T13:22:57Z

BinaryEncoding.md

-| return_type | `value_type?` | the return type of the function, with `0` indicating no return type |
 | param_types | `value_type*` | the parameter types of the function |
+| return_count | `uint8` | the number of results from the function (0 or 1) |
+| return_type | `value_type?` | the result type of the function (if return_count is 1) |


Just as with sections, a string might be needed for each class of type declaration to avoid name clashes, and a size will be needed to skip unknown type declarations.

Might it be prudent to just define the return_count as a varuint32?
Could the return_type be defined a having return_count repeated value_type entries?

The 0 and 1 return_count limitation might not even be a limit for some tools. For example a tool to convert to the text format need not be limited to just 0 and 1, so perhaps the limit should be noted elsewhere, perhaps in validation rules for the MVP, and not a property of the binary encoding?

This is adding one byte per function that seems unnecessary and elsewhere some people have objected to this.

The function signatures are ordered, in the same order as the function bodies are defined, and the count defines the number of functions which seems useful for loading a vector etc. The count now would be the number of type declarations.

Is there any technical reason for not just using separate sections for future classes of type declarations?

On 5 April 2016 at 15:23, JSStats notifications@github.com wrote:

In BinaryEncoding.md
#636 (comment):

| param_types | value_type* | the parameter types of the function |
+| return_count | uint8 | the number of results from the function (0 or 1) |
+| return_type | value_type? | the result type of the function (if return_count is 1) |

Just as with sections, a string might be needed for each class of type
declaration to avoid name clashes, and a size will be needed to skip
unknown type declarations.

Unlike for unknown sections, which an implementation can ignore, there is
no sane way to handle unknown types, so that's not useful here.

Might it be prudent to just define the return_count as a varuint32?
Could the return_type be defined a having return_count repeated value_type
entries?

Note that uint8 is forward compatible with varuint32, and so is the
treatment of the count. That is intentional, of course.

The 0 and 1 return_count limitation might not even be a limit for some

tools. For example a tool to convert to the text format need not be limited
to just 0 and 1, so perhaps the limit should be noted elsewhere, perhaps in
validation rules for the MVP, and not a property of the binary encoding?

That would extend the AST, though. In particular, it would introduce
malformed types into the AST, which we currently don't have.

This is adding one byte per function that seems unnecessary and elsewhere

some people have objected to this.

No, this is one byte per different function type, typically a much
smaller number.

The function signatures are ordered, in the same order as the function

bodies are defined, and the count defines the number of functions which
seems useful for loading a vector etc. The count now would be the number of
type declarations.

I don't follow.

Is there any technical reason for not just using separate sections for
future classes of type declarations?

Yes: then you would need distinguished ways to reference the different
kinds of types, in places where either would be allowed. I.e., you'd put
the cost on each use site instead of the fewer definition sites.

Sorry, forgot that the functions just have a signature index, and see the utility of them all sharing an index space, thanks. Just a thought, but might it be useful to fit the base types into this same index space, so that where base type indexes are currently accepted these could potentially be other types? If not then how would a local be declared to have a struct type in future etc?

Just a thought, but might it be useful to fit the base types into this
same index space, so that where base type indexes are currently accepted
these could potentially be other types? If not then how would a local be
declared to have a struct type in future etc?

Yes, that's what the third bullet in the PR description is alluding to, and
the reason for having 0x05 as the code for the function type constructor,
which is disjoint from base types.

I think we might want to have a bit more index space for the primitive (and other nullary) type constructors. Or maybe that doesn't matter....

Yes, I bet you to it on the retargetted PR. :)

Understood that it's future compatible, but can we write the length to be a varuint32 just to avoid the question/answer altogether? I was planning to remove the only other uint8 (the exported field of the memory section) and then uint8 altogether to nail down this point that, except for a handful of careful special cases, all ints are variable-length.

rossberg · 2016-04-05T14:03:04Z

[Retargeted in #640.]

lukewagner · 2016-04-05T15:48:31Z

BinaryEncoding.md

-| Field | Type | Description |
+| Field | Type/Value | Description |
 | ----- |  ----- | ----- |
+| constructor | `0x05` | the function type constructor |


Let me check my understanding of this interpretation of "constructor":

we're viewing i32 and the other local types as having nullary constructors that construct the local type

0x5, the function constructor, constructs a function type given parameters/returns

in the local type declaration of a function body, we have a list of (lists of) locals that declare their type via integer index. In a future of richer (GC) types where we had an safe/opaque function-pointer type, the way you declare a local of func-ptr type would be to first emit the index of the function constructor, 0x5 which told the decoder to decode a second index which was an index into this types section array. At a higher level, you're saying first which constructor and then providing the arguments to the constructor.

If we had a "struct type" constructor, it could be, say, 0x6, and the same scheme above would apply and the validation rules would require that the index following a local type 0x5/0x6 had to match on the constructor field

If that's all right, then that seems great. Maybe it would be good to update GC.md to include this plan of record.

On Tue, Apr 5, 2016 at 5:49 PM, Luke Wagner notifications@github.com
wrote:

In BinaryEncoding.md
#636 (comment):

| ----- | ----- | ----- |
+| constructor | 0x05 | the function type constructor |

Let me check my understanding of this interpretation of "constructor":

we're viewing i32 and the other local types as having nullary
constructors that construct the local type

0x5, the function constructor, constructs a function type given
parameters/returns

in the local type declaration of a function body, we have a list of
(lists of) locals that declare their type via integer index. In a future of
richer (GC) types where we had an safe/opaque function-pointer type, the
way you declare a local of func-ptr type would be to first emit the index
of the function constructor, 0x5 which told the decoder to decode a
second index which was an index into this types section array. At a higher
level, you're saying first which constructor and then providing the
arguments to the constructor.

If we had a "struct type" constructor, it could be, say, 0x6, and
the same scheme above would apply and the validation rules would require
that the index following a local type 0x5/0x6 had to match on the
constructor field

If that's all right, then that seems great. Maybe it would be good to
update GC.md to include this plan of record.

I was thinking that we might unify the index spaces so that primitives are
low numbers (easily recognizable), and numbers larger than that would just
be indexes into the types section. Then there's no secondary validation
needed.

—
You are receiving this because you commented.
Reply to this email directly or view it on GitHub
https://github.com/WebAssembly/design/pull/636/files/bb8a6e43af9bc6b5b904a5124062b59ae06d1304#r58563196

I see, so if the was greater-than a fixed base, you'd subtract the base and use that as the index into the types section? (Or we could do the analogous thing with signed varint32 and negatives.)

And now I remember the original question that motivated my post: given this single index space, then why does the constructor field need to start at any other value than 0? Function (and later struct, array) constructors are not in the same index space as the primitive types nor the elements of the types section.

On Tue, Apr 5, 2016 at 6:53 PM, Luke Wagner notifications@github.com
wrote:

In BinaryEncoding.md
#636 (comment):

| ----- | ----- | ----- |
+| constructor | 0x05 | the function type constructor |

And now I remember the original question that motivated my post: given
this single index space, then why does the constructor field need to
start at any other value than 0? Function (and later struct, array)
constructors are not in the same index space anymore.

Yep, good point.

—
You are receiving this because you commented.
Reply to this email directly or view it on GitHub
https://github.com/WebAssembly/design/pull/636/files/bb8a6e43af9bc6b5b904a5124062b59ae06d1304#r58574302

Restructure encoding of function signatures

bb8a6e4

ghost reviewed Apr 5, 2016
View reviewed changes

rossberg mentioned this pull request Apr 5, 2016

Extensible encoding of function signatures #640

Merged

rossberg closed this Apr 5, 2016

lukewagner reviewed Apr 5, 2016
View reviewed changes

Uh oh!

Extensible encoding of function signatures #636

Extensible encoding of function signatures #636

Uh oh!

Conversation

rossberg commented Apr 5, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rossberg commented Apr 5, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants