Tweaks to binary section format?

Having finished a first iteration of the encoder & decoder for the spec, I'd like to make a couple of small suggestions regarding the structure of sections in the binary. Lumping the together here.
##### Section Headers

Instead of
`(payload_size; name_string; payload)`
as the overall section structure, can we swap that to
`(name_string; payload_size; payload)`
? Two minor advantages:
- It makes the offset computation slightly less confusing.
- It allows simply viewing a section as a pair
  `(name_string; payload_string)`,
  which is particularly natural wrt handling and skipping unknown sections.

With that change, skipping over sections requires skipping over their name first. But I have a hard time imagining a reason for skipping a section without even knowing what it is.
##### Section Names

Honestly, the current names are super verbose. Would anybody be opposed to shorten them a bit, to something nicer? I'd suggest:

```
"signatures"          -> "types"
"import_table"        -> "imports"
"function_signatures" -> "functions"
"export_table"        -> "exports"
"start_function"      -> "start"
"function_bodies"     -> "code"
"data_segments"       -> "data"
```

Note that the `signatures` section may be generalised to contain other kinds of type definitions in the future, so the current name is not a good fit.
##### Function Bodies

Function bodies are implicitly delimited by the byte size of their encoding. This is unfortunate for a couple of reasons:
- The expression decoder cannot just operate on an abstract stream, it has to take this secondary end-of-stream condition into account. The spec currently handles this by a somewhat ad-hoc notion of "substream", but really, this pierces the stream abstraction in an ugly manner.
- This is the only piece of the binary format that stands in the way of formulating the entire format unambiguously as a grammar. Because it depends on non-local (and lower-level) context information. I find it rather sad to get 99% there but lose it on the last meter.

I'd hence like to suggest adding an explicit `end` opcode to functions. Pros:
- The binary format is fully structured and can be entirely parsed linearly from an abstract byte stream, without paying attention to any size information.
- Sizes would only be needed to (a) seek through the stream if desired, (b) validating that they are consistent. That decouples concerns nicely.

I'm aware that there are concerns that the extra byte is "redundant", but is one byte per function a big deal? We recently saved much more by improving the representation of locals (or would by shortening section names :) ).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Tweaks to binary section format? #623

Section Headers

Section Names

Function Bodies

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Tweaks to binary section format? #623

Description

Section Headers

Section Names

Function Bodies

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions