Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make format startup-/streaming-friendly #86

Open
Yoric opened this issue Mar 25, 2018 · 3 comments
Open

Make format startup-/streaming-friendly #86

Yoric opened this issue Mar 25, 2018 · 3 comments
Labels
Specification: Byte streams Specifications: Container format Tools: Optimization levers Stuff that users of command-line tools can decide depending on what they want to optimize

Comments

@Yoric
Copy link
Collaborator

Yoric commented Mar 25, 2018

Generally, we want to make sure that all data that is required during startup should appear before data that is only required later.

So, what data do we need during startup?

  • The toplevel;
  • Functions/Methods that are executed immediately (and their nested functions that are executed recursively, etc.).

Our assumption here is that we should optimize startup speed for code that is outside of any Skippable node, and that the encoder should figure out the rest.

One way to do this would be to change the format from what we have now:

[grammar]
// All node definitions
[strings]
// All string definitions
[ast]
// Ast definitions

into

[grammar]
// Node definitions used during startup.
[strings]
// String definitions used during startup.
[ast]
// Ast definitions used during startup.
[grammar]
// Node definitions used only after startup.
[strings]
// String definitions used only after startup.
[ast]
// Ast definitions used only after startup.

Semantics

  • if parsing a node definition that is used during startup requires something that appears in a post-startup table, raise a SyntaxError – this does not include code hidden in a Skippable;
  • if executing a node during startup requires something that appears in a post-startup table (through dethunkification), raise a DelayedSyntaxError.

In either case, the encoder is in charge of deciding where to best put grammar/strings/ast definitions. This is both an optimization lever and a question of semantics.

Rationale for the second point: attempting to execute a node that depends on something that is provided in a later table means blocking the run-to-completion until we have finished received network data. This is both complicated to implement and hard to specify, as receiving network data is observable by the DOM, which could in turn trigger JS code.

Further

Ideally, we'd like to get full streaming compilation/interpretation. This may mean more than 2 levels.

[grammar]
// Node definitions used during stage 1 (startup).
[strings]
// String definitions used during stage 1 (startup).
[ast]
// Ast definitions used during stage 1 (startup).
[grammar]
// Node definitions used during stage 2.
[strings]
// String definitions used during stage 2.
[ast]
// Ast definitions used during stage 2.
[grammar]
// Node definitions used during stage 3.
[strings]
// String definitions used during stage 3.
[ast]
// Ast definitions used during stage 3.
// ...

With the definition that any lookup in a table first looks up in stage 1, then if the table of stage 1 is not long enough stage 2, ...

Again, we'll let the encoder where to best place the data. Again, we'll need to decide of semantics for errors.

@Yoric Yoric added Specification: Byte streams Specifications: Container format Tools: Optimization levers Stuff that users of command-line tools can decide depending on what they want to optimize labels Mar 25, 2018
@Yoric Yoric changed the title Make format startup-friendly Make format startup-/streaming-friendly Mar 25, 2018
@syg
Copy link
Collaborator

syg commented Mar 28, 2018

I think full streaming compilation is the realistic goal, see issue at https://github.com/binast/ecmascript-binary-ast/issues/12

I think streaming interpretation, given the deferred function-at-a-time error model we're going with, will end up slowing down the more widely useful streaming parsing use case.

@Yoric
Copy link
Collaborator Author

Yoric commented Apr 4, 2018

We agreed over in binast/ecmascript-binary-ast#12, in particular to not call "streaming interpretation" what I was calling "streaming interpretation".

@Yoric
Copy link
Collaborator Author

Yoric commented Apr 24, 2018

Discussing with @lukewagner, we realized that most VMs will have difficulties implementing the semantics in which we wait during execution for the loading of a function that appears further down in the stream.

So, amending

if executing a node during startup requires something that appears in a post-startup table (through dethunkification), should we raise a DelayedSyntaxError or just take the performance hit?

into

if executing a node during startup requires something that appears in a post-startup table (through dethunkification), we raise a DelayedSyntaxError.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Specification: Byte streams Specifications: Container format Tools: Optimization levers Stuff that users of command-line tools can decide depending on what they want to optimize
Projects
None yet
Development

No branches or pull requests

2 participants