Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tidy up preprocessor and parser #12

Merged
merged 34 commits into from Jul 16, 2023
Merged

Commits on Jul 2, 2023

  1. Fix typos in preprocessor tests

    These don’t really affect the preprocessor because it doesn’t care about
    the details, but it’s less confusing if the modules in the tests are at
    least syntactically valid.
    tomstuart committed Jul 2, 2023
    Configuration menu
    Copy the full SHA
    b6f09f5 View commit details
    Browse the repository at this point in the history

Commits on Jul 4, 2023

  1. Extract #assert_preprocess_module_fields test helper

    Since almost all of the preprocessor tests are for fields within a
    single module body, we can reduce boilerplate (and indentation levels)
    by introducing an assertion which automatically wraps its arguments in
    `(module …)`.
    tomstuart committed Jul 4, 2023
    Configuration menu
    Copy the full SHA
    3cf39b6 View commit details
    Browse the repository at this point in the history
  2. Update preprocessor tests to not depend upon type use abbreviation

    Most of the preprocessor tests use syntax which relies upon the ability
    to omit an explicit type use [0] and have it automatically inserted when
    abbreviations are expanded. To prevent these tests from breaking when we
    implement desugaring of type use abbreviations we need to update them to
    contain full type uses which are unaffected by the expansion of that
    abbreviation.
    
    When adding an unabbreviated type use, I’ve removed the corresponding
    inline declarations unless they’re the focus of the test.
    
    [0] https://webassembly.github.io/spec/core/text/modules.html#abbreviations
    tomstuart committed Jul 4, 2023
    Configuration menu
    Copy the full SHA
    207c4ad View commit details
    Browse the repository at this point in the history

Commits on Jul 5, 2023

  1. Pretty print S-expressions when a preprocessor test fails

    This makes it easier to see what’s gone wrong.
    tomstuart committed Jul 5, 2023
    Configuration menu
    Copy the full SHA
    27c94c4 View commit details
    Browse the repository at this point in the history
  2. Preprocess instructions inside global definitions

    As with `offset` and element expressions, the test is demonstrating
    desugaring of instructions in an invalid program because that’s the
    correct behaviour for the parsing phase regardless of validation. In
    practice only constant instructions [0] may appear inside a global
    definition [1] but we have no way to demonstrate desugaring of constant
    instructions yet.
    
    [0] https://webassembly.github.io/spec/core/valid/instructions.html#valid-constant
    [1] https://webassembly.github.io/spec/core/valid/modules.html#globals
    tomstuart committed Jul 5, 2023
    Configuration menu
    Copy the full SHA
    dc2465b View commit details
    Browse the repository at this point in the history
  3. Recursively preprocess expanded inline import

    We need to desugar an inline import after expanding it because its
    description can contain other abbreviations (e.g. multiple anonymous
    parameters and results in the case of function imports).
    
    I overlooked this earlier because I didn’t write a test to cover it and
    presumably the official WebAssembly test suite doesn’t cover it either,
    so no tests failed when I removed inline import support from the parser.
    tomstuart committed Jul 5, 2023
    Configuration menu
    Copy the full SHA
    ea67cc0 View commit details
    Browse the repository at this point in the history
  4. Make preprocessing for export and start fields explicit

    These are the only two remaining kinds of field and neither supports any
    abbreviations, so we can list them as an intentional no-op case and have
    all field kinds explicitly represented.
    tomstuart committed Jul 5, 2023
    Configuration menu
    Copy the full SHA
    41f065c View commit details
    Browse the repository at this point in the history
  5. Decouple reading from result generation everywhere in preprocessor

    This separation exists in most places in the preprocessor, so for the
    sake of stylistic consistency we should do it everywhere.
    tomstuart committed Jul 5, 2023
    Configuration menu
    Copy the full SHA
    02663dc View commit details
    Browse the repository at this point in the history
  6. Rename module field parsing methods to match language spec and prepro…

    …cessor
    
    There’s no reason for these to be different, and the longer names are
    clearer.
    tomstuart committed Jul 5, 2023
    Configuration menu
    Copy the full SHA
    7ed0b57 View commit details
    Browse the repository at this point in the history
  7. Extract #fresh_id helper in preprocessor

    It’s still the wrong way of doing it, but at least now it’s only wrong
    once.
    tomstuart committed Jul 5, 2023
    Configuration menu
    Copy the full SHA
    b2d3ea8 View commit details
    Browse the repository at this point in the history

Commits on Jul 6, 2023

  1. Extract ReadOptionalId helper module from preprocessor and parser

    This logic is repeated in a lot of places, so let’s extract and
    reuse it.
    
    IDs are essentially always optional so I’ve chosen to write a single
    method which unconditionally tries to read an ID (and may return nil)
    rather than a pair of #can_read_id?/#read_id methods. That does mean we
    have to check for `id.nil?` in a couple of places rather than asking a
    dedicated predicate, but it’s rare enough that I think this trade-off is
    the right one.
    tomstuart committed Jul 6, 2023
    Configuration menu
    Copy the full SHA
    4f3c2e9 View commit details
    Browse the repository at this point in the history
  2. Use pattern matching to check for index when parsing br_table instr…

    …uction
    
    This is how we do it everywhere else, and there’s no reason for it to be
    different here. The pattern match also has the advantage of tolerating
    non-string values (e.g. arrays), which makes the #can_read_list? call
    unnecessary.
    
    Removing the numbered parameter changes the block’s arity, so we need to
    add an explicit dummy parameter to keep #repeatedly happy. In hindsight
    I don’t think the `until:` API is a good idea so I’m going to change
    that soon.
    tomstuart committed Jul 6, 2023
    Configuration menu
    Copy the full SHA
    08436fe View commit details
    Browse the repository at this point in the history
  3. Extract ReadIndex helper module from AST parser

    This provides a nice expressive interface for dealing with indexes,
    moves a complicated regexp out of the parser, and puts the
    implementation in a helper where the preprocessor will be able to share
    it in future.
    tomstuart committed Jul 6, 2023
    Configuration menu
    Copy the full SHA
    adf38a6 View commit details
    Browse the repository at this point in the history

Commits on Jul 16, 2023

  1. Add ?: to regular expressions without named groups

    In 8ea2aa0 I removed unnecessary `?:`
    from regular expressions which already didn’t capture unnamed groups
    (because they also contained named ones), but I neglected to check
    whether other regular expressions were unintentionally capturing unnamed
    groups, so I’m fixing that now.
    
    Although this adds more syntactic noise to the regular expressions, I
    think our intent is clearer if we only capture groups which we intend to
    use, so I’m prepared to pay the price.
    tomstuart committed Jul 16, 2023
    Configuration menu
    Copy the full SHA
    18f1b51 View commit details
    Browse the repository at this point in the history
  2. Clean up action read in #parse_assert_return and #parse_assert_trap

    The pattern match will raise an exception if it fails, so there’s no
    need to do it ourselves.
    tomstuart committed Jul 16, 2023
    Configuration menu
    Copy the full SHA
    9483690 View commit details
    Browse the repository at this point in the history
  3. Use #parse_typeuse when parsing call_indirect instruction

    The grammar says a type use appears here [0] so we should be reusing all
    of the appropriate parsing machinery.
    
    This creates the opportunity to check that no named parameters appear in
    the type use, which we’ll do in the next commit.
    
    [0] https://webassembly.github.io/spec/core/text/instructions.html#text-call-indirect
    tomstuart committed Jul 16, 2023
    Configuration menu
    Copy the full SHA
    c770866 View commit details
    Browse the repository at this point in the history
  4. Check that no named parameters appear in call_indirect type use

    We already do this in #parse_blocktype because of its side condition
    which requires a type use to contain only unnamed entries [0]. The same
    side condition applies to `call_indirect`’s type use [1] so we should
    perform the same check.
    
    [0] https://webassembly.github.io/spec/core/text/instructions.html#text-instr-block
    [1] https://webassembly.github.io/spec/core/text/instructions.html#text-call-indirect
    tomstuart committed Jul 16, 2023
    Configuration menu
    Copy the full SHA
    06c71da View commit details
    Browse the repository at this point in the history
  5. Use #parse_results when parsing select instruction

    The grammar says zero or more results appear here [0] so we should be
    reusing the appropriate parsing machinery.
    
    [0] https://webassembly.github.io/spec/core/text/instructions.html#parametric-instructions
    tomstuart committed Jul 16, 2023
    Configuration menu
    Copy the full SHA
    499b1e1 View commit details
    Browse the repository at this point in the history
  6. Use #parse_index when parsing table.init instruction

    These are indexes [0], so we should be reusing the index-parsing machinery.
    
    [0] https://webassembly.github.io/spec/core/text/instructions.html#table-instructions
    tomstuart committed Jul 16, 2023
    Configuration menu
    Copy the full SHA
    af92d91 View commit details
    Browse the repository at this point in the history
  7. Use Kernel#loop in #repeatedly helper

    This’ll allow the loop to be terminated by the `StopIteration` exception
    instead of needing to support custom conditions through arguments.
    tomstuart committed Jul 16, 2023
    Configuration menu
    Copy the full SHA
    f3681f3 View commit details
    Browse the repository at this point in the history
  8. Replace until: with raising StopIteration in #repeatedly callers

    It’s admittedly a bit “exceptions for control flow”, but this way of
    using `StopIteration` is idiomatic in Ruby so I think giving the caller
    arbitrary control over loop termination is easier to understand than
    relying upon a mystery `until:` argument.
    
    I’ve resisted the temptation to push `raise StopIteration` further down
    the call stack (into e.g. #read_list, or even #read) to avoid spooky
    action at a distance; I think the logic is easiest to understand when
    `raise StopIteration` appears syntactically within the #repeatedly
    block.
    
    I’ve also chosen a guard-style raise (`raise … if …`) rather than
    putting the whole body of the block inside a conditional (`if … else
    raise … end`) because the former puts the `raise` lexically adjacent to
    the #repeatedly call instead of pushing it further away (e.g. inside the
    `else` clause) which should improve clarity. This choice also helps to
    keep the indentation levels under control.
    tomstuart committed Jul 16, 2023
    Configuration menu
    Copy the full SHA
    f87fbb2 View commit details
    Browse the repository at this point in the history
  9. Remove unused until: argument from #repeatedly helper

    We always use `StopIteration` now.
    tomstuart committed Jul 16, 2023
    Configuration menu
    Copy the full SHA
    347c635 View commit details
    Browse the repository at this point in the history
  10. Use #repeatedly helper instead of while/until in preprocessor and…

    … parser
    
    This avoids us having to do the concatenating ourselves.
    
    Again I’ve chosen guard-style `raise StopIteration if …` clauses to keep
    them adjacent to the #repeatedly call for clarity’s sake.
    tomstuart committed Jul 16, 2023
    Configuration menu
    Copy the full SHA
    bff8044 View commit details
    Browse the repository at this point in the history
  11. Disable YJIT when running tests

    I’d like to use destructuring with optional block parameters in the
    implementation of #unzip_pairs, but there’s a bug in YJIT [0] which
    prevents this from working. The easiest solution for now is to turn YJIT
    off.
    
    [0] Shopify/yjit#313
    tomstuart committed Jul 16, 2023
    Configuration menu
    Copy the full SHA
    192014b View commit details
    Browse the repository at this point in the history
  12. Use optional block parameters in #unzip_pairs

    When `pairs` is empty then `pairs.transpose` is `[]`, which gets
    destructured into a pair of nils for the two block parameters. We can
    replace the nils in this case by defaulting each parameter to the empty
    array.
    tomstuart committed Jul 16, 2023
    Configuration menu
    Copy the full SHA
    dd5488c View commit details
    Browse the repository at this point in the history
  13. Push #unzip_pairs into #parse_parameters and #parse_locals

    The only uses of #unzip_pairs are to unzip the results of these methods,
    so we might as well have them return an unzipped result.
    tomstuart committed Jul 16, 2023
    Configuration menu
    Copy the full SHA
    505caa5 View commit details
    Browse the repository at this point in the history
  14. Use #unzip_pairs in #parse_results

    This was the only caller of #parse_declarations which didn’t unzip its
    result with #unzip_pairs. By doing it consistently across all callers we
    create the opportunity to have #parse_declarations return unzipped
    results directly, which we’ll do in the next commit.
    tomstuart committed Jul 16, 2023
    Configuration menu
    Copy the full SHA
    53ecd0d View commit details
    Browse the repository at this point in the history
  15. Push #unzip_pairs into #parse_declarations

    The only uses of #unzip_pairs are to unzip the result of
    the #parse_declarations method, so we might as well have it return an
    unzipped result.
    tomstuart committed Jul 16, 2023
    Configuration menu
    Copy the full SHA
    c98e965 View commit details
    Browse the repository at this point in the history
  16. Inline #unzip_pairs into #parse_declarations

    This is now the only caller of #unzip_pairs, so we can safely build it
    into #parse_declarations directly.
    tomstuart committed Jul 16, 2023
    Configuration menu
    Copy the full SHA
    b951678 View commit details
    Browse the repository at this point in the history
  17. Support multiple memories in AST parser

    Validation constrains modules to at most one memory [0] but the parser
    doesn’t need to worry about that. It creates more regularity in the
    parser to support arbitrarily many memories (just like functions,
    tables, globals etc) and let validation deal with any problems.
    
    [0] https://webassembly.github.io/spec/core/valid/modules.html#valid-module
    tomstuart committed Jul 16, 2023
    Configuration menu
    Copy the full SHA
    f253bb0 View commit details
    Browse the repository at this point in the history
  18. Initialise all module field vectors together in AST parser

    Now that these are all arrays (except for `start`) we can consolidate
    the setup of their initial values.
    tomstuart committed Jul 16, 2023
    Configuration menu
    Copy the full SHA
    6df4e7a View commit details
    Browse the repository at this point in the history
  19. Consolidate references to singleton memory in interpreter

    I’d like to support multiple memories here to match the AST parser, and
    that change will be easier if we assign `current_module.memory` to a
    local variable instead of repeatedly using it inline.
    tomstuart committed Jul 16, 2023
    Configuration menu
    Copy the full SHA
    17157a4 View commit details
    Browse the repository at this point in the history
  20. Support multiple runtime memories in interpreter

    Validation won’t actually allow multiple memories [0] but it makes the
    implementation more regular, and more like the formal WebAssembly
    language specification, if we assume they’re possible. All of the memory
    instructions implicitly operate on memory index zero [1] so we’ve now
    made that explicit in their implementation.
    
    The #initialise_memories method no longer raises an error if the memory
    index is non-zero; this check should’ve been happening during validation
    anyway (which we’re not doing yet) so losing it doesn’t matter.
    
    [0] https://webassembly.github.io/spec/core/valid/modules.html#valid-module
    [1] https://webassembly.github.io/spec/core/syntax/instructions.html#memory-instructions
    tomstuart committed Jul 16, 2023
    Configuration menu
    Copy the full SHA
    1ddfef3 View commit details
    Browse the repository at this point in the history
  21. Extract #allocate_instances helper in interpreter

    Now that modules have multiple memories, we can use the same approach
    for all relevant fields during allocation [0].
    
    This makes it clear that exports are different and shouldn’t be “built”
    in the same way; we could’ve already seen this from the definition of
    module allocation [1]. I’m not going to get distracted by making any
    more changes to the interpreter at the moment, but we should come back
    to this in future.
    
    [0] https://webassembly.github.io/spec/core/exec/modules.html#allocation
    [1] https://webassembly.github.io/spec/core/exec/modules.html#alloc-module
    tomstuart committed Jul 16, 2023
    Configuration menu
    Copy the full SHA
    9042b94 View commit details
    Browse the repository at this point in the history