Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preprocess only genuine type uses inside instructions #13

Merged
merged 10 commits into from Jul 16, 2023

Conversation

tomstuart
Copy link
Owner

As described in d0f5675, type uses can appear inside a few different kinds of instruction (block, loop, if and call_indirect) so we need to find and preprocess them to make it possible to add support for the type use abbreviation in future.

Our previous implementation was hacky and depended upon finding every syntactic occurrence of param and result inside an instruction and desugaring multiple anonymous occurrences. This is fine for that specific abbreviation but unsuitable for type uses more generally because some occurrences of result aren’t type uses: a select instruction uses result but that isn’t a type use, and a block type may or may not contain a type use depending upon whether type and param appear and how many results it has.

This PR refines the implementation to correctly identify and preprocess only genuine type uses, taking into account the rules for when a block type should be treated as a type use.

We want to use #process_typeuse for processing parameters and results,
but #process_typeuse doesn’t expect to be called inside #read_list, so
we have to separate the two cases first.
This makes it easier to add more cases for specific instruction types,
which is what we want to do next.
If we intend to desugar type uses, we have to know where they all are.
Type uses can occur inside `block`/`loop`/`if` and `call_indirect`
instructions, so we need to preprocess them there.
The `select` instruction contains `result` types [0] which look
syntactically like type uses but explicitly aren’t in this context, so
we have to be careful to avoid triggering the type use preprocessor.

The tests in this commit enforce the constraint that a `select`
instruction’s `result` types must be left alone by any desugaring of the
type use abbreviation since they’re not type uses. We don’t have any
such desugaring yet, so the tests don’t currently constrain anything,
but they’ll prevent regressions when we do implement it.

[0] https://webassembly.github.io/spec/core/text/instructions.html#parametric-instructions
At the moment we’re treating any occurrence of `result` as a type use,
but that brute force approach isn’t correct because `result` can appear
in a block type [0] without necessarily being a type use, so we need to
be a bit more surgical.

d0f5675 already added test coverage for
this behaviour when it introduced the initial “hack” to expand parameter
and result abbreviations inside instructions, so we don’t need to test
it again here.

We’re not yet observing the rules about whether a block type constitutes
a type use, but introducing #process_blocktype gives us a place to do it
in future.

[0] https://webassembly.github.io/spec/core/text/instructions.html#text-blocktype
A block type isn’t a type use in certain circumstances [0] so we need to
read all of its parts first to make it possible to decide (in subsequent
commits) whether it’s actually a type use.

It’s important that we expand abbreviations before trying to decide
whether this is a type use or not. For example, `(param)` is an
abbreviation for no parameters, so the syntactic presence of `param`
doesn’t necessarily indicate a type use.

[0] https://webassembly.github.io/spec/core/text/instructions.html#text-blocktype
Per the spec [0], “the special case of a type use that is syntactically
empty […] is not regarded as an abbreviation for an inline function
type, but is parsed directly into an optional value type”.

The tests in this commit enforce the constraint that an empty block type
should be left alone by any desugaring of the type use abbreviation
since it’s not a type use. We don’t have any such desugaring yet, so the
tests don’t currently constrain anything, but they’ll prevent
regressions when we do implement it.

[0] https://webassembly.github.io/spec/core/text/instructions.html#text-blocktype
Per the spec [0], “the special case of a type use that […] consists of
only a single result is not regarded as an abbreviation for an inline
function type, but is parsed directly into an optional value type”.

The tests in this commit enforce the constraint that a single-result
block type should be left alone by any desugaring of the type use
abbreviation since it’s not a type use. We don’t have any such
desugaring yet, so the tests don’t currently constrain anything, but
they’ll prevent regressions when we do implement it.

[0] https://webassembly.github.io/spec/core/text/instructions.html#text-blocktype
We already have plenty of tests which exercise `call_indirect` (because
it was our go-to for an instruction containing a type use) so we’ll just
test that subsequent instructions aren’t ignored.
All legitimate type uses are now covered by instruction-specific cases,
so we no longer need the blunt instrument of looking for them
everywhere.
@tomstuart tomstuart merged commit d434813 into main Jul 16, 2023
2 checks passed
@tomstuart tomstuart deleted the preprocess-type-uses-in-instructions branch July 16, 2023 15:37
tomstuart added a commit that referenced this pull request Jul 19, 2023
To desugar the type use abbreviation [0] we’ll need access to a list of
all the module’s type definitions [1] when preprocessing a type use [2],
but the full list of type definitions will only be available once the
complete module has been processed. This creates a chicken-and-egg
situation: we can’t desugar a type use until we’ve processed all of the
enclosing module’s fields, but many of those fields will contain type
uses which need to be desugared during processing.

This commit breaks the deadlock by refactoring the preprocessor to allow
some field-processing methods to return a “deferred result”, namely a
proc which must be called with a list of type definitions in order to
produce the final S-expression. Deferred results give us the ability to
process a module’s fields first and then provide type definitions later;
the next step will be to generate the type definitions during processing
so we can make use of that ability.

This continues the preparatory work from #12 and #13. We’re still in the
“make the change easy” [3] phase, so there’s no externally visible
difference in the preprocessor’s behaviour yet.

[0] https://webassembly.github.io/spec/core/text/modules.html#abbreviations
[1] https://webassembly.github.io/spec/core/text/modules.html#types
[2] https://webassembly.github.io/spec/core/text/modules.html#type-uses
[3] https://twitter.com/KentBeck/status/250733358307500032
tomstuart added a commit that referenced this pull request Jul 20, 2023
)

The #process_module method is ready to provide type definitions to the
deferred results produced by preprocessing the module’s fields so that
the type use abbreviation [0] can be desugared, but we don’t currently
have any way to communicate those type definitions to #process_module.
Although the processed fields do include type definition fields, they’re
all deferred until we’re ready to provide the type definitions — another
chicken-and-egg situation.

This commit adds a “list of type definitions” return value
to #process_type_definition and refactors other preprocessor methods to
allow that value to propagate up to #process_module. The next (and
final!) step will be to populate it with real type definitions and use
those to desugar the type use abbreviation.

This concludes the preparatory work begun in #12, #13 and #14, and
therefore still doesn’t introduce any externally visible changes to the
preprocessor’s behaviour.

[0] https://webassembly.github.io/spec/core/text/modules.html#abbreviations
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant