Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cranelift-isle: codegen from new IR #5435

Merged
merged 18 commits into from
Jan 23, 2023

Conversation

jameysharp
Copy link
Contributor

ISLE's existing code-generation strategy doesn't generate the most efficient matching order for rules. This PR completely replaces it.

With this PR applied, wasmtime compile retires 2% fewer instructions on the pulldown-cmark and spidermonkey benchmarks from Sightglass.

A dev build of cranelift-codegen from an empty target/ directory takes 2% less time. The build script, invoking ISLE, takes a little longer, but Rust can compile the generated code faster, so it balances out.

This PR is ready for review, but I'm expecting review to take a few rounds of back-and-forth while we figure out what comments I should have written.

@jameysharp jameysharp requested a review from cfallin December 14, 2022 02:48
@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator isle Related to the ISLE domain-specific language labels Dec 14, 2022
@github-actions
Copy link

Subscribe to Label Action

cc @cfallin, @fitzgen

This issue or pull request has been labeled: "cranelift", "isle"

Thus the following users have been cc'd because of the following labels:

  • cfallin: isle
  • fitzgen: isle

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

@jameysharp
Copy link
Contributor Author

The "Doc build" and "Fuzz Targets" CI failures look to be due to using an older nightly Rust version (nightly-2022-09-07) than the stable release we're using for everything else. I used the new let-else syntax in this PR, which stabilized last month in Rust 1.65. I could remove that syntax from this PR without much disappointment.

The ubuntu-latest test failures tell me I forgot to run the ISLE tests, again. Most of those failures are due to integer constants that don't fit in their specified types, like what we discussed in #5423. There are also a couple hitting some asserts that I added at the last minute, so the asserts may be wrong.

@jameysharp
Copy link
Contributor Author

I've split the "easy" parts of this into separate commits in #5441 in hopes that we can get review of those parts out of the way quickly.

TODO: write a useful commit message
Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments from our pair-review so far -- most things making sense, just some confusion around equality constraints. To be continued!

cranelift/isle/isle/src/serialize.rs Outdated Show resolved Hide resolved
cranelift/isle/isle/src/serialize.rs Show resolved Hide resolved
cranelift/isle/isle/src/serialize.rs Outdated Show resolved Hide resolved
cranelift/isle/isle/src/serialize.rs Outdated Show resolved Hide resolved
cranelift/isle/isle/src/serialize.rs Outdated Show resolved Hide resolved
cranelift/isle/isle/src/serialize.rs Outdated Show resolved Hide resolved
cranelift/isle/isle/src/serialize.rs Outdated Show resolved Hide resolved
for &impure in rule.impure.iter() {
self.use_expr(impure);
}
self.use_expr(rule.result);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarify invariant here and elsewhere: are the bindings named in Condition variants required to already be use_expr'd, or does the consumer of the Conditions do so?

cranelift/isle/isle/src/serialize.rs Show resolved Hide resolved
cranelift/isle/isle/src/serialize.rs Outdated Show resolved Hide resolved
Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the doc-comments -- with updates, this is much clearer to me now!

The equality-constraint handling is still pretty subtle.HasControlFlow::Equal candidates seem to come from best_single_binding's logic that scans over other candidates, given a first one that is Equal. The rule-partitioning that determines if a HasControlFlow::Equal selects a rule then uses the rule's equivalence-classes containing two two sides of the equality and... checks if they have any overlap? The bigger picture here -- how we ensure that when we eventually include a rule's right-hand side, we've satisfied all equalities, what the above rule partitioning is really doing (the tested equality... implies some subset of the required equalities? Does not contradict them?), how we refine the list of remaining required equalities in the context of the check, etc... it would be great to see some documentation on this.

Other than that, I'm pretty happy with the rest; I've looked over the new codegen.rs and it seems pretty reasonable. And the overall scheme for serialization into control-flow makes sense. Let's just sort the equality checking and I think this is good to go!

cranelift/isle/isle/src/codegen.rs Show resolved Hide resolved
cranelift/isle/isle/src/codegen.rs Outdated Show resolved Hide resolved
This changes the order that some control flow is emitted in the egraph
optimization rules, but as far as I can tell didn't change what
conditions are actually checked.
Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really close to complete! I have some thoughts below but otherwise I think I'm satisfied that the logic is correct, so I'm going to go ahead and r+ it (with outstanding comments addressed).

Jamey and I talked a bit offline about this and we came up with the idea of a "translation validation" approach to get a little more confidence in the new codegen: the idea is to build a symbolic representation of the result of codegen (inputs from external extractors being the terminals and constructor calls being the results), derive this both from the original rules processed in priority order linearly (i.e., the "naive ISLE semantics") and the tree of control flow generated here. If we can canonicalize and show equivalence, then we have a translation validation for a particular ISLE compilation; and we could actually enable this by default in islec if fast enough, or at least in debug-assertion builds if not. (The check is at ISLE compile time and has no compiler-runtime cost.) This actually gives us better assurance than we have today with the current ISLE codegen; for a given build of Cranelift, all of this logic would then be outside the TCB and we know the rules are being evaluated properly. (Then we just need to validate the rules themselves.)

I think Jamey is going to see if there's a not-too-terrible path to implementing this (a few days to a week of work?) before merging this, because if it works it would be the best possible assurance that all of this really subtle logic is correct; to state my position in advance though, if this turns out to be more difficult than expected, I'm fine with merging this first, as the use in Cranelift (and associated fuzzing and testsuites) are a reasonable test of the ISLE rule handling as well; or at least, as good as we have today.

cranelift/isle/isle/src/serialize.rs Outdated Show resolved Hide resolved
cranelift/isle/isle/src/serialize.rs Outdated Show resolved Hide resolved
cranelift/isle/isle/src/serialize.rs Outdated Show resolved Hide resolved
cranelift/isle/isle/src/serialize.rs Outdated Show resolved Hide resolved
// equivalence class. By specifying the same binding site twice,
// we're checking whether the binding site is in any equivalence
// class at all.
set_score(&mut candidate.score, HasControlFlow::Equal(source, source))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about introducing a function on HasControlFlow called something like any_equiv_class that constructs the Equal(source, source) case? Something like:

impl HasControlFlow {
  fn any_equivalence_class(source: BindingId) -> Self {
    Self::Equal(source, source)
  }
}

It might help when seeing uses where the args are the same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since both you and Chris found the Equal(x, x) pattern confusing, I've implemented an alternative. What do you think of 2bb7988?

Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked over the new handling of equalities and it seems reasonable enough; combined with my earlier review and Trevor's subsequent reviews I have no objections to seeing this merged!

@jameysharp jameysharp merged commit bfc6aad into bytecodealliance:main Jan 23, 2023
@jameysharp jameysharp deleted the isle-codegen branch January 23, 2023 20:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift Issues related to the Cranelift code generator isle Related to the ISLE domain-specific language
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants