Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A new matcher representation for use in parse_tt #95555

Merged
merged 2 commits into from
Apr 4, 2022

Conversation

nnethercote
Copy link
Contributor

@nnethercote nnethercote commented Apr 1, 2022

By transforming the matcher into a different form, parse_tt can run faster and be easier to understand.

r? @petrochenkov

@rustbot rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Apr 1, 2022
@nnethercote nnethercote marked this pull request as draft April 1, 2022 06:54
`parse_tt` currently traverses a `&[TokenTree]` to do matching. But this
is a bad representation for the traversal.
- `TokenTree` is nested, and there's a bunch of expensive and fiddly
  state required to handle entering and exiting nested submatchers.
- There are three positions (sequence separators, sequence Kleene ops,
  and end of the matcher) that are represented by an index that exceeds
  the end of the `&[TokenTree]`, which is clumsy and error-prone.

This commit introduces a new representation called `MatcherLoc` that is
designed specifically for matching. It fixes all the above problems,
making the code much easier to read. A `&[TokenTree]` is converted to a
`&[MatcherLoc]` before matching begins. Despite the cost of the
conversion, it's still a net performance win, because various pieces of
traversal state are computed once up-front, rather than having to be
recomputed repeatedly during the macro matching.

Some improvements worth noting.
- `parse_tt_inner` is *much* easier to read. No more having to compare
  `idx` against `len` and read comments to understand what the result
  means.
- The handling of `Delimited` in `parse_tt_inner` is now trivial.
- The three end-of-sequence cases in `parse_tt_inner` are now handled in
  three separate match arms, and the control flow is much simpler.
- `nameize` is no longer recursive.
- There were two places that issued "missing fragment specifier" errors:
  one in `parse_tt_inner()`, and one in `nameize()`. Presumably the
  latter was never executed. There's now a single place issuing these
  errors, in `compute_locs()`.
- The number of heap allocations done for a `check full` build of
  `async-std-1.10.0` (an extreme example of heavy macro use) drops from
  11.8M to 2.6M, and most of these occur outside of macro matching.
- The size of `MatcherPos` drops from 64 bytes to 16 bytes. Small enough
  that it no longer needs boxing, which partly accounts for the
  reduction in allocations.
- The rest of the drop in allocations is due to the removal of
  `MatcherKind`, because we no longer need to record anything for the
  parent matcher when entering a submatcher.
- Overall it reduces code size by 45 lines.
To match the order the variants are declared in.
@nnethercote nnethercote marked this pull request as ready for review April 4, 2022 07:22
@nnethercote
Copy link
Contributor Author

r? @petrochenkov

Best reviewed one commit at a time. Apologies for the first commit being quite large; the change is big and invasive enough that there wasn't really a good way to break it into smaller pieces.

@nnethercote
Copy link
Contributor Author

Here are some instruction count results for check full builds. Those marked with * are from the rustc-perf suite.

Benchmark Profile Scenario % Change Significance Factor?
async-std-1.10.0 check full -9.94% 49.68x
time-macros-0.2.3 check full -7.60% 38.01x
quote-stress check full -7.17% 35.84x
yansi-0.5.0 check full -5.89% 29.43x
inotify-0.10.0 check full -3.52% 17.58x
*deep-vector check full -1.90% 9.51x
*html5ever-0.26.0 check full -1.71% 8.56x
*token-stream-stress check full -1.36% 6.80x
pest_generator-2.1.3 check full -0.41% 2.05x
futures-macro-0.3.19 check full -0.37% 1.84x
num-derive-0.3.3 check full -0.35% 1.76x
libc-0.2.121 check full -0.41% 2.04x
ctor-0.1.21 check full -0.35% 1.73x
wast-39.0.0 check full -0.33% 1.64x
syn-1.0.89 check full -0.32% 1.61x
mockall_derive-0.11.0 check full -0.30% 1.52x

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Apr 4, 2022
@bors
Copy link
Contributor

bors commented Apr 4, 2022

⌛ Trying commit 0bd47e8 with merge 6ac9d4cf75f3aa6be784c21636d0f83c9f8d7302...

@bors
Copy link
Contributor

bors commented Apr 4, 2022

☀️ Try build successful - checks-actions
Build commit: 6ac9d4cf75f3aa6be784c21636d0f83c9f8d7302 (6ac9d4cf75f3aa6be784c21636d0f83c9f8d7302)

@rust-timer
Copy link
Collaborator

Queued 6ac9d4cf75f3aa6be784c21636d0f83c9f8d7302 with parent ec667fb, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (6ac9d4cf75f3aa6be784c21636d0f83c9f8d7302): comparison url.

Summary:

  • Primary benchmarks: mixed results
  • Secondary benchmarks: mixed results
Regressions 😿
(primary)
Regressions 😿
(secondary)
Improvements 🎉
(primary)
Improvements 🎉
(secondary)
All 😿 🎉
(primary)
count1 4 10 19 11 23
mean2 0.3% 1.0% -1.0% -2.6% -0.7%
max 0.4% 2.8% -2.0% -4.6% 0.4%

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR led to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf +perf-regression

Footnotes

  1. number of relevant changes

  2. the arithmetic mean of the percent change

@rustbot rustbot added perf-regression Performance regression. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Apr 4, 2022
@petrochenkov
Copy link
Contributor

@bors r+

@bors
Copy link
Contributor

bors commented Apr 4, 2022

📌 Commit 0bd47e8 has been approved by petrochenkov

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 4, 2022
@bors
Copy link
Contributor

bors commented Apr 4, 2022

⌛ Testing commit 0bd47e8 with merge 6a9080b...

@bors
Copy link
Contributor

bors commented Apr 4, 2022

☀️ Test successful - checks-actions
Approved by: petrochenkov
Pushing 6a9080b to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Apr 4, 2022
@bors bors merged commit 6a9080b into rust-lang:master Apr 4, 2022
@rustbot rustbot added this to the 1.62.0 milestone Apr 4, 2022
@nnethercote nnethercote deleted the parse_tt-new-representation branch April 4, 2022 20:51
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (6a9080b): comparison url.

Summary:

  • Primary benchmarks: 🎉 relevant improvements found
  • Secondary benchmarks: mixed results
Regressions 😿
(primary)
Regressions 😿
(secondary)
Improvements 🎉
(primary)
Improvements 🎉
(secondary)
All 😿 🎉
(primary)
count1 3 10 18 11 21
mean2 0.4% 1.0% -1.0% -2.4% -0.8%
max 0.4% 2.8% -2.0% -4.6% -2.0%

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

@rustbot label: -perf-regression

Footnotes

  1. number of relevant changes

  2. the arithmetic mean of the percent change

@rustbot rustbot removed the perf-regression Performance regression. label Apr 4, 2022
@nnethercote
Copy link
Contributor Author

I looked at the regressions for wg-grammar. It's due to the conversion in compute_locs. In particular, wg-grammar has a very large machine-generated macro. It's squeezed onto a single line without any spaces, and is over 12,000 characters, with 84 rules. This is very much a worst-case scenario for compute_locs, and this comment applies.

Likewise for hyper-0.14.18, though the effect there is much smaller. There are a number of macros in that benchmark, I'm not sure which one(s) is responsible, but the Cachegrind diff shows that compute_locs accounts for much of the (very small) regression.

Overall I'm not worried because the speed benefits outweigh the regressions, but I'll still see if I can improve things some more, either by doing what the comment above suggests, or something else.

nnethercote added a commit to nnethercote/rust that referenced this pull request Apr 5, 2022
In rust-lang#95555 this was moved out of `parse_tt_inner` and `nameize` into
`compute_locs`. But the next commit will be moving `compute_locs`
outwards to a place that isn't suitable for the missing fragment
identifier checking. So this reinstates the old checking.
bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 6, 2022
…r-rule, r=petrochenkov

Call `compute_locs` once per rule

This fixes the small regressions on `wg-grammar` and `hyper-0.14.18` seen in rust-lang#95555.

r? `@petrochenkov`
@mark-i-m
Copy link
Member

mark-i-m commented Apr 7, 2022

This is a very nice code improvement! Thanks @nnethercote !

JohnTitor added a commit to JohnTitor/rustc-dev-guide that referenced this pull request Apr 15, 2022
JohnTitor added a commit to JohnTitor/rustc-dev-guide that referenced this pull request Apr 15, 2022
nnethercote added a commit to nnethercote/rust that referenced this pull request Apr 22, 2022
PR rust-lang#95159 unintentionally changed the behaviour of declarative macro
matching for `NoDelim` delimited sequences.
- Before rust-lang#95159, delimiters were implicit in `mbe::Delimited`. When
  doing macro matching, normal delimiters were generated out of thin air
  as necessary, but `NoDelim` delimiters were not. This was done within
  `TokenTree::{get_tt,len}`.
- After rust-lang#95159, the delimiters were explicit. There was an unintentional
  change whereby `NoDelim` delimiters were represented explicitly just
  like normal delimeters.
- rust-lang#95555 introduced a new matcher representation (`MatcherLoc`) and the
  `NoDelim` delimiters were made explicit within it, just like
  `mbe::Delimited`.
- rust-lang#95797 then changed `mbe::Delimited` back to having implicit
  delimiters, but because matching is now being done on `MatcherLoc`,
  the behavioural change persisted.

The fix is simple: remove the explicit `NoDelim` delimiters in the
`MatcherLoc` representation. This gives back the original behaviour.

As for why this took so long to be found: it seems that `NoDelim`
sequences are unusual. It took a macro that generates another macro
(from the `yarte_lexer` crate, found via a crater run) to uncover this.

Fixes rust-lang#96305.
bors added a commit to rust-lang-ci/rust that referenced this pull request May 22, 2022
Remove box syntax from rustc_mir_dataflow and rustc_mir_transform

Continuation of rust-lang#87781, inspired by rust-lang#97239. The usages that this PR removes have not appeared from nothing, instead the usage in `rustc_mir_dataflow` and `rustc_mir_transform` was from rust-lang#80522 which split up `rustc_mir`, and which was filed before I filed rust-lang#87781, so it was using the state from before my PR. But it was merged after my PR was merged, so the `box_syntax` uses were able to survive here. Outside of this introduction due to the code being outside of the master branch at the point of merging of my PR, there was only one other introduction of box syntax, in rust-lang#95159. That box syntax was removed again though in rust-lang#95555. Outside of that, `box_syntax` has not made its reoccurrance in compiler crates.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants