Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start removing Nonterminal #114647

Closed
wants to merge 18 commits into from

Conversation

nnethercote
Copy link
Contributor

r? @ghost

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Aug 9, 2023
@nnethercote
Copy link
Contributor Author

nnethercote commented Aug 9, 2023

This is a successor to #96724, where I started doing this with NtExpr and then gave up because it was difficult. It is a necessary predecessor for fixing #67062.

The code is draft quality. So far I have removed NtVis and NtTy, both of which are easier than NtExpr. The pretty printing of vis and ty captures is now much worse. #114571 would fix that, by improving pretty printing of token streams.

It's unclear to me how to best stage this work, whether to do it in pieces, or all in a single huge PR.

cc @petrochenkov

@nnethercote nnethercote marked this pull request as draft August 9, 2023 06:03
@rust-log-analyzer

This comment has been minimized.

@petrochenkov petrochenkov self-assigned this Aug 9, 2023
@nnethercote nnethercote force-pushed the start-rm-Nonterminal branch 3 times, most recently from 86999c2 to 1034bc0 Compare August 10, 2023 07:02
@nnethercote
Copy link
Contributor Author

The latest update removes NtPat.

@nnethercote nnethercote force-pushed the start-rm-Nonterminal branch 3 times, most recently from 390563a to be30bb6 Compare August 12, 2023 22:35
@nnethercote
Copy link
Contributor Author

The latest update removes NtItem. Doing this made me realize that removing Token::Interpolated/Nonterminal will remove this restriction, because a test about that restriction started passing.

@petrochenkov
Copy link
Contributor

The PR is marked as draft, so marking is as waiting on author as well (I'll be busy for the next few days, and won't be able to review anyway).

Once this is ready I'll ask to split all the preparational commits + one of the smaller nonterminals (NtVis) to a separate PR.
@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 13, 2023
@nnethercote
Copy link
Contributor Author

#114571 should be considered ahead of this one. It improves token stream pretty printing greatly, which helps minimize changes to tests in this PR.

@nnethercote
Copy link
Contributor Author

The latest update removes NtMeta.

@nnethercote
Copy link
Contributor Author

The latest update removes NtPath.

@nnethercote
Copy link
Contributor Author

And now NtBlock.

@bors
Copy link
Contributor

bors commented Aug 15, 2023

☔ The latest upstream changes (presumably #114803) made this pull request unmergeable. Please resolve the merge conflicts.

@nnethercote
Copy link
Contributor Author

The latest updates remove NtExpr, NtLiteral, NtIdent, NtLifetime, Nonterminal, and Token::Interpolated. The commit removing NtExpr and NtLiteral is rough, with lots of njn: comments and a couple of test failures. But it all basically works.

I haven't yet added handling of invisible delimiters from proc macros, which will be required to fix #67062.

@petrochenkov: it might be worth taking a look now, to see what you think of the overall approach.

@rust-log-analyzer

This comment has been minimized.

@nnethercote
Copy link
Contributor Author

That test failure above on the bail! call is a known one from the NtExpr/NtLiteral removal.

I have opened #114915 for all the preliminary cleanup commits.

@nnethercote nnethercote mentioned this pull request Aug 16, 2023
This will make things nicer as more variants are added to
`ParseNtResult`.
Token sequences delimited with these invisible delimiters will
eventually replace `Token::Interpolated` for declarative macro
expansions.

There are no uses yet, but this commit does modify
`nonterminal_may_begin_with` so that every `Interpolated` test is now
mirrored with a `InvisibleSource::MetaVar` test. This means that `Nt*`
cases can now be removed from that function one at a time without any
other changes required.
We now use invisible delimiters for expanded `vis` fragments, instead of
`Token::Interpolated`.

This slightly changes some pretty-printed output in tests, but not in
any significant way.
Notes about tests:

- tests/ui/parser/macro/trait-object-macro-matcher.rs: the syntax error
  is duplicated, because it occurs now when parsing the decl macro
  input, and also when parsing the expanded decl macro. But this won't
  show up for normal users due to error de-duplication.

- The output of stringify! is uglier, due to `print_tts` producing
  uglier output than AST pretty-printing.
This requires adding `stream_pretty_printing_compatibility_hack`, which
will replace `nt_pretty_printing_compatibility_hack` once `NtStmt` is
removed.

Notable test changes:
- `tests/ui/macros/nonterminal-matching.rs` now passes. Removal of
  `Token::Interpolated` will remove the "captured metavariables except
  for `:tt`, `:ident` and `:lifetime` cannot be compared to other
  tokens" restriction.

- `tests/ui/proc-macro/expand-to-derive.rs`: the diff looks large but
  the only difference is the insertion of a single invisible-delimited
  group. I hope this doesn't cause any new `ProceduralMasquerade`-type
  problems.
This means `nt_pretty_printing_compatibility_hack` can also be removed.

The handling of statements in `transcribe` is slightly different to
other nonterminal kinds, due to the lack of `from_ast` implementation
for empty statements.
Note: there was an existing code path involving `Interpolated` in
`MetaItem::from_tokens` that was dead. This commit transfers that to the
new form, but puts an `unreachable!` call inside it.
[xt] move from_token

Getting a test failure here:
```
Building tool error_index_generator (stage1 -> stage2, x86_64-unknown-linux-gnu)
   Compiling cfg-if v1.0.0
    ...
   Compiling mdbook v0.4.31
error: internal compiler error: the following error was constructed but not emitted

error: unexpected token: keyword `self`
   --> /home/njn/.cargo/registry/src/index.crates.io-6f17d22bba15001f/mdbook-0.4.31/src/book/summary.rs:275:31
    |
275 |                         bail!(self.parse_error("Suffix chapters cannot be followed by a list"));
    |                               ^^^^

thread 'rustc' panicked at compiler/rustc_errors/src/diagnostic_builder.rs:775:21:
error was constructed but not emitted
```
Remnants of `Token::Interpolated` still exist in the form of the new
`token::IdentMv` and `token::LifetimeMv` tokens. I did them like that
because there's a lot of code that assumes an interpolated
ident/lifetime fits in a single token, and changing all that code to
work with invisible delimiters would have been a pain. (Maybe it could
be done in a follow-up.)

Fully kills off the "captured metavariables except for `:tt`, `:ident`
and `:lifetime` cannot be compared to other tokens" restriction.
@nnethercote
Copy link
Contributor Author

nnethercote commented Aug 22, 2023

The last commit partially implements removal of invisible delimiter skipping. This is enough to make the examples in #67062 work, but the problems mentioned above occur. The code in that commit is very rough.

@Dylan-DPC
Copy link
Member

@nnethercote any updates on this? this has ended up bitrotting heavily

bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 18, 2024
…ors,cjgillot

macro_rules: Preserve all metavariable spans in a global side table

This PR preserves spans of `tt` metavariables used to pass tokens to declarative macros.
Such metavariable spans can then be used in span combination operations like `Span::to` to improve all kinds of diagnostics.

Spans of non-`tt` metavariables are currently kept in nonterminal tokens, but the long term plan is remove all nonterminal tokens from rustc parser and rely on the proc macro model with invisible delimiters (rust-lang#114647, rust-lang#67062).
In particular, `NtIdent` nonterminal (corresponding to `ident` metavariables) becomes easy to remove when this PR lands (rust-lang#119412 does it).

The metavariable spans are kept in a global side table keyed by `Span`s of original tokens.
The alternative to the side table is keeping them in `SpanData` instead, but the performance regressions would be large because any spans from tokens passed to declarative macros would stop being inline and would work through span interner instead, and the penalty would be paid even if we never use the metavar span for the given original span.
(But also see the comment on `fn maybe_use_metavar_location` describing the map collision issues with the side table approach.)

There are also other alternatives - keeping the metavar span in `Token` or `TokenTree`, but associating it with `Span` itsel is the most natural choice because metavar spans are used in span combining operations, and those operations are not necessarily tied to tokens.
bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 18, 2024
…ors,cjgillot

macro_rules: Preserve all metavariable spans in a global side table

This PR preserves spans of `tt` metavariables used to pass tokens to declarative macros.
Such metavariable spans can then be used in span combination operations like `Span::to` to improve all kinds of diagnostics.

Spans of non-`tt` metavariables are currently kept in nonterminal tokens, but the long term plan is remove all nonterminal tokens from rustc parser and rely on the proc macro model with invisible delimiters (rust-lang#114647, rust-lang#67062).
In particular, `NtIdent` nonterminal (corresponding to `ident` metavariables) becomes easy to remove when this PR lands (rust-lang#119412 does it).

The metavariable spans are kept in a global side table keyed by `Span`s of original tokens.
The alternative to the side table is keeping them in `SpanData` instead, but the performance regressions would be large because any spans from tokens passed to declarative macros would stop being inline and would work through span interner instead, and the penalty would be paid even if we never use the metavar span for the given original span.
(But also see the comment on `fn maybe_use_metavar_location` describing the map collision issues with the side table approach.)

There are also other alternatives - keeping the metavar span in `Token` or `TokenTree`, but associating it with `Span` itsel is the most natural choice because metavar spans are used in span combining operations, and those operations are not necessarily tied to tokens.
@nnethercote
Copy link
Contributor Author

There are still some good ideas in here that I'd like to pursue, so I'll keep it open.

@nnethercote
Copy link
Contributor Author

My next attempt at this is now in #124141.

nnethercote added a commit to nnethercote/rust that referenced this pull request Apr 22, 2024
The extra span is now recorded in the new `TokenKind::InterpolatedIdent` and
`TokenKind::InterpolatedLifetime`. These both consist of a single token, and so
there's no operator precedence problems with inserting them directly into the
token stream.

The other way to do this would be to wrap the ident/lifetime in invisible
delimiters, but there's a lot of code that assumes an interpolated
ident/lifetime fits in a single token, and changing all that code to work with
invisible delimiters would have been a pain. (Maybe it could be done in a
follow-up.)

This change might not seem like much of a win, but it's a first step toward the
much bigger and long-desired removal of `Nonterminal` and
`TokenKind::Interpolated`. That change is big and complex enough that it's
worth doing this piece separately. (Indeed, this commit is based on part of a
late commit in rust-lang#114647, a prior attempt at that big and complex change.)
bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 22, 2024
…=<try>

Remove `NtIdent` and `NtLifetime`

This is one part of the bigger "remove `Nonterminal` and `TokenKind::Interpolated`" change drafted in rust-lang#114647. More details in the individual commit messages.

r? `@petrochenkov`
nnethercote added a commit to nnethercote/rust that referenced this pull request May 13, 2024
The extra span is now recorded in the new `TokenKind::InterpolatedIdent` and
`TokenKind::InterpolatedLifetime`. These both consist of a single token, and so
there's no operator precedence problems with inserting them directly into the
token stream.

The other way to do this would be to wrap the ident/lifetime in invisible
delimiters, but there's a lot of code that assumes an interpolated
ident/lifetime fits in a single token, and changing all that code to work with
invisible delimiters would have been a pain. (Maybe it could be done in a
follow-up.)

This change might not seem like much of a win, but it's a first step toward the
much bigger and long-desired removal of `Nonterminal` and
`TokenKind::Interpolated`. That change is big and complex enough that it's
worth doing this piece separately. (Indeed, this commit is based on part of a
late commit in rust-lang#114647, a prior attempt at that big and complex change.)
nnethercote added a commit to nnethercote/rust that referenced this pull request May 13, 2024
The extra span is now recorded in the new `TokenKind::NtIdent` and
`TokenKind::NtLifetime`. These both consist of a single token, and so
there's no operator precedence problems with inserting them directly
into the token stream.

The other way to do this would be to wrap the ident/lifetime in invisible
delimiters, but there's a lot of code that assumes an interpolated
ident/lifetime fits in a single token, and changing all that code to work with
invisible delimiters would have been a pain. (Maybe it could be done in a
follow-up.)

This change might not seem like much of a win, but it's a first step toward the
much bigger and long-desired removal of `Nonterminal` and
`TokenKind::Interpolated`. That change is big and complex enough that it's
worth doing this piece separately. (Indeed, this commit is based on part of a
late commit in rust-lang#114647, a prior attempt at that big and complex change.)
nnethercote added a commit to nnethercote/rust that referenced this pull request May 13, 2024
The extra span is now recorded in the new `TokenKind::NtIdent` and
`TokenKind::NtLifetime`. These both consist of a single token, and so
there's no operator precedence problems with inserting them directly
into the token stream.

The other way to do this would be to wrap the ident/lifetime in invisible
delimiters, but there's a lot of code that assumes an interpolated
ident/lifetime fits in a single token, and changing all that code to work with
invisible delimiters would have been a pain. (Maybe it could be done in a
follow-up.)

This change might not seem like much of a win, but it's a first step toward the
much bigger and long-desired removal of `Nonterminal` and
`TokenKind::Interpolated`. That change is big and complex enough that it's
worth doing this piece separately. (Indeed, this commit is based on part of a
late commit in rust-lang#114647, a prior attempt at that big and complex change.)
bors added a commit to rust-lang-ci/rust that referenced this pull request May 14, 2024
…=petrochenkov

Remove `NtIdent` and `NtLifetime`

This is one part of the bigger "remove `Nonterminal` and `TokenKind::Interpolated`" change drafted in rust-lang#114647. More details in the individual commit messages.

r? `@petrochenkov`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants