yarte_lexer 0.0.1 compilation regression #96305

nnethercote · 2022-04-22T03:51:42Z

@lqd pointed out a failure from a recent crater run.

Bisection identifies the 7th commit of #95159 as the problem: "Eliminate TokenTreeOrTokenTreeSlice". Interestingly, #95797 reverted many of the changes from that 7th commit, but the problem persists.

UPDATE: See this comment for the explanation.

I have a reduced test case. It doesn't fit in a single file, instead requiring two crates.

Two-crate test case

Crate y, Cargo.toml:

[package]
name = "y"
version = "0.1.0"
edition = "2018"
[dependencies]

Crate y, src/lib.rs:

#![feature(trace_macros)]
trace_macros!(true);

macro_rules! ascii_builder {
    ($($n:literal)+) => {
        #[macro_export]
        macro_rules! ascii {
            $(($n) => { $n };)+
            ($t:tt) => { compile_error!("no match") };
        }
    };
}

ascii_builder!('a' 'b' 'c');

Crate z (which sits next to crate y in the filesystem), Cargo.toml:

[package]
name = "z2"
version = "0.1.0"
edition = "2021"
[dependencies]
y = { path = "../y" }

Crate z, src/main.rs:

#![feature(trace_macros)]
trace_macros!(true);

fn main() {
    let _x = y::ascii!('a');
}

Note: The original code defines ascii and ascii_builder within the yarte_lexer crate and then uses ascii within a unit test. For my reduced test case I used two separate crates because it made it easier to use tools like cargo expand, and to avoid any issues specific to unit tests.

Output

If I compile with cargo check using a version prior to #95159, compilation succeeds with this output:

    Checking y v0.1.0 (/home/njn/dev/y)
note: trace_macro
  --> /home/njn/dev/y/src/lib.rs:14:1
   |
14 | ascii_builder!('a' 'b' 'c');
   | ^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |
   = note: expanding `ascii_builder! { 'a' 'b' 'c' }`
   = note: to `#[macro_export] macro_rules! ascii
           {
               ('a') => { 'a' } ; ('b') => { 'b' } ; ('c') => { 'c' } ; ($t : tt) =>
               { compile_error! ("no match") } ;
           }`

    Checking z2 v0.1.0 (/home/njn/dev/z3)
note: trace_macro
 --> src/main.rs:5:14
  |
5 |     let _x = y::ascii!('a');
  |              ^^^^^^^^^^^^^^
  |
  = note: expanding `ascii! { 'a' }`
  = note: to `'a'`

    Finished dev [unoptimized + debuginfo] target(s) in 0.21s

If I do the same with a version after #95159, compilation fails with this output:

    Checking y v0.1.0 (/home/njn/dev/y)
note: trace_macro
  --> /home/njn/dev/y/src/lib.rs:14:1
   |
14 | ascii_builder!('a' 'b' 'c');
   | ^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |
   = note: expanding `ascii_builder! { 'a' 'b' 'c' }`
   = note: to `#[macro_export] macro_rules! ascii
           {
               ('a') => { 'a' } ; ('b') => { 'b' } ; ('c') => { 'c' } ; ($t : tt) =>
               { compile_error! ("no match") } ;
           }`

    Checking z2 v0.1.0 (/home/njn/dev/z2)
error: no match
 --> src/main.rs:5:14
  |
5 |     let _x = y::ascii!('a');
  |              ^^^^^^^^^^^^^^
  |
  = note: this error originates in the macro `y::ascii` (in Nightly builds, run with -Z macro-backtrace for more info)

note: trace_macro
 --> src/main.rs:5:14
  |
5 |     let _x = y::ascii!('a');
  |              ^^^^^^^^^^^^^^
  |
  = note: expanding `ascii! { 'a' }`
  = note: to `compile_error! ("no match")`

error: could not compile `z2` due to previous error

The obvious difference is the compile error. Interestingly, the expansion of ascii_builder in crate y is the same in both cases. So it's not at all clear why the failure occurs. There's a rule matching 'a' in both cases.

Single file attempt

I tried and failed to reproduce this in a single file. Here's what I had:

macro_rules! ascii_builder {
    ($($n:literal)+) => {
        #[macro_export]
        macro_rules! ascii {
            $(($n) => { $n };)+
            ($t:tt) => { compile_error!("no match") };
        }
    };
}

ascii_builder!('a' 'b' 'c');

fn main() {
    let _x = ascii!('a');
}

I just get the same compile error from above, no matter which version of the compiler I use:

error: no match
  --> y.rs:9:26
   |
9  |             ($t:tt) => { compile_error!("no match") };
   |                          ^^^^^^^^^^^^^^^^^^^^^^^^^^
...
17 |     let _x = ascii!('a');
   |              ----------- in this macro invocation
   |
   = note: this error originates in the macro `ascii` (in Nightly builds, run with -Z macro-backtrace for more info)

error: aborting due to previous error

I honestly don't know why this single-file version fails but the two-crate version works, and whether that is valid or a bug. It feels like it should compile.

Metadata difference

The only other clue I have is that the metadata has changed slightly. The older (pre-regression) compiler produces this:

-rw-rw-r-- 1 njn njn 3282 Apr 22 13:39 liby-d5c1be4b71acf3c4.rmeta

The newer (post-regression) compiler produces this:

-rw-rw-r-- 1 njn njn 3278 Apr 22 13:39 liby-d5c1be4b71acf3c4.rmeta

I don't know of an easy way to compare metadata files, so I don't know what accounts for the slight size difference.

Conclusion

This is a weird one, involving a macro that generates another macro that is then used in another crate. This explains why the change in behaviour took a while to be noticed.

I have looked through the commit responsible, but I can't see any changes related to all of the above.

@petrochenkov, any ideas?

The text was updated successfully, but these errors were encountered:

nnethercote · 2022-04-22T03:59:48Z

Oh, I can probably guess what caused the metadata change: mbe::Delimited's tts field was renamed as all_tts. (It was subsequently renamed back to tts in #95797). And that type is marked with derive(Encodable, Decodable).

nnethercote · 2022-04-22T04:53:44Z

Aha! I have worked it out. The 7th commit in #95159 unintentionally changed the behaviour of NoDelim-delimited sequences. I guess these don't come up very often, but a macro that generates a macro is such a case? There is an easy fix, which I will post shortly. Though I don't know how to write a test case for it, which is unfortunate.

(For anyone wondering how I worked this out: I took the 7th commit from #95159 and combined it with the diff from #95797 using combinediff. The result was pretty small, and the removal of TokenTree::{get_tt,len} was a big chunk of it. So then I stared at them until I realized that the NoDelim case was suspicious.)

I'm still unsure why the single-file test case above doesn't work, and whether it should.

PR rust-lang#95159 unintentionally changed the behaviour of declarative macro matching for `NoDelim` delimited sequences. - Before rust-lang#95159, delimiters were implicit in `mbe::Delimited`. When doing macro matching, normal delimiters were generated out of thin air as necessary, but `NoDelim` delimiters were not. This was done within `TokenTree::{get_tt,len}`. - After rust-lang#95159, the delimiters were explicit. There was an unintentional change whereby `NoDelim` delimiters were represented explicitly just like normal delimeters. - rust-lang#95555 introduced a new matcher representation (`MatcherLoc`) and the `NoDelim` delimiters were made explicit within it, just like `mbe::Delimited`. - rust-lang#95797 then changed `mbe::Delimited` back to having implicit delimiters, but because matching is now being done on `MatcherLoc`, the behavioural change persisted. The fix is simple: remove the explicit `NoDelim` delimiters in the `MatcherLoc` representation. This gives back the original behaviour. As for why this took so long to be found: it seems that `NoDelim` sequences are unusual. It took a macro that generates another macro (from the `yarte_lexer` crate, found via a crater run) to uncover this. Fixes rust-lang#96305.

apiraino · 2022-04-22T09:35:13Z

Assigning priority as discussed in the Zulip thread of the Prioritization Working Group.

@rustbot label -I-prioritize +P-medium

petrochenkov · 2022-04-22T15:01:24Z

@nnethercote

I'm still unsure why the single-file test case above doesn't work, and whether it should.

Likely because in the single-file case literal fragments are kept as AST pieces in Nonterminal tokens, but in the cross-crate case they are converted into tokens streams including NoDelim delimiters so they can be written to crate metadata.

The cross-crate case is "normative" because it's pure token stream.
The fact that it works locally is kind of a variation of #67062.

nnethercote · 2022-05-05T23:15:17Z

@petrochenkov said the following in #96307:

Given that the macro in #96305 correctly reports an error if used locally, I'd say that

It would be better to report this case early, at macro definition site, when parsing the LHS of the macro definition.

is the right course of action here.

See Nonterminal/NoDelim when parsing a macro LHS => report an error (or a deny-by-default deprecation lint if there are too many regressions).

petrochenkov · 2022-05-06T07:39:34Z

After some thought, perhaps we should indeed land #96307 to keep the old behavior at call site, until the definition site diagnostics (#96307 (comment)) are actually implemented and turned into errors.

petrochenkov · 2022-05-06T07:40:55Z

until the definition site diagnostics (#96307 (comment)) are actually implemented and turned into errors.

This needs an open issue though, so it doesn't get lost.

nnethercote added C-bug Category: This is a bug. regression-untriaged Untriaged performance or correctness regression. labels Apr 22, 2022

rustbot added the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Apr 22, 2022

nnethercote mentioned this issue Apr 22, 2022

Fix macro matching for NoDelim delimited sequences. #96307

Closed

nnethercote changed the title ~~yarte_lexer 0.0.1 compilation regression due to #95159~~ yarte_lexer 0.0.1 compilation regression Apr 22, 2022

rustbot added P-medium Medium priority and removed I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Apr 22, 2022

ascjones mentioned this issue May 16, 2022

Invisible delimiters quoted for Group #97076

Closed

JohnTitor added regression-from-stable-to-stable Performance or correctness regression from one stable version to another. and removed regression-untriaged Untriaged performance or correctness regression. labels May 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

yarte_lexer 0.0.1 compilation regression #96305

yarte_lexer 0.0.1 compilation regression #96305

nnethercote commented Apr 22, 2022 •

edited

Loading

nnethercote commented Apr 22, 2022

nnethercote commented Apr 22, 2022

apiraino commented Apr 22, 2022

petrochenkov commented Apr 22, 2022

nnethercote commented May 5, 2022

petrochenkov commented May 6, 2022

petrochenkov commented May 6, 2022

yarte_lexer 0.0.1 compilation regression #96305

yarte_lexer 0.0.1 compilation regression #96305

Comments

nnethercote commented Apr 22, 2022 • edited Loading

Two-crate test case

Output

Single file attempt

Metadata difference

Conclusion

nnethercote commented Apr 22, 2022

nnethercote commented Apr 22, 2022

apiraino commented Apr 22, 2022

petrochenkov commented Apr 22, 2022

nnethercote commented May 5, 2022

petrochenkov commented May 6, 2022

petrochenkov commented May 6, 2022

nnethercote commented Apr 22, 2022 •

edited

Loading