Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic iterating over proc_macro::TokenStream #55640

Closed
dtolnay opened this issue Nov 3, 2018 · 14 comments
Closed

Panic iterating over proc_macro::TokenStream #55640

dtolnay opened this issue Nov 3, 2018 · 14 comments
Labels
A-macros Area: All kinds of macros (custom derive, macro_rules!, proc macros, ..)

Comments

@dtolnay
Copy link
Member

dtolnay commented Nov 3, 2018

In the following proc macro invocation the compiler passes in the input $crate as a TokenStream but panics when iterating over that TokenStream. Specifically the panic happens in next().

rustc 1.31.0-nightly (de9666f 2018-10-31)

repro/src/lib.rs

extern crate proc_macro;
use self::proc_macro::TokenStream;

#[proc_macro]
pub fn repro(input: TokenStream) -> TokenStream {
    input.into_iter().next();
    unimplemented!()
}

testing/src/lib.rs

macro_rules! m {
    () => {
        repro::repro!($crate);
    };
}

m!();
error: proc macro panicked                                                                                                     
 --> src/main.rs:3:9                                                                                                           
  |                                                                                                                            
3 |         repro::repro!($crate);                                                                                             
  |         ^^^^^^^^^^^^^^^^^^^^^^                                                                                             
...                                                                                                                            
7 | m!();                                                                                                                      
  | ----- in this macro invocation                                                                                             
  |                                                                                                                            
  = help: message: `"$crate"` is not a valid identifier

Originally reported in dtolnay/paste#3. Mentioning @softprops.

@dtolnay dtolnay added the A-macros Area: All kinds of macros (custom derive, macro_rules!, proc macros, ..) label Nov 3, 2018
@eddyb
Copy link
Member

eddyb commented Nov 4, 2018

cc @petrochenkov Looks like $crate is a single ident so it need splitting.

@dtolnay
Copy link
Member Author

dtolnay commented Nov 4, 2018

Don't split it, allow $crate as a single ident. Macro_rules does this correctly.

macro_rules! is_ident {
    ($i:ident) => {};
}

macro_rules! check {
    () => {
        is_ident!($crate);
    };
}

check!();

Otherwise all proc macro parsing code will need separate cases for many places that an ident is parsed or stored to also accept $ crate.

@petrochenkov
Copy link
Contributor

petrochenkov commented Nov 4, 2018

This is a tricky case, yes.
$crate produced by declarative macros is accepted by the ident matcher (including in path segments), so it should somehow act as a single identifier at least for compatibility.

@petrochenkov
Copy link
Contributor

Derive macro infra currently does a horrible thing called eliminate_crate_var replacing the $crate segment either with a crate name or with nothing (depending on the context) before passing tokens to the macro.
Fn-like and attribute macros don't do that (thankfully).

@dtolnay
How derive macros treat the is_ident!($crate) case when the $crate is turned into nothing?
Does it happen in practice?

@dtolnay
Copy link
Member Author

dtolnay commented Nov 4, 2018

I believe we don't currently have any code in any proc macro library specifically for accommodating $crate.

@petrochenkov
Copy link
Contributor

There's one thing I'm sure in - $crate after processing by the decl-macro it belongs to is no longer two tokens, it's one token - reserved/keyword identifier - in all respects.
It's very similar to crate, but slightly different in terms of resolution (Note: I don't think crate with tweaked span/hygiene can emulate $crate, unfortunately).

If we introduced a new keyword, say dollar_crate, with the same meaning, we could easily replace the "processed $crate" with it.
The obvious solution is to extend the lexical grammar of identifiers, relax validation in Ident::new, and treat "$crate" itself as such keyword.
It just doesn't feel right though, e.g. one issue with it is that such "keyword" can't survive stringification and reparsing.

@dtolnay
Copy link
Member Author

dtolnay commented Nov 4, 2018

I would be fine with accepting "$crate" in Ident::new. Lots of things don't survive stringification and reparsing so that shouldn't be a requirement.

@petrochenkov
Copy link
Contributor

Some status update: the ICE no longer happens because #49219 removed all the validation for tokens coming to proc macros from the compiler, but the underlying issue is still there.

@eddyb
Copy link
Member

eddyb commented Dec 8, 2018

@petrochenkov Wait there was validation? Where exactly, I might've missed it?

@petrochenkov
Copy link
Contributor

petrochenkov commented Dec 8, 2018

Ident::new(...)/Punct::new(...) instead of unchecked Ident { /* fields */ }/Punct { /* fields */ } like it's done now in proc_macro_server.rs / fn from_internal.

@petrochenkov
Copy link
Contributor

Input from the compiler is kind of controlled, so validation there is not so important as for the user-facing proc_macro::Ident::new, but it still was there exactly to catch unexpected cases / bugs like this issue.

@eddyb
Copy link
Member

eddyb commented Dec 8, 2018

Before:

            Ident(ident, false) => tt!(self::Ident::new(&ident.as_str(), Span(span))),
            Ident(ident, true) => tt!(self::Ident::new_raw(&ident.as_str(), Span(span))),

After:

            Ident(ident, is_raw) => tt!(Ident {
                sym: ident.name,
                is_raw
            }),

Just looking at this, without remembering exactly what happened, I suspect a rebase killed this accidentally. I don't remember noticing reinterning was happening.

I guess my version also preserves gensyms when it shouldn't?

@petrochenkov
Copy link
Contributor

I guess my version also preserves gensyms ...

Yes.

... when it shouldn't?

I'm not sure what should happen here.
(I'm only sure that we need to get rid of gensyms in general.)

@petrochenkov
Copy link
Contributor

Fixed in #56647

bors added a commit that referenced this issue Dec 20, 2018
Rework treatment of `$crate` in procedural macros

Important clarification: `$crate` below means "processed `$crate`" or "output `$crate`". In the input of a decl macro `$crate` is just two separate tokens, but in the *output of a decl macro* `$crate` is a single keyword identifier (#55640 (comment)).

First of all, this PR removes the `eliminate_crate_var` hack.
`$crate::foo` is no longer replaced with `::foo` or `::crate_name::foo` in the input of derive proc macros, it's passed to the macro instead with its precise span and hygiene data, and can be treated as any other path segment keyword (like `crate` or `self`) after that. (Note: `eliminate_crate_var` was never used for non-derive proc macros.)

This creates an annoying problem - derive macros still may stringify their input before processing and expect `$crate` survive that stringification and refer to the same crate (the Rust 1.15-1.29 way of doing things).
Moreover, the input of proc macro attributes and derives (but not fn-like proc macros) also effectively survives stringification before being passed to the macro (also for legacy implementation reasons).

So we kind of resurrect the `eliminate_crate_var` hack in reduced form, but apply it only to AST pretty-printing.
If an AST fragment is pretty-printed, the resulting *text* will have `$crate` replaced with `crate` or `::crate_name`. This should be enough to keep all the legacy cases working.

Closes #55640
Closes #56622
r? @ghost
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-macros Area: All kinds of macros (custom derive, macro_rules!, proc macros, ..)
Projects
None yet
Development

No branches or pull requests

3 participants