-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect and inconsistent jointness of tokens in desugared doc comment #49596
Comments
@dtolnay I think I know what's happening here in terms of where the I was unable to get I'll see what I can about the spurious |
The bug has nothing related to the PR, that's just the first time the error is reported. I can reproduce the failed assertion on master. But there is a description of what happen in the PR. Basically, there is a 2 line doc comment, but only the first one is processed. |
Yeah I couldn't reproduce this in a straightforward minimized macro either, but checking out the structopt PR at commit 5000840d1a6b79cf704caa927b4f5b15d29978c9 and Their problem is here + here where a Joint |
Ok thanks for the info! I've confirmed that this is fixed by #49597 where I jiggered things around a bit. Namely it appears that the |
Repro script#!/bin/sh
cargo new --lib structopt_derive
cargo new --lib repro
echo >structopt_derive/src/lib.rs '
#![feature(proc_macro)]
extern crate proc_macro;
use proc_macro::{TokenStream, TokenTreeIter, TokenTree, TokenNode, Spacing};
#[proc_macro_derive(StructOpt)]
pub fn derive_structopt(input: TokenStream) -> TokenStream {
let mut iter = input.into_iter();
assert_eq!(iter.next().unwrap().to_string(), "struct");
assert_eq!(iter.next().unwrap().to_string(), "S");
let mut inner = unwrap_group(iter.next().unwrap());
assert_eq!(inner.next().unwrap().to_string(), "#");
let mut x = unwrap_group(inner.next().unwrap());
assert_eq!(x.next().unwrap().to_string(), "doc");
println!("{:?} {}", unwrap_spacing(x.next().unwrap()), x.next().unwrap());
assert_eq!(inner.next().unwrap().to_string(), "#");
let mut y = unwrap_group(inner.next().unwrap());
assert_eq!(y.next().unwrap().to_string(), "doc");
println!("{:?} {}", unwrap_spacing(y.next().unwrap()), y.next().unwrap());
TokenStream::empty()
}
fn unwrap_group(tt: TokenTree) -> TokenTreeIter {
match tt.kind {
TokenNode::Group(_, s) => s.into_iter(),
_ => unimplemented!(),
}
}
fn unwrap_spacing(tt: TokenTree) -> Spacing {
match tt.kind {
TokenNode::Op(_, s) => s,
_ => unimplemented!(),
}
}
'
echo >>structopt_derive/Cargo.toml '
[lib]
proc-macro = true
'
echo >repro/src/lib.rs '
#![allow(dead_code)]
#[macro_use]
extern crate structopt_derive;
fn f() {
#[derive(StructOpt)]
struct S {
/// X
/// Y
#[doc(hidden)]
foo: bool,
}
}
'
echo >>repro/Cargo.toml '
structopt_derive = { path = "../structopt_derive" }
'
cargo build --manifest-path repro/Cargo.toml Output
|
Interestingly in the script if you move
@alexcrichton when you say "jiggered things around a bit" is that of the intentional sort or the accidentally-fixed-it sort? It would be good to understand why the token stream was different depending on whether the macro is invoked inside a function or outside -- not clear in your PR what might have fixed that. |
@dtolnay sure yeah, worth documenting! So I'm not really sure why, but the Unfortunately I fat-fingered this a bit and in the doc comment expansion the code also uses In the PR I made these are both explicitly changed to |
This is really what I was interested in. I filed #49604 to follow up. Thanks! |
…chenkov proc_macro: Reorganize public API This commit is a reorganization of the `proc_macro` crate's public user-facing API. This is the result of a number of discussions at the recent Rust All-Hands where we're hoping to get the `proc_macro` crate into ship shape for stabilization of a subset of its functionality in the Rust 2018 release. The reorganization here is motivated by experiences from the `proc-macro2`, `quote`, and `syn` crates on crates.io (and other crates which depend on them). The main focus is future flexibility along with making a few more operations consistent and/or fixing bugs. A summary of the changes made from today's `proc_macro` API is: * The `TokenNode` enum has been removed and the public fields of `TokenTree` have also been removed. Instead the `TokenTree` type is now a public enum (what `TokenNode` was) and each variant is an opaque struct which internally contains `Span` information. This makes the various tokens a bit more consistent, require fewer wrappers, and otherwise provides good future-compatibility as opaque structs are easy to modify later on. * `Literal` integer constructors have been expanded to be unambiguous as to what they're doing and also allow for more future flexibility. Previously constructors like `Literal::float` and `Literal::integer` were used to create unsuffixed literals and the concrete methods like `Literal::i32` would create a suffixed token. This wasn't immediately clear to all users (the suffixed/unsuffixed aspect) and having *one* constructor for unsuffixed literals required us to pick a largest type which may not always be true. To fix these issues all constructors are now of the form `Literal::i32_unsuffixed` or `Literal::i32_suffixed` (for all integral types). This should allow future compatibility as well as being immediately clear what's suffixed and what isn't. * Each variant of `TokenTree` internally contains a `Span` which can also be configured via `set_span`. For example `Literal` and `Term` now both internally contain a `Span` rather than having it stored in an auxiliary location. * Constructors of all tokens are called `new` now (aka `Term::intern` is gone) and most do not take spans. Manufactured tokens typically don't have a fresh span to go with them and the span is purely used for error-reporting **except** the span for `Term`, which currently affects hygiene. The default spans for all these constructed tokens is `Span::call_site()` for now. The `Term` type's constructor explicitly requires passing in a `Span` to provide future-proofing against possible hygiene changes. It's intended that a first pass of stabilization will likely only stabilize `Span::call_site()` which is an explicit opt-in for "I would like no hygiene here please". The intention here is to make this explicit in procedural macros to be forwards-compatible with a hygiene-specifying solution. * Some of the conversions for `TokenStream` have been simplified a little. * The `TokenTreeIter` iterator was renamed to `token_stream::IntoIter`. Overall the hope is that this is the "final pass" at the API of `TokenStream` and most of `TokenTree` before stabilization. Explicitly left out here is any changes to `Span`'s API which will likely need to be re-evaluated before stabilization. All changes in this PR have already been reflected to the [`proc-macro2`], `quote`, and `syn` crates. New versions of all these crates have also been published to crates.io. Once this lands in nightly I plan on making an internals post again summarizing the changes made here and also calling on all macro authors to give the APIs a spin and see how they work. Hopefully pending no major issues we can then have an FCP to stabilize later this cycle! [`proc-macro2`]: https://docs.rs/proc-macro2/0.3.1/proc_macro2/ Closes rust-lang#49596
I have not minimized this yet but in TeXitoi/structopt#88 we are seeing inexplicable behavior when iterating over tokens of a struct field doc comment. Related to #49545 so mentioning @alexcrichton.
One of their test cases contains the following struct.
Within their macro implementation we are seeing the desugared doc comment of
/// Fooify a bar
having an Alone spacing:while the desugared
/// and a baz
has a Joint spacing. I believe the Joint is incorrect because only an Op followed by another Op should be able to have Joint spacing.I stuck the following loop at the top of their derive entry point:
and it indicates that the first doc comment has
kind: Tree
while the second haskind: JointTree
.The text was updated successfully, but these errors were encountered: