New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Procedural macros #1566

Merged
merged 3 commits into from Dec 13, 2016

Conversation

@nrc
Member

nrc commented Apr 1, 2016

This RFC proposes an evolution of Rust's procedural macro system (aka syntax
extensions, aka compiler plugins). This RFC specifies syntax for the definition
of procedural macros, a high-level view of their implementation in the compiler,
and outlines how they interact with the compilation process.

At the highest level, macros are defined by implementing functions marked with
a #[macro] attribute. Macros operate on a list of tokens provided by the
compiler and return a list of tokens that the macro use is replaced by. We
provide low-level facilities for operating on these tokens. Higher level
facilities (e.g., for parsing tokens to an AST) should exist as library crates.

Procedural macros
This RFC proposes an evolution of Rust's procedural macro system (aka syntax
extensions, aka compiler plugins). This RFC specifies syntax for the definition
of procedural macros, a high-level view of their implementation in the compiler,
and outlines how they interact with the compilation process.

At the highest level, macros are defined by implementing functions marked with
a `#[macro]` attribute. Macros operate on a list of tokens provided by the
compiler and return a list of tokens that the macro use is replaced by. We
provide low-level facilities for operating on these tokens. Higher level
facilities (e.g., for parsing tokens to an AST) should exist as library crates.
Which delimiter was used should be available to the macro implementation via the
`MacroContext`. I believe this is maximally flexible - the macro implementation
can throw an error if it doesn't like the delimiters used.

This comment has been minimized.

@oli-obk

oli-obk Apr 1, 2016

Contributor

I prefer this over hiding it from the macro implementor. I personally think that things like vec!{1, 2, 3} and vec!(5, 6, 7) should be forbidden or at least linted against (for backwards compat).

@oli-obk

oli-obk Apr 1, 2016

Contributor

I prefer this over hiding it from the macro implementor. I personally think that things like vec!{1, 2, 3} and vec!(5, 6, 7) should be forbidden or at least linted against (for backwards compat).

@jimmycuadra

This comment has been minimized.

Show comment
Hide comment
@jimmycuadra

jimmycuadra Apr 1, 2016

Rendered

(I am SOOOOO excited about this. :D)

jimmycuadra commented Apr 1, 2016

Rendered

(I am SOOOOO excited about this. :D)

@steveklabnik

This comment has been minimized.

Show comment
Hide comment
@steveklabnik

steveklabnik Apr 1, 2016

Member

I am also psyched to see movement on this 😄

Member

steveklabnik commented Apr 1, 2016

I am also psyched to see movement on this 😄

I think we could first design the syntax, interfaces, etc. and later evolve into
a process-separated model (if desired). However, if this is considered an
essential feature of macro reform, then we might want to consider the interfaces
more thoroughly with this in mind.

This comment has been minimized.

@sfackler

sfackler Apr 1, 2016

Member

I would very much like to move towards this kind of setup, but I think that the interface proposed here should work just fine in that world so we shouldn't necessarily block on figuring this out right now.

@sfackler

sfackler Apr 1, 2016

Member

I would very much like to move towards this kind of setup, but I think that the interface proposed here should work just fine in that world so we shouldn't necessarily block on figuring this out right now.

This comment has been minimized.

@eddyb

eddyb Apr 2, 2016

Member

We might have to do clever things like use shared memory alongside message-based IPC to get the most out of such a model, but I believe it would be worth it.

Dynamic linking for plugin systems works in practice but just making it safe to unload said plugins is an entire area of design Rust hasn't even touched yet (although lifetimes would play a very important role).

I would rather have an IPC solution that lets me:

  • download a single binary and have a working cross-compiler without even system dependencies (static linking against musl on linux)
  • safely unload plugins after expansion without introducing complexity into the language
  • use a forking zygote to share most of the plugins across compilations (via cargo or RLS)
    • EDIT: This technique can also be used to (relatively) cheaply erase any per-thread/process state in between expansions of the same macro, if we want to completely deny that (it would make order dependence much harder, I think) - since we control the compilation of the macro crate, we could just ban statics altogether, but that doesn't account for them messing with pthreads on their own
  • implement my own libmacro in an alternative Rust compiler
  • more exciting: implement my own libmacro for a Rust tool which is not a full compiler
@eddyb

eddyb Apr 2, 2016

Member

We might have to do clever things like use shared memory alongside message-based IPC to get the most out of such a model, but I believe it would be worth it.

Dynamic linking for plugin systems works in practice but just making it safe to unload said plugins is an entire area of design Rust hasn't even touched yet (although lifetimes would play a very important role).

I would rather have an IPC solution that lets me:

  • download a single binary and have a working cross-compiler without even system dependencies (static linking against musl on linux)
  • safely unload plugins after expansion without introducing complexity into the language
  • use a forking zygote to share most of the plugins across compilations (via cargo or RLS)
    • EDIT: This technique can also be used to (relatively) cheaply erase any per-thread/process state in between expansions of the same macro, if we want to completely deny that (it would make order dependence much harder, I think) - since we control the compilation of the macro crate, we could just ban statics altogether, but that doesn't account for them messing with pthreads on their own
  • implement my own libmacro in an alternative Rust compiler
  • more exciting: implement my own libmacro for a Rust tool which is not a full compiler
Show outdated Hide outdated text/0000-proc-macros.md
Show outdated Hide outdated text/0000-proc-macros.md
Show outdated Hide outdated text/0000-proc-macros.md
Show outdated Hide outdated text/0000-proc-macros.md
Dollar, // `$`
// Not actually sure if we need this or if semicolons can be treated like
// other punctuation.
Semicolon, // `;`

This comment has been minimized.

@eddyb

eddyb Apr 2, 2016

Member

Can't we use Punctuation for all 3 of these?

@eddyb

eddyb Apr 2, 2016

Member

Can't we use Punctuation for all 3 of these?

This comment has been minimized.

@nrc

nrc Apr 5, 2016

Member

I'm not really sure, there does seem to be some advantage to making these special. However, I think it is probably a convenience rather than essential. I think I would start implementing and maybe change this later.

@nrc

nrc Apr 5, 2016

Member

I'm not really sure, there does seem to be some advantage to making these special. However, I think it is probably a convenience rather than essential. I think I would start implementing and maybe change this later.

pub enum CommentKind {
Regular,
InnerDoc,
OuterDoc,

This comment has been minimized.

@eddyb

eddyb Apr 2, 2016

Member

Don't we expand doc comments into attributes nowadays? I recall being able to use them with macros taking attributes.

@eddyb

eddyb Apr 2, 2016

Member

Don't we expand doc comments into attributes nowadays? I recall being able to use them with macros taking attributes.

This comment has been minimized.

@pczarn

pczarn Apr 4, 2016

Yes, doc comments become doc attributes in macro input.

@pczarn

pczarn Apr 4, 2016

Yes, doc comments become doc attributes in macro input.

This comment has been minimized.

@nrc

nrc Apr 5, 2016

Member

Yeah. It is gross and hackey and I have no idea why we even have doc comment attributes. I would like to kill them in the surface syntax, but that seems a bit unlikely given the discussion on that RFC :-(

I would prefer to have explicit doc comments. Is there a reason to treat them like attributes (in macros I mean, I realise that the compiler/rustdoc wants them to be attributes).

@nrc

nrc Apr 5, 2016

Member

Yeah. It is gross and hackey and I have no idea why we even have doc comment attributes. I would like to kill them in the surface syntax, but that seems a bit unlikely given the discussion on that RFC :-(

I would prefer to have explicit doc comments. Is there a reason to treat them like attributes (in macros I mean, I realise that the compiler/rustdoc wants them to be attributes).

This comment has been minimized.

@eddyb

eddyb Apr 5, 2016

Member

You can use them in macro_rules macros with just generic attribute pass-through.

@eddyb

eddyb Apr 5, 2016

Member

You can use them in macro_rules macros with just generic attribute pass-through.

Currently, procedural macros are dynamically linked with the compiler. This
prevents the compiler being statically linked, which is sometimes desirable. An
alternative architecture would have procedural macros compiled as independent
programs and have them communicate with the compiler via IPC.

This comment has been minimized.

@eddyb

eddyb Apr 2, 2016

Member

I think we can start by running each plugin's registrar and all expansion in a thread dedicated to that plugin.
This will avoid accidental dependence on thread-local state (which we have some of, most notably the string interner).

If we have a &mut context, we should be able to temporarily "transfer ownership" to the expanding thread, without the context even implementing Sync, just Send.

@eddyb

eddyb Apr 2, 2016

Member

I think we can start by running each plugin's registrar and all expansion in a thread dedicated to that plugin.
This will avoid accidental dependence on thread-local state (which we have some of, most notably the string interner).

If we have a &mut context, we should be able to temporarily "transfer ownership" to the expanding thread, without the context even implementing Sync, just Send.

This comment has been minimized.

@DemiMarie

DemiMarie Jun 7, 2016

Another advantage of the IPC solution (which OCaml has chosen for its own syntax extensions) is that crashes in the plugin don't crash the compiler. Furthermore, should it become desirable, the plugin could be run in a process with reduced OS-level privileges.

@DemiMarie

DemiMarie Jun 7, 2016

Another advantage of the IPC solution (which OCaml has chosen for its own syntax extensions) is that crashes in the plugin don't crash the compiler. Furthermore, should it become desirable, the plugin could be run in a process with reduced OS-level privileges.

Both procedural macros and constant evaluation are mechanisms for running Rust
code at compile time. Currently, and under the proposed design, they are
considered completely separate features. There might be some benefit in letting
them interact.

This comment has been minimized.

@eddyb

eddyb Apr 2, 2016

Member

The main problem with mixing macros and constants is that if you somehow run impure procedural macro code for each monomorphization of a generic type, you generally cannot enforce determinism so safe code can break coherence and possibly even type safety (not that unsafe should let you do such things either).

OTOH, I would welcome a TokenStream -> Result<TokenStream, ...> method on MacroContext for compiling just a constant and outputting its evaluated form (if available - ADT "trees" with primitive literals as leaves should work), but I don't have a specific usecase in mind right now.

@eddyb

eddyb Apr 2, 2016

Member

The main problem with mixing macros and constants is that if you somehow run impure procedural macro code for each monomorphization of a generic type, you generally cannot enforce determinism so safe code can break coherence and possibly even type safety (not that unsafe should let you do such things either).

OTOH, I would welcome a TokenStream -> Result<TokenStream, ...> method on MacroContext for compiling just a constant and outputting its evaluated form (if available - ADT "trees" with primitive literals as leaves should work), but I don't have a specific usecase in mind right now.

This comment has been minimized.

@9il

9il Jan 6, 2018

Hi, I am looking for D's static if analog in Rust. It is required to implement fast full featured multidimensional arrays in Rust ndarray cargo package is slower then D's ndslice. Rust is going to have constant templates parameters. Mixing macros and constants allows to implement macro like static_if!(ctfe_cond, codeBlock1, codeBlock2) if I am not wrong, plus it will make Rust the most generic system programming language ever.

@9il

9il Jan 6, 2018

Hi, I am looking for D's static if analog in Rust. It is required to implement fast full featured multidimensional arrays in Rust ndarray cargo package is slower then D's ndslice. Rust is going to have constant templates parameters. Mixing macros and constants allows to implement macro like static_if!(ctfe_cond, codeBlock1, codeBlock2) if I am not wrong, plus it will make Rust the most generic system programming language ever.

Show outdated Hide outdated text/0000-proc-macros.md
Show outdated Hide outdated text/0000-proc-macros.md
pub struct TokenTree {
pub kind: TokenKind,
pub span: Span,
pub hygiene: HygieneObject,

This comment has been minimized.

@eddyb

eddyb Apr 2, 2016

Member

Might want to consider combining the span and the hygiene information - couldn't the expansion traces be deduced from hygiene scopes?

@eddyb

eddyb Apr 2, 2016

Member

Might want to consider combining the span and the hygiene information - couldn't the expansion traces be deduced from hygiene scopes?

This comment has been minimized.

@Zoxc

Zoxc Apr 2, 2016

How will we create TokenTrees?

@Zoxc

Zoxc Apr 2, 2016

How will we create TokenTrees?

This comment has been minimized.

@eddyb

eddyb Apr 2, 2016

Member

I'd start with better quasi-quoting and see what else we need from there.
One worry I have about Sequence is that it might be expensive to convert into the optimized representation.
Given that the average sequence has 8 tokens, it would be a waste to keep an actual tree in memory, when a lot of cases could fit in the same space the pointers take right now.

@eddyb

eddyb Apr 2, 2016

Member

I'd start with better quasi-quoting and see what else we need from there.
One worry I have about Sequence is that it might be expensive to convert into the optimized representation.
Given that the average sequence has 8 tokens, it would be a waste to keep an actual tree in memory, when a lot of cases could fit in the same space the pointers take right now.

This comment has been minimized.

@nrc

nrc Apr 5, 2016

Member

Hmm, it might be a good idea to combine Spans and HygieneObject here. There have different roles, but are kind of two sides of the same coin. Spans are informative and transparent. HygieneObjects are meant to be normative and opaque. I can imagine that macro authors might want to change them independently, but the common case definitely would be to operate on both at the same time.

@nrc

nrc Apr 5, 2016

Member

Hmm, it might be a good idea to combine Spans and HygieneObject here. There have different roles, but are kind of two sides of the same coin. Spans are informative and transparent. HygieneObjects are meant to be normative and opaque. I can imagine that macro authors might want to change them independently, but the common case definitely would be to operate on both at the same time.

This comment has been minimized.

@nrc

nrc Apr 5, 2016

Member

I'll cover creating token trees in an upcoming RFC

@nrc

nrc Apr 5, 2016

Member

I'll cover creating token trees in an upcoming RFC

This comment has been minimized.

@eddyb

eddyb Apr 6, 2016

Member

How does Origin sound as a name for the combined span & hygiene info?

@eddyb

eddyb Apr 6, 2016

Member

How does Origin sound as a name for the combined span & hygiene info?

pub struct TokenStream(Vec<TokenTree>);
// A borrowed TokenStream
pub struct TokenSlice<'a>(&'a [TokenTree]);

This comment has been minimized.

@eddyb

eddyb Apr 2, 2016

Member

These two types might be better served by Cow<'a, [TokenTree]> - or if an optimized representation such as "reference-counted ropes of slices" is chosen, no lifetime might be needed at all.

That said, having a lifetime in both the stream type and the context type expands the design space significantly, and the following is probably really close to optimal:

pub struct TokenStream<'a>(Vec<&'a [CompactToken]>);
pub struct MacroContext<'a> {
    tokens: &'a TypedArena<Vec<CompactToken>>,
    ...
}

We must be careful to only provide APIs which can be implemented efficiently regardless of representation, such as forward/reverse iterators (but not random access iterators).

@eddyb

eddyb Apr 2, 2016

Member

These two types might be better served by Cow<'a, [TokenTree]> - or if an optimized representation such as "reference-counted ropes of slices" is chosen, no lifetime might be needed at all.

That said, having a lifetime in both the stream type and the context type expands the design space significantly, and the following is probably really close to optimal:

pub struct TokenStream<'a>(Vec<&'a [CompactToken]>);
pub struct MacroContext<'a> {
    tokens: &'a TypedArena<Vec<CompactToken>>,
    ...
}

We must be careful to only provide APIs which can be implemented efficiently regardless of representation, such as forward/reverse iterators (but not random access iterators).

@Zoxc

This comment has been minimized.

Show comment
Hide comment
@Zoxc

Zoxc Apr 2, 2016

I am missing Space, Tab and Newline from the tokens. Without them, it's hard to tell the difference between + = and +=. They would also allow you to embed languages which are newline sensitive, like assembly.

Zoxc commented Apr 2, 2016

I am missing Space, Tab and Newline from the tokens. Without them, it's hard to tell the difference between + = and +=. They would also allow you to embed languages which are newline sensitive, like assembly.

@thepowersgang

This comment has been minimized.

Show comment
Hide comment
@thepowersgang

thepowersgang Apr 2, 2016

Contributor

They're not needed - += should be a distinct token to + =

Contributor

thepowersgang commented Apr 2, 2016

They're not needed - += should be a distinct token to + =

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Apr 2, 2016

Member

@Zoxc spaces are intentionally omitted, they complicate everything. We skip those tokens in the lexer (Reader::real_token) in the current parser.

Member

eddyb commented Apr 2, 2016

@Zoxc spaces are intentionally omitted, they complicate everything. We skip those tokens in the lexer (Reader::real_token) in the current parser.

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Apr 3, 2016

Member

@nrc I've been talking to @mrmonday about his pnet crate, which gives you the tools to define arbitrary packets - network is the main target, but I suspect IPC or serialization would also be valid usecases; I hear dropbox is using it for something.

From what I understand, it would be real helpful if given an arbitrary type path (e.g. the type of a field), a procedural macro could query the MacroContext and get some information about the type definition (a struct or enum, AFAICT).

My reading of #1560 suggests this should be possible, and if nothing else, having a macro next to the type definition would also work, i.e. given field: path::to::Type, pnet could query path::to::Type__pnet_info (previously generated by pnet) or at least defer to it for decisions.

Member

eddyb commented Apr 3, 2016

@nrc I've been talking to @mrmonday about his pnet crate, which gives you the tools to define arbitrary packets - network is the main target, but I suspect IPC or serialization would also be valid usecases; I hear dropbox is using it for something.

From what I understand, it would be real helpful if given an arbitrary type path (e.g. the type of a field), a procedural macro could query the MacroContext and get some information about the type definition (a struct or enum, AFAICT).

My reading of #1560 suggests this should be possible, and if nothing else, having a macro next to the type definition would also work, i.e. given field: path::to::Type, pnet could query path::to::Type__pnet_info (previously generated by pnet) or at least defer to it for decisions.

@Zoxc

This comment has been minimized.

Show comment
Hide comment
@Zoxc

Zoxc Apr 4, 2016

Are you supposed to be able to generate Spans dynamically for token you generate? For example include! should make a new Span for the tokens in the external file. That means there is at least hope of inferring whitespace/newlines from Spans. It would make the fields of Comment and String seem redundant, since those could easily be inferred from the Span.

I wonder if it would be a good idea to include a Unknown token, which could contain an AST or newer tokens not supported by the current version of libmacro. Inline assembly seems to be a good candidate to put inside such a token, since asm! cannot currently return tokens.

Zoxc commented Apr 4, 2016

Are you supposed to be able to generate Spans dynamically for token you generate? For example include! should make a new Span for the tokens in the external file. That means there is at least hope of inferring whitespace/newlines from Spans. It would make the fields of Comment and String seem redundant, since those could easily be inferred from the Span.

I wonder if it would be a good idea to include a Unknown token, which could contain an AST or newer tokens not supported by the current version of libmacro. Inline assembly seems to be a good candidate to put inside such a token, since asm! cannot currently return tokens.

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Apr 4, 2016

Member

@Zoxc Well, one solution to make matches over tokens, from stable crates, non-exhaustive is to have a dummy variant which is unstable - InlineAsm is another good example of an unstable variant.
Not doing anything like this would force libmacro into a fixed set of tokens which cannot be extended.

Member

eddyb commented Apr 4, 2016

@Zoxc Well, one solution to make matches over tokens, from stable crates, non-exhaustive is to have a dummy variant which is unstable - InlineAsm is another good example of an unstable variant.
Not doing anything like this would force libmacro into a fixed set of tokens which cannot be extended.

It would nice to allow procedural macros to be defined in the crate in which
they are used, as well as in separate crates (mentioned above). This complicates
things since it breaks the invariant that a crate is designed to be used at
either compile-time or runtime. I leave it for the future.

This comment has been minimized.

@pczarn

pczarn Apr 4, 2016

N.B. This would allow an implementation of macro_rules! in terms of procedural macro generation, assuming that procedural macros could be defined in macro expansions.

@pczarn

pczarn Apr 4, 2016

N.B. This would allow an implementation of macro_rules! in terms of procedural macro generation, assuming that procedural macros could be defined in macro expansions.

This comment has been minimized.

@eddyb

eddyb Apr 4, 2016

Member

That wouldn't really be allowed, I wouldn't think. It's hard enough to make it work before expansion, during expansion it would be a nightmare.
Unless we had "macro modules" which were split off as an entire crate, disregarding anything else in the current crate (i.e. they would be the root of their own crate). But that might get a bit confusing.

@eddyb

eddyb Apr 4, 2016

Member

That wouldn't really be allowed, I wouldn't think. It's hard enough to make it work before expansion, during expansion it would be a nightmare.
Unless we had "macro modules" which were split off as an entire crate, disregarding anything else in the current crate (i.e. they would be the root of their own crate). But that might get a bit confusing.

This comment has been minimized.

@Ericson2314

Ericson2314 Apr 5, 2016

Contributor

On a different note, this is weird wrt cross compiling. It would be annoying to not be able to cross-compile a crate because the macros defined in it use some (imported) items that can't be built on the host platform.

@Ericson2314

Ericson2314 Apr 5, 2016

Contributor

On a different note, this is weird wrt cross compiling. It would be annoying to not be able to cross-compile a crate because the macros defined in it use some (imported) items that can't be built on the host platform.

@nrc

This comment has been minimized.

Show comment
Hide comment
@nrc

nrc Apr 5, 2016

Member

@Zoxc whitespace is elided from the tokens. Since whitespace is the only thing elided, it should always be possible to recreate it from the spans.

Member

nrc commented Apr 5, 2016

@Zoxc whitespace is elided from the tokens. Since whitespace is the only thing elided, it should always be possible to recreate it from the spans.

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Apr 5, 2016

Member

@Zoxc @nrc Whitespace and comments - I just realized the example tokens in this RFC include comments which might be an annoyance, you want to ignore them if you want to ignore whitespace - unless they are doc comments, but in that case see the discussion above about doc comments and attributes.

Member

eddyb commented Apr 5, 2016

@Zoxc @nrc Whitespace and comments - I just realized the example tokens in this RFC include comments which might be an annoyance, you want to ignore them if you want to ignore whitespace - unless they are doc comments, but in that case see the discussion above about doc comments and attributes.

@Zoxc

This comment has been minimized.

Show comment
Hide comment
@Zoxc

Zoxc Apr 5, 2016

I would like a simpler API which deals with spans, expansions and hyigene objects. This can be used to build the higher level API proposed in this RFC. Here is the API I would like:

// A contiguous string containing tokens with source location information.
pub struct Span(..);

// This represents a macro expansion. It is passed to the #[macro] function.
pub struct Expansion {
    // `Span` is the source of this macro, including the `macro!()` syntax
    pub span: Span,
    // A map of positions in the span onto macro expansions. macro!(a!() b!()) would have 2 entries here.
    pub macros: HashMap<usize, Expansion>,
    // A map of positions in the span onto hygiene objects.
    pub hyigene: HashMap<usize, HygieneObject>,
}

impl Span {
    pub fn as_str<'s>(&'s self, context: &Context) -> &'s str { .. }
    pub fn new(context: &Context, str: &str) -> Span { .. }
}

pub struct Context(..);

The macros and hyigene fields should be exposed and instead by accessed by method somehow to allow for more efficient representations. There should also be a way to edit Spans which preserves location information.

This requires all macros to return things that can be represented by spans, expansions and hyigene objects. So asm! will have to move to an intrinsic for this to work.

The benefits of this API:

  • It's very light on allocations. It can reuse input source code so it can be very space efficient. Using identity macros (without macros or hygiene objects inside) would require no allocations.
  • It perfectly represents whitespace, newline, comments and all future tokens, unlike the proposed API.
  • It still allows macros which only wants to deal with valid Rust tokens (no whitespace, comments or newlines) a way to do that using some API on top which deals with Tokens.

Zoxc commented Apr 5, 2016

I would like a simpler API which deals with spans, expansions and hyigene objects. This can be used to build the higher level API proposed in this RFC. Here is the API I would like:

// A contiguous string containing tokens with source location information.
pub struct Span(..);

// This represents a macro expansion. It is passed to the #[macro] function.
pub struct Expansion {
    // `Span` is the source of this macro, including the `macro!()` syntax
    pub span: Span,
    // A map of positions in the span onto macro expansions. macro!(a!() b!()) would have 2 entries here.
    pub macros: HashMap<usize, Expansion>,
    // A map of positions in the span onto hygiene objects.
    pub hyigene: HashMap<usize, HygieneObject>,
}

impl Span {
    pub fn as_str<'s>(&'s self, context: &Context) -> &'s str { .. }
    pub fn new(context: &Context, str: &str) -> Span { .. }
}

pub struct Context(..);

The macros and hyigene fields should be exposed and instead by accessed by method somehow to allow for more efficient representations. There should also be a way to edit Spans which preserves location information.

This requires all macros to return things that can be represented by spans, expansions and hyigene objects. So asm! will have to move to an intrinsic for this to work.

The benefits of this API:

  • It's very light on allocations. It can reuse input source code so it can be very space efficient. Using identity macros (without macros or hygiene objects inside) would require no allocations.
  • It perfectly represents whitespace, newline, comments and all future tokens, unlike the proposed API.
  • It still allows macros which only wants to deal with valid Rust tokens (no whitespace, comments or newlines) a way to do that using some API on top which deals with Tokens.
@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Apr 5, 2016

Member

@Zoxc First impression: not bad. Definitely doesn't interfere directly with anything I can think of (it's a bit like the lower-level internal codemap API).
However, an efficient token representation would have to be part of libmacro (as opposed to being implemented in an entirely stable crate) to be able to avoid repeated lexing between the compiler and procedural macros.

Member

eddyb commented Apr 5, 2016

@Zoxc First impression: not bad. Definitely doesn't interfere directly with anything I can think of (it's a bit like the lower-level internal codemap API).
However, an efficient token representation would have to be part of libmacro (as opposed to being implemented in an entirely stable crate) to be able to avoid repeated lexing between the compiler and procedural macros.

@plietar

This comment has been minimized.

Show comment
Hide comment
@plietar

plietar Apr 6, 2016

Great to see this moving forward. Two comments :

  • custom deriving isn't mentioned. This is quite a big use case for compiler plugins for eg serde. Passing in the struct/enum definition into the macro as tokens would feel clunky. It would probably be better to pass the list of fields/variants, but this may limit compatibility.

  • some macros embed rust syntax inside there own custom syntax (thinking of json_macros or my own protobuf-macros which parse rust expressions only to include them as is in the generated ast).

    If they do their own parsing, then when a new expression syntax comes along (say the new ? suffix operator for example) and they aren't updated to take it into account, then the new syntax may not be used inside the macro. However the macro doesn't care about the syntax used, it just wants an expression.

    The compiler could provide some parsing as a service for plugins, either generating an opaque expression AST that plugins may include within the tokens (feels clumsy), or simply validate the token tree on behalf of the plugin, which outputs the original tokens.

    Alternatively, the plugins can just include the token tree in the output without making sure it is an actual expression. This however breaks the "It is the procedural macro's responsibility to ensure that the tokens parse without error" rule stated by the RFC, although it is really the user's fault.

    In any case, I think the RFC should mention this in some way.

plietar commented Apr 6, 2016

Great to see this moving forward. Two comments :

  • custom deriving isn't mentioned. This is quite a big use case for compiler plugins for eg serde. Passing in the struct/enum definition into the macro as tokens would feel clunky. It would probably be better to pass the list of fields/variants, but this may limit compatibility.

  • some macros embed rust syntax inside there own custom syntax (thinking of json_macros or my own protobuf-macros which parse rust expressions only to include them as is in the generated ast).

    If they do their own parsing, then when a new expression syntax comes along (say the new ? suffix operator for example) and they aren't updated to take it into account, then the new syntax may not be used inside the macro. However the macro doesn't care about the syntax used, it just wants an expression.

    The compiler could provide some parsing as a service for plugins, either generating an opaque expression AST that plugins may include within the tokens (feels clumsy), or simply validate the token tree on behalf of the plugin, which outputs the original tokens.

    Alternatively, the plugins can just include the token tree in the output without making sure it is an actual expression. This however breaks the "It is the procedural macro's responsibility to ensure that the tokens parse without error" rule stated by the RFC, although it is really the user's fault.

    In any case, I think the RFC should mention this in some way.

@nrc

This comment has been minimized.

Show comment
Hide comment
@nrc

nrc Apr 8, 2016

Member

@eddyb - I can imagine use cases where a macro might want the comments (although rare). Certainly I've written tools where having the comments would have been useful. I hope that we can include the comments, but present API which effectively filters them out. Strawman: TokenStream could have an iter method which ignores comments and iter_all (or iter_with_comments or something) which iterates over all tokens including comments.

I hope that the most popular mode for operation will be passing the token stream straight to a parsing library so that library can then ignore the comments unless necessary and the macro author never sees them.

Member

nrc commented Apr 8, 2016

@eddyb - I can imagine use cases where a macro might want the comments (although rare). Certainly I've written tools where having the comments would have been useful. I hope that we can include the comments, but present API which effectively filters them out. Strawman: TokenStream could have an iter method which ignores comments and iter_all (or iter_with_comments or something) which iterates over all tokens including comments.

I hope that the most popular mode for operation will be passing the token stream straight to a parsing library so that library can then ignore the comments unless necessary and the macro author never sees them.

The second argument is the tokens for the AST node the attribute is placed on.
Note that in order to compute the tokens to pass here, the compiler must be able
to parse the code the attribute is applied to. However, the AST for the node
passed to the macro is discarded, it is not passed to the macro nor used by the

This comment has been minimized.

@pnkfelix

pnkfelix Oct 13, 2016

Member

this phrasing cannot be correct as stated: "the AST for the node passed to the macro [...] is not passed to the macro" ...

  • It may seem like I am acting dense, but no, I am honestly confused by the construction of the sentence.

Q: Is the idea here just that whatever be the structure passed as the second argument to a proc_macro_attribute procedure, nothing that procedure does can affect the AST that second argument was generated from?

  • its hard to see how this distinction matters, since the AST is replaced by the one generated from parsing the returned tokens, no?
@pnkfelix

pnkfelix Oct 13, 2016

Member

this phrasing cannot be correct as stated: "the AST for the node passed to the macro [...] is not passed to the macro" ...

  • It may seem like I am acting dense, but no, I am honestly confused by the construction of the sentence.

Q: Is the idea here just that whatever be the structure passed as the second argument to a proc_macro_attribute procedure, nothing that procedure does can affect the AST that second argument was generated from?

  • its hard to see how this distinction matters, since the AST is replaced by the one generated from parsing the returned tokens, no?
// Word is defined by Unicode Standard Annex 31 -
// [Unicode Identifier and Pattern Syntax](http://unicode.org/reports/tr31/)
Word(Symbol),
Punctuation(char),

This comment has been minimized.

@pnkfelix

pnkfelix Oct 13, 2016

Member

You gave an example above of the token stream ['a', 'b', 'c', ';']. My first impression upon reading that is that all four of the listed tokens belong to the same category, but then I saw this Word/Punctuation distinction in the variants here.

Is ';' considered Punctuation? (Edit: oh, answer is, its listed above as its own variant, Semicolon.)

  • What about ,?
  • What about +?
  • What about && or -> or <-?
  • Edit: ah, down below I see that you have pointed out multi-char operators as an open-question. So I guess all of these cases are considered Punctuation, and its up to the parser to take a sequence [&, &] and turn it into &&? Sounds painful for macro authors (indeed, that pain seems to also be discussed below...)
@pnkfelix

pnkfelix Oct 13, 2016

Member

You gave an example above of the token stream ['a', 'b', 'c', ';']. My first impression upon reading that is that all four of the listed tokens belong to the same category, but then I saw this Word/Punctuation distinction in the variants here.

Is ';' considered Punctuation? (Edit: oh, answer is, its listed above as its own variant, Semicolon.)

  • What about ,?
  • What about +?
  • What about && or -> or <-?
  • Edit: ah, down below I see that you have pointed out multi-char operators as an open-question. So I guess all of these cases are considered Punctuation, and its up to the parser to take a sequence [&, &] and turn it into &&? Sounds painful for macro authors (indeed, that pain seems to also be discussed below...)
@pnkfelix

This comment has been minimized.

Show comment
Hide comment
@pnkfelix

pnkfelix Oct 13, 2016

Member

@rfcbot concern multi-char-operators

The RFC lists "Punctuation(char) and multi-char operators" as an open question. I think I would want to see a particular solution to this question settled before we accept. (Some were listed but I cannot infer which would actually be implemented.)

Member

pnkfelix commented Oct 13, 2016

@rfcbot concern multi-char-operators

The RFC lists "Punctuation(char) and multi-char operators" as an open question. I think I would want to see a particular solution to this question settled before we accept. (Some were listed but I cannot infer which would actually be implemented.)

position.
I had hoped to represent each character as a separate token. However, to make
pattern matching backwards compatible, we would need to combine some tokens. In

This comment has been minimized.

@nikomatsakis

nikomatsakis Oct 19, 2016

Contributor

Can someone elaborate on this? I am confused. What does it mean to be "completely backwards compatible" -- backwards compatible with what?

@nikomatsakis

nikomatsakis Oct 19, 2016

Contributor

Can someone elaborate on this? I am confused. What does it mean to be "completely backwards compatible" -- backwards compatible with what?

This comment has been minimized.

@eddyb

eddyb Oct 19, 2016

Member

Something like able to tell apart << and < <, presumably? Consider a << b > ::c vs a < < b > ::c.

@eddyb

eddyb Oct 19, 2016

Member

Something like able to tell apart << and < <, presumably? Consider a << b > ::c vs a < < b > ::c.

This comment has been minimized.

@eddyb
@eddyb
Char(char),
// These tokens are treated specially since they are used for macro
// expansion or delimiting items.

This comment has been minimized.

@nikomatsakis

nikomatsakis Oct 19, 2016

Contributor

Can someone elaborate a bit here too -- why do these tokens need to be treated specially? It seems like some particular procedural macro might treat them separately, but why are we involved here? Maybe I'm missing some context.

@nikomatsakis

nikomatsakis Oct 19, 2016

Contributor

Can someone elaborate a bit here too -- why do these tokens need to be treated specially? It seems like some particular procedural macro might treat them separately, but why are we involved here? Maybe I'm missing some context.

This comment has been minimized.

@eddyb

eddyb Oct 19, 2016

Member

They shouldn't be, we just had a bad history of treating $foo (or worse, $bar:ty) as a single token.
cc @jseyfried (who I believe has removed at least some of those cases)

@eddyb

eddyb Oct 19, 2016

Member

They shouldn't be, we just had a bad history of treating $foo (or worse, $bar:ty) as a single token.
cc @jseyfried (who I believe has removed at least some of those cases)

This comment has been minimized.

@jseyfried

jseyfried Oct 19, 2016

We only treat $foo and $bar:ty as a single token when parsing with a non-zero quote_depth, which (afaik) happens only in macro_rules's tt fragment parser and in the quasi-quoter. In particular, today's procedural macros see $foo as two tokens.
(btw, I agree that these tokens shouldn't be treated specially)

@jseyfried

jseyfried Oct 19, 2016

We only treat $foo and $bar:ty as a single token when parsing with a non-zero quote_depth, which (afaik) happens only in macro_rules's tt fragment parser and in the quasi-quoter. In particular, today's procedural macros see $foo as two tokens.
(btw, I agree that these tokens shouldn't be treated specially)

@nikomatsakis

This comment has been minimized.

Show comment
Hide comment
@nikomatsakis

nikomatsakis Oct 19, 2016

Contributor

On the topic of multibyte characters, a trick I used in my LR(1) Rust grammar (that I stole from somewhere else, the Java grammar maybe?) seems relevant. In that case, I changed it so that the character < generates one of two distinct tokens depending on what comes next. So e.g. if you have a < b that is these tokens: ID <[] ID whereas a << b is these tokens: ID <[<] <[] ID. Note that < is either <[<] (meaning: a < followed by another < without whitespace) or <[] (meaning: a < followed by whitespace or some other thing).

One could imagine having PunctuationAlone(char) and PunctuationJoint(char), where the latter means "a punctuation character with another punctuation character coming immediately afterwards". So << would be PunctuationJoint('<') PunctuationAlone('<'). Probably with better names.

This trick basically has the best of both worlds from what I can tell.

Contributor

nikomatsakis commented Oct 19, 2016

On the topic of multibyte characters, a trick I used in my LR(1) Rust grammar (that I stole from somewhere else, the Java grammar maybe?) seems relevant. In that case, I changed it so that the character < generates one of two distinct tokens depending on what comes next. So e.g. if you have a < b that is these tokens: ID <[] ID whereas a << b is these tokens: ID <[<] <[] ID. Note that < is either <[<] (meaning: a < followed by another < without whitespace) or <[] (meaning: a < followed by whitespace or some other thing).

One could imagine having PunctuationAlone(char) and PunctuationJoint(char), where the latter means "a punctuation character with another punctuation character coming immediately afterwards". So << would be PunctuationJoint('<') PunctuationAlone('<'). Probably with better names.

This trick basically has the best of both worlds from what I can tell.

@nikomatsakis

This comment has been minimized.

Show comment
Hide comment
@nikomatsakis

nikomatsakis Oct 19, 2016

Contributor

Procedural macros which take an identifier before the argument list (e.g, foo! bar(...)) will not be supported (at least initially).

Strictly for the record, I personally think these can be a really useful form, but I'm happy to leave them out.

My main use case is something like neon, which uses (or plans to use) macros to declare JavaScript classes. It's just nicer to write:

class! Foo {
}

and less great to write:

neon! {
    class Foo {
    }
}

But I know there are all kinds of niggly details (do you accept class! Foo: Bar { ... }? etc) and leaving it out to start seems fine.

Contributor

nikomatsakis commented Oct 19, 2016

Procedural macros which take an identifier before the argument list (e.g, foo! bar(...)) will not be supported (at least initially).

Strictly for the record, I personally think these can be a really useful form, but I'm happy to leave them out.

My main use case is something like neon, which uses (or plans to use) macros to declare JavaScript classes. It's just nicer to write:

class! Foo {
}

and less great to write:

neon! {
    class Foo {
    }
}

But I know there are all kinds of niggly details (do you accept class! Foo: Bar { ... }? etc) and leaving it out to start seems fine.

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Oct 19, 2016

Member

So << would be PunctuationJoint('<') PunctuationAlone('<').

Not quite (see below). What about Op(Joint | Alone, char)?

  • eat(<): Op(_, '<')
  • eat(<<): Op(Joint, '<') Op(_, '<') (could be <<<T>::CONST)

EDIT: I like Op more, it's shorter than Punctuation. Any better suggestions?
Also, who else wishes they could write a pattern once and then reuse it?

pattern Op2(a: char, b: char) = [Op(Joint, a), Op(_, b)];
pattern Shl = Op2('<', '<');
pattern Shr = Op2('>', '>');

Add slice-like pattern-matching on random-access iterators and you have yourself one of the nicest setups for writing parsers, ever, in a systems programming language, maybe nicer than some FP ones.

EDIT2: Ahhhh I want this so badly (and this doesn't even use VG/const generics):

pattern BinOpChar = '+' | '-' | '*' | '/' | '%' | '^' | '|' | '&';
pattern Op2(a: char, b: char) = [Op(Joint, a), Op(_, b)];
pattern Op3(a: char, b: char, c: char) = [Op(Joint, a), Op(Joint, b), Op(_, c)];
pattern Op1Assign([a]: [char; 1]) = Op2(a, '=');
pattern Op2Assign([a, b]: [char; 2]) = Op3(a, b, '=');
pattern OpAssign(op: [char]) = Op2Assign(op @ ['<', '<']) |
                               Op2Assign(op @ ['>', '>']) |
                               Op1Assign(op @ [BinOpChar]);
// ...
// Random example of "lowering":
match tokens {
    [x @ Ident(_), ..OpAssign(op), ..] => [x, Op(Alone, '='), x, op],
    // ...
}
Member

eddyb commented Oct 19, 2016

So << would be PunctuationJoint('<') PunctuationAlone('<').

Not quite (see below). What about Op(Joint | Alone, char)?

  • eat(<): Op(_, '<')
  • eat(<<): Op(Joint, '<') Op(_, '<') (could be <<<T>::CONST)

EDIT: I like Op more, it's shorter than Punctuation. Any better suggestions?
Also, who else wishes they could write a pattern once and then reuse it?

pattern Op2(a: char, b: char) = [Op(Joint, a), Op(_, b)];
pattern Shl = Op2('<', '<');
pattern Shr = Op2('>', '>');

Add slice-like pattern-matching on random-access iterators and you have yourself one of the nicest setups for writing parsers, ever, in a systems programming language, maybe nicer than some FP ones.

EDIT2: Ahhhh I want this so badly (and this doesn't even use VG/const generics):

pattern BinOpChar = '+' | '-' | '*' | '/' | '%' | '^' | '|' | '&';
pattern Op2(a: char, b: char) = [Op(Joint, a), Op(_, b)];
pattern Op3(a: char, b: char, c: char) = [Op(Joint, a), Op(Joint, b), Op(_, c)];
pattern Op1Assign([a]: [char; 1]) = Op2(a, '=');
pattern Op2Assign([a, b]: [char; 2]) = Op3(a, b, '=');
pattern OpAssign(op: [char]) = Op2Assign(op @ ['<', '<']) |
                               Op2Assign(op @ ['>', '>']) |
                               Op1Assign(op @ [BinOpChar]);
// ...
// Random example of "lowering":
match tokens {
    [x @ Ident(_), ..OpAssign(op), ..] => [x, Op(Alone, '='), x, op],
    // ...
}
@Ericson2314

This comment has been minimized.

Show comment
Hide comment
@Ericson2314

Ericson2314 Oct 19, 2016

Contributor

I am concerned about the lack of an analog to Racket's (import (for-syntax ..)), but this also applies to macros 1.1 so probably best to discuss it in that tracking issue, and update this post-merge based on the resolution there.

Contributor

Ericson2314 commented Oct 19, 2016

I am concerned about the lack of an analog to Racket's (import (for-syntax ..)), but this also applies to macros 1.1 so probably best to discuss it in that tracking issue, and update this post-merge based on the resolution there.

@nikomatsakis

This comment has been minimized.

Show comment
Hide comment
@nikomatsakis

nikomatsakis Oct 31, 2016

Contributor

@pnkfelix would you consider this approach to adjacent operators (or, more specifically, @eddyb's single-variant formulation here) to resolve your "multi-char-operators" concern?

@nrc, maybe edit RFC to include it?

Contributor

nikomatsakis commented Oct 31, 2016

@pnkfelix would you consider this approach to adjacent operators (or, more specifically, @eddyb's single-variant formulation here) to resolve your "multi-char-operators" concern?

@nrc, maybe edit RFC to include it?

@Zoxc

This comment has been minimized.

Show comment
Hide comment
@Zoxc

Zoxc Oct 31, 2016

@nikomatsakis What about DSLs which want custom operators like <=> without allowing <= > too?

Zoxc commented Oct 31, 2016

@nikomatsakis What about DSLs which want custom operators like <=> without allowing <= > too?

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Oct 31, 2016

Member

@Zoxc Read my description (which @nikomatsakis linked above), you can check for chaining indefinitely, I have an example for compound assignment such as >>=, which can be distinguished from >> =.

Member

eddyb commented Oct 31, 2016

@Zoxc Read my description (which @nikomatsakis linked above), you can check for chaining indefinitely, I have an example for compound assignment such as >>=, which can be distinguished from >> =.

@pnkfelix

This comment has been minimized.

Show comment
Hide comment
@pnkfelix

pnkfelix Nov 2, 2016

Member

@nrc @nikomatsakis @eddyb I think the options you have outlined would address my concern, (which was mostly about the fact that the known problem was unaddressed; I did not mean to imply the problem was unsolvable.)

Member

pnkfelix commented Nov 2, 2016

@nrc @nikomatsakis @eddyb I think the options you have outlined would address my concern, (which was mostly about the fact that the known problem was unaddressed; I did not mean to imply the problem was unsolvable.)

@pnkfelix

This comment has been minimized.

Show comment
Hide comment
@pnkfelix

pnkfelix Nov 3, 2016

Member

@rfcbot resolved multi-char-operators

Member

pnkfelix commented Nov 3, 2016

@rfcbot resolved multi-char-operators

@nrc

This comment has been minimized.

Show comment
Hide comment
@nrc

nrc Nov 28, 2016

Member

ping for approval - @aturon @withoutboats #1566 (comment)

I plan to update the RFC with the recent discussion about multi-char-operators.

Member

nrc commented Nov 28, 2016

ping for approval - @aturon @withoutboats #1566 (comment)

I plan to update the RFC with the recent discussion about multi-char-operators.

@aturon

This comment has been minimized.

Show comment
Hide comment
@aturon

aturon Nov 29, 2016

Member

@nrc I've re-read the RFC and the thread. Like most everyone else, I'm broadly in favor of the overall direction here (operating on token streams, easing the declaration system). My sense is that there's a lot of room for iteration on the details, but I'm taking this RFC as largely about setting the overall direction of exploration. In particular, I imagine that libproc_macro itself is going to take significant iteration that will feed back into the design elements here.

I'm also in agreement with the various points of scope cutting (e.g., making proc_macro work at crate granularity, leaving out foo! bar (..) macros). There's going to be a lot of work to put this new system together, so anywhere we can punt extensions to the future, we should.

👍 from me!

Member

aturon commented Nov 29, 2016

@nrc I've re-read the RFC and the thread. Like most everyone else, I'm broadly in favor of the overall direction here (operating on token streams, easing the declaration system). My sense is that there's a lot of room for iteration on the details, but I'm taking this RFC as largely about setting the overall direction of exploration. In particular, I imagine that libproc_macro itself is going to take significant iteration that will feed back into the design elements here.

I'm also in agreement with the various points of scope cutting (e.g., making proc_macro work at crate granularity, leaving out foo! bar (..) macros). There's going to be a lot of work to put this new system together, so anywhere we can punt extensions to the future, we should.

👍 from me!

@withoutboats

This comment has been minimized.

Show comment
Hide comment
@withoutboats

withoutboats Nov 30, 2016

Contributor

👍 from me. I don't think this is impacted by the conversation on #1584, at least not in any way that should block the RFC from being accepted.

Contributor

withoutboats commented Nov 30, 2016

👍 from me. I don't think this is impacted by the conversation on #1584, at least not in any way that should block the RFC from being accepted.

@rfcbot

This comment has been minimized.

Show comment
Hide comment
@rfcbot

rfcbot Nov 30, 2016

🔔 This is now entering its final comment period, as per the review above. 🔔

rfcbot commented Nov 30, 2016

🔔 This is now entering its final comment period, as per the review above. 🔔

@aturon aturon referenced this pull request Dec 13, 2016

Open

Tracking issue for RFC 1566: Procedural macros #38356

24 of 31 tasks complete

@aturon aturon merged commit 1c2a50d into rust-lang:master Dec 13, 2016

@aturon

This comment has been minimized.

Show comment
Hide comment
@aturon

aturon Dec 13, 2016

Member

The RFC bot has gotten stuck, but almost two weeks have elapsed since FCP, and positive consensus around this feature remains. I'm merging the RFC! Thanks @nrc!

Tracking issue

Member

aturon commented Dec 13, 2016

The RFC bot has gotten stuck, but almost two weeks have elapsed since FCP, and positive consensus around this feature remains. I'm merging the RFC! Thanks @nrc!

Tracking issue

two kinds exist today, and other than naming (see
[RFC 1561](https://github.com/rust-lang/rfcs/pull/1561)) the syntax for using
these macros remains unchanged. If the macro is called `foo`, then a function-
like macro is used with syntax `foo!(...)`, and an attribute-like macro with

This comment has been minimized.

@est31

est31 Dec 14, 2016

Contributor

So this means you can't call procedural macros with [] syntax anymore, like vec! for example does?

@est31

est31 Dec 14, 2016

Contributor

So this means you can't call procedural macros with [] syntax anymore, like vec! for example does?

This comment has been minimized.

@Connicpu

Connicpu Dec 21, 2016

No, it's talking about function-like (foo!(...), foo![...], foo!{...}) vs attribute macros (#[my_macro] ...)

@Connicpu

Connicpu Dec 21, 2016

No, it's talking about function-like (foo!(...), foo![...], foo!{...}) vs attribute macros (#[my_macro] ...)

bors added a commit to rust-lang/rust that referenced this pull request Jan 17, 2017

Auto merge of #38842 - abonander:proc_macro_attribute, r=jseyfried
Implement `#[proc_macro_attribute]`

This implements `#[proc_macro_attribute]` as described in rust-lang/rfcs#1566

The following major (hopefully non-breaking) changes are included:

* Refactor `proc_macro::TokenStream` to use `syntax::tokenstream::TokenStream`.
    * `proc_macro::tokenstream::TokenStream` no longer emits newlines between items, this can be trivially restored if desired
    * `proc_macro::TokenStream::from_str` does not try to parse an item anymore, moved to `impl MultiItemModifier for CustomDerive` with more informative error message

* Implement `#[proc_macro_attribute]`, which expects functions of the kind `fn(TokenStream, TokenStream) -> TokenStream`
    * Reactivated `#![feature(proc_macro)]` and gated `#[proc_macro_attribute]` under it
    * `#![feature(proc_macro)]` and `#![feature(custom_attribute)]` are mutually exclusive
    * adding `#![feature(proc_macro)]` makes the expansion pass assume that any attributes that are not built-in, or introduced by existing syntax extensions, are proc-macro attributes

* Fix `feature_gate::find_lang_feature_issue()` to not use `unwrap()`

    * This change wasn't necessary for this PR, but it helped debugging a problem where I was using the wrong feature string.

* Move "completed feature gate checking" pass to after "name resolution" pass

    * This was necessary for proper feature-gating of `#[proc_macro_attribute]` invocations when the `proc_macro` feature flag isn't set.

Prototype/Litmus Test: [Implementation](https://github.com/abonander/anterofit/blob/proc_macro/service-attr/src/lib.rs#L13) -- [Usage](https://github.com/abonander/anterofit/blob/proc_macro/service-attr/examples/post_service.rs#L35)

bors added a commit to rust-lang/rust that referenced this pull request Jan 18, 2017

Auto merge of #38842 - abonander:proc_macro_attribute, r=jseyfried
Implement `#[proc_macro_attribute]`

This implements `#[proc_macro_attribute]` as described in rust-lang/rfcs#1566

The following major (hopefully non-breaking) changes are included:

* Refactor `proc_macro::TokenStream` to use `syntax::tokenstream::TokenStream`.
    * `proc_macro::tokenstream::TokenStream` no longer emits newlines between items, this can be trivially restored if desired
    * `proc_macro::TokenStream::from_str` does not try to parse an item anymore, moved to `impl MultiItemModifier for CustomDerive` with more informative error message

* Implement `#[proc_macro_attribute]`, which expects functions of the kind `fn(TokenStream, TokenStream) -> TokenStream`
    * Reactivated `#![feature(proc_macro)]` and gated `#[proc_macro_attribute]` under it
    * `#![feature(proc_macro)]` and `#![feature(custom_attribute)]` are mutually exclusive
    * adding `#![feature(proc_macro)]` makes the expansion pass assume that any attributes that are not built-in, or introduced by existing syntax extensions, are proc-macro attributes

* Fix `feature_gate::find_lang_feature_issue()` to not use `unwrap()`

    * This change wasn't necessary for this PR, but it helped debugging a problem where I was using the wrong feature string.

* Move "completed feature gate checking" pass to after "name resolution" pass

    * This was necessary for proper feature-gating of `#[proc_macro_attribute]` invocations when the `proc_macro` feature flag isn't set.

Prototype/Litmus Test: [Implementation](https://github.com/abonander/anterofit/blob/proc_macro/service-attr/src/lib.rs#L13) -- [Usage](https://github.com/abonander/anterofit/blob/proc_macro/service-attr/examples/post_service.rs#L35)

alexcrichton added a commit to alexcrichton/rust that referenced this pull request Jan 19, 2017

Rollup merge of #38842 - abonander:proc_macro_attribute, r=jseyfried
Implement `#[proc_macro_attribute]`

This implements `#[proc_macro_attribute]` as described in rust-lang/rfcs#1566

The following major (hopefully non-breaking) changes are included:

* Refactor `proc_macro::TokenStream` to use `syntax::tokenstream::TokenStream`.
    * `proc_macro::tokenstream::TokenStream` no longer emits newlines between items, this can be trivially restored if desired
    * `proc_macro::TokenStream::from_str` does not try to parse an item anymore, moved to `impl MultiItemModifier for CustomDerive` with more informative error message

* Implement `#[proc_macro_attribute]`, which expects functions of the kind `fn(TokenStream, TokenStream) -> TokenStream`
    * Reactivated `#![feature(proc_macro)]` and gated `#[proc_macro_attribute]` under it
    * `#![feature(proc_macro)]` and `#![feature(custom_attribute)]` are mutually exclusive
    * adding `#![feature(proc_macro)]` makes the expansion pass assume that any attributes that are not built-in, or introduced by existing syntax extensions, are proc-macro attributes

* Fix `feature_gate::find_lang_feature_issue()` to not use `unwrap()`

    * This change wasn't necessary for this PR, but it helped debugging a problem where I was using the wrong feature string.

* Move "completed feature gate checking" pass to after "name resolution" pass

    * This was necessary for proper feature-gating of `#[proc_macro_attribute]` invocations when the `proc_macro` feature flag isn't set.

Prototype/Litmus Test: [Implementation](https://github.com/abonander/anterofit/blob/proc_macro/service-attr/src/lib.rs#L13) -- [Usage](https://github.com/abonander/anterofit/blob/proc_macro/service-attr/examples/post_service.rs#L35)

bors added a commit to rust-lang/rust that referenced this pull request Jan 20, 2017

Auto merge of #38842 - abonander:proc_macro_attribute, r=jseyfried
Implement `#[proc_macro_attribute]`

This implements `#[proc_macro_attribute]` as described in rust-lang/rfcs#1566

The following major (hopefully non-breaking) changes are included:

* Refactor `proc_macro::TokenStream` to use `syntax::tokenstream::TokenStream`.
    * `proc_macro::tokenstream::TokenStream` no longer emits newlines between items, this can be trivially restored if desired
    * `proc_macro::TokenStream::from_str` does not try to parse an item anymore, moved to `impl MultiItemModifier for CustomDerive` with more informative error message

* Implement `#[proc_macro_attribute]`, which expects functions of the kind `fn(TokenStream, TokenStream) -> TokenStream`
    * Reactivated `#![feature(proc_macro)]` and gated `#[proc_macro_attribute]` under it
    * `#![feature(proc_macro)]` and `#![feature(custom_attribute)]` are mutually exclusive
    * adding `#![feature(proc_macro)]` makes the expansion pass assume that any attributes that are not built-in, or introduced by existing syntax extensions, are proc-macro attributes

* Fix `feature_gate::find_lang_feature_issue()` to not use `unwrap()`

    * This change wasn't necessary for this PR, but it helped debugging a problem where I was using the wrong feature string.

* Move "completed feature gate checking" pass to after "name resolution" pass

    * This was necessary for proper feature-gating of `#[proc_macro_attribute]` invocations when the `proc_macro` feature flag isn't set.

Prototype/Litmus Test: [Implementation](https://github.com/abonander/anterofit/blob/proc_macro/service-attr/src/lib.rs#L13) -- [Usage](https://github.com/abonander/anterofit/blob/proc_macro/service-attr/examples/post_service.rs#L35)

bors added a commit to rust-lang/rust that referenced this pull request Jan 20, 2017

Auto merge of #38842 - abonander:proc_macro_attribute, r=jseyfried
Implement `#[proc_macro_attribute]`

This implements `#[proc_macro_attribute]` as described in rust-lang/rfcs#1566

The following major (hopefully non-breaking) changes are included:

* Refactor `proc_macro::TokenStream` to use `syntax::tokenstream::TokenStream`.
    * `proc_macro::tokenstream::TokenStream` no longer emits newlines between items, this can be trivially restored if desired
    * `proc_macro::TokenStream::from_str` does not try to parse an item anymore, moved to `impl MultiItemModifier for CustomDerive` with more informative error message

* Implement `#[proc_macro_attribute]`, which expects functions of the kind `fn(TokenStream, TokenStream) -> TokenStream`
    * Reactivated `#![feature(proc_macro)]` and gated `#[proc_macro_attribute]` under it
    * `#![feature(proc_macro)]` and `#![feature(custom_attribute)]` are mutually exclusive
    * adding `#![feature(proc_macro)]` makes the expansion pass assume that any attributes that are not built-in, or introduced by existing syntax extensions, are proc-macro attributes

* Fix `feature_gate::find_lang_feature_issue()` to not use `unwrap()`

    * This change wasn't necessary for this PR, but it helped debugging a problem where I was using the wrong feature string.

* Move "completed feature gate checking" pass to after "name resolution" pass

    * This was necessary for proper feature-gating of `#[proc_macro_attribute]` invocations when the `proc_macro` feature flag isn't set.

Prototype/Litmus Test: [Implementation](https://github.com/abonander/anterofit/blob/proc_macro/service-attr/src/lib.rs#L13) -- [Usage](https://github.com/abonander/anterofit/blob/proc_macro/service-attr/examples/post_service.rs#L35)

alexcrichton added a commit to alexcrichton/rust that referenced this pull request Jan 20, 2017

Rollup merge of #38842 - abonander:proc_macro_attribute, r=jseyfried
Implement `#[proc_macro_attribute]`

This implements `#[proc_macro_attribute]` as described in rust-lang/rfcs#1566

The following major (hopefully non-breaking) changes are included:

* Refactor `proc_macro::TokenStream` to use `syntax::tokenstream::TokenStream`.
    * `proc_macro::tokenstream::TokenStream` no longer emits newlines between items, this can be trivially restored if desired
    * `proc_macro::TokenStream::from_str` does not try to parse an item anymore, moved to `impl MultiItemModifier for CustomDerive` with more informative error message

* Implement `#[proc_macro_attribute]`, which expects functions of the kind `fn(TokenStream, TokenStream) -> TokenStream`
    * Reactivated `#![feature(proc_macro)]` and gated `#[proc_macro_attribute]` under it
    * `#![feature(proc_macro)]` and `#![feature(custom_attribute)]` are mutually exclusive
    * adding `#![feature(proc_macro)]` makes the expansion pass assume that any attributes that are not built-in, or introduced by existing syntax extensions, are proc-macro attributes

* Fix `feature_gate::find_lang_feature_issue()` to not use `unwrap()`

    * This change wasn't necessary for this PR, but it helped debugging a problem where I was using the wrong feature string.

* Move "completed feature gate checking" pass to after "name resolution" pass

    * This was necessary for proper feature-gating of `#[proc_macro_attribute]` invocations when the `proc_macro` feature flag isn't set.

Prototype/Litmus Test: [Implementation](https://github.com/abonander/anterofit/blob/proc_macro/service-attr/src/lib.rs#L13) -- [Usage](https://github.com/abonander/anterofit/blob/proc_macro/service-attr/examples/post_service.rs#L35)

@chriskrycho chriskrycho referenced this pull request Mar 29, 2017

Closed

Document all features #9

18 of 48 tasks complete

bors added a commit to rust-lang/rust that referenced this pull request May 16, 2018

Auto merge of #50473 - petrochenkov:pmapi, r=alexcrichton
Review proc macro API 1.2

cc #38356

Summary of applied changes:
- Documentation for proc macro API 1.2 is expanded.
- Renamed APIs: `Term` -> `Ident`, `TokenTree::Term` -> `TokenTree::Ident`, `Op` -> `Punct`, `TokenTree::Op` -> `TokenTree::Punct`, `Op::op` -> `Punct::as_char`.
- Removed APIs: `Ident::as_str`, use `Display` impl for `Ident` instead.
- New APIs (not stabilized in 1.2): `Ident::new_raw` for creating a raw identifier (I'm not sure `new_x` it's a very idiomatic name though).
- Runtime changes:
    - `Punct::new` now ensures that the input `char` is a valid punctuation character in Rust.
    - `Ident::new` ensures that the input `str` is a valid identifier in Rust.
    - Lifetimes in proc macros are now represented as two joint tokens - `Punct('\'', Spacing::Joint)` and `Ident("lifetime_name_without_quote")` similarly to multi-character operators.
- Stabilized APIs: None yet.

A bit of motivation for renaming (although it was already stated in the review comments):
- With my compiler frontend glasses on `Ident` is the single most appropriate name for this thing, *especially* if we are doing input validation on construction. `TokenTree::Ident` effectively wraps `token::Ident` or `ast::Ident + is_raw`, its meaning is "identifier" and it's already named `ident` in declarative macros.
- Regarding `Punct`, the motivation is that `Op` is actively misleading. The thing doesn't mean an operator, it's neither a subset of operators (there is non-operator punctuation in the language), nor superset (operators can be multicharacter while this thing is always a single character). So I named it `Punct` (first proposed in [the original RFC](rust-lang/rfcs#1566), then [by @SimonSapin](#38356 (comment))) , together with input validation it's now a subset of ASCII punctuation character category (`u8::is_ascii_punctuation`).
@Ekleog

This comment has been minimized.

Show comment
Hide comment
@Ekleog

Ekleog Aug 14, 2018

So, I've been looking a bit, and can't figure out: is there a way to call a function in the crate defining the procedural macro as part of the output of said procedural macro? The equivalent of $crate.

Ekleog commented Aug 14, 2018

So, I've been looking a bit, and can't figure out: is there a way to call a function in the crate defining the procedural macro as part of the output of said procedural macro? The equivalent of $crate.

@eddyb

This comment has been minimized.

Show comment
Hide comment
@eddyb

eddyb Aug 16, 2018

Member

@Ekleog No, the proc macro crate "lives in a different world" than the code it's used in. Imagine cross-compilation: the proc macro is compiled for, and runs on, the host, whereas the crate using it is compiled for the cross-compilation target.

What people do is have two crates, like serde_derive and serde, and the former generates code using the latter.

Member

eddyb commented Aug 16, 2018

@Ekleog No, the proc macro crate "lives in a different world" than the code it's used in. Imagine cross-compilation: the proc macro is compiled for, and runs on, the host, whereas the crate using it is compiled for the cross-compilation target.

What people do is have two crates, like serde_derive and serde, and the former generates code using the latter.

@Ekleog

This comment has been minimized.

Show comment
Hide comment
@Ekleog

Ekleog Aug 17, 2018

Oh sorry I must have forgotten I had posted here, when I search for whether I already asked this question or not a few days ago! The discussion appears to have happened on rust-lang/rust#38356 (comment), sorry for the noise.

Basically, my issue with your answer is, there is no way for the proc macro to generate code that refers to crates that are not the crate they're being expanded into, just like serde also has an issue open with.

Ekleog commented Aug 17, 2018

Oh sorry I must have forgotten I had posted here, when I search for whether I already asked this question or not a few days ago! The discussion appears to have happened on rust-lang/rust#38356 (comment), sorry for the noise.

Basically, my issue with your answer is, there is no way for the proc macro to generate code that refers to crates that are not the crate they're being expanded into, just like serde also has an issue open with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment