Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Procedural macros #1566

Merged
merged 3 commits into from
Dec 13, 2016
Merged

Procedural macros #1566

merged 3 commits into from
Dec 13, 2016

Conversation

nrc
Copy link
Member

@nrc nrc commented Apr 1, 2016

This RFC proposes an evolution of Rust's procedural macro system (aka syntax
extensions, aka compiler plugins). This RFC specifies syntax for the definition
of procedural macros, a high-level view of their implementation in the compiler,
and outlines how they interact with the compilation process.

At the highest level, macros are defined by implementing functions marked with
a #[macro] attribute. Macros operate on a list of tokens provided by the
compiler and return a list of tokens that the macro use is replaced by. We
provide low-level facilities for operating on these tokens. Higher level
facilities (e.g., for parsing tokens to an AST) should exist as library crates.

This RFC proposes an evolution of Rust's procedural macro system (aka syntax
extensions, aka compiler plugins). This RFC specifies syntax for the definition
of procedural macros, a high-level view of their implementation in the compiler,
and outlines how they interact with the compilation process.

At the highest level, macros are defined by implementing functions marked with
a `#[macro]` attribute. Macros operate on a list of tokens provided by the
compiler and return a list of tokens that the macro use is replaced by. We
provide low-level facilities for operating on these tokens. Higher level
facilities (e.g., for parsing tokens to an AST) should exist as library crates.
When a `#[cfg(macro)]` crate is `extern crate`ed, it's items (even public ones)
are not available to the importing crate; only macros declared in that crate.
The crate is dynamically linked with the compiler at compile-time, rather
than with the importing crate at runtime.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should come in hand with a warning/lint about adding public items in a #[cfg(macro)] crate

@jimmycuadra
Copy link

Rendered

(I am SOOOOO excited about this. :D)

## Tokens

Procedural macros will primarily operate on tokens. There are two main benefits
to this principal: flexibility and future proofing. By operating on tokens, code
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: principle

@steveklabnik
Copy link
Member

I am also psyched to see movement on this 😄

@aturon aturon added T-lang Relevant to the language team, which will review and decide on the RFC. T-compiler Relevant to the compiler team, which will review and decide on the RFC. labels Apr 1, 2016

```
#[macro]
pub fn foo(TokenStream, &mut MacroContext) -> TokenStream;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternative: use the macro keyword (it was reserved in 1.0). It would make it feel more part of the language, in my opinion.

pub macro foo(TokenStream, &mut MacroContext) -> TokenStream;

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens to conflicting names of function/attribute-like macros?

#[macro]
pub fn foo(TokenStream, &mut MacroContext) -> TokenStream;
#[macro_attribute]
pub fn foo(Option<TokenStream>, TokenStream, &mut MacroContext) -> TokenStream;

One possibility is allowing the above, and another is providing access to all data through MacroContext:

#[macro]
#[macro_attribute]
pub fn foo(context: &mut MacroContext) -> TokenStream {
    let tokens = context.token_stream();
    if let Attribute(params) = context.macro_kind() {
        // ...
    } else {
        // ...
    }
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using macro is appealing, but I'm not sure what we would do for attribute-like macros or macros with an extra ident. Should we depend just on the signature? That feels a bit fragile to me, but might work.

@Zoxc
Copy link

Zoxc commented Apr 2, 2016

I am missing Space, Tab and Newline from the tokens. Without them, it's hard to tell the difference between + = and +=. They would also allow you to embed languages which are newline sensitive, like assembly.

@thepowersgang
Copy link
Contributor

They're not needed - += should be a distinct token to + =

@eddyb
Copy link
Member

eddyb commented Apr 2, 2016

@Zoxc spaces are intentionally omitted, they complicate everything. We skip those tokens in the lexer (Reader::real_token) in the current parser.

@eddyb
Copy link
Member

eddyb commented Apr 3, 2016

@nrc I've been talking to @mrmonday about his pnet crate, which gives you the tools to define arbitrary packets - network is the main target, but I suspect IPC or serialization would also be valid usecases; I hear dropbox is using it for something.

From what I understand, it would be real helpful if given an arbitrary type path (e.g. the type of a field), a procedural macro could query the MacroContext and get some information about the type definition (a struct or enum, AFAICT).

My reading of #1560 suggests this should be possible, and if nothing else, having a macro next to the type definition would also work, i.e. given field: path::to::Type, pnet could query path::to::Type__pnet_info (previously generated by pnet) or at least defer to it for decisions.

@Zoxc
Copy link

Zoxc commented Apr 4, 2016

Are you supposed to be able to generate Spans dynamically for token you generate? For example include! should make a new Span for the tokens in the external file. That means there is at least hope of inferring whitespace/newlines from Spans. It would make the fields of Comment and String seem redundant, since those could easily be inferred from the Span.

I wonder if it would be a good idea to include a Unknown token, which could contain an AST or newer tokens not supported by the current version of libmacro. Inline assembly seems to be a good candidate to put inside such a token, since asm! cannot currently return tokens.

@eddyb
Copy link
Member

eddyb commented Apr 4, 2016

@Zoxc Well, one solution to make matches over tokens, from stable crates, non-exhaustive is to have a dummy variant which is unstable - InlineAsm is another good example of an unstable variant.
Not doing anything like this would force libmacro into a fixed set of tokens which cannot be extended.

@pnkfelix
Copy link
Member

pnkfelix commented Nov 3, 2016

@rfcbot resolved multi-char-operators

@nrc
Copy link
Member Author

nrc commented Nov 28, 2016

ping for approval - @aturon @withoutboats #1566 (comment)

I plan to update the RFC with the recent discussion about multi-char-operators.

@aturon
Copy link
Member

aturon commented Nov 29, 2016

@nrc I've re-read the RFC and the thread. Like most everyone else, I'm broadly in favor of the overall direction here (operating on token streams, easing the declaration system). My sense is that there's a lot of room for iteration on the details, but I'm taking this RFC as largely about setting the overall direction of exploration. In particular, I imagine that libproc_macro itself is going to take significant iteration that will feed back into the design elements here.

I'm also in agreement with the various points of scope cutting (e.g., making proc_macro work at crate granularity, leaving out foo! bar (..) macros). There's going to be a lot of work to put this new system together, so anywhere we can punt extensions to the future, we should.

👍 from me!

@withoutboats
Copy link
Contributor

👍 from me. I don't think this is impacted by the conversation on #1584, at least not in any way that should block the RFC from being accepted.

@rfcbot
Copy link
Collaborator

rfcbot commented Nov 30, 2016

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot added the final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. label Nov 30, 2016
@aturon aturon merged commit 1c2a50d into rust-lang:master Dec 13, 2016
@aturon
Copy link
Member

aturon commented Dec 13, 2016

The RFC bot has gotten stuck, but almost two weeks have elapsed since FCP, and positive consensus around this feature remains. I'm merging the RFC! Thanks @nrc!

Tracking issue

two kinds exist today, and other than naming (see
[RFC 1561](https://github.com/rust-lang/rfcs/pull/1561)) the syntax for using
these macros remains unchanged. If the macro is called `foo`, then a function-
like macro is used with syntax `foo!(...)`, and an attribute-like macro with
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this means you can't call procedural macros with [] syntax anymore, like vec! for example does?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's talking about function-like (foo!(...), foo![...], foo!{...}) vs attribute macros (#[my_macro] ...)

alexcrichton added a commit to alexcrichton/rust that referenced this pull request Jan 20, 2017
…seyfried

Implement `#[proc_macro_attribute]`

This implements `#[proc_macro_attribute]` as described in rust-lang/rfcs#1566

The following major (hopefully non-breaking) changes are included:

* Refactor `proc_macro::TokenStream` to use `syntax::tokenstream::TokenStream`.
    * `proc_macro::tokenstream::TokenStream` no longer emits newlines between items, this can be trivially restored if desired
    * `proc_macro::TokenStream::from_str` does not try to parse an item anymore, moved to `impl MultiItemModifier for CustomDerive` with more informative error message

* Implement `#[proc_macro_attribute]`, which expects functions of the kind `fn(TokenStream, TokenStream) -> TokenStream`
    * Reactivated `#![feature(proc_macro)]` and gated `#[proc_macro_attribute]` under it
    * `#![feature(proc_macro)]` and `#![feature(custom_attribute)]` are mutually exclusive
    * adding `#![feature(proc_macro)]` makes the expansion pass assume that any attributes that are not built-in, or introduced by existing syntax extensions, are proc-macro attributes

* Fix `feature_gate::find_lang_feature_issue()` to not use `unwrap()`

    * This change wasn't necessary for this PR, but it helped debugging a problem where I was using the wrong feature string.

* Move "completed feature gate checking" pass to after "name resolution" pass

    * This was necessary for proper feature-gating of `#[proc_macro_attribute]` invocations when the `proc_macro` feature flag isn't set.

Prototype/Litmus Test: [Implementation](https://github.com/abonander/anterofit/blob/proc_macro/service-attr/src/lib.rs#L13) -- [Usage](https://github.com/abonander/anterofit/blob/proc_macro/service-attr/examples/post_service.rs#L35)
bors added a commit to rust-lang/rust that referenced this pull request May 16, 2018
Review proc macro API 1.2

cc #38356

Summary of applied changes:
- Documentation for proc macro API 1.2 is expanded.
- Renamed APIs: `Term` -> `Ident`, `TokenTree::Term` -> `TokenTree::Ident`, `Op` -> `Punct`, `TokenTree::Op` -> `TokenTree::Punct`, `Op::op` -> `Punct::as_char`.
- Removed APIs: `Ident::as_str`, use `Display` impl for `Ident` instead.
- New APIs (not stabilized in 1.2): `Ident::new_raw` for creating a raw identifier (I'm not sure `new_x` it's a very idiomatic name though).
- Runtime changes:
    - `Punct::new` now ensures that the input `char` is a valid punctuation character in Rust.
    - `Ident::new` ensures that the input `str` is a valid identifier in Rust.
    - Lifetimes in proc macros are now represented as two joint tokens - `Punct('\'', Spacing::Joint)` and `Ident("lifetime_name_without_quote")` similarly to multi-character operators.
- Stabilized APIs: None yet.

A bit of motivation for renaming (although it was already stated in the review comments):
- With my compiler frontend glasses on `Ident` is the single most appropriate name for this thing, *especially* if we are doing input validation on construction. `TokenTree::Ident` effectively wraps `token::Ident` or `ast::Ident + is_raw`, its meaning is "identifier" and it's already named `ident` in declarative macros.
- Regarding `Punct`, the motivation is that `Op` is actively misleading. The thing doesn't mean an operator, it's neither a subset of operators (there is non-operator punctuation in the language), nor superset (operators can be multicharacter while this thing is always a single character). So I named it `Punct` (first proposed in [the original RFC](rust-lang/rfcs#1566), then [by @SimonSapin](#38356 (comment))) , together with input validation it's now a subset of ASCII punctuation character category (`u8::is_ascii_punctuation`).
@Ekleog
Copy link

Ekleog commented Aug 14, 2018

So, I've been looking a bit, and can't figure out: is there a way to call a function in the crate defining the procedural macro as part of the output of said procedural macro? The equivalent of $crate.

@eddyb
Copy link
Member

eddyb commented Aug 16, 2018

@Ekleog No, the proc macro crate "lives in a different world" than the code it's used in. Imagine cross-compilation: the proc macro is compiled for, and runs on, the host, whereas the crate using it is compiled for the cross-compilation target.

What people do is have two crates, like serde_derive and serde, and the former generates code using the latter.

@Ekleog
Copy link

Ekleog commented Aug 17, 2018

Oh sorry I must have forgotten I had posted here, when I search for whether I already asked this question or not a few days ago! The discussion appears to have happened on rust-lang/rust#38356 (comment), sorry for the noise.

Basically, my issue with your answer is, there is no way for the proc macro to generate code that refers to crates that are not the crate they're being expanded into, just like serde also has an issue open with.

@Centril Centril added A-macros Macro related proposals and issues A-proc-macros Proc macro related proposals & ideas A-attributes Proposals relating to attributes labels Nov 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-attributes Proposals relating to attributes A-macros Macro related proposals and issues A-proc-macros Proc macro related proposals & ideas final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.