RFC: Procedural macros 1.1 #1681

Merged
merged 8 commits into from Aug 22, 2016

Projects

None yet
@alexcrichton
Member

Extract a very small sliver of today's procedural macro system in the compiler,
just enough to get basic features like custom derive working, to have an
eventually stable API. Ensure that these features will not pose a maintenance
burden on the compiler but also don't try to provide enough features for the
"perfect macro system" at the same time. Overall, this should be considered an
incremental step towards an official "plugins 2.0".

Rendered

@alexcrichton alexcrichton RFC: Macros 1.1
Extract a very small sliver of today's procedural macro system in the compiler,
just enough to get basic features like custom derive working, to have an
eventually stable API. Ensure that these features will not pose a maintenance
burden on the compiler but also don't try to provide enough features for the
"perfect macro system" at the same time. Overall, this should be considered an
incremental step towards an official "plugins 2.0".
e2981fc
@alexcrichton alexcrichton self-assigned this Jul 18, 2016
@aturon
Contributor
aturon commented Jul 18, 2016

Suggestion: this might want to be titled Plugins 1.1, as it doesn't really impact the "macro" system (if you think of that as macro_rules, which I think many do).

@aturon
Contributor
aturon commented Jul 18, 2016
@eddyb
Member
eddyb commented Jul 18, 2016

@aturon I believe this is because @nrc wanted to avoid "plugins" for "procedural macros".

@alexcrichton
Member

@aturon yeah it's true that this only really affects what we call plugins today, but I believe that @nrc wanted to start avoiding the term "plugin" as everything will eventually be a macro in the "macros 2.0" world.

@eddyb eddyb commented on the diff Jul 18, 2016
text/0000-macros-1.1.md
+especially that both the compiler and the macro dynamically link to libraries
+like libsyntax, cannot be changed. This precludes, for example, a completely
+statically linked compiler (e.g. compiled for `x86_64-unknown-linux-musl`).
+Another goal of this RFC will also be to hide as many of these technical
+details as possible, allowing the compiler to flexibly change how it interfaces
+to macros.
+
+## Macros 1.1
+
+Ok, with the background knowledge of what procedural macros are today, let's
+take a look at how we can solve the major problems blocking its stabilization:
+
+* Sharing an API of the entire compiler
+* Frozen interface between the compiler and macros
+
+### `librustc_macro`
@eddyb
eddyb Jul 18, 2016 Member

This name is unfortunate, couldn't we use libmacro?

@alexcrichton
alexcrichton Jul 18, 2016 Member

This is currently done because macro is a keyword, and this doesn't propose changing that just yet. I was thinking that we could perhaps wholesale rename the crate one day as well if macro becomes an available name.

@MovingtoMars
MovingtoMars Jul 19, 2016 edited

How about libmacros? This follows the approach that Go takes, ie the strings package.

@sfackler sfackler commented on the diff Jul 18, 2016
text/0000-macros-1.1.md
+If a macro cannot process the input token stream, it is expected to panic for
+now, although eventually it will call methods on `Context` to provide more
+structured errors. The compiler will wrap up the panic message and display it
+to the user appropriately. Eventually, however, `librustc_macro` will provide
+more interesting methods of signaling errors to users.
+
+Customization of user-defined `#[derive]` modes can still be done through custom
+attributes, although it will be required for `rustc_macro_derive`
+implementations to remove these attributes when handing them back to the
+compiler. The compiler will still gate unknown attributes by default.
+
+### `rustc-macro` crates
+
+Like the rlib and dylib crate types, the `rustc-macro` crate
+type is intended to be an intermediate product. What it *actually* produces is
+not specified, but if a `-L` path is provided to it then the compiler will
@sfackler
sfackler Jul 18, 2016 Member

This feels a bit of an abuse of "link" if we switch plugins to IPC executables - maybe a separate flag would be a good idea?

@alexcrichton
alexcrichton Jul 18, 2016 Member

I believe the intention is that --extern works to load macros (it's what Cargo will pass), so this was just a natural extension of that. We could perhaps add something like -L macro=..., but at this point it doesn't seem strictly necessary as the compiler can always distinguish whether an artifact is a macro or an rlib.

@eddyb
eddyb Jul 18, 2016 Member

I believe -L for rustc means "library search path" not "link search path".

@alexcrichton alexcrichton and 3 others commented on an outdated diff Jul 18, 2016
text/0000-macros-1.1.md
+
+pub struct TokenStream {
+ // ...
+}
+
+#[derive(Debug)]
+pub struct LexError {
+ // ...
+}
+
+pub struct Context {
+ // ...
+}
+
+impl TokenStream {
+ pub fn from_source(cx: &mut Context,
@alexcrichton
alexcrichton Jul 18, 2016 Member

@nrc could you comment on the Context argument here? I believe both @eddyb and @cgswords think it may not be necessary, in which case the pointer could be dropped everywhere and these methods could just be fmt::Display and FromStr (standard traits)

@nrc
nrc Jul 18, 2016 Contributor

I agree it may not be necessary. To be clear you must have some handle to the compiler, let's call that MacroContext, then either you can pass that around in the various functions, or you can leave it in TLS. The former is less ergonomic, but the later might have problems with multi-threaded macros and is arguably not as good engineering practice. I'm undecided as to which way we should swing in terms of the cost/benefit, but it would make a big difference to any libmacro APIs (methods vs free functions + the argument itself).

@eddyb
eddyb Jul 18, 2016 edited Member

Keep in mind that if you do interning with integer indices you can't even implement fmt::Debug on Token without putting something in TLS, and there is no meaningful difference between passing &mut MacroContext around and having it in TLS except for needing to set up TLS in each thread, as opposed to passing the context to one child thread, if needed.

If any TLS is used (as is now needed to impl Debug for Token), then the threading problems are unavoidable and having the context usable cross-thread would let users write broken code.

@cgswords
cgswords Jul 18, 2016

In the discussion about parallel expansion, there needs to be a principled way to combine contexts / TLS: if a multi-threaded macro manages to manipualte its MacroContext in each of the threads (eg by interning things), combining these contexts in a principled way (eg unifying the interners and re-numbering some of the macro output) is basically the only way to place nice, anyway. To this end, it's TLS (or, more accurately, atomically-accessed globals) will end up the more ergonomic option.

@eddyb eddyb commented on an outdated diff Jul 18, 2016
text/0000-macros-1.1.md
+now, although eventually it will call methods on `Context` to provide more
+structured errors. The compiler will wrap up the panic message and display it
+to the user appropriately. Eventually, however, `librustc_macro` will provide
+more interesting methods of signaling errors to users.
+
+Customization of user-defined `#[derive]` modes can still be done through custom
+attributes, although it will be required for `rustc_macro_derive`
+implementations to remove these attributes when handing them back to the
+compiler. The compiler will still gate unknown attributes by default.
+
+### `rustc-macro` crates
+
+Like the executable, staticlib, and cdylib crate types, the `rustc-macro` crate
+type is intended to be a final product. What it *actually* produces is not
+specified, but if a `-L` path is provided to it then the compiler will recognize
+the output artifacts as a macro and it can be loaded for a program.
@eddyb
eddyb Jul 18, 2016 Member

Cargo's usage of --extern would also work.

@aturon
Contributor
aturon commented Jul 18, 2016

Thanks @alexcrichton for the awesome writeup! We've been working for a while on some way enable usage of Serde and similar libraries on stable in the near term, without massive stabilization. For example, there's a code generation proposal (which sits entirely on the side from the macro/plugin system), as well as attempts to find a very small increment toward "macros 2.0". But this is the first proposal that can be implemented immediately, has a clear path toward speedy stabilization, and imposes virtually no maintenance burden. I'm incredibly excited to be so close to a solution for the biggest cause of nightly usage today.

One thing that wasn't totally obvious to me, given that the macros 2.0 plan is a bit out of cache: what, if any, parts of what you propose to stabilize here would end up being deprecated once macros 2.0 has fully landed?

(To be clear, I have zero problem stabilizing the small slice of functionality you've outlined here in the near term even if in the long term we plan to deprecate it.)

@eddyb eddyb and 1 other commented on an outdated diff Jul 18, 2016
text/0000-macros-1.1.md
+* Macros eventually want to be imported from crates (e.g. `use foo::bar!`) and
+ limiting where `#[derive]` can be defined reduces the surface area for
+ possible conflict.
+* Macro crates eventually want to be compiled to be available both at runtime
+ and at compile time. That is, an `extern crate foo` annotation may load
+ *both* a `rustc-macro` crate and a crate to link against, if they are
+ available. Limiting the public exports for now to only custom-derive
+ annotations should allow for maximal flexibility here.
+
+### Using a procedural macro
+
+Using a procedural macro will be very similar to today's `extern crate` system,
+such as:
+
+```rust
+extern crate double;
@eddyb
eddyb Jul 18, 2016 Member

I would've expected at least #[macro_use] to be necessary.

@alexcrichton
alexcrichton Jul 18, 2016 Member

cc @nrc, I think that importing custom derive annotations was a somewhat unspecified extension of "macros 2.0" right now, so to me it seems somewhat up in the air about whether we'd require that or not.

I don't mind one way or another, but I'm curious what @nrc thinks as well.

Eventually of course custom-derive will be imported through the normal module system I believe, so this is purely just a temporary stopgap.

@eddyb
eddyb Jul 18, 2016 Member

The problem is that if you do anything by default now, you'll need to opt out to be able to use the normal module system.

@alexcrichton
alexcrichton Jul 18, 2016 Member

Aha, an excellent point! You've convinced me, I'll add it.

@eddyb eddyb and 1 other commented on an outdated diff Jul 18, 2016
text/0000-macros-1.1.md
+* Plugin authors will need to be quite careful about the code which they
+ generate as working with strings loses much of the expressiveness of macros in
+ Rust today. For example:
+
+ ```rust
+ macro_rules! foo {
+ ($x:expr) => {
+ #[derive(Serialize)]
+ enum Foo { Bar = $x, Baz = $x * 2 }
+ }
+ }
+ foo!(1 + 1);
+ ```
+
+ Plugin authors would have to ensure that this is not naively interpreted as
+ `Baz = 1 + 1 * 2` as this will cause incorrect results.
@eddyb
eddyb Jul 18, 2016 Member

Couldn't we add parenthesis around all such fragments at stringification time to avoid the misinterpretation?
Plugin authors can't do anything about this if all they have is conversion to String.

@alexcrichton
alexcrichton Jul 18, 2016 Member

Sure! This is just an implementation detail for the compiler though, right?

@eddyb
eddyb Jul 18, 2016 Member

It's visible to the plugins (and stable code can observe it with stringify! right now), so adding extra parens should be specified somewhere.

@nrc nrc changed the title from RFC: Macros 1.1 to RFC: Procedural macros 1.1 Jul 18, 2016
@alexcrichton
Member

@aturon

what, if any, parts of what you propose to stabilize here would end up being deprecated once macros 2.0 has fully landed?

This is a good question! Of the pieces here, what I would imagine is:

  • If we decide to rename anything, we'd accumulate deprecations for old names.
  • The behavior of "go to a string, parse, then go back to a token stream" will likely be deprecated in favor of directly manipulating the token stream.
  • The definition form of custom #[derive] will likely change and the proposed form here would be deprecated. We likely eventually want one that's imported via use (somehow).

Other than that though I imagine these pieces to survive stabilization:

  • The concept of a "macro" crate type
  • The concept of a "macro crate" provided in the standard distribution, providing all details necessary to talk to the compiler.
  • The ability to inform Cargo of crates which are macro crates
  • Cargo's behavior of compiling macro crates for the host and passing them to the compiler for the next unit of compilation.
  • Transformers like #[derive] operating over token streams and allowing customization via attributes that are stripped during the expansion process
@ubsan
Contributor
ubsan commented Jul 18, 2016 edited

First, I don't think this RFC has gone through enough design, and fast-tracking it just to get people off of nightly is not a good idea, IMO. However, I am in favor of eventually (soon-ish) getting this sliver.

Some of my specific issues:

First: librustc_macro is a terrible stable name! My vote's in for libmacro. It's not a rustc-specific thing, it's an all-of-Rust thing!

Second: Context seems like an unnecessary implementation detail, and, if they're used the same way as in the original proposal "Procedural macros should report errors, warnings, etc. via the MacroContext [Context]."... why. We already have an existing error reporting mechanism, with Results. This new way seems very un-Rustic.

Third: I don't like how macros are defined. I'd rather they be defined with something like macro instead, especially as we eventually are going to allow proc macros and normal stuff in the same library.

@eddyb
Member
eddyb commented Jul 18, 2016

@ubsan I don't like Context either, but Result won't cut it, macros 2.0 will eventually want a full-blown diagnostics API, where you can emit multiple warnings and non-fatal errors.
If the necessary information is placed in TLS, the API would work without a Context type.

@sfackler
Member

@ubsan Result is an all-or-nothing thing. In the compilation process, it is very common to be creating many diagnostics during one compilation pass. A compilation error is not necessarily fatal - you (within reason) want to continue as long as possible to avoid forcing the user to go through a recompilation cycle for every error when many could all be reported at once. In addition, warnings, notes, etc are certainly not fatal.

@arielb1
Contributor
arielb1 commented Jul 18, 2016

I think librust_macro could be a fine name.

On the integration level: plugins imply some form of host ABI. Stable plugins mean a stable HOSTRUSTC. I think we can just leave HOSTRUSTC as an implementation-defined thing, but we must be careful not to change its definition on existing platforms.

@ubsan
Contributor
ubsan commented Jul 18, 2016

@sfackler makes sense. I still don't think Context is the best, though.

@sfackler
Member

Do you have any suggestions for a better replacement?

@ubsan
Contributor
ubsan commented Jul 18, 2016

@sfackler No, but I think @eddyb does :P

@arielb1
Contributor
arielb1 commented Jul 18, 2016

On naming: I see librust_macro as an ordinary plugin library, and rustc_macro_derive as an exposed proc macro that (unstable) compiler plugins should eventually be able to reproduce (BTW, what is the nomenclature wrt. proc macro vs. plugin vs. compiler extension?)

rustc_macro_derive is a bad name, but we still don't have modular macros. I think pseudo-modular #[WHATEVER_CRATE_NAME_WE_PICK_derive] would work.

As for the API: maybe have #[CRATE_derive_v0] for the Context-free version?

@sfackler
Member

How does shoving Context into TLS hide implementation details?

@eddyb
Member
eddyb commented Jul 18, 2016 edited

@sfackler The TLS slot wouldn't be exposed, the APIs would just not take any context.

@sfackler
Member

@ubsan's concern with Context seemed to be that it was an implementation detail, and I'm confused as to how moving all of the methods on Context to free functions hides any details of the implementation. It seems totally isomorphic in that regard.

@arielb1
Contributor
arielb1 commented Jul 18, 2016 edited

Yeah. A Context parameter is easy to ignore and hard to add later.

@ubsan
Contributor
ubsan commented Jul 18, 2016 edited

A Context parameter also seems unnecessary and ugly. Just because it's fine doesn't mean it's good...

For example, an IPC macro system wouldn't need a Context.

@tomaka
tomaka commented Jul 18, 2016

Eventually Cargo may also grow support to understand that a rustc-macro crate should be compiled twice, once for the host and once for the target, but this is intended to be a backwards-compatible extension to Cargo.

As I already said in the RFC about procedural macros, I think this point is really important in order to avoid semver hell where the macros-providing crate (eg. serde-macros) must be compatible with the crate that provides the trait (eg. serde).

I think it would be better to implement that functionality as part of the RFC, instead of "eventually".

@alexcrichton
Member

Thanks for all the comments everyone! I've pushed a number of minor updates, and some responses are inline below as well.

  • Loading custom-#[derive] now requires #[macro_use] on the extern crate declaration site, as macros do today.

@ubsan

fast-tracking it just to get people off of nightly

To be clear, this RFC is not proposing anything like being instantly stable or anything like that, it's my intention that this will go through the normal design process as any other feature does. First we have an RFC, lots of discussion, an FCP, we merge the RFC, we implement, and after a number of cycles when it's clear the feature is ready for stabilization we can stabilize. If new information comes up during this time which obsoletes the design (e.g. macros 2.0 is implemented), then we can just avoid stabilization.

librustc_macro is a terrible stable name

Right now we unfortunately can't use macro as a crate name because it's a keyword, so @nrc suggested that a different name was created. My intention with the rustc_ prefix was that it mirrors rustc-serialize, and it's of course easily rename-able during stabilization if we like.

Context seems like an unnecessary implementation detail

Sounds like there's been a lot of discussion on this as well, but I'm inclined to agree with @arielb1 in that it's easy to ignore this parameter for now, and we can always remove later!

We'll have plenty of time after this is implement for stabilization as well, and we've already got lots of experience migrating APIs with libstd, so it'd basically just being doing that again where we can have a graceful transition period from the unstable APIs to a new set of stable APIs.

I don't like how macros are defined

The intention here is that this wouldn't be the final form in which custom derive annotations are written actually. These annotations don't fit in well with the module system, aren't hygienic, etc. If we were to stabilize in the near future, and then macros 2.0 is implemented much later, it may be the case that we want to deprecate this form of definition in favor of the macros 2.0 method.

Basically the idea is that we don't want to steal the "good syntax" for definitions just yet, but instead leave plenty of room for it to ensure it can entirely supplant this interim 1.1 state.


@arielb1

On the integration level: plugins imply some form of host ABI. Stable plugins mean a stable HOSTRUSTC.

Could you elaborate a bit on what you're thinking here? The intention is that all ABI details are hidden from procedural macros entirely, allowing the compiler to implement plugins in whatever way it sees fit.

Along those lines, though, I'm not entirely sure what you mean by HOSTRUSTC is here?

BTW, what is the nomenclature wrt. proc macro vs. plugin vs. compiler extension?

Right now I believe we're thinking "procedural macros" are those loaded by the compiler and run arbitrary code behind an invocation like foo!() or an attribute like #[foo]. A plugin on the other hand has been a bit of a confused term in the past, but I think we're keeping that nomenclature away from the macro system and instead reserving it for things which are far more intrusive to the compiler, such as lints, LLVM passes, MIR passes, etc. As for "compiler extension", I'm not sure!


@tomaka

I think this point is really important in order to avoid semver hell where the macros-providing crate (eg. serde-macros) must be compatible with the crate that provides the trait (eg. serde).

This is a good point! Unfortunately though this starts getting very quickly into the weeds of how to design macros 2.0, rather than remaining a very slim extension of what the compiler does today to hopefully stabilize quickly (due to its size).

In general I think that a solution here would also want to come from the Cargo side of things as well, not just rustc. I'll be sure to add this to the drawbacks though!

@eddyb
Member
eddyb commented Jul 18, 2016

From IRC: If re-exporting macro_rules macros works today, we could just make sure it also works with procedural custom derives, allowing serde to export both those and its usual public API, hiding the procedural macro crate from users.

@alexcrichton
Member

@eddyb oh right yes thanks for the reminder. @tomaka this was another solution to the problem you proposed I believe, which is that serde would simply reexport the macros from serde_macros so you'd only have to depend on one.

Macro reexport today is done in the standard library, but it is unfortunately unstable. Perhaps we could entertain a separate RFC for pushing it towards stabilization?

@eddyb
Member
eddyb commented Jul 18, 2016

@alexcrichton Oh, I didn't realize it was unstable! Because then I'd prefer @jseyfried's suggestion to just go with the new modular system from the start, which would include pub use for re-exports.

@alexcrichton
Member

@eddyb yeah if we've got all the hooks to plug in custom derive to the module system, and it all works when the rest of this RFC is ready to go, then I'm all for it! We'd want to make sure that the stabilization of it, however, would not hinder any future any efforts towards macros 2.0.

Using #[macro_use] and maybe #[macro_reexport] to deal with it today just gives us a clear way to implement it now, but if more tools are available the time them we can of course stabilize towards using them instead.

@SimonSapin SimonSapin and 1 other commented on an outdated diff Jul 18, 2016
text/0000-macros-1.1.md
+that error messages related to struct definitions will get *worse* if they have
+a custom derive attribute placed on them, because the entire struct's span will
+get folded into the `#[derive]` annotation. Eventually, though, more span
+information will be stable on the `TokenStream` type, so this is just a
+temporary limitation.
+
+The `rustc_macro_derive` attribute requires the signature (similar to [macros
+2.0][mac20sig]):
+
+[mac20sig]: http://ncameron.org/blog/libmacro/#tokenisingandquasiquoting
+
+```rust
+fn(&mut Context, TokenStream) -> TokenStream
+```
+
+If a macro cannot process the input token stream, it is expected to panic for
@SimonSapin
SimonSapin Jul 18, 2016 Contributor

Why panic over returning Result<TokenStream, MacroError>? MacroError could initially only have a from(&str) constructor.

@alexcrichton
alexcrichton Jul 18, 2016 Member

I believe the intention here is that this is the signature we'll want in "macros 2.0" as well, and eventually diagnostics should be signaled through rustc_macro, not panics. That is, any Result type returned here wouldn't have enough information for diagnostics anyway.

@alexcrichton
Member

Ok, I've now also pushed a commit to drop passing around &mut Context, instead it'll all be implicitly passed behind the scenes in a thread local variable. We can change this implementation over time, of course, though.

@comex
comex commented Jul 19, 2016 edited

Concrete question: What kind of guarantees, if any, apply to the state of globals or TLS (i.e., that the plugin accesses itself) between multiple invocations of custom derives? Are they guaranteed to be run from the same thread or process, in any particular order? Does it matter whether those invocations are of the same derive, derives defined in the same crate, or totally separate ones? Are they guaranteed to be run at all each time compilation is requested (as opposed to cached), which would impact plugins that base their results on reading files and such? Might multiple unrelated compilations use the same process?

(Well, the last one is probably an easy yes. But for the rest: a future rustc will probably someday want to cache the results of procedural macros for incremental compilation, and/or run them in parallel, and an IPC-based plugin system might or might not run each plugin crate in its own process, or even use multiple processes for a single plugin crate, at the same or different times. Of course, macros defined using this initial interface could always be special-cased.)

Also, I think it should be possible for plugins to define macro names (or in this case, derive(..) names) dynamically, either based on some sort of configuration or input file, or by defining a macro the user can use to define other macros (like macro_rules! itself). The latter I suppose would be pretty useless as long as this only supports derives, and in any case should arguably not be stabilized until the practical impact on incremental compilation can be considered. But the former is probably reasonable to support even in the initial iteration, at least if plugins are allowed to be non-pure in the first place: it just requires keeping the registration process explicit rather than requiring an attribute on each derive function.

More abstractly:

To the extent that the compiler plugin interface is considered ideally 'just another API' that happens to be consumed by the compiler, it's a bit odd that a single rustc_macro crate is expected to be sitting around in the environment, apparently as part of the standard library. After all, APIs can have multiple clients, and in theory there is no fundamental reason that the compiler that compiles the plugin must be the same one that will execute it. (Other than the magic attributes, but I'll get to that.) One simple use case for it being different is testing: a plugin author might want to use a mock implementation of the compiler interface to run tests without needing an actual Context, etc.

As a more elaborate example, one could imagine an alternative rustc implementation that only supports one target at a time: for some-triple-cross-funkyrustc to run compiler plugins, it would just shell out to rustc, which could be a different version or a completely different implementation, to compile the plugin, then run it, e.g. over IPC. This compiler would want to use its own implementation of the various structs, TokenStream, LexError, etc.: I suppose it could override the library search path for rustc_macro, but that's relatively ugly.

A more accommodating design could define the whole compiler-to-plugin interface in terms of a master trait: the trait's definition would only need a single implementation, which could just exist on crates.io. As a bonus, if someone actually writes an alternative Rust compiler, having a formal type signature for the compiler end of the interface would make it easier to ensure their implementation is compatible. More practically, with the rustc we have today, that should make it a bit easier to version the API and avoid accidental breaking changes - I suppose that's no bigger of a deal here than with the rest of the standard library, but still. Something like:

pub trait TokenStream<Context, LexError>: Sized {
    fn from_source(cx: &mut Context, source: &str) -> Result<Self, LexError>;
    fn to_source(...);
}
pub trait Compiler {
    type Context;
    type LexError: Debug + Error + ...;
    type TokenStream: TokenStream<Self::Context, Self::LexError>;
}

Though that leaves the question of how the user's functions should be declared. I have a vague idea that the use of magic attributes/macros in the RFC's design, which would probably be considered bad design if used unnecessarily in any other API, may be somewhat overlooked here because the use case is macros. If the compiler plugin API is 'just an API', then it should look like an idiomatic plugin API (by which I mean an API intended to have many implementations loaded dynamically).

...Except for the obvious point that there are no real idioms for plugin APIs today, since there aren't many plugin APIs written in Rust at all, since Rust doesn't support them very well (e.g. no way to load shared libraries at runtime without going through the C ABI). In particular, it's somewhat unclear how a future, more 'native' shared library loader will require plugins to declare their exports. But I propose that a reasonable design involves just having the crate export a specially named public symbol or symbols:

  • If someone wants to use a given fixed implementation, they can do it today just by importing the symbol(s) from the crate. As long as the exported symbol name(s) are fixed, the use declarations can remain the same, and only the extern crate declaration (or its path attribute, or -L, etc.) needs to change.
  • If someone (other than rustc) wants to create an actual plugin system that can access arbitrary implementations at runtime, they can't do it as neatly today as might someday be possible, but they can still do it: when the user specifies a source file, they can invoke rustc on it, then autogenerate and compile a wrapper crate that imports those fixed names from it, and re-exposes the interface via either a C API or IPC.
  • I'm not proposing that rustc actually do that; rather, it would just change the compiler magic a bit to identify a specially named symbol rather than an attribute.

It would make sense to define this too in terms of a trait; instead of having a series of magic symbols that should be implemented as fns to do different things, there would be just one symbol, which would be a type that implements the (external) trait. Among other things, using a trait would make type checking the plugin end of the interface more straightforward, and perhaps make lifecycle management a bit more self-evident/idiomatic (&mut vs. &). Something like:

pub trait Registrar<C: Compiler> {
    fn register_derive(&mut self, name: &str, func: Box<Fn(C::TokenStream) -> C::TokenStream>);
}
pub trait CompilerPlugin<C: Compiler> {
    fn new<R: Registrar<C>>(r: &mut R) -> Self;
}

That definition does have a bit of an issue, in that the user should implement their plugin generically for any C, but there is no way to express that requirement in the type system. Probably not a big deal though: in practice, C would be a private type on the compiler's end, so there would be no way to only implement it for given Compilers anyway.

Anyway, I don't know if any of this is really necessary for the kind of barebones initial interface this RFC proposes, or, for that matter, will ever be necessary. Maybe I'm way overthinking it. But I don't think it adds too much complexity, and to me it seems a bit more future-proof overall.

@Stebalien
Contributor
Stebalien commented Jul 19, 2016 edited

Has there been any thought given to sandboxing (e.g., banning all syscalls) both for security and general sanity? For one, it would be nice to be able to syntax check rust code without running arbitrary cage-free code and this is something that can't be turned on later.

Note: one can currently run arbitrary code using build.rs but I was hoping that uses of build.rs would become less common (and whitelistable) when we got stable procedural macros.

@kennytm kennytm and 1 other commented on an outdated diff Jul 19, 2016
text/0000-macros-1.1.md
+
+Eventually Cargo may also grow support to understand that a `rustc-macro` crate
+should be compiled twice, once for the host and once for the target, but this is
+intended to be a backwards-compatible extension to Cargo.
+
+## Pieces to stabilize
+
+Eventually this RFC is intended to be considered for stabilization (after it's
+implemented and proven out on nightly, of course). The summary of pieces that
+would become stable are:
+
+* The `rustc_macro` crate, and a small set of APIs within (skeleton above)
+* The `rustc-macro` crate type, in addition to its current limitations
+* The `#[rustc_macro_derive]` attribute
+* The signature of the `#![rustc_macro_derive]` functions
+* The `#![rustc_macro_crate]` attribute
@kennytm
kennytm Jul 19, 2016

This attribute is not specified anywhere in the RFC. Is it #![crate_type="rustc-macro"]?

@alexcrichton
alexcrichton Jul 19, 2016 Member

Oops, thanks for the sharp eye! These were actually remnants of a previous draft, I just forgot to remove them.

@kennytm kennytm commented on an outdated diff Jul 19, 2016
text/0000-macros-1.1.md
+requirements of the compiler change, and even amongst platforms these details
+may be subtly different.
+
+The compiler will essentially consider `rustc-macro` crates as `--crate-type
+dylib -C prefer-dyanmic`. That is, compiled the same way they are today. This
+namely means that these macros will dynamically link to the same standard
+library as the compiler itself, therefore sharing resources like a global
+allocator, etc.
+
+The `librustc_macro` crate will compiled as an rlib and a static copy of it
+will be included in each macro. This crate will provide a symbol known by the
+compiler that can be dynamically loaded. The compiler will `dlopen` a macro
+crate in the same way it does today, find this symbol in `librustc_macro`, and
+call it.
+
+The `rustc_macro_define` and `rustc_macro_derive` attributes will be encoded
@kennytm
kennytm Jul 19, 2016

The #[rustc_macro_define] attribute is not specified anywhere in the RFC.

@jimmycuadra

Thanks for the thoughtful design here, @alexcrichton. Great ideas. I'm not a compiler dev so the most I can contribute to the discussion is at a higher level, but generally my concerns are:

First: Minimizing or eliminating churn that would occur when macros 2.0 becomes available. That includes work for compiler devs, macro crate authors, and library consumers. For compiler devs, how much work would need to be done for this that wouldn't be done differently or replaced entirely by macros 2.0? For macro crate authors, are they going to have to rewrite everything now and then again for macros 2.0? For library consumers, are they going to need to make changes to Cargo.toml and/or the syntax for activating/importing the custom derives now and then again for macros 2.0?

Second: Needing to convert the token stream to a string in order to manipulate it seems very un-Rusty. This seems like it puts a really heavy burden on macro crate authors by foregoing the benefits of Rust's great type system. Having a string of Rust syntax as input will create a sort of ad-hoc API that macro crate authors will depend on. If the way the token stream gets converted into a string changes formatting at all, that could break assumptions their code will be making in order to manipulate it into the final form, e.g. variations in whitespace. In other words, losing all the type information means the macro crate authors will have to re-extract that meaning by parsing the string and essentially guessing at what it means.

In general, the need to convert back and forth to a string almost feels like being told "just shell out and do your work in a more capable language." Is there any other middle ground between string manipulation and full macros 2.0-level token manipulation? Would a subset of the macros 2.0 token manipulation API be feasible here, or would that blow out the scope too much?

Third: The perception of Rust not being stable because the nightly compiler is required for a better experience when you need serialization, as discussed in the motivation section of the RFC. Whether or not macros 1.1 would change that perception might depend a lot on my previous two concerns. The library consumer needing to distinguish between, for example, serde and serde_macros, and to know that they need the latter if they want custom derives, will still seem cumbersome and new users may not understand why it's necessary and have trouble keeping the versions in sync even if they do. To that end, I share @tomaka's concern.

As a library consumer, I think knowing that macro_use will eventually be replaced by importing macros like other items makes the whole system feel unfinished, too. That may be because I've been following this issue for a long time and am more familiar with the history and the plans for the future. In terms of a quick win for user perception, this may not be as important. But it does play into my first point, that there is still going to be some churn in the future for everyone involved, and that if you know that's coming, it might make people not already invested in Rust to think, "I'll just wait until the final version is ready."

@aturon
Contributor
aturon commented Jul 19, 2016

@jimmycuadra

In general, the need to convert back and forth to a string almost feels like being told "just shell out and do your work in a more capable language." Is there any other middle ground between string manipulation and full macros 2.0-level token manipulation?

Just a small note about this: the expectation is that you'll use an external library (like a new version of syntex) to do parsing and pretty printing. In other words, you can continue to work with AST rather than strings. The key differences are that (1) there will be a stable string-based interface to the compiler and (2) the AST definitions you're working with live outside the compiler, hence don't need to match any particular nightly version etc. Since updates to Rust's syntax come very rarely, the external AST library will likewise change slowly. (Ideally, new additions to Rust's syntax would not even involve breaking changes).

@erickt
erickt commented Jul 19, 2016

@Stebalien: Sandboxing is a good argument for talking to a plugin over an IPC channel. Another that hasn't been brought up yet is that some targets, like a staticly linked rust/musl, doesn't support loading dynamic libraries, but do support subprocesses.

@comex: Whether or not macros should be able to preserve state across invocations is interesting. The current plugin system allows it, but I'm not sure if it was ever really thought about if it should or not. The only library that I know of that does this though is clippy (cc @Manishearth and @llogiq) for analysis passes, but that's making lints, which is out of scope. I'm not even sure if something like stateful would need this since everything it does is still contained in one function.

I don't have a strong opinion either way on preserving state across invocations. It could lead to weird non-determininistic builds, but then it'd just be up to the plugin author to just not do that.

If we do actually allow for this, if we don't already have a formal evaluation order of the attributes, we should just explicitly define it as front-to-back so that a user can have some level of control if the plugins do in fact preserve (and maybe even share) state across uses.

@cgswords

@comex In this 'TokenStream as trait' approach, how will the expander enforce hygiene? The hygiene responsibility (anti-marking) currently falls on the invoker, and that means the invoker needs to know enough about the structure of the macro result to be able to ensure correct hygiene information. That means the trait, at minimum, will need the ability to manipulate hygiene over the structure in a way that interoperates with the expander (basically writing custom pieces of the marking algorithm), which seems messy at best and terrible at worst (eg misimplemented).

W/R/T global state, I think that (a) macros shouldn't preserve state across expansion using the thread-local storage available to the expander and syntax manipulation facilities, and (b) if a macro wants access to thread-local state, it should maintain its own (by defining some TLS near its definition site and using that). Moreover, giving rigid ordering guarantees at the level of, eg, attribute expansions on a particular item, seems baroquely fine-grained. If a macro author is truly that concerned about the order of expansion, I don't see why they can't track it and inspect it themselves in a TLS variable the macro maintains.

@eddyb
Member
eddyb commented Jul 19, 2016

@comex There are no plans to allow crates compiled with one compiler to be used with another.
That would require creating some form of standard for the crate metadata and that is not on the roadmap.
Moreover, libcore internally uses compiler details that may necessarily vary between compilers.
Every compiler is expected to have its own implementation of libmacro just as it would have its own libcore implementation, and the build artifacts will not be compatible, not in the foreseeable future.

@eddyb
Member
eddyb commented Jul 19, 2016

@erickt FWIW, the dynamic linker is just another library and with some hacks it could be incorporated into a statically-linked binary. We could write our own loaders, too, maybe involving the LLVM binary file format support, but that's a bit of an overkill (for now, anyway).

@alexcrichton
Member

I realize now I may not have stressed this in the RFC well enough, but the purpose here is to be maximally minimal with the spirit of perfect is the enemy of the good. The purpose of this RFC is to start along a path towards "macros 2.0" which will solve many of the problems mentioned here, but the purpose here is to not rethink the macro system at the same time.

To be clear, the only feature enabled here is custom derive which is a quite small feature that isn't impacted that much by features like hygiene. A full-blown procedural macro system is hopefully where this groundwork ends up eventually, but it's best discussing that on the relevant RFCs. So to clarify, I think it'll help to focus discussions on what would block this RFC in particular, not far-off features we'd like to see one day in a macro system.


@comex Thanks for the comments! About global state, threads, processes, etc, I think it's better to phrase the question "what's the maximally compatible thing we can do today?" In that sense, I think we just won't provide any guarantees about when/where procedural macros are run today. Eventually we may wish to guarantee something along these lines, and this may hinder implementing some macros today, but the purpose here is to start on the path towards enabling all procedural macros, not enable them all at once.

As @eddyb mentioned as well it's intended that all procedural macros are compiled by the crate loading them, so we have maximal flexibility in choosing a path later on. We can of course one day loosen this restriction if need be.

Additionally, the purpose here is to avoid abstracting too much by providing as thin an API as possible that can be implemented differently in the future. To that end I think it'd be best to avoid traits and extra layers of abstraction as it may be diverging both from macros 2.0 as well as where we'd like to end up.


@Stebalien I think that sandboxing plugins would be a great thing to explore! I don't believe it should hinder the initial implementation, however, especially because there's no sandboxing solution ready-to-go for the compiler today. In that sense I don't think it should block an initial implementation, and we can perhaps just document that "macros may be sandboxed in the future" to guard us in that respect.


@jimmycuadra

Minimizing or eliminating churn that would occur when macros 2.0 becomes available.

Indeed! That's driving some of the design in this RFC, but I also consider it ok if when macros 2.0 becomes available this whole system becomes deprecated. The idea is that the abstractions provided here provide a nice transition story, but if it doesn't pan out they're low cost enough to just deprecated and keep around forever.

Needing to convert the token stream to a string in order to manipulate it seems very un-Rusty.

@aturon mentioned this as well, but the intention here is that this is the absolutely minimal API which we can stablize for macro authors. This kind of functionality will always be available, so it's easy for the compiler to implement and maintain over time.

@cgswords has a great TokenStream abstraction building up in libsyntax, and my expectation is that much of it moves over to librustc_macro over time. Like the standard library we'll probably have a stabilization process for all APIs as well as a number of unstable APIs internally.

All in all I expect this to act much like the standard library. The compiler provides the bare bones abstractions, support is built on crates.io, and gradually the most popular support gets baked into the compiler itself.

Finally, the great part about stabilization is that it will allow authors to slowly transition code to more appropriate token stream APIs over time as they can. Macro authors can transition at their leisure without impacting users (beyond giving them better error messages and hygiene). Macro users with this RFC are just custom derive, so they're quite limited already, but eventually they'd even get the nice ability to import custom-derive annotations, etc.

As a stable interface to the compiler (if stabilized), none of the support will ever break. We'll just have a gradual transition away towards more ergonomic, featureful APIs.

As a library consumer, I think knowing that macro_use will eventually be replaced by importing macros like other items makes the whole system feel unfinished, too.

It's true that the support for custom derive today isn't great, but as I mentioned above we don't want perfect to be the enemy of the good here. We're only talking about custom derive in this RFC, and #[derive] works pretty well today. Sure it can work better, but there's no reason we can't extend the current mode a bit and then pile on "all the features" once macros 2.0 is ready.

@matklad
matklad commented Jul 19, 2016

I have zero experience in this area, but I still want to ask a question. Is it true that compiler plugins are a better tool for serde/diesel, than plain old code generation(rfc)? I think code gen has its own benefits:

  • it generates real code, which can be read, debugged and profiled without any special support.
  • it is simpler.
  • something like incremental compilation would work with generated code out of the box.

Can it be the case that compiler plugins are used in important projects just because there are no ergonomic code gen solutions?

@eddyb
Member
eddyb commented Jul 19, 2016

@matklad One huge problem with code generation is hygiene: procedural macros (although not the deriving support described here, but the full macros 2.0 deal) can interact with Rust's macro system hygienically, which uses information code generation just destroys.

Not to mention scoping and attributes, you need to re-implement macro expansion as a pre-processor (which is what the stable solution for serde is), to actually get anywhere close.

Btw, the libmacro plans are compatible with having alternative Rust compilers implement them and @ubsan has expressed the desire to have such a prototype before any part of libmacro is stabilized.
It's possible IntelliJ could have a working front-end with their own version of libmacro that allows procedural macros to be fully supported by the Rust plugin for IntelliJ.

@comex
comex commented Jul 20, 2016

@cgswords

@comex In this 'TokenStream as trait' approach, how will the expander enforce hygiene?

My impression is that the currently proposed design makes TokenStream fully opaque and does not support hygiene. I'm not sure exactly what you mean about marking, but in general I wouldn't expect problems translating any given design to involve traits...

@eddyb

@comex There are no plans to allow crates compiled with one compiler to be used with another.
That would require creating some form of standard for the crate metadata and that is not on the roadmap.

I don't know what you mean by "used with". In general, my hypothetical compiler that used a different compiler to compile its plugins would then load them using some generic interface for Rust programs to load plugins. Today, as I mentioned, that would probably involve autogenerating a wrapper crate that imported the symbol and re-exposed the functionality via either a C API or IPC, and running the second compiler on that. In the future, I hope that Rust will gain better support for plugins (which could take many forms) so it could work more directly, but the compiler's requirements would still be no different from any other program that uses plugins.

Which is my general point - that using magic attributes seems to sort of establish 'procedural macro implementations' as a distinct type of code from the compiler's point of view, whereas at least the current design proposals have no real need for them to be anything other than 'plugins that happen to be for a compiler'.

I don't really expect the alternative-compiler scenario to occur anytime soon; it's more of an issue of design elegance and future-proofing.

However, @alexcrichton's point about avoiding abstraction is well taken.

On another point, "macros may be sandboxed in the future" sounds like a pretty risky approach to me, but I guess it would be hard to do anything else in the short term.

@Ericson2314
Contributor

I think it's a bit silly to assume that macros 2 will take years, yet we can already predict that something will be forwards compatible (even if its quite minimal). I'd rather be conservative and namespace/prefix everything under v1_1 or something.

@nikomatsakis
Contributor

One thing I wanted to ask. Is the plan that the plugin gets the struct definition as input and then emits BOTH an impl and the structure definition, or that it only has to add the impl it is generating? The latter seems better as the effect of losing the span info is minimized.

@alexcrichton
Member

@nikomatsakis

ah yeah, to clarify and quote this section:

The input here is the entire struct that #[derive] was attached to, attributes and all. The output is expected to include the struct/enum itself as well as any number of items to be contextually "placed next to" the initial declaration.

It was opted to force regeneration of the struct itself for two reasons:

  • I believe @nrc wants to remove the distinction between decorators and modifiers, where decorators generate more things and modifiers replace thing thing in question. This means we'd only have modifiers.
  • Second, custom derive modes sometimes want to be customized (e.g. #[serde(rename = "foo")]), and this allows custom derive access to this information. We'll schedule the lint of "unused and gated attributes" after expansion, so as long as custom derives strip their relevant attributes they won't be gated

Does that make sense?

@jseyfried
jseyfried commented Jul 21, 2016 edited

I believe @nrc wants to remove the distinction between decorators and modifiers, where decorators generate more things and modifiers replace the thing in question. This means we'd only have modifiers.

Already done -- we haven't deprecated decorators yet, but modifiers are now strictly more powerful (c.f. rust-lang/rust#34253) and decorators are treated as a special case of modifiers (c.f. rust-lang/rust#34446).

To improve the span information, perhaps we could add a method extend to TokenStream so that, for example, double could become:

#[rustc_macro_derive(Double)]
pub fn double(mut input: TokenStream) -> TokenStream {
    let source = input.to_string();
    let impls_source = derive_double_impls(&source); // Only returns the added `impl`s
    input.extend(impls_source.parse().unwrap());
    input
}
@nikomatsakis
Contributor

@alexcrichton

It was opted to force regeneration of the struct itself for two reasons

Well, one could imagine adding another helper method to TokenStream -- one like "concatenate", that takes a TokenStream input along with additional text to produce a compound result. This way, if you leave the struct unmodified, you could just append it.

Basically, my concern here is that if you have something like this:

#[derive(SomethingCustom)]
struct Foo {
    x: u33 // note typo
}

you will get an error like "undefined type u33" with the "derive" as the span, right?

Anyway, not the end of the world, but it seems like we can probably do better just by tweaking the API we offer in some small way.

@eddyb
Member
eddyb commented Jul 22, 2016

I agree to concatenation, but I'm not sure we should have it from the start. If the item definition needs to be modified, we might want some heuristics for re-applying spans on the output based on the input.

@nikomatsakis
Contributor

Yeah that could work too. We don't have to work it out in the RFC thread,
seems like an ideal issue to address later.
On Jul 21, 2016 8:10 PM, "Eduard-Mihai Burtescu" notifications@github.com
wrote:

I agree to concatenation, but I'm not sure we should have it from the
start. If the item definition needs to be modified, we might want some
heuristics for re-applying spans on the output based on the input.

โ€”
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1681 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAJeZogEzwpqLSXA2xiKGCow40Mrnko7ks5qYAp8gaJpZM4JPF9c
.

@alexcrichton
Member

@jseyfried aha good point! And it seems that @nikomatsakis ended up at the same conclusion. It seems reasonable to me that we'd have that eventually, but I agree with @eddyb that it's an API that can be added and stabilized after-the-fact. With custom attributes you also wouldn't be able to use concatenate unfortunately as you'd want to splice out the custom attributes in the token stream.

@nikomatsakis you're right though that the errors pointing at structs with custom derive annotations are going to get worse, although hopefully once we stabilize more of TokenStream that won't be the case!

@nrc
Contributor
nrc commented Jul 22, 2016

To jump in on the span bandwagon - I think there are a few things we could do in the medium term - I don't like concatenation because I think very minor changes to the annotated code will be fairly common and it would produce an 'ergonomics cliff'. Some kind of lightweight way to indicate preserved spans or a source-map-like approach would probably work. But I definitely think that should come later.

@eddyb
Member
eddyb commented Jul 22, 2016

@nrc Like I mentioned on IRC, I believe we can preserve all the important bits by just guessing where things could have come from, which would require no API additions, and would be obsoleted by tokens getting stabilized.
We just need to flag the TokenStream or individual tokens as having no discernible source and then act on that, in the derive case, matching the output to parts of the properly-spanned input.

@sanxiyn
Member
sanxiyn commented Jul 25, 2016 edited

Re: global namespace of derive modes, I propose Rust project runs central registry of derive modes to prevent name conflict. (Like crates.io.)

@sgrif sgrif added a commit to diesel-rs/diesel that referenced this pull request Jul 27, 2016
@sgrif sgrif Implement a stable version of `BelongsTo`
This requires a bit more information to be explicitly given in the
options, as we can no longer infer the table name or the foreign key
from the relevant struct names. This differs from the procedural form,
as you state the name of the struct that you belong to, not a lower
cased singular form of the table (I'm considering changing the
procedural form to match)

Since we need the table name potentially available in multiple places,
I've modified the only other place that is related and explititly needs
it (Identifiable) to match with a new annotation. As usual, I've had to
do `#[table_name(value)]` instead of `#[table_name="value"]`, as there's
no way for me to match the second form as an ident.

Due to the proposed changes in [RFC 1681][rfc-1681], I'm considering
eventually having the "public API" form of this use annotations simlar
to the current procedural macros, and then an `Associations!` macro
which walks over each of them, which would look something like this:

    #[derive(associations)]
    #[has_many(Comment, foreign_key = post_id)]
    #[belongs_to(User, foreign_key = user_id)]
    #[table_name(posts)]
    pub struct Comment {
        id: i32,
        user_id: i32,
    }

As I write this, I am realizing that the way I'm passing the foreign key
isn't actually a valid meta item once this goes to stable, and I have
no way to pattern match on it. I'll probably need to change it to
`foreign_key(post_id)` in the future...

Anyway the change to that form can still call this macro under the hood,
it would only need to be a documentation change, so I think we can move
forward with this interface for now.

[rfc-1681]: rust-lang/rfcs#1681
1a72382
@sgrif sgrif referenced this pull request in diesel-rs/diesel Jul 27, 2016
Merged

Implement a stable version of `BelongsTo` #383

@sgrif sgrif added a commit to diesel-rs/diesel that referenced this pull request Jul 27, 2016
@sgrif sgrif Implement a stable version of `BelongsTo`
This requires a bit more information to be explicitly given in the
options, as we can no longer infer the table name or the foreign key
from the relevant struct names. This differs from the procedural form,
as you state the name of the struct that you belong to, not a lower
cased singular form of the table (I'm considering changing the
procedural form to match)

Since we need the table name potentially available in multiple places,
I've modified the only other place that is related and explititly needs
it (Identifiable) to match with a new annotation. As usual, I've had to
do `#[table_name(value)]` instead of `#[table_name="value"]`, as there's
no way for me to match the second form as an ident.

Due to the proposed changes in [RFC 1681][rfc-1681], I'm considering
eventually having the "public API" form of this use annotations simlar
to the current procedural macros, and then an `Associations!` macro
which walks over each of them, which would look something like this:

    #[derive(associations)]
    #[has_many(Comment, foreign_key = post_id)]
    #[belongs_to(User, foreign_key = user_id)]
    #[table_name(posts)]
    pub struct Comment {
        id: i32,
        user_id: i32,
    }

As I write this, I am realizing that the way I'm passing the foreign key
isn't actually a valid meta item once this goes to stable, and I have
no way to pattern match on it. I'll probably need to change it to
`foreign_key(post_id)` in the future...

Anyway the change to that form can still call this macro under the hood,
it would only need to be a documentation change, so I think we can move
forward with this interface for now.

[rfc-1681]: rust-lang/rfcs#1681
f33d250
@aturon aturon added the I-nominated label Aug 11, 2016
@aturon
Contributor
aturon commented Aug 11, 2016

Nominating for FCP.

@ubsan
Contributor
ubsan commented Aug 11, 2016

I know this isn't actually in FCP yet, but; the only issue I see is that this isn't called libmacro. I think this is a big enough step that we should name the library macro.

@Manishearth
Member

Iirc that's a keyword (which may be used in the Final Form of macros)

@nikomatsakis
Contributor
nikomatsakis commented Aug 12, 2016 edited

We discussed this in the @rust-lang/lang meeting and decided we would like to move to accept this RFC.

Here is a summary of the conversation thus far:

@nikomatsakis
Contributor

Hear ye, hear ye! This RFC is now entering final comment period regarding a move to accept.

@llogiq
Contributor
llogiq commented Aug 12, 2016

I still think that the stringy-ness of the API precludes having correct information on where a span of code comes from (if that is even one location โ€“ it could have been stitched together from several parts of the macro and the code generated within the macro invocation). Unfortunately, this is a vital part of having good error messages as well as linting accurately in the presence of macros.

I sincerely doubt heuristics will be sufficient for the general case.

@jseyfried

@llogiq Agreed -- I don't think heuristics are a good idea or will be sufficient.

I'd much prefer that we allow users to concatenate tokenstreams as @nikomatsakis and I simultaneously suggested here so that heuristics aren't needed when the decorated item doesn't need to be modified.

When the decorated item does need to be modified (e.g. #[serde(rename = "foo")]), we could still use heuristics, so I don't see a downside of allowing users to concatenate tokenstreams.

@alexcrichton
Member

@sfackler notes that it could be confusing to use the -L flag to find the librustc_macro, since that sounds like a linker flag? However, as @eddyb pointed out, -L is better thought of as a "library" path, in which case this makes sense.

@sfackler please correct me if I'm wrong, but I think concretely the worry here is that a prodedural macro crate is discovered via -L. That is, crates complied with the rustc-macro crate type are discovered via this flag. Normally the flag is only used for discovering libraries to link to.

I personally feel though that it's performing the same way as the rest of crate lookup. Which is to say that it's being used to find a particular crate on the filesystem, this one just happens to be in a different format. We already handle a bunch of this logic in the loader, so I'm not personally too worried about overloading the meaning here.

Plus it just means that this flag, which no one ever interacts with anyway, is easier for tools to pass to the compiler (as they just compile everything to one output directory).

@Ericson2314
Contributor

We already have -L kind=... Let's just crate a new kinds for macros. This is important for when cross-compiling to a very similar target for which rustc might get confused.

@Ericson2314
Contributor
Ericson2314 commented Aug 13, 2016 edited

For FCP, I suggested in #1681 (comment) that we preemtively namespace everything so we can deprecate it easily for 2.0, but not sure anybody saw this.

@cgswords
cgswords commented Aug 15, 2016 edited

@ubsan @nikomatsakis I propose we call it libproc_macro, at least for now, because it's clear, it avoids ambiguity, and it doesn't tie it to rustc. Maybe it isn't great, but it's the obvious choice that avoids all the obvious problems.

On the topic of spans, the current TokenStream concat behavior already does the span stuff that @jseyfried described (via combine_spans), so forcing it at part of 1.1 seems worse than just waiting for some stabilization. (That said, I suspect that heuristics will range from unreliable to inefficient, and that the push for at least this amount of stabilization should happen sooner rather than later.)

Sandboxing, in general, seems problematic. More precisely, some procedural macros rely on syscalls today (e.g. diesel, as I understand it, performs database operations during expansion to get structural information). If sandboxing is a thing, it needs to be toggleable (and making sandboxes a default with big scary warnings for non-sandbox wouldn't be the worst).

@nikomatsakis
Contributor

@jseyfried @llogiq

Agreed -- I don't think heuristics are a good idea or will be sufficient.

I sort of suspect this will prove to be true, but it seems like a minor point that can easily be addressed in stabilization to me.

@nikomatsakis
Contributor

@Ericson2314

For FCP, I suggested in #1681 (comment) that we preemtively namespace everything so we can deprecate it easily for 2.0, but not sure anybody saw this.

Ah, yeah, sorry, I missed that. Will add to the summary.

@nikomatsakis
Contributor

@Ericson2314 Regarding namespacing, I see your point about predicting the future being considered difficult; that said, I think the API we are talking about here feels pretty minimal, and we could deprecate it easily enough and find alternative names if it came to that. Put another way, just because something will take a while to implement, doesn't necessarily mean that the big picture design is going to change that much. That said, I could easily see myself eating crow on this point.

A compromise I could imagine would be having the methods be attached to a trait, so we can deprecate the old trait and add a new one.

@nrc
Contributor
nrc commented Aug 18, 2016

I'm in favour of this. I'm not sure how I feel about the namespacing issue, I'd like to see how this looks when implemented.

@nikomatsakis
Contributor

Huzzah! The @rust-lang/lang team has decided to accept this RFC. We decided to ask @alexcrichton to move some of the detailed matters (e.g., prefixes, etc) into unresolved questions to be addressed during stabilization. Thanks everyone!

@nikomatsakis nikomatsakis referenced this pull request in rust-lang/rust Aug 22, 2016
Closed

Tracking issue for "Macros 1.1" (RFC #1681) #35900

50 of 53 tasks complete
@nikomatsakis nikomatsakis merged commit ac22573 into rust-lang:master Aug 22, 2016
@nikomatsakis
Contributor

Tracking issue: rust-lang/rust#35900

If you'd like to keep following the development of this feature, please subscribe to that issue, thanks! :)

@nikomatsakis nikomatsakis added a commit that referenced this pull request Aug 22, 2016
@nikomatsakis nikomatsakis Merge RFC #1681: Macros 1.1 8207010
@llogiq
Contributor
llogiq commented Aug 22, 2016

I still miss my comment about the inferior traceability of code through macros 1.1 as opposed to the current systems. Could we at least add it to the drawbacks section?

@nikomatsakis
Contributor

@llogiq I pushed some text about spans

@llogiq
Contributor
llogiq commented Aug 22, 2016

Ok then. I'll keep an eye on the implementation. ๐Ÿ˜‰

@untitaker untitaker added a commit to untitaker/rfcs that referenced this pull request Sep 11, 2016
@untitaker untitaker Fix a typo in RFC #1681 7bfa935
@hauleth hauleth added a commit to rust-num/num that referenced this pull request Sep 18, 2016
@hauleth hauleth Fix `num-macros` `FromPrimitive` implementation
Current solution follow syntax proposed in rust-lang/rfcs#1681.

Tracking issue rust-lang/rust#35900

Close #227
decfedb
@hauleth hauleth referenced this pull request in rust-num/num Sep 18, 2016
Merged

Fix `num-macros` `FromPrimitive` implementation #228

0 of 7 tasks complete
@hauleth hauleth added a commit to rust-num/num that referenced this pull request Sep 18, 2016
@hauleth hauleth Fix `num-macros` `FromPrimitive` implementation
Current solution follow syntax proposed in rust-lang/rfcs#1681.

Tracking issue rust-lang/rust#35900

Close #227
11f8289
@homu homu added a commit to rust-num/num that referenced this pull request Sep 30, 2016
@homu homu Auto merge of #228 - rust-num:fix/num-macros, r=cuviper 019c136
@phaazon phaazon pushed a commit to phaazon/rfcs that referenced this pull request Feb 10, 2017
@untitaker untitaker Fix a typo in RFC #1681 cd4e3f1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment