Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proc macros: ability to refer to a specific crate/symbol (something similar to $crate) #54363

Open
LukasKalbertodt opened this issue Sep 19, 2018 · 19 comments
Labels
A-macros Area: All kinds of macros (custom derive, macro_rules!, proc macros, ..) A-proc-macros Area: Procedural macros A-resolve Area: Path resolution C-feature-request Category: A feature request, i.e: not implemented / a PR. T-lang Relevant to the language team, which will review and decide on the PR/issue.

Comments

@LukasKalbertodt
Copy link
Member

LukasKalbertodt commented Sep 19, 2018

The problem

In macros-by-example we have $crate to refer to the crate the macro is defined in. This is very useful as the library author doesn't have to assume anything about how that crate is used in the user's crate (in particular, the user can rename the crate without breaking the world).

In the new proc macro system we don't seem to have this ability. It's important to note that just $crate won't be useful most of the time though, because right now most crates using proc macros are structured like that:

  • foo-{macros/derive/codegen}: this crate is proc-macro = true and defines the actual proc macro.
  • foo: defines all runtime dependency stuff, has foo-{macros/derive/codegen} as dependency and reexports the proc macro.
  • The important part: the proc macro emits code that uses stuff from foo

An example:

foo-macros/src/lib.rs

#[proc_macro]
pub fn mac(_: TokenStream) -> TokenStream {
    quote! { ::foo::do_the_thing(); }
}

foo/src/lib.rs

pub fn do_the_thing() {
    println!("hello!");
}

When the user uses mac!() now, they have to have do_the_thing in scope, otherwise an error from inside the macro will occur. Not nice. Even worse: if the user has a do_the_thing in scope that is not from foo, strange things could happen.


So an equivalent of $crate would refer to the foo-{macros/derive/codegen} crate which is not all that useful, because we mostly want to refer to foo. The best way to solve this right now is to use absolute paths everywhere and hope that the user doesn't rename the crate foo to something else.

The proc macro needs to be defined in a separate crate and the main crate foo wants to reexport the macro. That means that foo-macros doesn't know anything about foo and thus blindly emits code (tokens) hoping that the crate foo is in scope.

But this doesn't sound like a very robust solution.

Furthermore, using the macro in foo itself (usually for testing) is not trivial. The macro assumes foo is an extern crate that can be referred to with ::foo. But that's not the case for foo itself. In one of my codebases I used a hacky solution: when the first token of the macro invocation is *, I emit paths starting with crate:: instead of ::foo::. But again, a better solution would be really appreciated.


How can we do better?

I'm really not sure, but I hope we can use this issue as place for discussion (I hope I didn't miss any previous discussion on IRLO).

However, I have one idea: declaring dependencies of emitted code. One could add another kind of dependencies (apart from dependencies, dev-dependencies and build-dependencies) that defines what crates the emitted code depends on. (Let's call them emit-dependencies for now, although that name should probably be changed.) So those dependencies wouldn't be checked/downloaded/compiled when the proc macro crate is compiled, but the compiler could make sure that those dependencies are present in the crate using the proc macro.

I guess defining those dependencies globally crate is not sufficient since different proc macros could emit code with different dependencies. So maybe we could define the emit-dependencies per proc macro. But I'm not sure if that makes the check too complicated (because then Cargo would have to check which proc macros the user actually uses to collect a set of emit-dependencies).

That's just one idea I wanted to throw out there.


Related

@estebank estebank added the A-macros Area: All kinds of macros (custom derive, macro_rules!, proc macros, ..) label Jan 11, 2019
@bkchr
Copy link
Contributor

bkchr commented Feb 11, 2019

Hey, I created a proc-macro-crate to find the name of a crate, even if it was renamed in Cargo.toml. This helps for procedural macros and the extern crate trick.

@jonas-schievink jonas-schievink added C-feature-request Category: A feature request, i.e: not implemented / a PR. T-lang Relevant to the language team, which will review and decide on the PR/issue. labels Nov 21, 2019
@jonas-schievink jonas-schievink added the A-resolve Area: Path resolution label Feb 19, 2020
@Aaron1011 Aaron1011 added the A-proc-macros Area: Procedural macros label May 21, 2020
gdetal added a commit to gdetal/rust_cmd_lib that referenced this issue Oct 16, 2020
Programs using the cmd_lib crate requires to include the cmd_lib_core as
the macros are procedural macros which do not yet support $crate as for regular
macros (See rust-lang/rust#54363).

This commit re-export the procedural macros as well as core/macro libraries. It
also makes sure that the cmd_lib_core is not required by a depending crate.

Signed-off-by: Gregory Detal <gregory.detal@tessares.net>
gdetal added a commit to gdetal/rust_cmd_lib that referenced this issue Oct 16, 2020
Programs using the cmd_lib crate requires to include the cmd_lib_core as
the macros are procedural macros which do not yet support $crate as for regular
macros (See rust-lang/rust#54363).

This commit re-export the procedural macros as well as core/macro libraries. It
also makes sure that the cmd_lib_core is not required by a depending crate.

Signed-off-by: Gregory Detal <gregory.detal@tessares.net>
@alercah
Copy link
Contributor

alercah commented Jan 29, 2022

It's also just annoying to have to make a separate crate for macros, especially given that the predominant style seems to be to reexport them so that the user doesn't need to fuss with managing the dependencies.

No matter what, I suspect this is blocked on proper def-site hygiene support. But in light of the annoyance of making a separate crate, I'll propose another alternative: allow proc macros to be defined in the same crates as regular items.

Obviously this isn't immediately workable, because of the possibility of circular dependencies. So we would need some kind of multi phase compilation. But I think, without much knowledge of the details, this might be viable without being too intrusive to the compiler. Specifically I propose the following:

  1. There are two named phases to compiling a crate: for the sake of an argument, call them macro and final.
  2. There is a phase attribute available to enable conditional compilation. It accepts a comma-separated list of phases and is inherited through scopes unless overridden, with a default of#[phase(final)]. But there can also be a phase key for cfg in order to cover use cases that the new attribute doesn't.
    1. Optionally, for ergonomics, items not compiled in the current phase participate in name resolution, but it is an error to refer to them.
  3. Proc macro declarations ignore the inherited phase attribute and can't have an explicit phase marking of their own. They always declare a macro into their immediate scope (in the macro namespace of course), but exact behaviour depends on phase.
    1. In the macro phase, the declared macro is simply an error to call, and the definition is used to compile the macro. The placeholder macro exists to produce better error messages than "name doesn't exist" and to avoid accidentally invoking a macro that you didn't realize was imported from a glob import and that would have been shadowed. (Note: this can probably be done with minimal compiler support by having the macro phase version be a normal macro that just expands into a compiler_error! invocation?)
    2. In the final phase, the proc macro definition is ignored, and the declare macro name refers to the macro compiled during the macro phase, as by a use. The visibility of the macro name is determined by the visibility of the function defining it.
  4. For hygiene, def-site spans in a proc macro are considered to be in the final version of the crate, and therefore can refer to other names in the crate.
    1. For convenience, the proc_macro crate provides a quote_local_path! macro, implemented via intrinsic, that accepts a local path and produces a hygienic, implicity-delimited TokenStream referring to that path. Possibly the quote! macro could also have a syntax for this.

@lovasoa
Copy link
Contributor

lovasoa commented Sep 4, 2022

Another problem that arises from this issue that I think hasn't been mentioned here is when the proc-macro is re-exported.
If we have crate a which contains a proc-macro, and a crate b that depends on a and re-exports the macro (with pub use a::my_macro), then the code that depends on b will not have the ::a in scope, and this will result in hard-to-troubleshoot issues, since users of b don't even know about crate a.

@jhpratt
Copy link
Member

jhpratt commented Sep 5, 2022

Completely forgot about this issue. Basically there were two reasonably significant issues with the approach I wanted to take that would need to be addressed for any solution.

  • How are ambiguities between multiple versions resolved? It is perfectly legal to have multiple (incompatible) versions of a crate as dependencies.
  • Ditto for crates of the same name from different registries? While crates.io is by far the most common, different registries can have crates with identical names and versions, but not underlying code.

My personal view is that the former is trivially solvable: allow an optional version to be specified. The latter is quite a bit more difficult, and I have no solutions to propose.

@lovasoa
Copy link
Contributor

lovasoa commented Sep 5, 2022

@jhpratt
I think I don't understand where the problem is. If a proc macro could use $crate, it would refer to its accompanying library crate, which is unique.

@SimonSapin
Copy link
Contributor

Is it unique? As far as I understand "accompanying library crate" is not a concept that really exists in rustc or ever cargo. It’s just a convention.

@jhpratt
Copy link
Member

jhpratt commented Sep 5, 2022

What if the proc macro is re-exported from a third crate? The problem is not as simple as you'd think. What if you want to reference some arbitrary crate? I have a proc macro that needs ::serde, but that's not my crate.

@Nemo157
Copy link
Member

Nemo157 commented Sep 5, 2022

Something I've long thought about is the ability to explicitly declare runtime-dependencies for a proc-macro. That would forward-declare a crate that the macro will generate code to reference (and so avoid dependency loops), and somehow give it some TokenTree through which it can refer to the crate in the generated code.

EDIT: It is possible to support multiple crates even if there is a single $crate though, you just need to have your runtime crate re-export all the other dependencies you need.

@alercah
Copy link
Contributor

alercah commented Sep 5, 2022 via email

@Nemo157
Copy link
Member

Nemo157 commented Sep 5, 2022

It's not just hygiene. The proc-macro is built for a different target, so it will have to do something similar to how a crate depending on a proc-macro crate implicitly creates a different kind of dependency edge. And having it just naïvely depend on the runtime crate creates dependency loops if the runtime crate then depends on the macro crate to re-export it. That's why I think it needs some way to do forward declaration of dependencies between cargo and rustc for these "dependencies, but not really".

@alercah
Copy link
Contributor

alercah commented Sep 5, 2022

Yeah, that's the hard part. The name lookup is the part that is not hard.

@WaffleLapkin
Copy link
Member

So, runtime-dependencies would create a pseudo-crate which depend on all the runtime-dependencies and make $crate output by the macro refer to it? That sounds nice, although the pseudo-crate will still need special handling all the way down, since nothing can depend on it (otherwise there will be a cycle if it depends on something).


Another way is to move macro definition to the normal crate. A hacky idea that comes to mind is something like

reexport_macro_setting_dollar_crate_to_here! { the_crate_macros::the_macro }

i.e. add a built-in macro that creates a new macro setting $crate to the current crate. This also solves

To some extent this can be simulated even on stable, I think:

// main crate, that reexports the macro
macro_rules! the_macro {
    // `$crate` is expanded early it seems, and can be reinterpreted as a path
    // I've tested with `macro_rules!` and this still works if this macro is used from outside
    ($($args:tt)*) => { $crate::the_crate_macros::the_macro!($crate; $($args)*) }
}
// macro crate
#[proc_macro]
pub fn make_answer(ts: TokenStream) -> TokenStream {
    let krate = ts.parse_path(); // pseudo-code
    _ = ts.parse_semicolon();

    // ...
}

But, this is very limited -- for attribute and derive macros this won't work, as there is no syntax to define them in the normal crate.


A less hacky way would require to define macro sub-crates in-tree, which I think was discussed in zulip. But that's a lot bigger feature I think.

@Nemo157
Copy link
Member

Nemo157 commented Sep 5, 2022

To some extent this can be simulated even on stable, I think

I can confirm this works just fine for expression macros, I use it in stylish to get a re-exportable format_args! proc-macro. (It doesn't actually get expanded early, it gets passed in as a $crate ident which obeys its hygiene to determine which crate it refers to, you can get arbitrary crate access from one $crate token by changing its span).

@lovasoa
Copy link
Contributor

lovasoa commented Sep 5, 2022

Yes, this is a solution for proc macros, but it doesn't work for derive macros, does it ?

@WaffleLapkin
Copy link
Member

But, this is very limited -- for attribute and derive macros this won't work, as there is no syntax to define them in the normal crate.

But rustc could do a similar(-ish) trick for them, I think. We "just" need to design and implement it.

@dtolnay
Copy link
Member

dtolnay commented Sep 6, 2022

#54363 (comment) is along the lines of what I would want in pretty much all of my macro libraries. Something like:

# Cargo.toml

[package]
name = "serde_derive"

[lib]
proc-macro = true

[dependencies]
proc-macro2 = "1"
quote = "1"
syn = "1"

[build-dependencies]
autocfg = "1"

[dev-dependencies]
serde = "1"
trybuild = "1"

[macro-dependencies]
serde = "1"
// src/lib.rs

use proc_macro::TokenStream;

#[proc_macro_derive(Serialize)]
pub fn derive_serialize(input: TokenStream) -> TokenStream {
    let serde /*: proc_macro::Ident */ = proc_macro::dependency("serde");
    quote! {
        impl #serde::Serialize for}
}

In terms of the Cargo build graph, this does not say serde needs to finish (or even start) building before serde_derive can start building, unlike ordinary dependencies. It says serde needs to finish building (the rmeta, not necessarily codegen) before anything that depends on serde_derive can begin building, except serde itself:

  • If serde depends on serde_derive and calls this macro (it doesn't, but let's pretend) then the Ident that gets returned by proc_macro::dependency("serde") inside that expansion needs to behave just like a $crate that came from a macro_rules inside serde would behave.

  • If some other crate depends on serde_derive (directly or transitively) and calls this macro, the Ident is as though the downstream crate had its own direct dependency on __unnameable = { package = "serde", version = "1" } and obtained a $crate from it.

The discussion above about "what if multiple versions" and "what if different registries" doesn't seem applicable to this solution. The macro-dependencies describes a particular version just as a dependency or dev-dependency would do, and implicitly or explicitly a registry, and integrates nicely with Cargo patch. For example [patch.crates-io] serde = { path = "…" } would apply to that macro-dependency exactly as it would apply to an ordinary dependency.

@jhpratt
Copy link
Member

jhpratt commented Sep 6, 2022

@Nemo157 @dtolnay Love it. Both questions/problems I stated are inherently resolved by using Cargo.toml, which is something I honestly never considered.

@blueforesticarus
Copy link

Based on this discussion over def_site in proc macros. #54724 (comment)

Would it be appropriate for @dtolnay's macro-dependencies suggestion to put the dependencies into the def_site namespace?

@Kixunil
Copy link
Contributor

Kixunil commented Oct 25, 2023

Note that the same problem occurs in build script dependencies. prost/prost-build, tonic/tonic-build, configure_me/configure_me_codegen... I think both need to be solved and probably doing it the same way is the simplest option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-macros Area: All kinds of macros (custom derive, macro_rules!, proc macros, ..) A-proc-macros Area: Procedural macros A-resolve Area: Path resolution C-feature-request Category: A feature request, i.e: not implemented / a PR. T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests