Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Eager Macro Expansion #2320

Open
wants to merge 30 commits into
base: master
from

Conversation

Projects
None yet
@pierzchalski
Copy link

pierzchalski commented Feb 2, 2018

Rendered.

@pierzchalski pierzchalski referenced this pull request Feb 3, 2018

Open

Tracking issue: declarative macros 2.0 #39412

9 of 19 tasks complete
@alexreg

This comment has been minimized.

Copy link

alexreg commented Feb 3, 2018

So the idea would be to implement the lift macro you mentioned in rust-lang/rust#39412 (comment) using this macro expansion API?

@alexreg

This comment has been minimized.

Copy link

alexreg commented Feb 3, 2018

@pierzchalski Incidentally, you probably want to CC/assign @jseyfried to this PR.

@pierzchalski

This comment has been minimized.

Copy link
Author

pierzchalski commented Feb 3, 2018

@alexreg Whoops! Done.

@alexreg

This comment has been minimized.

Copy link

alexreg commented Feb 4, 2018

On second thought, maybe better to CC @petrochenkov given @jseyfried's long-term absence?

@petrochenkov

This comment has been minimized.

Copy link
Contributor

petrochenkov commented Feb 5, 2018

maybe better to CC @petrochenkov

Sorry, can't say anything useful here, I haven't written a single procedural macro in my life and didn't touch their implementation in the compiler either.

@pierzchalski

This comment has been minimized.

Copy link
Author

pierzchalski commented Feb 5, 2018

This is a language/compiler RFC so I guess @nikomatsakis and @nrc are two other people to CC, anyone else who would be interested?

@alexreg

This comment has been minimized.

Copy link

alexreg commented Feb 5, 2018

@petrochenkov Oh, sorry. I gathered from your comments on the declarative macros 2.0 RFC that you knew something of the macros system in general. My bad.


* Greatly increases the potential for hairy interactions between macro calls. This opens up more of the implementation to be buggy (that is, by restricting how macros can be expanded, we might keep implementation complexity in check).

* Relies on proc macros being in a separate crate, as discussed in the reference level explanation [above](#reference-level-explanation). This makes it harder to implement any future plans of letting proc macros be defined and used in the same crate.

This comment has been minimized.

@Centril

Centril Feb 5, 2018

Contributor

I'd like to highlight this drawback. Are the gains in this RFC enough to outweigh this drawback?

This comment has been minimized.

@alexreg

alexreg Feb 5, 2018

Indeed, why does it require a separate crate for proc macros? Can you elaborate?

This comment has been minimized.

@pierzchalski

pierzchalski Feb 5, 2018

Author

Thinking about it more, this expansion API doesn't add any extra constraints to where a proc macro can be defined, so I guess this shouldn't really be here.

Originally I was worried about macro name resolution (I thought having proc macros in a separate crate at the call site would make that easier but given that there are other issues involving macro paths this seems redundant to worry about), and collecting definitions in an 'executable' form.

Declarative macros can basically be run immediately after they're parsed because they're all compositions of pre-existing built-in purely-syntactic compiler magic. Same-crate procedural macros would need to be 'pre-compiled' like they're tiny little inline build.rss scattered throughout your code. I thought this would interact poorly in situations line this:

#[macro_use]
extern crate some_crate;

#[proc_macro]
fn my_proc_macro(ts: TokenStream) -> TokenStream { ... }

fn main() {
    some_crate::a_macro!(my_proc_macro!(foo));
}

How does some_crate::a_macro! know how to expand my_proc_macro!?

In hindsight, this is just a roundabout way of hitting an existing problem with same-crate proc macros:

// Not a proc-macro.
fn helper(ts: TokenStream) -> TokenStream { ... }

#[proc_macro]
fn a_macro(ts: TokenStream) -> TokenStream {
    let helped_ts = helper(ts);
    ...
}

fn main() {
    a_macro!(foo);
}

Same question: how does a_macro! know how to evaluate helper? I think whatever answer we find there will translate to this macro expansion problem.

Anyway, I'm now slightly more confident that that particular drawback isn't introduced by this RFC. Should I remove it?

This comment has been minimized.

@alexreg

alexreg Feb 5, 2018

Yeah, I'd tend to agree with that assessment. Is there an RFC open for same-crate proc macros currently? If so, I'd be curious to read it over.

This comment has been minimized.

@pierzchalski

pierzchalski Feb 6, 2018

Author

I remember reading some fleeting comments about it, but I just had a quick look around and I can't find anything about plans for it.

This comment has been minimized.

@Centril

Centril Feb 6, 2018

Contributor

I'm no expert wrt. proc macros.. I'd also be interested in any resources wrt. same-crate macros.

Thanks for the detailed review and changes =)

This comment has been minimized.

@alexreg

alexreg Feb 6, 2018

@pierzchalski On a related note, my WIP PR can be found here: rust-lang/rust#47992 (comment). I'm going to make another big commit & push in an hour I think.

@pierzchalski pierzchalski changed the title Add macro expansion API to proc macros RFC: Add macro expansion API to proc macros Feb 5, 2018

Update 0000-proc-macro-expansion-api.md
Remove 'same crate proc macro' drawback and replace it with discussion under reference explanation, since it's an issue that isn't introduced by this RFC and will also probably share a solution.

@sgrif sgrif added the T-lang label Feb 8, 2018


Built-in macros already look more and more like proc macros (or at the very least could be massaged into acting like them), and so they can also be added to the definition map.

Since proc macros and `macro` definitions are relative-path-addressable, the proc macro call context needs to keep track of what the path was at the call site. I'm not sure if this information is available at expansion time, but are there any issues getting it?

This comment has been minimized.

@jseyfried

jseyfried Feb 9, 2018

Yeah, this information is available at expansion time. Resolving the macro shouldn't be a problem.

@pierzchalski

This comment has been minimized.

Copy link
Author

pierzchalski commented Feb 9, 2018

I just realised that one of the motivations for this feature (the lift! macro alluded to by @alexreg) wouldn't actually be made possible by this RFC. lift! needs to lift the contained macro up two levels:

#[proc_macro]
fn lift(ts: TokenStream) -> TokenStream {
    let mut mac_c = ...;
    mac_c.call_from(...);
    //              ^^^
    // This needs to be the span/scope/context of, in this
    // example, `main`: the caller of `m`, which is the caller of `lift!`.
    ...
}

macro m() {
    lift!(m_helper!()); // Should set the caller context of `m_helper!` to
                        // caller context of `m!`.
}

fn main() {
    m!();
}

But the current Span API doesn't allow such shenanigans. @jseyfried, does the RFC you mentioned here hold any hope? How exciting a change is it?

@alexreg

This comment has been minimized.

Copy link

alexreg commented Feb 9, 2018

@pierzchalski Yeah, it looks like either we'd have to bake this lift macro into the compiler, or extend the proc macro API (ideally to provide a whole stack of syntax contexts for macro expansions).

@llogiq

This comment has been minimized.

Copy link
Contributor

llogiq commented Mar 9, 2018

Good job! I've wanted a solution for this for some time. I see but two possible problem with the solution this RFC PR suggests:

  1. If we have multiple procedural macros, their order of execution may change the result. Consider proc_macro_a, which wants to ignore macros, just passing ExprMac nodes unchanged, whereas proc_macro_b will expand them. Now if proc_macro_a runs before proc_macro_b, all is well and the macro authors don't need to care about what could have led to the result.
    However, if proc_macro_b runs before proc_macro_a, the latter will only see the expansion of the expressions, and now proc_macro_a's author will have to worry about whether an expression comes from an expanded macro.
    A simple solution would be to extend the registry API so that proc macros can register themselves as pre-expansion or post-expansion. Pre-expansion macros won't be allowed to fold an Expr to something expanded (which would need a marker and detection visitor), while post-expansion macros will see the expressions after macro expansion (and could find out what led to this particular code via
    the expansion info).
    A possible extension would be to introduce a third during-expansion category, which are allowed to expand macros, but may get the AST at any stage in the expansion chain.
  2. Compiler-internal macros may expand to something that is not allowable outside the compiler (see __unstable_column!() for example). Expanding it from within a macro could
  • fail – as it is currently the case. The macro will abort with a panic. This is suboptimal for obvious reasons
  • return a Result that may contain an error object of some sort. This is still not optimal, as for example, the vec![] macro contains such a thing, and it is one thing we likely want to have expanded, but we can probably deal with that by making the error object return the expanded result until the expansion
    which caused the error, which should suffice for most cases
  • go through and allow proc macro authors to reach into the compiler internals. This is not something we want to stabilize, ever.
@pierzchalski

This comment has been minimized.

Copy link
Author

pierzchalski commented Apr 3, 2018

@llogiq sorry for the late reply!

I'm not sure what point you're trying to make in (1) - if I change the order of two macro calls, I don't really expect the same result in general, similar to if I change the order of two function calls. Do you have a concrete example of a proc macro which wants to ignore/pass-through macro nodes but which also cares if an expression comes from a macro expansion?

Also re. (1), I'm not overly familiar with the expansion process but as far as I understand and recall, the current setup is recursive fixpoint expansion, which makes it hard to have cleanly delineated pre- and post-expansion phases for macros to register themselves for. Can you clarify how these would work in that context?

Regarding (2), one dodgy solution is to have the macro expansion utility functions be internals-aware by having a blacklist of "do not expand" macros, but that's pretty close to outright stabilising them.

@llogiq

This comment has been minimized.

Copy link
Contributor

llogiq commented Apr 3, 2018

To answer (2), in mutagen, I'd like to avoid mutating assert! and similar macros, so I'm interested not only if code comes from a macro, but also which one. On the other hand, I'd like to mutate other macro calls, e.g. vec![..] or println(..). This should also explain (1), because mutagen, as a procedural macro, may see a mixture of pre- and post-expansion macro calls, and cannot currently look into the former.

I'm OK with getting the resulting code if I also get expansion info, and also get a way of expanding macros so I can look into them.

@pierzchalski

This comment has been minimized.

Copy link
Author

pierzchalski commented Apr 3, 2018

So I don't know what changes @jseyfried is making to how contexts and scopes are handled, but I agree that sounds like the right place to put this information (about how a particular token was created or expanded).

Putting it in spans definitely sounds more workable than trying to wrangle invocations to guarantee you see things pre- or post-expansion, but it also means doing a lot more design work to identify what information you need and in what form.

@llogiq

This comment has been minimized.

Copy link
Contributor

llogiq commented Apr 5, 2018

One thing I think we need is a way for proc macros to mark what they changed (and for quote! to use it automatically).

@nrc nrc self-assigned this Apr 30, 2018

@nrc

This comment has been minimized.

Copy link
Member

nrc commented May 1, 2018

I just realised that one of the motivations for this feature (the lift! macro alluded to by @alexreg) wouldn't actually be made possible by this RFC. lift! needs to lift the contained macro up two levels:

iiuc, lift is eager expansion? That was covered by #1628 for declarative macros, which I still think is a nice thing to add. If we did add it for decl macros, then we should do something for proc macros too.

@nrc

This comment has been minimized.

Copy link
Member

nrc commented May 1, 2018

re compiler internals and inspection, I would expect that the results of expansion would be a TokenStream and that could be inspected to see what macro was expanded (one could also inspect the macro before expansion to get some details too). I would expect that 'stability hygiene' would handle access to compiler internals, and that the implementation of that would not allow macro authors to arbitrarily apply that tokens.

@nrc

This comment has been minimized.

Copy link
Member

nrc commented May 1, 2018

Thanks for this RFC @pierzchalski! I agree that this is definitely a facility we want to provide for macro authors. My primary concern is that this is a surprisingly complex feature and it might be better to try and handle a more minimal version as a first iteration. It might be a good idea to try and avoid any hygiene stuff in a first pass (but keep the API future-compatible in this direction), that would work well with the macros 1.2 work.

It is worth considering how to handle expansion order (although it might be worth just making sure we are future-compatible, rather than spec'ing this completely). Consider the following macros uses:

foo!(baz!());
bar!(); // expands to `macro baz() {}`

If foo is expanded before bar, then baz won't be defined and building will fail. However, if baz! were written directly in the program it would succeed - https://play.rust-lang.org/?gist=32998f65348efbeffdfbe106b0063eeb&version=nightly&mode=debug

Then consider a macro that wants to expand two macros where one is defined by the other - it might be nice if the macro could try different expansion orders. I think all that is needed is for the compiler to tell the macro why expansion failed - is it due to a failed name lookup, or something going wrong during the actual expansion stage.

Which brings to mind another possible problem - what happens if the macro we're expanding panics? Should that be caught by the compiler or the macro requesting expansion?

@nrc

This comment has been minimized.

Copy link
Member

nrc commented May 1, 2018

Is there prior art for this? What do the Scheme APIs for this look like?

The full API provided by `proc_macro` and used by `syn` is more flexible than suggested by the use of `parse_expand` and `parse_meta_expand` above. To begin, `proc_macro` defines a struct, `MacroCall`, with the following interface:

```rust
struct MacroCall {...};

This comment has been minimized.

@nrc

nrc May 2, 2018

Member

Without getting too deep into a bikeshed, I think something like ExpansionBuilder would be a better name

fn new_attr(path: TokenStream, args: TokenStream, body: TokenStream) -> Self;
fn call_from(self, from: Span) -> Self;

This comment has been minimized.

@nrc

nrc May 2, 2018

Member

I think we should leave this to a later iteration

fn call_from(self, from: Span) -> Self;
fn expand(self) -> Result<TokenStream, Diagnostic>;

This comment has been minimized.

@nrc

nrc May 2, 2018

Member

The error type should probably be an enum of different ways things can go wrong, and where there are compile errors we probably want a Vec of Diagnostics, rather than just one.

```

The functions `new_proc` and `new_attr` create a procedural macro call and an attribute macro call, respectively. Both expect `path` to parse as a [path](https://docs.rs/syn/0.12/syn/struct.Path.html) like `println` or `::std::println`. The scope of the spans of `path` are used to resolve the macro definition. This is unlikely to work unless all the tokens have the same scope.

This comment has been minimized.

@nrc

nrc May 2, 2018

Member

Overall, I really like the idea of using a Builder API - it keeps things simple and is future-proof

@pierzchalski

This comment has been minimized.

Copy link
Author

pierzchalski commented Mar 10, 2019

Alright, after some long and fruitful discussions with @alexreg we've got an updated RFC (no longer an eRFC) that focuses on and justifies what we think is the "least exciting" proposal that sill gets us useful eager expansion.

@pierzchalski pierzchalski changed the title eRFC: Macro Expansion for Macro Input RFC: Macro Expansion for Macro Input Mar 10, 2019

@alexreg

This comment has been minimized.

Copy link

alexreg commented Mar 10, 2019

Indeed, @pierzchalski has worked long and hard over this proposal, and having reviewed some drafts and had some nice discussions with him about this, we've settled on what we think is a reasonably conservative but still powerful extension to the language in terms of eager expansion. Why this RFC does not of course go into implementation details, I think the idea is that this can be implemented mainly within the proc_macro crate, with a few small extensions to rustc.

@pnkfelix Would you mind taking a look and FCPing this for merge, if you're reasonably happy with the state of things?

@llogiq

This comment has been minimized.

Copy link
Contributor

llogiq commented Mar 10, 2019

I've in the meantime had discussions with @matklad who leads the rust-analyzer effort and he prefers a less powerful approach (basically just mark proc_macros with an attribute so they get their arguments already expanded) that is easier to implement and more cache-friendly.

Perhaps we should use that proposal as a stepping stone to a more full-fledged implementation?

(Caveat: I'm just a proc_macro author who wants expanded macros somehow)

@pierzchalski pierzchalski changed the title RFC: Macro Expansion for Macro Input RFC: Eager Macro Expansion Mar 10, 2019

@alexreg

This comment has been minimized.

Copy link

alexreg commented Mar 10, 2019

@llogiq This is already quite toned-down I feel, and has several notable advantages over that proposal, above all the ability to have eager expansion in declarative macros, apart from greater flexibility. A lot of thought and planning has gone into this already, and several iterations (including inspiration from nrc's prior work). I personally think what we have here represents the best current state of work on the problem, though I of course encourage comments and feedback (now or at FCP stage), and I would imagine @pierzchalski would appreciate that too.

@pierzchalski

This comment has been minimized.

Copy link
Author

pierzchalski commented Mar 11, 2019

On an entirely separate note, I'm currently hunting for prior art for Rust's 'macros 2.0' expansion order (there's plenty of prior art for the hygiene system, and from the right viewpoint Racket looks similar to macros 2.0, but I'm having trouble finding work on eager expansion). There's some discussion on the subreddit, if anyone wants to contribute there or here.

@pierzchalski pierzchalski force-pushed the pierzchalski:macro-expansion-api branch from e4f3d04 to 0669405 Mar 12, 2019

@pierzchalski pierzchalski force-pushed the pierzchalski:macro-expansion-api branch from 0669405 to 3dd9545 Mar 12, 2019

@samth

This comment has been minimized.

Copy link

samth commented Mar 13, 2019

@pierzchalski (I think) asked me to comment on Reddit, so here are some thoughts as someone who's thought deeply about macros but is an outsider to Rust. (I also talked to @nikomatsakis about this a long time ago but I don't think I had as coherent thoughts then.)

Here's what I think the motivation is, phrased in a less-Rust-specific way, mostly to help me understand/talk about it.

  1. In some cases, you want to compute the argument to a macro. One way to do that is to write that argument as an expression that would macro-expand to something that's syntactically what you want, and then run the maro expander until you get that. Note that this is limited to things that can be represented as the result of macro expansion (ie, you can't compute an actual vector).
  2. It would be nice to be able to perform a little computation, often for generating identifiers, while staying entirely within pattern-based macros.
  3. Partial reuse of other macros. Sometimes another macro is helpful but you want to take apart the pieces of its expansion to re-arrange them. (There isn't actually an example of this in the RFC.)

I would add some other use cases:

  1. Analyze the results of macro expansion. If you want to write a macro that performs some sort of analysis as part of its transformation, you need to be able to expand away macros. This is one of the most common uses of calling expand in Lisp macro systems. A system I built, Typed Racket, works this way, as described in these two papers.
  2. Allow macro clients to use different abstractions together. For example, say you have a macro which expects a struct declaration as an argument. But if your client has defined another macro which expands into a struct, they might want to use these two macros together. If your macro just looks literally for struct, then this won't work. Instead, what you want is to expand to find the struct form, and then process the results. This is the "work together" from the paper that describes Racket's local-expand form. Note that annotations in Rust make this use case less needed.

How do people solve these problems in other languages with macros

Here I'll mostly describe Racket, which has not only the fanciest macro system but is what I know best.

Computing arguments

I think there are two main solutions here. One is just write another macro that expands into the macro call you want. That is, for the env! example, write env_concat!. The second is to re-think the problem, and realize that probably what you want isn't so much macro expansion as regular evaluation, but at compile time -- concatenating two strings is just the beginning. Then define an abstraction that takes an expression computing the thing you want.

Here's an example macro showing how I'd do env allowing compile-time evaluation: http://pasterack.org/pastes/80014

Adding computation to rewriting-only macros

I think this is mostly a bad idea. The underlying problem is usually that procedural macros are too hard/complex/unwieldy, and that should be fixed directly. In Racket, the transition is very smooth and syntax-rules (the pattern-only macro definition form) is used only for the simplest macros. This is generally good for other reasons. A feature like this existed in the extend-syntax system which was the original pattern-based macros as proposed by Kohlbecker and Wand, but was eventually dropped and I don't know of any system that has it now. People (maybe even me) have sometimes suggested adding something like this to Racket but we've instead just made better libraries for writing procedural macros.

Partial re-use

This doesn’t seem like a good idea. useful! seems like it’s someone else’s abstraction, and messing with its internals is unlikely to end well. Is there an actual example where this came up?

Analyzing code with macros in it

This is done in Racket by fully expanding the relevant form using the local-expand function. The 2011 paper I linked to above describes how Typed Racket uses this. The basic idea is that local-expand takes an expression as an argument, and produces a full-expanded expression; one with no more macro uses in it. You can specify whether you want the argument treated as an expression or a top-level declaration form. For Racket there’s a simple grammar for what you get out, for Rust it would just be the Rust grammar.

Cooperating macros

This is the use local-expand was invented for. In addition to the use described above, you can also have it stop as soon as it expands all the “top level” macros in the argument, so you can tell what sort of thing the top-level portion is (a struct in my example above). The way it works is described pretty well in the “Macros that work together” paper (in the motivation section, you won’t have to read the formal model).

Having just covered the motivation section, I feel like I should stop here and put my thoughts on the actual proposal in the RFC in a separate comment. Hopefully this is helpful for someone. :)

@alexreg

This comment has been minimized.

Copy link

alexreg commented Mar 13, 2019

@samth Thanks for sharing your perspective as a Racket developer. Thoughts on this specific RFC would also be appreciated, if you're reasonably familiar with the Rust macro system, but that's already useful for @pierzchalski to maybe amend the Prior Art section with, I think. :-)

@samth

This comment has been minimized.

Copy link

samth commented Mar 13, 2019

Ok, some thoughts on the rest of the RFC.

First, another useful piece of related work is the concept of continuation-passing-style macros, described here by Oleg Kiselyov with application to exactly this problem. I also added a link to an example of how I'd do env in Racket to my earlier comment.

  1. You really need to think through the interaction with hygiene. The main reason why Racket has a separate local-expand function, in addition to the traditional Lisp expand, is that local-expand is sensitive to the local context, including binding information. There's a whole section (5.3) in this paper about how Racket's hygiene system works (this is known as set-of-scopes, you may have heard of that already) just about interaction with local-expand.

  2. Start with just the procedural API. People write a lot of macros in Racket, and there's been no suggestion ever that there should be an addition to rewriting-based macros for using local-expand. Plus the API to use expand! seems very tricky to use, since it effectively adds its own form of pattern matching.

  3. I wanted to try to write the examples in appendix B & C in Racket, but I found them very tricky to understand, particularly because they use the declarative macro API (and because they rely on modifying existing definitions, which is strange). My sense is that you could make examples that stress different corners of expansion order much more easily by (a) using procedural macros and (b) doing the print-outs at compile time so that you can see when things happen. Here's an example written out in Racket that shows some of the ordering issues, and I expect that you'd be able to express the issues for this RFC more easily with something like this.

@alexreg

This comment has been minimized.

Copy link

alexreg commented Mar 14, 2019

Thanks a lot for your feedback, @samth.

  • You really need to think through the interaction with hygiene. The main reason why Racket has a separate local-expand function, in addition to the traditional Lisp expand, is that local-expand is sensitive to the local context, including binding information. There's a whole section (5.3) in this paper about how Racket's hygiene system works (this is known as set-of-scopes, you may have heard of that already) just about interaction with local-expand.

I don't quite see how there are really any special considerations about hygiene beyond what we already have with the Rust hygiene system (which uses set-of-scopes along with macro expansion "marks", as you may know). Well, that's not 100% true. The one thing that we should probably note is that whenever an eager expansion is performed (i.e., the macro expansion routine is invoked for a metavariable like #tokens in the RFC examples), then the scope of the eagerly-expanded token stream should be that of the substitution location. To be specific, if you have something like

eager! {
	#a = concat!("foo", "_", "b");
	let #a = 42;
}

then the hygiene of all the tokens in the output (let foo_bar = 42;) should be the same (modulo expansion marks – i.e., the sets of scopes should be the same).

  • Start with just the procedural API. People write a lot of macros in Racket, and there's been no suggestion ever that there should be an addition to rewriting-based macros for using local-expand. Plus the API to use expand! seems very tricky to use, since it effectively adds its own form of pattern matching.

The thing is, a major use case for this is being able to write ident-position macros, hence the declarative expand! macro. Only being able to use eager expansion within procedural macros would be a real shame. I'm sure @pierzchalski has a few other use-cases in mind.

  • I wanted to try to write the examples in appendix B & C in Racket, but I found them very tricky to understand, particularly because they use the declarative macro API (and because they rely on modifying existing definitions, which is strange). My sense is that you could make examples that stress different corners of expansion order much more easily by (a) using procedural macros and (b) doing the print-outs at compile time so that you can see when things happen. Here's an example written out in Racket that shows some of the ordering issues, and I expect that you'd be able to express the issues for this RFC more easily with something like this.

That may help explain; I'm not sure. We had a fair bit of discussion on how best to express this, and I was reasonably happy with how it ended up, but admittedly I do think of things in a slightly different way myself. Now, when it comes to "modifying existing definitions", I'm personally not a big fan of that terminology, even though @pierzchalski somewhat likes it. Personally I don't like to think of macros as "modifying" their token streams, but rather as functions on token streams -- black boxes, even, where there can be anything from an identity relation between (single-argument) input and output, to complete disconnect.

@pierzchalski

This comment has been minimized.

Copy link
Author

pierzchalski commented Mar 14, 2019

@alexreg

The thing is, a major use case for this is being able to write ident-position macros, hence the declarative expand! macro.

This isn't strictly true. There's nothing stopping us from coming at the 'ident-position macros' problem with a procedural macro (for instance, we could implement something like expand! using fn please_expand and quote!).

The reason why I chose to focus on a declarative macro API is that I think declarative expansion semantics are better understood in Rust and easier to extend cleanly. In fact, for my original use-case (expanding arguments to attribute macros), expand! is less ergonomic, but it's ergonomic enough.

@samth, thanks for the feedback!

First, another useful piece of related work is the concept of continuation-passing-style macros,

Ah, in a previous draft of the RFC I referred to CPS at the start of this section, but ended up removing the term in favour of walking through an example (in fact, CPS was a pretty direct inspiration for expand!). I should probably mention it (and the paper you linked) in the prior art.

Plus the API to use expand! seems very tricky to use, since it effectively adds its own form of pattern matching.

I'm afraid I don't understand. What part of the description of expand! sounds like it's adding something like a new form of pattern matching?

@pierzchalski

This comment has been minimized.

Copy link
Author

pierzchalski commented Mar 14, 2019

In my previous comment, I was about to say something along the lines of "expand! is the simplest basic combinator that allows us to turn a third-party non-CPS macro into one", but that's clearly not true! We could imagine a much simpler one (call it cps!), such that this:

cps!(foo!(<args to foo>); bar);

Would expand into this:

bar!(<result of expanding `foo!(<args to foo>)`>);

This sidesteps all the issues that expand! introduces with respect to expansion order, at the cost of being too fiddly to do anything useful with (it's not obvious how to pass additional arguments to bar!, for example. There's a solution, but it involves defining fresh macros to act as 'closures', which is a little heavyweight).

In this light, one compromise would be to restrict expand! to only have immediate macro invocations in the right-hand sides of its arguments (but keeping the useful quoting output format), and to only expand those outermost invocations; some other utility macro expand_all! would perform the process of iteratively discovering and expanding macros in an expression.

This idea essentially 'punts' the expansion order question by making it a library problem. I'm not sure how I feel about that. Keeping expansion order under the control of the compiler means we get to say "there's no expansion order that you can rely on"; the alternative is an ecosystem of eager macros that occasionally do depend on expansion order, which sounds unappealingly fragile.

Edit: yeah, making expansion order a library choice is worse than I thought. For example: how does a library figure out the expansion order in the following?

foo!();
id(mk_macro!(foo));
@pierzchalski

This comment has been minimized.

Copy link
Author

pierzchalski commented Mar 14, 2019

Regarding the examples in appendices B and C:

  1. For this RFC we're not really interested in cases where two procedural macros get expanded in different orders and have their side-effects interleaved. That's not an issue introduced by eager expansion, and the consensus is already that proc macro expansion order is unspecified.
  2. The only way for the expansion of one macro invocation foo! to affect an invocation of another macro bar! is for the result of the expansion of foo! to provide a definition for the invocation bar!.
  3. The only way for an invocation foo! to be sensitive to expansion order is if there are different steps or stages during expansion in which different definitions of foo! appear to be valid and resolvable.

Unfortunately, this narrows down the 'interesting' expansion order cases to the admittedly hard-to-motivate examples in the appendices. Fortunately, this is a pretty niche problem! Given the various hoops you have to go through to get to this point (you need your identifier to have the 'correct' hygiene, and you need nested eager macros that provide different definitions for that identifier as they expand), we could probably get away with ignoring the issue for a long time.

@alexreg

This comment has been minimized.

Copy link

alexreg commented Mar 14, 2019

This isn't strictly true. There's nothing stopping us from coming at the 'ident-position macros' problem with a procedural macro (for instance, we could implement something like expand! using fn please_expand and quote!).

My point was it can be used in declarative macro code (and not just as a function within procedural macro libraries). I think there's a bit of confusion around terminology here. Sure, we could implement it like that, and we still might. The RFC doesn't dictate implementation.

@samth

This comment has been minimized.

Copy link

samth commented Mar 14, 2019

A few responses:

  1. I thought the Rust macro hygiene story was still in flux, and set-of-scopes hadn't been implemented yet. Did that change?

  2. Regardless of the current state, hygiene + eager expansion is a complex topic, and needs detailed thinking. In particular, "the scope of an identifier" isn't well-defined except in so far as you say what the set-of-scopes (or marks/renaming) operations that are performed are, which this RFC doesn't. I don't know how detailed you need to be in the RFC, but there are plenty of tricky issues here. You may well also end up in a situation where existing macros depend on something that doesn't work with how you eventually want hygiene to work.

  3. I don't think the word "modifying" is important -- the issue is whether it's possible to express those examples with simpler macros, such as ones that don't expect a full sequence of top-level declarations as input (could they just take a name?).

  4. The reason that I think a procedural-only API is better is that (a) if you have procedural macros at all, then procedural macros are the easier-to-understand part, because they involve expansion but not necessarily fancy pattern matching; (b) you can then implement expand! as a library; and (c) if you find yourself wanting to add lots of fancy extensions to declarative macros, it's probably a sign that your procedural macros are too hard to write.

  5. expand! is isn't really adding new pattern matching, but it is adding a new way to bind things and splice them into other things, like how declarative macros already work.

  6. Not defining macro expansion order seems like a very bad idea, for a lot of reasons. But I'll see if I can write a simpler Racket example for this particular expansion order issue.

  7. Note that iterating an expand_once! macro is likely to be different (particularly wrt hygiene) than having a built-in recursive expand! macro (or similar for procedural versions of the same).

@pierzchalski

This comment has been minimized.

Copy link
Author

pierzchalski commented Mar 14, 2019

  1. I thought the Rust macro hygiene story was still in flux, and set-of-scopes hadn't been implemented yet. Did that change?

As far as I'm aware, both types of declarative macros (ones defined with macro foo and macro_rules! foo) are hygienic (although macro_rules! macros have some exceptions, I think?).

One thing that might clear up the point of the B and C appendices: from one point of view, they're about hygiene. Each eager expansion introduces a new scope (the 'expansion contexts' mentioned here) in which discovered definitions are made available; we modify the hygiene resolution rules so that a prospective definition needs to be in the same hygiene scope and also be in a suitable expansion context.

  1. The reason that I think a procedural-only API is better is that ...

I agree that having a fully-featured procedural API is 'the' 'correct' long-term goal, but that's a long way off (for instance, the API for modifying hygiene information on token streams is currently in flux and far from stabilisation). We can implement something like expand! now, and something like expand! would be useful now, and ideally it should be forwards-compatible with being re-implemented in the future as a procedural macro, but I'm not convinced that replacing this RFC with a full proc macro API proposal is a good idea right now.

  1. Not defining macro expansion order seems like a very bad idea, for a lot of reasons.

I'm rather curious what those reasons are. I would be very surprised and annoyed if this code:

// foo.rs
makes_a_struct! {};

// bar.rs
makes_a_module! {};

Depended on the order that those macros were expanded in any way. Currently, the only expansion order guarantee that the compiler makes is that a macro gets expanded after its definition can be resolved, which is very minimal, but which also seems to have worked fine so far.

  1. Note that iterating an expand_once! macro is likely to be different (particularly wrt hygiene) than having a built-in recursive expand! macro (or similar for procedural versions of the same).

I know that at this point I'm asking for a lot of examples for a lot of things, but... do you have an example of this difference?

@alexreg

This comment has been minimized.

Copy link

alexreg commented Mar 14, 2019

  • I thought the Rust macro hygiene story was still in flux, and set-of-scopes hadn't been implemented yet. Did that change?

I'm pretty sure it's been in place for a while now, but I may be mistaken.

  • Regardless of the current state, hygiene + eager expansion is a complex topic, and needs detailed thinking. In particular, "the scope of an identifier" isn't well-defined except in so far as you say what the set-of-scopes (or marks/renaming) operations that are performed are, which this RFC doesn't. I don't know how detailed you need to be in the RFC, but there are plenty of tricky issues here. You may well also end up in a situation where existing macros depend on something that doesn't work with how you eventually want hygiene to work.

I'm yet to see justification for this assertion. I think you're expecting this RFC to cover everything about the macro system and hygiene, when that's far beyond its scope (har har). In fact, that's already been fleshed out before and has been working well in Rust for a while now.

  • The reason that I think a procedural-only API is better is that (a) if you have procedural macros at all, then procedural macros are the easier-to-understand part, because they involve expansion but not necessarily fancy pattern matching; (b) you can then implement expand! as a library; and (c) if you find yourself wanting to add lots of fancy extensions to declarative macros, it's probably a sign that your procedural macros are too hard to write.

Again, having an actual expand! macro that can be used declaratively is paramount for things like ident-position among other things. (Of course, one could allow normal macros in ident-position, but that idea has already been rightly discarded for several reasons, most of all that it could make a real mess of ordinary code.) If you have a concrete idea in mind for a simpler way the power of expand! in code with lower complexity (I don't think it's complex, personally, mind you), I'm all ears. :-)

@Ixrec

This comment has been minimized.

Copy link
Contributor

Ixrec commented Mar 15, 2019

Has there ever been a detailed public explanation of why hygiene is a hard problem to solve at a design level, not just fiddly to implement? I feel like either we forgot to ever do a proper write-up, or there is one out there I've simply never seen, but either way only half of the people in this discussion seem to have this context.

To clarify what I'm asking for: For the hardness of specialization, aturon wrote this. For the hardness of NLL, nikomatsakis wrote this post and a dozen others and the RFC. For the hardness of self-referential structs, withoutboats wrote this post and its sequels. For the hardness of name resolution, RFC 1560 gets the point across. And so on. But for whatever reason, I've never seen any blog post or RFC or forum thread or other public explanation of why hygiene is harder than it seems. Did I simply miss it?

@pnkfelix

This comment has been minimized.

Copy link
Member

pnkfelix commented Mar 15, 2019

reassigning to self, since I told @alexreg that I would try to look more carefully at this.

@pnkfelix pnkfelix assigned pnkfelix and unassigned nrc Mar 15, 2019

@pierzchalski

This comment has been minimized.

Copy link
Author

pierzchalski commented Mar 18, 2019

@Ixrec: have you looked at @nrc's series on macros? By Rust's standards it's an old series (from 2015), but part 3 talks about some of the rough edges of hygiene at the time (it also mentions the 'macros that work together' paper that @samth referred to in one of their comments). I'm not sure how much of that series is still accurate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.