How to determine hygienic context for "non-atomic" code fragments #50122

petrochenkov · 2018-04-20T16:15:30Z

Reminder: Syntactic/hygienic context is formally a "chain of expansions" and informally "the place where something is actually written". For example, in this example

macro m($b: expr) {
    a + $b
}

let x = m!(b);

a's context is inside the macro m and b's - outside of the macro.
With Macro 2.0 hygiene names are resolved in locations where they are "actually written".

For "atomic" tokens like identifiers or punctuation signs the context is unambiguous, but complex entities like expressions or types can be combined from tokens introduced in different contexts, look for example at this code ultimately expanding into println!("Hello world!")

macro context_parens($name: tt, $bang: tt, $args: tt) {
    $name $bang ( $args )
}

macro context_hello($name: tt, $bang: tt) {
    context_parens!($name, $bang, "Hello world!")
}

macro context_bang($name: tt) {
    context_hello!($name, !)
}

macro context_println() {
    context_bang!(println)
}

fn main() {
    context_println!();
}

So, what is the "call site" context of the println macro in this case?
Where should we resolve identifiers with call-site hygiene for macros invoked like this?

Contexts of "non-atomic" entities are important for several other reason than determining call-site hygiene, for example in Struct { field1, field2, ..rest } fields fieldN where N > 2 are checked for privacy in the context of ..rest fragment, but that fragment may also be a Frankenstein's monster combined from pieces with different contexts.

Proposed solution:

For each complex entity figure out and document an atomic entity that is "essential" for that complex entity and that serves as a source of hygienic context for the complex entity.

For example, for binary operator expressions the context may be determined by the context of the operator: context($a + $b) = context(+), for the "remaining fields" fragment ..$rest mentioned above the context may be determined by the context of .., etc.

I'm... not sure what that essential atomic token would be for macro invocations, probably ! for bang macros and [] for attribute macros.
(Note that paired delimiters like (), [] and {} always have the same context in a pair).

The text was updated successfully, but these errors were encountered:

CAD97 · 2018-04-20T19:45:52Z

mod a {
    const A: i32 = 3;
    macro a() { A }
}
mod b {
    const B: i32 = 2;
    macro b() { B }
}
mod sum {
    macro sum($lhs, $rhs) { $lhs!() + $rhs!() }
}
fn whatami() -> i32 {
    use a::a;
    use b::b;
    use sum::sum;
    sum!(a, b)
}

I think it's pretty clear that should work and expand to

fn whatami() -> i32 {
    ::a::A + ::b::B
}

Does it do so with your proposed rule? Is my intuition wrong and this shouldn't make sense? Is there a different way of composing macro imports and call locations such that it would break?

My intuition is that if I name something at the call site, if the macro uses it as an item, it resolves to what it is at the call site. If I name something at the macro def site, it resolves to whatever that name means if that name had been used in a function at that source location.

petrochenkov · 2018-04-21T00:14:21Z

@CAD97
Yes, the example should work (after a couple of added pubs and : idents) and the intuition is correct.
The example never makes use of hygiene contexts of "complex" entities though, only contexts of "atomic "identifiers (e.g. a, b), so this issue doesn't apply.

arielb1 · 2018-04-25T22:45:34Z

@petrochenkov

So i thought the most natural solution would be to use the call-site of the path that invoked the println. On the other hand, that would screw with users, because there would be no way to receive a macro from another syntax context and expand it in your own syntax context.

The second option that I currently think is natural is to use the "span of the expression" - the syntax context where the relevant expression was constructed in (is this not a valid concept in some case? this would be context_parens in your example), and also expand that ideal to all cases (including e.g. the privacy of the misc fields in an ExprStruct).

The above might however create some confusion (e.g. privacy in method calls - if the method name and the method call come from different scopes: what is determined by the method name, and what is determined by the method call?).

The main difference between the "span of expression" and "span of characteristic token" is how you pass around expansion sites, which is by either CPS-passing a macro that constructs the expression, or using a "magic" characteristic token.

arielb1 · 2018-04-25T22:49:51Z

In "span of expression", you can use

// in mod A

macro foo($expand: ident) {
    $expand!(println!("foo"));
}

// in mod B

macro expand($macro:ident ! $args:tt) {
    $macro ! $args
}
foo!(expand); // this expands `println!("foo")` here.

While in "span of characteristic token", you use

// in mod A

macro foo($bang: tt) {
    println $bang ("foo")
}

// in mod B

foo!(!); // this expands `println!("foo")` here.

nikomatsakis · 2018-05-30T13:15:15Z

Hmm, @petrochenkov's initial thought of a "span of characteristic token" is indeed what I initially expected, but I see the appeal of @arielb1's "span of the expresion" as well. This obviously affects the question in #50376 as well (which concerns use paths).

I wanted to step back a second and try to establish what our goals are. From my perspective, we should be shooting for two things:

Rules we can explain, of course.
Rules that mean that "normal macros" behave as expected with respect to hygiene:
- as a side effect, this should ensure that they work across editions, and that can be a useful guideline.

I was trying to think about the various things that might be tied ultimately to hygiene:

Name resolution, of course
- In the context of a use, this also affects the "global context" (see How to determine hygienic context for the "crate root" in absolute-by-default paths #50376)
Overflow behavior of + etc (checked or unchecked)
- overflow checks can currently be enable either globally or per crate; but what should happen for macros?
Method name resolution
- what traits are in scope?
Are we in an unsafe section?
Others?

I am wondering whether these distinct uses might introduce competing demands (e.g., perhaps one gives "more natural" results with the span-of-expr approach vs span-of-characteristic-token). I suppose we must also consider macro and macro_rules! somewhat distinctly.

petrochenkov · 2018-05-30T14:41:06Z

Yes, @arielb1's suggestion is the primary alternative, but I think the "characteristic token" is preferable.

@nikomatsakis

I wanted to step back a second and try to establish what our goals are.

Simplicity, first of all! This covers both "can teach", "can specify" and also implementation complexity.

Unless we a trying to assign the context to something that is never actually written in the source code like #50376, our choice should almost never matter because for a + b context of + and "concatenation context" are almost always same.

In this sense characteristic token context is simpler to implement and explain because it's something "real" and visible rather than an abstract point during expansion process.
"Normal macros" should rarely care about the distinction.

(Note that a + b has two "concatenation contexts" and we need to also decide which of them is the "primary concatenation context" and this is not so obvious for cases like

macro m($a_plus: tts) {
    $a_plus b
}

m!(a +)

)

Others?

Privacy.
Currently, fields inside of ..rest in Struct { a, b, ..rest } should be checked in the context of ..rest.
In the future, Enhanced^TM Type Privacy should also use expression contexts to avoid checking "implementation details" of macros while still checking their "outputs".

Do not provide suggestions when the spans come from expanded code that doesn't point at user code Hide invalid proc-macro suggestions and track spans coming from proc-macros pointing at attribute. Effectively, unless the proc-macro keeps user spans, suggestions will not be produced for the code they produce. r? `@ghost` Fix rust-lang#107113, fix rust-lang#107976, fix rust-lang#107977, fix rust-lang#108748, fix rust-lang#106720, fix rust-lang#90557. Could potentially address rust-lang#50141, rust-lang#67373, rust-lang#55146, rust-lang#78862, rust-lang#74043, rust-lang#88514, rust-lang#83320, rust-lang#91520, rust-lang#104071. CC rust-lang#50122, rust-lang#76360.

Do not provide suggestions when the spans come from expanded code that doesn't point at user code Hide invalid proc-macro suggestions and track spans coming from proc-macros pointing at attribute. Effectively, unless the proc-macro keeps user spans, suggestions will not be produced for the code they produce. r? ``@ghost`` Fix rust-lang#107113, fix rust-lang#107976, fix rust-lang#107977, fix rust-lang#108748, fix rust-lang#106720, fix rust-lang#90557. Could potentially address rust-lang#50141, rust-lang#67373, rust-lang#55146, rust-lang#78862, rust-lang#74043, rust-lang#88514, rust-lang#83320, rust-lang#91520, rust-lang#104071. CC rust-lang#50122, rust-lang#76360.

petrochenkov · 2024-06-24T11:55:20Z

Update: the compiler eventually converged on the "span of expression" approach, because it's something natural to implement when you don't know anything about this issue.
It's not necessarily good, but that's something we'll have to live with, most likely.

So I'm going to close this issue in favor of #126763 which is supposed to give spans to "complex" code fragments in a more systematic way.

Without "characteristic tokens" contexts for fragments built from tokens with "heterogeneous" spans, coming from unrelated macros, will not be well defined.
In practice they will gravitate towards the context of the first span node (in terms of #126763).
Such fragments can be generated primarily by proc macros (or complex declarative macros).

petrochenkov added the A-macros-2.0 Area: Declarative macros 2.0 (#39412) label Apr 20, 2018

This was referenced May 1, 2018

How to determine hygienic context for the "crate root" in absolute-by-default paths #50376

Closed

Tracking issue for concat_idents #29599

Open

petrochenkov mentioned this issue May 23, 2018

don't ask what edition we are in; ask what edition a span is in #50999

Closed

nrc mentioned this issue Jul 3, 2018

Edition hygiene in lints #52038

Closed

petrochenkov mentioned this issue Sep 5, 2018

RFC: Or patterns, i.e Foo(Bar(x) | Baz(x)) rust-lang/rfcs#2535

Merged

XAMPPRocky added C-enhancement Category: An issue proposing an enhancement or a PR with one. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Sep 25, 2018

petrochenkov mentioned this issue Mar 16, 2019

Expand on lints with macro spans. rust-lang/reference#544

Open

petrochenkov mentioned this issue Jun 22, 2021

document and test the precise span that triggers edition-dependent behavior #86539

Open

estebank mentioned this issue Mar 15, 2023

Do not provide suggestions when the spans come from expanded code that doesn't point at user code #109082

Closed

petrochenkov closed this as completed Jun 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to determine hygienic context for "non-atomic" code fragments #50122

How to determine hygienic context for "non-atomic" code fragments #50122

petrochenkov commented Apr 20, 2018

CAD97 commented Apr 20, 2018

petrochenkov commented Apr 21, 2018 •

edited

Loading

arielb1 commented Apr 25, 2018

arielb1 commented Apr 25, 2018

nikomatsakis commented May 30, 2018

petrochenkov commented May 30, 2018 •

edited

Loading

petrochenkov commented Jun 24, 2024 •

edited

Loading

How to determine hygienic context for "non-atomic" code fragments #50122

How to determine hygienic context for "non-atomic" code fragments #50122

Comments

petrochenkov commented Apr 20, 2018

CAD97 commented Apr 20, 2018

petrochenkov commented Apr 21, 2018 • edited Loading

arielb1 commented Apr 25, 2018

arielb1 commented Apr 25, 2018

nikomatsakis commented May 30, 2018

petrochenkov commented May 30, 2018 • edited Loading

petrochenkov commented Jun 24, 2024 • edited Loading

petrochenkov commented Apr 21, 2018 •

edited

Loading

petrochenkov commented May 30, 2018 •

edited

Loading

petrochenkov commented Jun 24, 2024 •

edited

Loading