Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Propose implicit named arguments for formatting macros #2795

Open
wants to merge 9 commits into
base: master
from

Conversation

@davidhewitt
Copy link

davidhewitt commented Oct 28, 2019

Rendered

Proposes the ability to pass implicit named arguments to formatting macros, for example:

let person = "Charlie";
print!("Hello, {person}!");    // implicit named argument `person`

A follow up to internals discussion https://internals.rust-lang.org/t/println-use-named-arguments-from-scope/10633 and WIP PR rust-lang/rust#65338

@kennytm

This comment has been minimized.

Copy link
Member

kennytm commented Oct 28, 2019

An alternative syntax is to clearly distinguish the interpolated variable from named arguments,

println!("Hello, {(person)}!");
//                ^      ^

and it allows all expressions can be naturally used

format_args!("Hello {(get_person())}")
//                   ^~~~~~~~~~~~~^

println!(r#"{("Hello"):~^33}\n{(self.person):~^33}"#);

println!("Hello, {(if self.foo { &self.person } else { &self.other_person })}");

println!(
    "variable-person = {(person1)}, named-person = {person2}, positional-person = {}",
    person3,
    person2=person2,
);
@CAD97

This comment has been minimized.

Copy link

CAD97 commented Oct 28, 2019

and it allows all expressions can be naturally used

First: this is backwards compatible to add on top of the "single Ident special case". Most languages with string interpolation have a special case for single Ident, so we wouldn't be out of line to make that easier. It's also by far the most common case.

Second: allowing arbitrary expressions opens a whole separate can of worms around nesting. Just as a few edge case examples: "{("a")}", "{(\"a\")}", r#"{(")}")}"#. You can make rules for how these resolve (it'd have to be "standard string token first, then work on the content of the string literal"), but this adds a huge number of edge cases that just plain don't exist for the ""{name}" is a shortcut for "{name}", name=name" simple case.

So we definitely should stick to the simple, fairly obvious case that eliminates that stutter, and worry about generalizing it to more expressions later if desired.

@kennytm

This comment has been minimized.

Copy link
Member

kennytm commented Oct 28, 2019

@CAD97

Most languages with string interpolation have a special case for single Ident

Half of the languages with string interpolation have special case for single ident.

Firstly, many of them do support interpolating arbitrary expressions, without any special case for single idents:

Furthermore, when the string interpolation syntax is special-cased for single idents, it is often because there is no closing sigil, which is not the case for Rust

  • Kotlin, Groovy, PHP: "$ident ${expr}"
  • Ruby: "#@xxx #@@yyy #$zzz #{expr}" (note: local variables (no sigils) won't work, must use "#{ident}")
  • Scala: s"$ident ${expr}"

Finally, string interpolation is a feature built into the language itself, which often live independently from the "format" facility, e.g. in Python you write f"{ident}" or "{}".format(ident), but "{ident}".format() is meaningless. This RFC fuses string formatting and interpolation, so the experiences from other languages are not reliable, e.g. they can't express format!("{a} {b}", b=123).

And hence I disagree with this criticism:

Second: allowing arbitrary expressions opens a whole separate can of worms around nesting. Just as a few edge case examples: "{("a")}", "{(\"a\")}", r#"{(")}")}"#. You can make rules for how these resolve (it'd have to be "standard string token first, then work on the content of the string literal"), but this adds a huge number of edge cases that just plain don't exist for the ""{name}" is a shortcut for "{name}", name=name" simple case.

"{("a")}" is simply syntax error. Remember we're talking about a macro format!("...") here, the format string literal is not special in the Rust tokenizer/parser, it is not an interpolated string. To the proc macro format_args!(), the format strings "{(\")}\")}" and r#"{(")}")}"# are equivalent. The format string parser could,

  • If we see {,
    • If we see {, unescape as {
    • If we see an integer, it's a positional argument
    • If we see an identifier, it's a named argument
    • If we see : or }, it's an (implicit) positional argument
    • If we see (, step back one character then parse a Rust token tree
    • Otherwise, emit an error

I don't see "huge number of edge cases" syntactically. IMO it has less edge cases than the current RFC because when you see an ident, you don't need to determine whether {ident} is a named argument or interpolated variable!

And to clarify, introducing the "stutter" is the primary purpose of my comment above even if we reject interpolating arbitrary expressions. It's to ensure {ident} is always a named argument, which leads to

this is backwards compatible to add on top of the "single Ident special case".

Indeed we accepted this RFC as-is, {ident:?} can also be forward extended to {(expr):?}. But you'll need to consider what to do with {expr:?} — either reject all non-ident expressions (thus {self.prop:?} is invalid), or accept some expressions (which leads to "why is my particular expression not accepted?" and true edge cases involving numbers and : and {). So I'd avoid even allowing that in the first place.


It's also by far the most common case.

Yes. But stuff like self.stuff or path.display() is also quite common. Grepping format!, panic! and (e)print(ln)! (including docs) from https://github.com/BurntSushi/ripgrep we get

Type Count
Single ident (e.g. format!("error: {}", err)) 147
Properties (e.g. format!("value {}", self.x.y.z) 22
Other expressions (e.g. format!("{}", xyz[0].replace("\n", "\n\n"))) 61
Expression with return branch (e.g. println!("{}", foo()?);) 2
Expression involving macros (e.g. format!("{} {}", $e, crate_version!())) 3

So non-single-ident covers 40% of all usages, which I won't brush them off simply as can-of-worms opener.

Also, when these expressions are interpolated, they are quite readable unlike the RFC's constructed example.

https://github.com/BurntSushi/ripgrep/blob/8892bf648cfec111e6e7ddd9f30e932b0371db68/tests/util.rs#L403-L414

// Current
            panic!("\n\n==========\n\
                    command failed but expected success!\
                    {}\
                    \n\ncommand: {:?}\
                    \ncwd: {}\
                    \n\nstatus: {}\
                    \n\nstdout: {}\
                    \n\nstderr: {}\
                    \n\n==========\n",
                   suggest, self.cmd, self.dir.dir.display(), o.status,
                   String::from_utf8_lossy(&o.stdout),
                   String::from_utf8_lossy(&o.stderr));

// Interpolated
            panic!("\n\n==========\n\
                    command failed but expected success!\
                    {(suggest)}\
                    \n\ncommand: {(self.cmd):?}\
                    \ncwd: {(self.dir.dir.display())}\
                    \n\nstatus: {(o.status)}\
                    \n\nstdout: {(String::from_utf8_lossy(&o.stdout))}\
                    \n\nstderr: {(String::from_utf8_lossy(&o.stdout))}\
                    \n\n==========\n");

https://github.com/BurntSushi/ripgrep/blob/8892bf648cfec111e6e7ddd9f30e932b0371db68/ignore/src/lib.rs#L287-L299

// Current
                write!(f, "{}", msgs.join("\n"))
...
                write!(f, "File system loop found: \
                           {} points to an ancestor {}",
                          child.display(), ancestor.display())

// Interpolated
                write!(f, r#"{(msgs.join("\n"))}"#)
...
                write!(f, "File system loop found: \
                           {(child.display())} points to an ancestor {(ancestor.display())}")

https://github.com/BurntSushi/ripgrep/blob/8892bf648cfec111e6e7ddd9f30e932b0371db68/src/search.rs#L244-L264

// Current
        write!(
            self.get_mut(),
            "
{matches} matches
{lines} matched lines
{searches_with_match} files contained matches
{searches} files searched
{bytes_printed} bytes printed
{bytes_searched} bytes searched
{search_time:0.6} seconds spent searching
{process_time:0.6} seconds
",
            matches = stats.matches(),
            lines = stats.matched_lines(),
            searches_with_match = stats.searches_with_match(),
            searches = stats.searches(),
            bytes_printed = stats.bytes_printed(),
            bytes_searched = stats.bytes_searched(),
            search_time = fractional_seconds(stats.elapsed()),
            process_time = fractional_seconds(total_duration)
        )

// Interpolated
        write!(
            self.get_mut(),
            "
{(stats.matches())} matches
{(stats.matched_lines())} matched lines
{(stats.searches_with_match())} files contained matches
{(stats.searches())} files searched
{(stats.bytes_printed())} bytes printed
{(stats.bytes_searched())} bytes searched
{(fractional_seconds(stats.elapsed())):0.6} seconds spent searching
{(fractional_seconds(total_duration)):0.6} seconds
")
@danielhenrymantilla

This comment has been minimized.

Copy link

danielhenrymantilla commented Oct 28, 2019

As a comparison point, there is a substantial difference between the mentioned ::fstrings crate, featuring deliberately limited interpolation (c.f., danielhenrymantilla/fstrings-rs#1 (comment)), and the ::ifmt crate, featuring "arbitrary" interpolation.

In the linked ::fstrings PR, extending fstrings interpolation to field access is being considered, but nothing more (e.g., no method calls a priori). Imho this leads to a sweet point in letting people use interpolation while preventing abuse of the feature.

@davidhewitt

This comment has been minimized.

Copy link
Author

davidhewitt commented Oct 28, 2019

I think @kennytm's interpolation proposal is reasonable, and I will add it to the alternatives section in the RFC shortly.

My initial observations about the proposed syntax for interpolation are that it's not actually any shorter than positional formatting arguments, although I would argue it reads easier (especially as the number of arguments scales):

println!("Hello, {(person)}!");
println!("Hello, {}!", person);

However, if the brackets are necessary to resolve parsing ambiguities then that is something we would have to accept. I do find iterpolation mechanisms convenient, though it can be difficult to judge how complex interpolations should be before they become hard-to-read / poor style.

I would like to contest this line though:

This RFC fuses string formatting and interpolation

The objective of this RFC is not to add interpolation to Rust's existing formatting macros. It is to add a shorthand to the macros to make them more ergonomic in a very common use case. I'd be glad to take a win on 60% of macro invocations even if this RFC doesn't improve the remaining 40%. This discussion appears to agree that interpolation can be added later in a backwards-compatible way (even if there are different arguments whether that would be desirable).

As pointed out by @petrochenkov in the WIP PR, we actually already have a similar shorthand in struct literal expressions for the special case of single identifiers:

struct Foo { bar: u8 }
let bar = 1u8;

let foo = Foo { bar: bar };
let foo = Foo { bar };        // This shorthand only accepts single identifiers

To make the original intention of the RFC stand out clearer, I will update the wording to state explicitly that adding interpolation to the macros was not the intended goal, as well as add the struct literal shorthand to the prior art section.

The RFC itself is not intended to be a way to sneak interpolation
into the formatting macros; the RFC author believes that they do
not need full interpolation support, although would not rule it
out if it was deemed desirable.

This update to the RFC text clarifies the distinction between
implicit named arguments and interpolation. It also adds a note
on prior art that Field Init Shorthand is an existing precedent
where language ergonomics have introduced a special case for
single identifiers.
Copy link

danielhenrymantilla left a comment

I like the nuances you've added.

Could you also state whether format!("{x}, {y}", y = 42) would be accepted or rejected? (it would avoid having let declarations used only for one "fstring" interpolation)

@davidhewitt

This comment has been minimized.

Copy link
Author

davidhewitt commented Oct 28, 2019

Could you also state whether format!("{x}, {y}", y = 42) would be accepted or rejected? (it would avoid having let declarations used only for one "fstring" interpolation)

I would expect it would be accepted: it seems perfectly valid to me that one named parameter can be implicit and the other explicit.

@Centril

This comment has been minimized.

Copy link
Member

Centril commented Oct 28, 2019

Also, when these expressions are interpolated, they are quite readable unlike the RFC's constructed example.

I think there's value in separating the computing of results and how they are rendered (so that the former can be separated into functions eventually). With the {binding} case, I think a good balance is struck where things don't get too intermingled.

Also, given that 60% of the cases in the data in question were for single-ident cases, it seems to me that {binding} is right to optimize for those cases whereas {{binding}} optimizes for the minority.

The rustc compiler has a lot of diagnostics so it is probably a better data set to dig into if you want more of those.

All in all, I think this RFC finds a well-struck balance.

The examples provided had incorrectly merged Scala and PHP.
Scala's string interpolation is written `s"$foo"` whereas PHP is written `"$foo"`
@tomwhoiscontrary

This comment has been minimized.

Copy link

tomwhoiscontrary commented Oct 30, 2019

Would this involve any changes to the language syntax, or can this be implemented entirely in a macro?

I realise that format_args! is not currently implemented as a true macro, but i think it's a good idea to maintain the fiction that it is, or at least could be. format_args! shouldn't do anything a user-written macro can't.

@davidhewitt

This comment has been minimized.

Copy link
Author

davidhewitt commented Oct 30, 2019

Would this involve any changes to the language syntax, or can this be implemented entirely in a macro?

The fstrings crate implements a PoC of this functionality using proc_macro (and proc_macro_hack, I believe). So this can indeed be implemented in a user-written macro.

Also, the test PR I wrote needed just a tiny adjustment to the internals of format_args! without any changes anywhere else.

text/0000-format-args-implicit-identifiers.md Outdated Show resolved Hide resolved
text/0000-format-args-implicit-identifiers.md Outdated Show resolved Hide resolved
println!("{x:.precision$}");
The RFC author believes Rust users familiar with implicit named arguments may expect the above to compile (as long as `x` and `precision` were valid identifiers in the scope in question). However, feedback is requested during this RFC process as to whether the this should be indeed become acceptable as part of the RFC.

This comment has been minimized.

Copy link
@joshtriplett

joshtriplett Nov 1, 2019

Member

I would very much like implicit arguments to work anywhere explicitly provided named arguments currently work, for consistency.

One interesting interaction that seems worth noting: because those named arguments appear right next to the ., if we want to allow dotted.paths in the future, we could end up with something like {x:self.width$.self.precision$}. I don't know how easily we could parse that, and it certainly confuses my mental parser. Could you discuss this a bit, and suggest possible future solutions if we decide to extend this to dotted paths? (I agree with your rationale of not doing dotted paths in this RFC, but it does seem like the most obvious extension, and worth discussing future-proofing for.)

This comment has been minimized.

Copy link
@davidhewitt

davidhewitt Nov 3, 2019

Author

It's a great question about dotted.paths, and one that I've been thinking hard about. I agree with you that {x:self.width$.self.precision$} also confuses me to read. This indicates to me that we probably don't want to end up with "implicit dotted path arguments" in formatting parameters.

(Sorry, prematurely hit comment, will continue in another comment below...)

This comment has been minimized.

Copy link
@davidhewitt

davidhewitt Nov 3, 2019

Author

Even in just making the step to permit dotted.paths, I think that using the interpolation syntax suggested by @kennytm, i.e. {(expr}), is likely to be the best option.

My reasoning: It offers improved readability in the basic case for complex expressions, and seems to me to also help with this problem you highlight in the formatting parameters case. For example, we could have

   {x:(self.width).(self.precision)}        // seems okay to me
   {x:(self.width)$.(self.precision)$}      // or an alternative if we decide to keep the postfix dollars

Both of these read okay to me.

A good argument can be made that if we added interpolation we might aswell go all the way and allow permit any expression for interpolation in {(expr)}. But this doesn't have to be the case - we could, for example, test the waters with just {(dotted.paths)}.

While it's tempting to wish for {dotted.paths} without the brackets, I suspect in the long run we'll prefer {(expr)} instead of {expr} if we want to start adding any interpolation.

This comment has been minimized.

Copy link
@CAD97

CAD97 Nov 3, 2019

A problem with "dollar syntax" is that it's horribly hard to find if you don't know what you're looking for, and confusing if you don't know it already.

The solution of just using (expr) (assuming it's not ambiguous anywhere; be careful around the pad character which can be anything, including }(!)) makes it a lot more obvious what is going on, I think.

This actually would put some weight on always using the (expr) form, as {(x):(width).(precision)} is more internally consistent than {x:width$.precision$}.

That said, I agree that if we allow {x}, the rule should be just "if named argument isn't provided, look for it in scope", no matter the source of that named argument.

This comment has been minimized.

Copy link
@davidhewitt

davidhewitt Nov 3, 2019

Author

This actually would put some weight on always using the (expr) form, as {(x):(width).(precision)} is more internally consistent than {x:width$.precision$}.

I do agree with this statement. Though {x:width$.precision$} is what we already have for named arguments, and I'd much rather avoid introducing new syntax when the existing works for the proposal at hand.

Seems most (all?) of us are arguing the {(expr)} form is better for forwards compatibility with interpolation. However, if we added {(expr)} interpolation in the future, the {(ident)} special case will just as equally be backwards compatible with {ident}. So I would argue to avoid introducing new {(expr)} syntax now if we're not sure we will want to add interpolation to these macros in the future.

This comment has been minimized.

Copy link
@joshtriplett

joshtriplett Nov 4, 2019

Member

@davidhewitt If we considering the possibility that future syntax extensions might potentially change the dollar syntax, I like the idea of just assuming that we'll need parentheses to put dotted.names in the current syntax, and then considering potential alternative syntaxes for width and precision that parse unambiguously without requiring parentheses. Would you consider listing that in the future work section?

This comment has been minimized.

Copy link
@davidhewitt

davidhewitt Nov 4, 2019

Author

Yes, great idea, have now done so.

@joshtriplett

This comment has been minimized.

Copy link
Member

joshtriplett commented Nov 1, 2019

I'd love to have this. I find myself writing name=name, another_name=another_name all the time.

I fully support the rationale in this RFC to only support single identifiers, and defer any more complex interpolation. I do think we'll want dotted fields in the future, but I added a comment about one bit of complexity even that level of extended interpolation would produce.

@tmccombs

This comment has been minimized.

Copy link

tmccombs commented Nov 2, 2019

I have a couple of reasons to not support arbitrary expressions without require parenthesis ( "{(expr)}"):

  1. There is ambiguity between an integer expression and number indicating a positional argument
  2. There could potentially be ambiguity, or at least confusing with format arguments, since the colon is used both in rust syntax and as the delimiter before format arguments.
With thanks to Lonami, joshtriplett and tmccombs
@davidhewitt

This comment has been minimized.

Copy link
Author

davidhewitt commented Nov 3, 2019

I have a couple of reasons to not support arbitrary expressions without require parenthesis ("{(expr)}"):

Fully agree with your arguments. For anything beyond {ident}, I agree {(expr)} is much better than {expr}.

@danielhenrymantilla

This comment has been minimized.

Copy link

danielhenrymantilla commented Nov 3, 2019

A new version of ::fstrings has just been released, that supports interpolation of field accesses / dotted.paths (but not after the :, which is currently just left as is).

It can be useful for those wanting to get a feeling of what the ergonomic improvements this RFC and a future one could provide.

@tmccombs

This comment has been minimized.

Copy link

tmccombs commented Nov 3, 2019

Another possible alternative: No special case for identifiers, but allow arbitrary expressions within parenthesis. So, {ident} means, use the keyword argument named ident and {(ident)} means use the variable ident, and {(expr)} works for any expression.

@cramertj

This comment has been minimized.

Copy link
Member

cramertj commented Nov 4, 2019

However, afaik, there isn't a way to do that with macros by example. (It would be possible to just match a literal but that would also match literal numbers and characters, but I don't know if those are ever passed to panic! in practice.

Seems like we could solve this pretty easily by making panic! a built-in or adding an unstable :strlit matcher.

@kennytm

This comment has been minimized.

Copy link
Member

kennytm commented Nov 5, 2019

change the behavior of panic! in a new edition

I'm not sure if this can be allowed by the edition mechanism, since a "2021" crate can use a macro exported from a 2018 crate, and vice-versa. So if the 2018 macro writes panic!($literal) it's not clear which behavior it should use.

adding an unstable :strlit matcher.

We already have :literal.

@CAD97

This comment has been minimized.

Copy link

CAD97 commented Nov 5, 2019

I'm not sure if this can be allowed by the edition mechanism, since a "2021" crate can use a macro exported from a 2018 crate, and vice-versa. So if the 2018 macro writes panic!($literal) it's not clear which behavior it should use.

If #[macro_export(local_inner_macros)], the local one, if not, then the one in scope. This could be slightly problematic, but given the relative rarity of the panic!(err) form (where err is not a string literal), this wouldn't be too big of an issue in practice. The more interesting case is 202X to 2018 or 2015, where panic!("{name}") would change behavior. But still, it's a standard "what's in scope" problem that macro authors have to be aware of anyway.

adding an unstable :strlit matcher.

We already have :literal.

The idea was to add a new matcher that only takes string literals (unlike :literal which takes any literal, e.g. string, number), so that panic! could "specialize" its single-arg form for string literals.

@scottmcm

This comment has been minimized.

Copy link
Member

scottmcm commented Nov 5, 2019

I like the restriction to simple idents here. With struct field init shorthand I often just make local variables for everything; I could easily see myself doing the same for formatting.

@rfcbot reviewed

@Centril

This comment has been minimized.

Copy link
Member

Centril commented Nov 5, 2019

Guess we can use ($lit:literal) => { ... } then and route that to a macro defined in libsyntax_ext which will, if given something other than a string literal expand to the input (identity), and otherwise do analysis on the format string to find {stuff}; if found, format_args! is used, otherwise it expands to the input (identity). Hopefully no one is relying on having the literal output with {stuff} here. We should be able to check that with crater.

@CAD97

This comment has been minimized.

Copy link

CAD97 commented Nov 5, 2019

Here's a way to handle it with the edition: prelude::v2!

I.e. edition 2018 or earlier imports prelude::v1::*, which includes today's panic!. Editions later than 2018 import prelude::v2::*, which includes a "fixed" panic! and panic_with!.

That's not to say it should be done that way, but it could be handled "simply" with a new prelude.

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented Nov 5, 2019

I haven't read the entire discussion but Ctrl+F hygiene finds nothing, so this is my take on it:

This feature would be fine, except for one fatal flaw: there's no Ident token nested in the string Literal token to allow macro hygiene to operate correctly.


If we had a general syntax for interpolating expressions, e.g. Swift's

"Hello \(capitalize("world"))"

which could desugar into something like this, perhaps:

format_args!("Hello {}", capitalize("world"))

then the macro system would already be aware of this and we could add a special-case for single identifiers without introducing any new issues.

However, that's not the case today - AFAIK, at least, maybe @petrochenkov has some clever ideas.

@danielhenrymantilla

This comment has been minimized.

Copy link

danielhenrymantilla commented Nov 5, 2019

Hygiene was not discussed given the fstrings PoC that works on stable Rust, thanks to @dtolnay 's [::proc_macro_hack::proc_macro_hack(fake_call_site)]. Perhaps they can provide more information about how they circumvent this issue.

@davidhewitt

This comment has been minimized.

Copy link
Author

davidhewitt commented Nov 5, 2019

I haven't read the entire discussion but Ctrl+F hygiene finds nothing, so this is my take on it:

This feature would be fine, except for one fatal flaw: there's no Ident token nested in the string Literal token to allow macro hygiene to operate correctly.

There is an (admittedly short) discussion of hygiene in the RFC: https://github.com/davidhewitt/rfcs/blob/format-args-implicit-identifiers/text/0000-format-args-implicit-identifiers.md#macro-hygiene

Based on my test PR implementation, the hygiene is fine. I was able to create a new Ident which inherited the span from the format string literal. This was consistent with @petrochenkov's comments when reviewing the PR.

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented Nov 5, 2019

So "{foo}" would preserve the right hygiene but concat!("{", stringify!(foo), "}") wouldn't - I suppose that's an acceptable compromise?

The main downside I see then is that it might be possible to bypass hygiene using only builtin macros and no proc-macro, but maybe we can make strings that weren't written by the user verbatim, unusable with this feature (e.g. the concat!(...) example above would never work).

@CAD97

This comment has been minimized.

Copy link

CAD97 commented Nov 5, 2019

I was expecting that format_args!(concat!(..), foo=0) would never work, because the macro requires that the first argument is a string literal. In actuality, however, format_args! allows the first argument to be concat! if and only if all arguments to concat! are themselves string literals (or, recursively, concat! that meet this requirement). playground

This means concat!("{", stringify(foo), "}") isn't accepted (NB: note typo), but concat!("{", concat!("f", "o", "o"), "}") is. This actually breaks the "format_args! is a proc macro" illusion, as this behavior does rely on resolving the macro call to std::concat!/std::stringify! and even allows interceding user-defined macros-by-example!

@davidhewitt

This comment has been minimized.

Copy link
Author

davidhewitt commented Nov 6, 2019

Is this behaviour of format_args! resolving concat! calls intentional?

I haven't ever come across it before in code or documentation. Your nested-concat example concat!("{", concat!("f", "o", "o"), "}") is particularly suprising to me. My mind is totally blown that format_args!(concat!("{", concat!("f", "o", "o"), "}"), foo=3) compiles.

(I thought, possibly incorrectly, that macros are expanded from the outermost call first.)

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented Nov 6, 2019

This means concat!("{", stringify!(foo), "}") isn't accepted,

How did you test this? This compiles and prints 123:

fn main() {
    println!(concat!("{", stringify!(foo), "}"), foo = 123);
}

AFAIK format_args! (and possibly asm! as well, can't think of any others) takes an expression and attempts to expand it to a string literal.

EDIT: looks like you have a typo in your playground link (missing ! after stringify).

@withoutboats

This comment has been minimized.

Copy link
Contributor

withoutboats commented Nov 6, 2019

I also think there are other libraries subject to similar issues as the panic macro. For example, I believe failure's macro for creating an error avoids formatting the string if it gets no additional arguments. It seems like this would be an idiom change not only for users of format_args but for people writing macros based on format_args, exactly the sort of change an edition would make sense for coordinating around.

@davidhewitt

This comment has been minimized.

Copy link
Author

davidhewitt commented Nov 6, 2019

The main downside I see then is that it might be possible to bypass hygiene using only builtin macros and no proc-macro, but maybe we can make strings that weren't written by the user verbatim, unusable with this feature (e.g. the concat!(...) example above would never work).

Having just tested the concat! example on my experimental PR implementation, it actually already rejects it! This is the code I tried:

fn main() {
    let target = "world";
    println!("hello {target}");
    println!(concat!("hello {", "target", "}"));
    println!(concat!("hello {", "target", "}"), target=target);
}

Both the first and third println! invocations compile fine, but the second fails:

 error[E0425]: cannot find value `target` in this scope
  --> scratch/test.rs:4:14
  |
4 |     println!(concat!("hello {", "target", "}"));
  |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ not found in this scope

I guess the hygienic context of the concat!-ed format string must be different to that of the literal. But of course this behavior is a result of concat!'s implementation, rather than defensive programming by me. A more robust solution would be for format_args! to only enable implicit named arguments if the format string was written by the user verbatim, as @eddyb suggests.

@davidhewitt

This comment has been minimized.

Copy link
Author

davidhewitt commented Nov 6, 2019

I also think there are other libraries subject to similar issues as the panic macro. For example, I believe failure's macro for creating an error avoids formatting the string if it gets no additional arguments. It seems like this would be an idiom change not only for users of format_args but for people writing macros based on format_args, exactly the sort of change an edition would make sense for coordinating around.

Having just checked the failure docs, this is indeed the case (for the bail! and ensure! macros at least).

@H2CO3

This comment has been minimized.

Copy link

H2CO3 commented Nov 7, 2019

Accepting implicit arguments it a pretty big change. It would be possible to eliminate foo=foo style repetitive named arguments by allowing to omit the foo= part from (still explicit) single-identifier arguments, like this: format!("{foo}", foo).

@kennytm

This comment has been minimized.

Copy link
Member

kennytm commented Nov 8, 2019

@H2CO3 that would be ambiguous. What should format!("{foo} {0}", foo, foo=1) do?

@H2CO3

This comment has been minimized.

Copy link

H2CO3 commented Nov 8, 2019

@H2CO3 that would be ambiguous. What should format!("{foo} {0}", foo, foo=1) do?

You're right, although that's not specific to the form I'm proposing. If foo is a variable in scope, then how should the same expression behave if single-identifier arguments are used implicitly?

I think by the way that the means of dealing with ambiguity should be to emit a compile error. Just like when type inference can't decide which type to choose from a set of possibilities, when formatting can't decide which arguments to use, it would be better to ask the user to rewrite their code in a manner that is unambiguous. Especially, since with the newly-introduced implicit state, it is harder to keep all the necessary information in one's head when reading the code, which exacerbates the problem with ambiguities. And I'd absolutely not like to see real-life code use the expression in question, I don't think there's a good reason for writing that.

@davidhewitt

This comment has been minimized.

Copy link
Author

davidhewitt commented Nov 8, 2019

You're right, although that's not specific to the form I'm proposing. If foo is a variable in scope, then how should the same expression behave if single-identifier arguments are used implicitly?

Explicit named arguments would always take precedence over implicit ones.

This allows users greater control: they can eliminate all implicit capture if they provide all arguments explicitly. It also ensures that this RFC won't break existing code, as no existing code will suddenly start capturing identifiers in scope which collide with existing named arguments.

So if this RFC were implemented format!("{foo} {0}", foo, foo=1) would still expand the same as it already does today: {foo} would be substituted for the explicit named argument foo=1.

(This precedence is implied in the RFC text as-is because the search for an implicit named argument won't occur if a named argument already exists. I will update the RFC text to declare this precedence explicitly.)

In RFC discussion it emerged that it was not clearly stated in the RFC
that implicit named arguments only apply if a corresponding explicit
named argument is not passed to the macro invocation.
@kennytm

This comment has been minimized.

Copy link
Member

kennytm commented Nov 8, 2019

@H2CO3

If foo is a variable in scope, then how should the same expression behave if single-identifier arguments are used implicitly?

The explicit named argument always take precedence, since it is "closer" to the format!() call. The problem with your proposal is that foo and foo=2 both appear at the same level, so there isn't a clear disambiguation rule.

I think by the way that the means of dealing with ambiguity should be to emit a compile error.

That means breaking change, which I don't think is justified (even if edition allowed that) since there exists alternative (this RFC) won't require the breakage.

davidhewitt added 2 commits Nov 8, 2019
When format_args! recieves an expression as the first argument, it
attempts to expand it to a string literal. If successful, this is
used as the format string and format_args! macro expansion continues
as usual.

This has subtle interactions with macro hygiene when implicit named
arguments come into play. This commit adds dicussion around this case.
The panic macro currently does not perform string formatting if
it only passed a single argument which is a string literal.

To be consistent with implicit named arguments and e.g.
`print!("{foo}")`, panic with a single string literal will need
to change to also use string formatting.
@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Nov 8, 2019

@rfcbot resolved panic-may-not-work

Thanks for the updates to the text @davidhewitt!

Because `person` is only treated as an implicit named argument if no exisiting named argument can be found, this ensures that implicit named arguments have lower precedence than explicit named arguments.

## Macro Hygiene
[macro-hygiene]: #macro-hygiene

This comment has been minimized.

Copy link
@Centril

Centril Nov 8, 2019

Member

@petrochenkov Can you take a look at this? :)


While it can be argued that this example is very contrived, the RFC author believes that it is undesirable to add such subtle interactions to the `format_args!` family of macros.

These appear to give strong motivation to disable implicit argument capture when `format_args!` expands an expression instead of a verbatim string literal.

This comment has been minimized.

Copy link
@petrochenkov

petrochenkov Nov 8, 2019

Contributor

Let's make it an error instead, that's the most conservative variant.

Literal spans weren't significant for resolution previously (at least in built-in proc macros), so they were never really specified.
The literal returned by concat!("a", "b") could arguably have a call-site context, similarly to the result of concat_idents!(a, b), so we may want to make things like format!(concat!("foo", "{bar}")) work eventually.

This comment has been minimized.

Copy link
@CAD97

CAD97 Nov 8, 2019

The problem is that code compiles today, so making it an error is a breaking change.

This comment has been minimized.

Copy link
@petrochenkov

petrochenkov Nov 8, 2019

Contributor

Which code compiles?

fn main() {
    let x = format!(concat!("{fo", "o}")); // error: there is no argument named `foo`
}

This comment has been minimized.

Copy link
@CAD97

CAD97 Nov 8, 2019

Oh, whoops, sorry, mixed up which macros we were talking about.

This compiles in panic! due to the single-arg exception.

This comment has been minimized.

Copy link
@davidhewitt

davidhewitt Nov 11, 2019

Author

Let's make it an error instead, that's the most conservative variant.

Can you clarify a little what you'd like to see the error say? Disabling implicit named arguments would already cause the error you show above to be emitted (e.g. "there is no argument named foo")

This comment has been minimized.

Copy link
@petrochenkov

petrochenkov Nov 11, 2019

Contributor

Well, something like "implicit named arguments are not supported in format strings expanded from macros"?

This comment has been minimized.

Copy link
@davidhewitt

davidhewitt Nov 12, 2019

Author

I see. I'd also like to leave the existing "no argument named foo" error untouched. So how about using a note? I'm thinking something like the following:

fn main() {
    let x = format!(concat!("{fo", "o}"));
    // ^~ error: there is no argument named `foo`
    // ^^~ note: `foo` cannot be captured as an implicit named argument because they 
    // ^^~            are not supported in format strings expanded from macros
    // ^^~ note: try passing `foo` as a positional or named argument instead
}
@therealprof

This comment has been minimized.

Copy link

therealprof commented Nov 10, 2019

Actually I noticed there's prior art with the same identifier: https://tools.ietf.org/html/rfc2795
Seems related...


## Generated Format Strings

`format_args!` can accept an expression instead of a string literal as its first argument. `format_args!` will attempt to expand any such expression to a string literal. If successful then the `format_args!` expansion will continue as if the user had passed that string literal verbatim.

This comment was marked as resolved.

Copy link
@SimonSapin

SimonSapin Nov 21, 2019

Contributor

Is this the description of an existing behavior, or a new feature proposed by this RFC? If the former, where can I read more about this?

This comment was marked as resolved.

Copy link
@SimonSapin

SimonSapin Nov 21, 2019

Contributor

Never mind :) I found the Macro Hygiene section which explains in more details, and https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=e42db7da7dba251aab15ccfa8ead3b65 shows that this is existing behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.