Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

String Interpolation #165

Open
wants to merge 1 commit into
base: master
from

Conversation

@WalterBright
Copy link
Member

WalterBright commented Aug 14, 2019

No description provided.

@wilzbach

This comment has been minimized.

Copy link
Member

wilzbach commented Aug 14, 2019

Did you consider @adamdruppe's proposal on this?
http://dpldocs.info/this-week-in-d/Blog.Posted_2019_05_13.html

tl;dr: custom tuple-like struct which allows functions to customize output (think writeln or SQL statements).

@andre2007

This comment has been minimized.

Copy link

andre2007 commented Aug 14, 2019

@WalterBright is my understanding correct that this proposal only works for direct stdout but assigning to e.g. a string variable is not possible?
This would be a big limitation.

@WalterBright

This comment has been minimized.

Copy link
Member Author

WalterBright commented Aug 14, 2019

This would be a big limitation.

I'm not so sure about that. I use format strings all the time, and almost never use anything but a literal. If it does become a serious problem, this could be supported:

enum s = "hello %bar";
format(i"" ~ s);

But it doesn't really matter if interpolated strings don't fit every niche. They just have to fit the most used ones, as there's a fallback - use the current method.

@WalterBright

This comment has been minimized.

Copy link
Member Author

WalterBright commented Aug 14, 2019

BTW, the goal for this design was to minimize the amount of typing a user has to do, and make it usable for printf as well as writef.

@WalterBright WalterBright force-pushed the WalterBright:4NNN-WGB.md branch from cade2a6 to 63e99b1 Aug 14, 2019
@WalterBright

This comment has been minimized.

Copy link
Member Author

WalterBright commented Aug 14, 2019

Did you consider @adamdruppe's proposal on this?

Not specifically, though I knew people were working and thinking about it. Adam should do it as a proper DIP. I've amended the DIP to add references to Adam's and Jason's work.

@WalterBright

This comment has been minimized.

Copy link
Member Author

WalterBright commented Aug 14, 2019

This would be a big limitation.

Thinking about this a bit more, the end result of an InterpolatedString is a tuple expression. Anything you can do with a tuple expression you can do with an InterpolatedString.

@andre2007

This comment has been minimized.

Copy link

andre2007 commented Aug 14, 2019

Yes, all of my mid size D applications do not directly write to stdout but uses library functions for e.g. adding colors. Also if you want to build an http server (http response body), string interpolation would be really useful.
A generic solution which works for stdout but also for assigning a string variable or a function argument would be really nice.

Also in other languages string interpolation is not limited to stdout(for the languages I am aware of).

@WalterBright

This comment has been minimized.

Copy link
Member Author

WalterBright commented Aug 14, 2019

I think there's a misunderstanding here. None of this DIP restricts it to stdout.

@andre2007

This comment has been minimized.

Copy link

andre2007 commented Aug 15, 2019

Yes, it was a misunderstanding. You may add an example (string variable assignment, string argument assignment) to this DIP to make it clear for other readers. Thanks.

@atilaneves

This comment has been minimized.

Copy link

atilaneves commented Aug 15, 2019

I don't understand how // error, %d is not a valid element is the case given the rule in the dip for when Element is Character.

@marler8997

This comment has been minimized.

Copy link
Contributor

marler8997 commented Sep 26, 2019

My related PR: dlang/dmd#7988

@mdparker mdparker referenced this pull request Sep 26, 2019
@marler8997

This comment has been minimized.

Copy link
Contributor

marler8997 commented Sep 26, 2019

My comment here addresses my concerns with lowering interpolated strings to format string tuples. dlang/dmd#7988 (comment)

Lowering the interpolated string to a tuple of strings and expressions seems more versatile.

@anon17

This comment has been minimized.

Copy link

anon17 commented Sep 30, 2019

Pull 7988 mentions HTML generation as a use case for string interpolation, this article uses such vulnerability in Firefox.

@adamdruppe

This comment has been minimized.

Copy link

adamdruppe commented Sep 30, 2019

Tuple based interpolation can do HTML generation more sanely (though I think html strings are mistakes anyway, but that's another thing) because as tuples, the types are available and then the function could - in theory at least - do some introspection and proper encoding based on that information.

but indeed i would be generally skeptical, just tuples with type information - like my proposal tweak talked about - makes it possible to actually do it right.

@ntrel

This comment has been minimized.

Copy link
Contributor

ntrel commented Oct 15, 2019

No attempt is made to check that the format specification is compatible with the argument type. Making such checks would require that detailed knowledge of printf and writef be hardwired into the core language

This is false, the language could lower an interpreted string to a Phobos function that checks the format string at compile time. E.g. format!”format_string”(args).

@boxed

This comment has been minimized.

Copy link

boxed commented Oct 15, 2019

Did you look at how swift does string interpolation? The Wikipedia reference here points to a vastly simplified example for swift that doesn't meantion how it handles escaping much smoother than any system I've seen before.

If the `Element` is:

* `Character`, it is written to the output string.
* `'%%'`, a '%' is written to the output string.

This comment has been minimized.

Copy link
@schlupa

schlupa Oct 15, 2019

Contributor

'%%', a '%' is written to the output string.

Shouldn't it better be that %% stays as a %% in the resulting format string. Or else one will need to put %%%% in the interpolated string to get a % in the result of the writef or the printf.

This comment has been minimized.

Copy link
@wilzbach

wilzbach Oct 15, 2019

Member

How would you write a % to the output otherwise?

This comment has been minimized.

Copy link
@schlupa

schlupa Oct 15, 2019

Contributor

If during transformation of the interpolated string the %% becomes a %, then the format string will contain an isolated % which is an error in a format string. A double percent in the interpolated string has to stay a double percent in the format string, or else you would have to put 4 % chars.
The transformation is
interpolated string => format string => output
Example:
writef(i"Percent %{d}value %%") becomes
writef("Percent %d %", value) which is an error (at least undefined behaviour for printf.

```
becomes:
```
printf("I ate %s and %d totalling %s fruit.\n", apples, bananas, apples + bananas);

This comment has been minimized.

Copy link
@TurkeyMan

TurkeyMan Oct 15, 2019

Contributor

should they all be d's?

@TurkeyMan

This comment has been minimized.

Copy link
Contributor

TurkeyMan commented Oct 15, 2019

@WalterBright

I'm not so sure about that. I use format strings all the time, and almost never use anything but a literal.

The cases where interpolated strings are useful is when the string is long... and in my experience, where I often generate shim or binding functions, those strings are always synthesised. They're rarely literals... so this feature is absolutely something I've wanted for 10 years, but won't actually suit my one common use case.

mixin(i"" ~ generatedCode) doesn't feel super satisfying; it looks more like a workaround to me. Maybe there's a better suggestion?

@SMietzner

This comment has been minimized.

Copy link

SMietzner commented Oct 15, 2019

I think the better way of interpolation rules would be:

  • a % is always followed directly by a format string argument OR opening curly brace
  • curly braces are mandatory for arguments

e.g.:
writefln(i"I ate %{apples} and %d{bananas} totalling %d{apples + bananas} fruit.");

This would

  • keep existing format rules / syntax as it is right now
  • eliminate parentheses since %( x + y) could be some code snippet, but %{ x + y } could not (with respect to the use of interpolated strings in mixins)
@SMietzner

This comment has been minimized.

Copy link

SMietzner commented Oct 15, 2019

The much better way of interpolation rules would be:

  • a % is always followed directly by an opening curly brace
  • format string argument are an optional second argument separated by a colon

e.g.:

writefln(i"I ate %{apples} and %{bananas, "d"} totalling %{apples + bananas, "d"} fruit.");

This would

  • keep readability high
  • minimize ambiguity with respect to the use of interpolated strings in mixins:
    %( x + y) could be some code snippet, but %{ x + y } could not
  • make sense since the argument is actually required and the specifier is actually optional
@boxed

This comment has been minimized.

Copy link

boxed commented Oct 15, 2019

Look at Swift! They looked at what other languages did and made something nicer. There is no special interpolated string: \(expr) is used. Standard escaping rules apply but one can also do #"\#(str)"#. The number of # before and after the string modifies the escape sequence inside. You can always write literals as the literal text you want and still be able to do string interpolation.

@SMietzner

This comment has been minimized.

Copy link

SMietzner commented Oct 15, 2019

Look at Swift! They looked at what other languages did and made something nicer. There is no special interpolated string: (expr) is used. Standard escaping rules apply but one can also do #"#(str)"#. The number of # before and after the string modifies the escape sequence inside. You can always write literals as the literal text you want and still be able to do string interpolation.

So how is formatting handled in swifts case? Formatting specifiers?

@boxed

This comment has been minimized.

Copy link

boxed commented Oct 15, 2019

Well they punted a bit on that I'm afraid, so people use function calls, operator overloading or string conversion operators. So here is a good opportunity to one up swift!

@adamdruppe

This comment has been minimized.

Copy link

adamdruppe commented Oct 15, 2019

I still say we should just do the tuple of structs thing. That's a very simple rule and by far the most flexible - it doesn't even have to yield strings!

@AndrejMitrovic

This comment has been minimized.

Copy link
Member

AndrejMitrovic commented Oct 16, 2019

Why use % in the string instead of something like $ which is more common in other languages and doesn't conflict with the existing usage of %?

It's very confusing to see something like %set in a format string and then wonder, is this %s followed by the string "et", or it's just trying to refer to the set variable? With $set, it's immediately obvious because it stands out.

@dejlek

This comment has been minimized.

Copy link

dejlek commented Oct 16, 2019

Why the percent or dollar? - If the string is interpolated, then the parser should know that everything inside braces should be evaluated, so writefln(i"I ate {apples} and {bananas} totalling {apples + bananas} fruit."); should work. If developer wants braces in the the INTERPOLATED string, then (s)he should double them. Look at Python's formatted strings people as a proof that this works, and have been widely accepted by an enormous community. Reference: https://docs.python.org/3/reference/lexical_analysis.html#formatted-string-literals

@marler8997

This comment has been minimized.

Copy link
Contributor

marler8997 commented Oct 16, 2019

Having to double curly braces in an interpolated string would make it awkward to use for code generation. Curly braces work great for python because they don't use curly braces to delimit code blocks. I also hardly see dynamic code generation in python so it wouldn't matter anyway, however, in D, I think using string interpolation inside mixins would be a major use case.

@dejlek

This comment has been minimized.

Copy link

dejlek commented Oct 18, 2019

Why would you use the i-strings for code generation when you have q{} ones? I would also argue that code generation is not what typical D programmer does. - We need string interpolation for hundreds of other, different reasons. And if people still insist, then I would agree to use $ instead of % as $ is used in other languages and we are more familiar with that style.

@adamdruppe

This comment has been minimized.

Copy link

adamdruppe commented Oct 18, 2019

@dejlek

This comment has been minimized.

Copy link

dejlek commented Oct 18, 2019

I think we should have a survey here asking D developers on the forum to participate and tell us how often they generate complex blocks of D code. - Yes I am aware of the fact that developers want string mixins but the real question is how often people do this compared to the "regular" use-case when you just want to output some meaning text... I see people constantly talking about D being easy to prototype stuff, and for this to be true - easy, simple string interpolation without the need for some extra special characters to type all the time (% or $) is very much needed.

@marler8997

This comment has been minimized.

Copy link
Contributor

marler8997 commented Oct 18, 2019

@dejlek I implemented interpolated strings in this PR dlang/dmd#7988 and in the description I show an example of how it can be used in code generation.

string generateFunction(string attributes, string returnType, string name, string args, string body)
{
    import std.conv : text;
    return text(iq{
        $(attributes) $(returnType) $(name)($(args))
        {
            $(body)
        }
    });
}
mixin(generateFunction("pragma(inline)", "int", "add", "int a, int b", "return a + b;"));
assert(100 == add(25, 75));

Without interpolated strings, you can't insert dynamic content in the middle of a q{} string which virtually makes them useless for code generation. This is because if the content of the q{} string was all static, then you probably wouldn't need a mixin in the first place :)

Here's a real example that is in phobos as well: https://github.com/marler8997/interpolated_strings/blob/master/phobos_example.d

@cryptisk-grs

This comment has been minimized.

Copy link

cryptisk-grs commented Nov 2, 2019

Might I suggest consideration to how C# does its string interpolation. It ends up being very clean and very quick to write.

function($"This string has {value} within it");

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
You can’t perform that action at this time.