-
-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
String Interpolation #140
String Interpolation #140
Conversation
Added some links and implementation details
Expand abstract.
Need to support arbitrary expressions
Change title
The proposed "lazy alias" feature is unnecessary, and its presence in the DIP distracts readers from the main focus: string sequence literals. To achieve the same results without language changes, it suffices in almost all cases to wrap the string sequence literal in an explicit AliasSeq instantiation, or a call to std.typecons.tuple: alias ctSeq = AliasSeq!(i"hello, {name}"); auto rtSeq = tuple(i"a + b = ${a+b}"); If lazy evaluation is desired, it can be opted-into explicitly using existing language features (lambdas, lazy parameters).
Remove changes to alias
Remove "expressions" section
Why is it awkward? Examples are badly needed here. The dip needs an actual library to compare with instead of an hypothetical one.
This needed actual benchmark data to say that it has bad performance.
So what? Complex numbers depends on a library. Not a good con.
You need to explain why. An actual implementation of an library would be useful here in demonstrating this. |
Just wanted to let you know I've seen this. It will be some time yet before I get to it, but please take that time to encourage folks to look over it for draft review. |
Also, a precise description of exactly how the compiler is to interpolate the string is required. Even though we already have an implementation, this still needs to be spelled out exactly in detail (e.g., what characters may follow a I've rather limited time on my hands, but if I get some free time I could try to throw together a draft of such a description. But this needs some serious fleshing out if it's to stand any chance of being accepted by Walter & Andrei. |
Also, there should be more detailed examples as to what to do with the resulting tuple / sequence of tokens. A sample implementation of string interpolation is a good basic example (I see it's already implied in the call to But it would do wonders if we included a more powerful example, like Steven's database example, accompanied with a detail explanation of just why it's so awesome. Any other possible applications should also be listed -- otherwise it would sound weak ("fancy language feature with limited applications"). |
This should give an example of interpolated qstrings. |
In the examples 3 different styles are used: |
And moreover, there should be a section clearly describing these different syntaxes as different possibilities, rather than implying that all of them must be supported simultaneously. Or, if the latter was intended, it should be clearly stated and not merely left up to implication. |
@Jash11 Please edit the status field to "Draft". Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Presumably the user has to also
import std.conv : text
. This is a significant downside compared to other interpolated strings in other languages - creating a new string is the common case, not creating a sequence. It would be better if the interpolated string expression was actually a struct instance that implicitly converts to a value sequence instead of being just a value sequence, and also has built-in fields something like:
U u; V v;
// string s = i"u = $u, v = $v".text;
string s = struct {
alias expand this;
alias expand = __dmdseq("u = ", u, ", v = ", v);
alias text = {import std.conv : text; return text(expand);}`;
enum literalLength = "u = ".length + ", v = ".length;
}().text;
The text
import is only made if the alias is used. (Having a dependency on Phobos for a language feature has precedence - a ^^ b
lowers to std.math.pow
IIRC). It would also be useful if the struct had a name so template code can recognise that interpolated arguments have been passed rather than just variadic arguments.
Bonus points if you can do i"v = $v".text!myAllocator
.
Having literalLength
could help the implementation of e.g. text
to avoid having to sum the length of each string fragment to avoid reallocations when assembling a long query string. (With format("u = %s, v = %s", u, v)
, the first approximation for string length is just arg1.length
). In any case, using a struct instance over just a value sequence gives us more opportunity for enhancements.
- Re: Library implementation 'awkward syntax': A simpler more general feature than this DIP would be to have
$templateName(args)
meanmixin(templateName!args)
. Then$i("v = $v")
syntax reads OK. Thei
template would need to be in object.d.
|
||
`db.exec("UPDATE Foo SET a = ?, b = ?, c = ?, d = ? WHERE id = ?", aval, bval, cval, dval, id);`<br> | ||
Becomes:<br> | ||
`db.exec(i"UPDATE Foo SET a = $(aval), b = $(bval), c = $(cval), d = $(dval) WHERE id = $(id)");` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not equivalent, the implementation of exec
has to change to receive interleaved strings and variables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, and actually, I would never, as a library author, write a library function that works with i"..."
as it is defined right now because it would be so easy to do wrong.
What would stop you from writing db.exec("UPDATE foo SET a = ", a)
? Well, nothing, and it would work fine, but just kinda coincidentally.
But what about db.exec(i"UPDATE $(tablePrefix)table SET a = ?", 10)
. OK, it is fair to say "don't do that", but there's no way to tell, in the library, that they did!
It translates to db.exec("UPDATE ",tablePrefix,"table SET a = ?", 10)
. At run time, you can issue a sql syntax error, but it is impossible to tell the difference at compile time.
I hate to be the downer, cuz I basically like this proposal, but I still think we should be making this thing give a new type. Instead of a naked tuple, do an anonymous struct. Change text
and writeln
etc. to recognize this new kind of struct, or at the usage site, use .tupleof
.
`i"foo $(a)".tupleof.whatever is still an option... but then library authors who actually want to work with the details have them available. (and we should probably change text and writeln so those just work first anyway.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think we should be making this thing give a new type. Instead of a naked tuple, do an anonymous struct
Amen. We need to be able to tell on the callee side if an interpolated sequence was passed vs just variadic args. And if it's a struct we can also have a text
method wrapping std.conv.text, so we can actually get a string easily without having to keep adding a local import every time we use interpolated strings. Getting a string is a major use case, however nice parsing a sequence of elements is for efficiency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I can live with the convenience .text
method. (i almost want to call it .toString
to leverage existing things that call that name, but whatever, i don't care that much about the name).
I am slightly disgusted at having a magic struct created by the compiler include such a reference, but I think it works and the compiler already magically references std.math in syntax, so I can live with it. I'd just make its magic .text
thing be a zero-arg template, so you don't pay for it if it isn't actually called.
I think I will write more about this in today's "this week in D"; I need to write that in a couple hours anyway and that gives me some content to dual-purpose since i missed most of dconf lol
I'll link it back here in a while. It might be a fairly different implementation though, since a magic struct cannot - I think anyway - be done with just a lexer hack. Even if you wrapped it in a call syntax to punt the details to the runtime library, you'd actually lose some functionality. (consider: foo!i"$(some_alias)"
. If that is passed to __d_is(some_alias)
, the aliasness is gone. On the other hand, foo(i"$(a + 4)")
being passed to __d_is!(a+4)
is liable to fail with "variable a is not accessible at compile time". So I don't think they both can work with a library function injected in the middle. Now, if I had to choose, I think the latter case is more important and I'd sacrifice the former case, but with the current PR, both work, and I think that is actually kinda cool. So I wanna preserve that. And a compiler-generated magic struct can do it.)
Anyway, I'll write more in a few hours.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, since we have alias this
, we can default to expanding the interpolated struct into a tuple while still giving access to methods to work with the interpolated data structure.
writeln(i"1 + 2 is $(1+2)");
// don't need to import std.conv.text because .text is just a member of the
// interpolated string type
audo s = "1 + 2 is $(1 + 2)".text;
I'd like to know where @WalterBright and @andralex stand on this. Since they are the ones to decide what design will be accepted, I'd like to get their input on this particular aspect. But I think I agree with you on this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My mistake, I forgot that Tuple
doesn't automatically expand to a "tuple", that's kind of the whole reason why Tuple
exists :)
In any case I do see alot of versatility in using the structure. It's a very good idea and I'm on board with it. It does make a few use cases harder by sometimes requiring the user to use tupleof
and/or requiring a wrapper template, but a small cost compared to the benefits. There will likely only be a handful of functions that need to accept interpolated strings. Though, I would like to be able to do things like:
// contrived example, you wouldn't actually do it this way
writeln(i"foo is $(foo) and bar is ", bar, i" and baz is $(baz)");
This would mean I don't think we could just use a wrapper, but would need to add a static if
inside writeln
to handle interpolated strings to automatically expand them.
And yes, having a toString(sink)
would also be good.
Thanks for taking the time to write this up, I like it alot. I'll think about it and may update my implementation to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more comment on your interpolation proposal. I think a big use case for interpolated strings is to make mixins much easier to write/maintain. Using q{}
string literals along with interpolated strings is a big win. You mentioned in your article that you don't think they should be supported, and provided an example that you thought was ambiguous. However, it's currently well-defined in my implementation.
There's 2 ways you could go about string interpolation. You could either parse the interpolated string at the same time you are you are parsing the string literal, or you could first parse the string literal and then interpolate it afterwards. I opted for the second design. With this model, it's easier to determine what will happen in cases like yours because the string is processed by each part one at a time. So in your example:
mixin(iq{
string s = i"foo $(bar)";
});
- The token string is parsed. Token strings don't have escapes so it stays the same.
- Then the interpolater runs. It ignores everything except when it sees a '$' character. The rule is made intentionally simple so a developer will be able to determine what will happen. The the string above will become:
mixin(__d_interpolatedString(`
string s = i"foo `, bar, `";
`));
- Now mixin gets involved, of which interpolatedString support should be added, and it will convert
bar
to a string. Let's assume its string representation is"barvalue"
.
mixin(`
string s = i"foo barvalue";
`));
string s = i"foo barvalue";
- Now the string literal parser and interpolator run again which gives us:
string s = __d_interpolatedString("foo barvalue");
Now if you wanted the other behavior where the $(bar)
expression is expanded later, then you would escape the $
with $$
:
mixin(iq{
string s = i"foo $$(bar)";
});
Will become:
string s = i"foo $(bar)";
Which becomes:
string s = __d_interpolatedString("foo ", bar);
So by default, any $(...)
expressions are always expanded by the initial interpolation, and if you want to delay expansion then you can escape the dollar. The interpolation parser doesn't try to understand the characters in between the '$' expressions and escape quoted sections etc... that just ends up causing more confusion and will cause the developer headaches when they try to figure out how to get what they want from their interpolated strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, double quoted strings have the same issue. Your example doesn't only apply to token strings:
mixin("
string s = i\"foo $(bar)\";
");
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I think it is trivial with double-quoted strings because it is all syntax highlighted as a big string :) with token strings, the D lexer runs inside too so you might think it is different.
But that said, your reasoning is solid and simple to understand, so I could go with that too. (just gotta be sure we mention all these things - and explicitly mention what we decide to exclude - to get past the Walter barrier :) )
And also I guess mixin should probably be able to handle the interpolated thing too. The alias this
could handle that, so could an explicit call to toString
(or text
), but we should prolly document it regardless.
PS this is a kinda weird place to be having this conversation, as comments on a random line of example code :P but whatever.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using alias this
with toString
would mean that using mixins with interpolated strings would require phobos.
enum SomeValue = 1234;
mixin(iq{enum AnotherValue = $(SomeValue + 1);});
Seems wrong to me....I'll have to think about this one.
Token: | ||
... | ||
StringLiteral | ||
i StringLiteral |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it makes sense to have an interpolated WysiwygString, it isn't WYSIWYG. You also can't have escapes so you can't have a $
character.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either way the DIP should specify what happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The string literal itself is still WYSIWYG. It's just that afterwards, it's processed by the interpolator. You can think of the interpolator like an operator and/or function call. For example, if you passed a WYSIWYG string to the "toUpper" function, the string wouldn't by WYSIWYG.
src/build.d:556:<br> | ||
`auto hostDMDURL = "http://downloads.dlang.org/releases/2.x/"~hostDMDVer~"/dmd."~hostDMDBase;`<br> | ||
Becomes:<br> | ||
`auto hostDMDURL = i"http://downloads.dlang.org/releases/2.x/$hostDMDVer/dmd.$hostDMDBase".text;`<br> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A better example would be to use a non-string variable:
import std.conv : to;
string folder = "releases/" ~ dmd_major_version.to!string ~ ".x";
// no `to` import needed (but `text` might be)
string folder = "releases/$dmd_major_version.x".text;
The author seems to have deleted his GH account? |
I would take it over, but alas, it would be irresponsible for me to invest so much into yet another DIP when the process is so broken. Hopefully the leadership takes our feedback and fixes the process. I would love to create a good DIP and work with the community to fine tune this feature. |
...The hell!? Why am I not surprise by this? It quite clear that we need an issue tracker to track fundamental language issues that people have with the language as not everyone have the time/skill to write an DIP and wait through the 180+ response. There are two advantages to this approach:
This is no means a replacement for the DIP. This allow easy information gathering when writing a DIP. |
@12345swordy sounds like a fine idea. Maybe you should write a DIP? jk :) |
@marler8997 I suggest you reboot this as a new DIP to match your proposal. It seems to be the easiest way forward. |
On Tue, May 14, 2019 at 10:07:24AM -0700, Jonathan Marler wrote:
Using `alias this` with `toString` would mean that using mixins with interpolated strings would require phobos.
Yes, but only at compile time. And the conversions need to be done *somehow*. The compiler could easily be responsible for strings and ints, but when we get more complex it might mean running user-defined code anyway. But the CTFE engine can handle all that.
|
I don't really have an issue that it's using phobos necessarily. More an issue that it's hard coded to use // somewhere in object.d
// make sure this a template so it doesn't try to import anything unless it's used
template __d_valuesToString ()
{
import std.conv : text;
alias __d_valuesToString = std.conv.text;
} |
On Tue, May 14, 2019 at 11:05:54AM -0700, Jonathan Marler wrote:
I don't really have an issue that it's using phobos necessarily. More an issue that it's hard coded to use `std.conv.text`, or whatever. Maybe we could make that configurable in `object.d`?
It is - all the compiler does is call a function. The rest of the implementation is in that function, in object.d or whereever. It is a normal struct with a normal toString template that runs normal code.
|
Gotcha, I'm good with that. |
Given that the DIP author's account has been deleted and Walter has submitted an alternative proposal, I'm closing this. |
Can you link to his proposal? |
No description provided.