Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-dollar interpolation #375

Open
serras opened this issue Apr 12, 2024 · 21 comments
Open

Multi-dollar interpolation #375

serras opened this issue Apr 12, 2024 · 21 comments

Comments

@serras
Copy link

serras commented Apr 12, 2024

This KEEP proposes multi-dollar interpolation. The current full text of the proposal can be found here.

We propose an extension of string literal syntax to improve the situation around $ in string literals. Literals may configure the amount of $ characters required for interpolation.

@lukellmann
Copy link

lukellmann commented Apr 12, 2024

Marking the end of the string is done using as many " symbols as those beginning the string.

Won't this change the meaning of code like this:

// 1)
// prints "foo" now
// would print foo in the future, the literal is starting with 4 "
println(""""foo"""")

// 2)
// prints "foo now
// would fail to compile in the future
println(""""foo""")

I have written code like 2) recently to create single line JSON literals:

val json = """{"id":"0","guild_id":"0","name":"rule","creator_id":"0","event_type":1,"trigger_type":3,""" +
    """"trigger_metadata":{},"actions":[],"enabled":false,"exempt_roles":[],"exempt_channels":[]}"""
//  ^^^^
//  notice the 4 " here

@OliverO2
Copy link

I understood the proposal to imply that the number of leading/trailing " symbols can only be changed from the standard 1/3 for strings prefixed with at least one $ symbol. So it is backwards-compatible and the above examples would be interpreted in the same way as before.

@AarjavP
Copy link

AarjavP commented Apr 12, 2024

also curious, in this example:

$$"$${order.product} costs $ $${order.price}"

will removing space after costs $ cause any issue or will it just print costs $150?

@zarechenskiy
Copy link
Member

zarechenskiy commented Apr 12, 2024

also curious, in this example:

$$"$${order.product} costs $ $${order.price}"

will removing space after costs $ cause any issue or will it just print costs $150?

It will print costs $150. The rules for interpolation should naturally extend the existing ones, and right now it's possible to use two dollars consecutively, one as a sign and the last one to mark interpolation:

fun main() {
    val price = 150
    println("costs $$price") // costs $150
}

@zarechenskiy
Copy link
Member

zarechenskiy commented Apr 12, 2024

Marking the end of the string is done using as many " symbols as those beginning the string.

Won't this change the meaning of code like this:

// 1)
// prints "foo" now
// would print foo in the future, the literal is starting with 4 "
println(""""foo"""")

// 2)
// prints "foo now
// would fail to compile in the future
println(""""foo""")

I have written code like 2) recently to create single line JSON literals:

val json = """{"id":"0","guild_id":"0","name":"rule","creator_id":"0","event_type":1,"trigger_type":3,""" +
    """"trigger_metadata":{},"actions":[],"enabled":false,"exempt_roles":[],"exempt_channels":[]}"""
//  ^^^^
//  notice the 4 " here

It won't, as we'd like to have backwards-compatible rules and require the dollar sign for the new rules. However, this approach implicitly introduces a new kind of string literals, and that is a pretty high price for this issue. We'll easily run into subtle differences because the $ in front of a string literal will have two meanings:

println(""""f"""") // "f"
println($""""f"""") // f
val id = "x"
println(""""$id"""") // "x"
// if I just wanted to use '$' sign:
println($$""""$id"""") // $id; quotes are missing

And there are a lot of single-line strings that start with more than three " symbols: https://github.com/search?q=%22%22%22%22+language%3AKotlin&type=code&ref=advsearch

@Peanuuutz
Copy link

Peanuuutz commented Apr 12, 2024

Indeed, the dollar escaping rule looks good to me, but the subtle difference between two versions of """" bothers me a lot. I'd like to propose the following option:

A string literal, whether single line or multi-line, when starting with N $, requires exactly N $ after the 1 or 3 quotes to close. When this happens, interpolation should also start with the corresponding number of $.

val num = 0

val sinOne = $"$num"$
> 0

val sinTwo = $$"$num"$$
> $num

val sinEscape = $"\"$num\""$
> "0"

val rawOne = $"""
    {
        "key": $num
    }
"""$.trimIndent()
> {
>     "key": 0
> }

val rawTwo = $$"""
    {
        "$num": $$num
    }
"""$$.trimIndent()
> {
>     "$num": 0
> }

val rawPseudoEscape = $"""""""""$
> """

val rawPseudoEscapy = $$"""$$"""$"""$$
> $$"""$

@serras serras changed the title Improve handling of $ and " in string literals Multi-dollar interpolation Apr 12, 2024
@serras
Copy link
Author

serras commented Apr 12, 2024

I've pushed a new version of the proposal, removing everything related to double quotes. That way the KEEP is only about multi-dollar interpolation.

Thanks @lukellmann for noticing the problems with the proposed approach so fast. 🚀

@serras
Copy link
Author

serras commented Apr 12, 2024

I think the case of embedding """ inside a raw string literal happens way less often than the need of embedding $ in such a literal. Given that, I think that @Peanuuutz's proposal would add too much ceremony to most of the strings.

@sken130
Copy link

sken130 commented Apr 13, 2024

But if we don't solve the triple double quotes problem, then the whole proposal is only a partial solution, and it wastes the syntax space.

Nowadays more and more languages are using triple double quotes, and we will have more and more chance to encounter the need to embed them. When we regret today's decision, we will have to introduce even more syntax to solve it.

I am not requesting to solve every unknown problems, but at least please solve the problems we do know now.

@Peanuuutz
Copy link

Peanuuutz commented Apr 13, 2024

How is adding just a trailing sequence of $ considered "too much ceremony"? For most of the multi-line literals the actual content is way more noticeable than the beginning and ending characters. If $ is needed then just a pair of $$, and the triple quote part doesn't even need any change as it's just solved. Two birds with one stone. Why drawing back for later?

Is "short" better than "good"? I truly hope not.

@serras
Copy link
Author

serras commented Apr 13, 2024

Sorry, I didn't mean that we shouldn't look for the best solution, and certainly using words like "ceremony" wasn't good on my side. My apologies.

The problem statement, as I see it, is the following. We would like to find a way to include """ in a multiline string; but without falling in the "trap" of then having a new sequence of characters we cannot include. The solution with an increasing amount of " is one such solution, as for every amount of " you want to include, you just need to make the initial/end markers one " longer.

Can we imagine another such solution? In particular, @Peanuuutz, does your proposed solution satisfy these requirements?

@Peanuuutz
Copy link

Peanuuutz commented Apr 13, 2024

Can we imagine another such solution? In particular, @Peanuuutz, does your proposed solution satisfy these requirements?

After I tested a few cases, the answer is no. There's only one very edge case where my proposal will fail.

val i = 1

$"""
   """
"""$ // Good

$"""
    """$i
"""$ // Compile error

$$"""
    """$i
"""$$ // Good

$$"""
    """$$i
"""$$ // Compile error

That is, now I can't have """ and a consecutive interpolation.

Please note that, even though this happens, we can't allow the surrounding quotes to grow as it then falls into the original situation where quotes are gone after a $ is prepended, see comments above me.

Actually, after I read the original ticket, I'm more in favor of a new representation - '$"Hello $name"$'. I've considered several configurations, like changing the rule of $ or " or the combination, and it always ends up with a single edge case not possible to write or introducing a very sneaky change (the quotes are gone with just a $) which is raised above. I truly feel like a model with the $ in between the starting/ending quotes is what we need, because that way we can safely ends a literal as the $ is always followed by a single quote (meaning it can never be an interpolation within the string), and we can have multiple $ if $ is included.

val i = 1

'""'
> (Empty)

'"\n"'
> \\n

'"$i"'
> 1

'$"$i"$'
> 1

'$$"$i"$$'
> $i

'$$"$$i"$$'
> 1

'$""'"$'
> "'

'$""$i'"$'
> "1'

@Amejonah1200
Copy link

Firstly, thx serras for writing all those KEEPs 🙏

Secondly, I find the solution to the problem very nice, esp. because I wished for it to have in Kotlin as I saw it being added to C# 11.


Concerning string, well... as I write Rust, there is no "multiline" strings, as the strings can accept newlines like that anyway. So there is only ##"..."## there. About indentation: C# 11 also introduced the ability to trim the indent by having a specific amount of space characters before the closing """. The only limitation I see there, that it is a compile error if there is a line where the indent is lower, I would've removed that limitation.

@OliverO2
Copy link

Trimming the indent for multi-line strings at compile time sounds attractive. Why add bloat and bear the cost of invoking trimIndent() at runtime?

@JakeWharton
Copy link

trimIndent and trimMargin already run at compile-time (and have for years) if the string is a constant. If the string captures variables, however, you cannot perform the operation and it is deferred to runtime.

@ephemient
Copy link

I do think there may be value in having a way to "compile-time trim this multi-line string, but not the expressions interpolated into it", but we don't have a way to express that now.

@serras
Copy link
Author

serras commented Apr 15, 2024

Trimming the indent for multi-line strings at compile time sounds attractive.

As discussed at the beginning of the KEEP, we've decided to split this concern to another KEEP which is in the works. The reason why trimming is harder is because of possible interactions with string templates.

@Amejonah1200
Copy link

Amejonah1200 commented Apr 15, 2024

Trimming the indent for multi-line strings at compile time sounds attractive.

As discussed at the beginning of the KEEP, we've decided to split this concern to another KEEP which is in the works. The reason why trimming is harder is because of possible interactions with string templates.

@serras did you see the C# trimming for """ strings? What's your take on that?

@serras
Copy link
Author

serras commented Apr 16, 2024

did you see the C# trimming for """ strings? What's your take on that?

Yes, I've seen how they did it. However, whatever solution we come up with, we need to be cautious not to break backwards compatibility, and trying to get some uniformity across the language. When our team discussed the issue, we came to two conclusions:

  1. Having multiline strings with $ and without it behave differently with respect to trimming is not good. We would like to provide a solution which allows "fixing" both kinds of literals,
  2. There are different ways people want trimming (in the Kotlin community, people use both trimIndent and trimMargin). We have to acknowledge that, and not force everybody into the same trimming behavior.

As hinted above and by this message, we are working on this. However, it may take longer to reach a solution.

@serras
Copy link
Author

serras commented Apr 19, 2024

After some discussion, we've decided not to handle the problem of three double quotes in a multiline string.

We acknowledge that this solution does not solve the problem of escaping (three or more) " characters inside a multiline string. The workaround is using ${"\"\"\""}, or similar code which interpolates a single-line string with the three symbols.


Our preliminary code search for usages of """" (that is, using a multiline string literal with double quotes inside) shows that this is a relevant pattern (7.2K usages), so we should not proceed with the change of making the closing block of double quotes match the opening one.

In contrast, code search for usages of ${"\"\"\""} reveals only 140 usages in GitHub. This shows that the need here is quite narrow, and we prefer a simpler extension of syntax (adding only $ at the front) instead of changing both begin and end markers.

@OliverO2
Copy link

Do I understand correctly that the case for using a single-dollar prefix $"""...""" is now moot since the quote interpretation rule change has gone?

If so, wouldn't it be reasonable to change the proposed syntax so that a prefix now requires 2 or more consecutive $ symbols, and the single $ symbol is disallowed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants