-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Unicode-related String Interpolation standard -- consider for Ada? #37
Comments
As much as it makes me cringe to imagine Ada with curly braces, I get the value of such format strings. However to me, it seems the usual way of concatenating strings is very close to what we'd get from any additional complex syntax. Maybe there is a way we can add a simple syntactic sugar approach that collapses the " & ... & " sequence, such as via the dollar sign. For example, what if we say that a '$' within a string with a matching '$' that is more than zero characters away is exactly equivalent to (using single quote to delineate) '" & ' for the first and ' & "' for the second.
This would be syntactically equivalent to
And could be formed through simple text replacement. Also we could follow the same convention for double-quote where two '$' in a row is replaced by a single $.
Which would become
I'm a bit weary if introducing the complexity as given in the Unicode standard, particularly given that Ada 2022 has such rich user-defined image facilities. |
Now that Ada has a universal 'Image, it makes it very annoying to have to specify it all over the place when producing textual output. So the goal is to replace |
How are you proposing to differentiate "interpolated" string literal from
regular ones? It would seem wildly incompatable to do it always (anything
that happened to contain the trigger strings would get clobbered, and that
would be a behavior change without the possibility of compile-time
detection).
Randy.
…_____
From: S. Tucker Taft ***@***.***
Sent: Wednesday, January 11, 2023 7:04 PM
To: Ada-Rapporteur-Group/User-Community-Input
Cc: Subscribed
Subject: Re: [Ada-Rapporteur-Group/User-Community-Input] New Unicode-related
String Interpolation standard -- consider for Ada? (Issue #37)
Now that Ada has a universal 'Image, it makes it very annoying to have to
specify it all over the place when producing textual output. So the goal is
to replace " & X'Image & " (or worse, " & Integer'Image(X + Y) & "), with
simply $X or $(X + Y) when appearing in the middle of an "interpolated"
string literal.
-
Reply to this email directly, view
<#37 (comment)
ecomment-1379677213> it on GitHub, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AT65YNZWR33JFLB77HADWATWR
5KBZANCNFSM6AAAAAATVUIMTU> .
You are receiving this because you are subscribed to this thread.
<https://github.com/notifications/beacon/AT65YN6LRTFYTLASBMC3BXTWR5KBZA5CNFS
M6AAAAAATVUIMTWWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTS
SHQ2B2.gif> Message ID:
***@***.***>
|
The string literal would start (and possibly end) with a unique sequence, such as:
or
or
as mentioned above in the original note. So a complete interpolated string literal might be:
or
|
Le 12/01/2023 à 02:04, S. Tucker Taft a écrit :
Now that Ada has a universal 'Image, it makes it very annoying to have
to specify it all over the place when producing textual output. So the
goal is to replace " & X'Image & " (or worse, " & Integer'Image(X + Y)
& "), with simply $X or $(X + Y) when appearing in the middle of an
"interpolated" string literal.
Message ID:
***@***.***>
I see nothing annoying in having to type a few more characters. This
looks like another feature justified by ease-of-writing, and because
some other interpreted popular language has something like it.
This increases the complexity of the language, defeats orthogonality,
and I doubt it will have any effect on the popularity of the language...
--
J-P. Rosen
Adalog
2 rue du Docteur Lombard, 92441 Issy-les-Moulineaux CEDEX
https://www.adalog.fr https://www.adacontrol.fr
|
On 2023-01-13 10:29, Jean-Pierre Rosen wrote:
I see nothing annoying in having to type a few more characters. This
looks like another feature justified by ease-of-writing, and because
some other interpreted popular language has something like it.
This increases the complexity of the language, defeats orthogonality,
and I doubt it will have any effect on the popularity of the language...
One case where it might be useful, though, is when you have a user-facing
application and you want to be able to translate the output for instance.
The string to be translated would be something like:
"You have ${count} eggs"
Which could be translated to "Vous avez ${count} oeufs"
(likely talking about eggs is pretty rare in Ada applications...)
One other advantage could be performance (though if you are dynamically
building strings in a performance-sensitive loop you are likely doing it
wrong
of course). With the proposed syntax, the compiler could possibly first
compute
the overall string length, then allocate it once (heap or stack), and
finally build it
in place.
I am a big fan of python's f-strings (which are similar), though I must
admit I have
never really missed that feature all that much in Ada. The main place
where we
have to build strings is for logging, and we have a much more efficient
API there
that delegates the building of the string (and calling Image) to a
background task.
Emmanuel
|
It really makes quite a difference on readability, and reduction in silly errors. An intern and I were writing a compiller in ParaSail that generated LLVM intermediate representation (which has a textual form), and at some point we realized all of the calls on ToString and the various concatenation operations were making the code unbelievably hard to read. Since we could, we added string interpolation to ParaSail, and the improvement was enormous. Yes it made it easier to write, but it also made it much easier to read, and hence much easier to notice mistakes. |
Le 13/01/2023 à 14:35, S. Tucker Taft a écrit :
It really makes quite a difference on readability, and reduction in
silly errors. An intern and I were writing a compiller in ParaSail
that generated LLVM intermediate representation (which has a textual
form), and at some point we realized all of the calls on ToString and
the various concatenation operations were making the code unbelievably
hard to read. Since we could, we added string interpolation to
ParaSail, and the improvement was enormous. Yes it made it easier to
write, but it also made it much easier to read, and hence much easier
to notice mistakes.
Fair enough. But couldn't you achieve the same thing with a couple of
subprograms?
--
J-P. Rosen
Adalog
2 rue du Docteur Lombard, 92441 Issy-les-Moulineaux CEDEX
https://www.adalog.fr https://www.adacontrol.fr
|
I don't see how. String interpolation requires the Ada lever, parser, and semantic analysis to work together. For example, an interpolated string literal like:
I can't quite imagine how a couple of subprograms could handle that. The point is that we are interpolating the 'Image of the value of an arbitrary Ada expression into the middle of a string literal. The equivalent non-interpolated syntax would be:
Both readability and writability are improved in the interpolated version, I would claim. -Tuck |
My interpretation of JP's point, which is one I sympathize with, is that the "solution to your problem" should be produced by a function itself. Particularly if that function was nearer in scope, and would see Y, G, et al directly. In such a case you'd simply say:
Or even better, you could have another function expression that returned My_Time'Image of Compute_Solution at X,
I have used this kind of approach regularly, and I'm having a hard time seeing this proposal as being anything more than yet another lazy programmer feature, which alienates people (like myself) who really don't want to see Ada go that route, and does nothing to satisfy people using languages that are structurally faster to type, such as Rust. Sure it might be more readable in isolation, but I don't think you can as easily argue that it is any more readable than abstracting things out to more specialized subprograms. |
Writing here what I said in the ARG: One should bear in mind that MessageFormat 2.0 (like its ancestor, ICU MessageFormat) is about localized strings; as such, it comes with a rather fancy domain-specific language which is needed to handle the complexities of grammar in localized strings (the main example being pluralization; English is easy here, with just two plural cases—singular for 1, plural for everything else—but many languages are more interesting; consider Russian’s 4 plural cases or Arabic’s 6, depending on the last two digits). See https://unicode-org.github.io/icu/userguide/format_parse/messages/#complex-argument-types in the old MessageFormat and https://github.com/unicode-org/message-format-wg/blob/main/spec/syntax.md#complex-messages in the draft new one. String interpolation syntaxes in programming languages usually do not deal with that; as Tucker mentioned in the ARG meeting, it is common to see such things as the following Python: f"{n} cat{'' if n == 1 else 's'}" The reason why the MessageFormat syntax exists is that the above construct is impossible to localize: translators do not get to change the program, and no amount of playing with the |
At AdaCore we have been discussing possible ways of supporting string "interpolation" where a special syntax for string literals allows direct "interpolation" of the values of variables and expressions into the string, such as:
"Name = {First_Name} {Last_Name}, Address = {Address}, and Age = {(Now - Birthday) / Year}."
Of course, we would need some way of distinguishing such strings from "normal" string literals, and we have considered various options, such as:
{"Name = {...}."}
or, with a python-like prefix
F"Name = {...}."
or, using '$' consistently
$"Name = $First_Name $Last_Name, ... $((Now - Birthday) / Year)."$
Today I noticed that the Unicode consortium is working on a standard for something that approximates string interpolation, which they call "Message Format 2" (great name ;-):
Message Format 2.0 syntax
which is a follow-on to a relatively old existing standard "ICU MessageFormat", which had some "pain points":
ICU MessageFormat pain points
Here are a couple of simple examples (drawn from Message Format 2.0 syntax):
A message with an interpolated $date variable formatted with the :datetime function:
If we want to consider something like this for standardizing, it would make sense to look at the work the Unicode consortium is doing, as it seems to be based on significant experience, both bad and good, with the ICU MessageFormat.
-Tuck
The text was updated successfully, but these errors were encountered: