Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: String interpolation via AppendFormat pattern #28945

Closed
stephentoub opened this issue Mar 12, 2019 · 5 comments · Fixed by #51086
Closed

Proposal: String interpolation via AppendFormat pattern #28945

stephentoub opened this issue Mar 12, 2019 · 5 comments · Fixed by #51086
Labels
api-needs-work API needs work before it is approved, it is NOT ready for implementation area-System.Runtime
Milestone

Comments

@stephentoub
Copy link
Member

Copied from @stephentoub https://github.com/dotnet/corefx/issues/35806#issuecomment-470310193

One of the main use cases this is being proposed for is around string interpolation and string formatting.

I realize there are other uses cases, so not necessarily instead of a something Variant-like, but specifically to address the case of string interpolation, I had another thought on an approach….

Today, you can define a method like:

AppendFormat(FormattableString s);

and use that as the target of string interpolation, e.g.

AppendFormat($”My type is {GetType()}.  My value is {_value:x}.);

Imagine we had a pattern (or an interface, though that adds challenge for ref structs) the compiler could recognize where a type could expose a method of the form:

AppendFormat(object value, ReadOnlySpan<char> format);

The type could expose additional overloads as well, and the compiler would use normal overload resolution when determining which method to call, but the above would be sufficient to allow string interpolation to be used with the type in the new way. We could add this method to StringBuilder, for example, along with additional overloads for efficiency, e.g.

public class StringBuilder
{
    public void AppendFormat(object value, ReadOnlySpan<char> format);
    public void AppendFormat(int value, ReadOnlySpan<char> format);
    public void AppendFormat(long value, ReadOnlySpan<char> format);
    public void AppendFormat(ReadOnlySpan<char> value, ReadOnlySpan<char> format);// etc.
}

We could also define new types (as could anyone), as long as they implemented this pattern, e.g.

public ref struct ValueStringBuilder
{
    public ValueStringBuilder(Span<char> initialBuffer);
    
    public void AppendFormat(FormattableString s);
    public void AppendFormat(object value, ReadOnlySpan<char> format);
    public void AppendFormat(int value, ReadOnlySpan<char> format);
    public void Appendformat(long value, ReadOnlySpan<char> format);
    public void AppendFormat(ReadOnlySpan<char> value, ReadOnlySpan<char> format);// etc.

    public Span<char> Value { get; }
}

Now, when you call:

ValueStringBuilder vsb =;
vsb.AppendFormat($”My type is {GetType()}.  My value is {_value:x}.);

rather than generating what it would generate today if this took a FormattableString:

vsb.AppendFormat(FormattableStringFactory.Create("My type is {0}. My value is {1:x}.”, new object[] { GetType(), (object)_value }));

or if it took a string:

vsb.AppendFormat(string.Format("My type is {0}. My value is {1:x}.”, GetType(), (object)_value));

it would instead generate:

vsb.AppendFormat(“My type is, default);
vsb.AppendFormat(GetType(), default);
vsb.AppendFormat(. My value is, default);
vsb.AppendFormat(_value, “x”);
vsb.AppendFormat(".", default);

There are more calls here, but most of the parsing is done at compile time rather than at run time, and a type can expose overloads to allow any type T to avoid boxing, including one that takes a generic T if so desired.


Copied from @JeremyKuhne https://github.com/dotnet/corefx/issues/35806#issuecomment-470365126

@stephentoub Generally speaking I like the idea of moving parsing to compile time. I'll play around to see what sort of perf implications it has.

One thing I'd want to make sure we have an answer for is how do we fit ValueFormatableString (or something similar) into this picture? Ideally we can add just one overload to Console.WriteLine() that will magically suck $"" away from Console.WriteLine(string). Could we leverage ValueStringBuilder for this?

int count = 42;
Console.WriteLine($"The count is {count}.");

// And we have the following overload
void WriteLine(in ValueStringBuilder builder);

// Then C# generates:
ValueStringBuilder vsb = new ValueStringBuilder();
// ... the series of Appends() ...
WriteLine(vsb);
vsb.Dispose(); // Note that this isn't critical, it just returns any rented space to the ArrayPool

We could also add overloads that take IFormatProvider, ValueStringBuilder? Or possibly just add an optional IFormatProvider on ValueStringBuilder? Then something like this could happen:

Console.WriteLine(myFormatProvider, $"The count is {count}.");

// Creates the following
ValueStringBuilder vsb = new ValueStringBuilder(myFormatProvider);
// ... the series of Appends() ...
WriteLine(vsb);
vsb.Dispose();

Copied from @stephentoub https://github.com/dotnet/corefx/issues/35806#issuecomment-470392810

It would be really nice if the high-performance formatting supported consuming Span items.

This is one of the advantages I see to the aforementioned AppendFormat approach. In theory you just have another AppendFormat(ReadOnlySpan<char> value, ReadOnlySpan<char> format) overload, and then you could do $"This contains a {string.AsSpan(3, 7)}" and have that "just work".

@vancem
Copy link
Contributor

vancem commented Mar 12, 2019

@svick
Copy link
Contributor

svick commented Mar 13, 2019

I don't think translating a single call to method M with interpolated string to multiple calls to overloads of M is sufficient, at least not without some indication about which call is the last one. For example, for Console.WriteLine, the implementation would have no information about where to actually insert the newline.


Ideally we can add just one overload to Console.WriteLine() that will magically suck $"" away from Console.WriteLine(string).

Currently, interpolated strings convert to string first and to FormattableString only if that's not possible. I'm not sure what the exact rationale for that design decision was (compatibility?), but wouldn't the same rationale apply here? That would mean it would not be possible to "suck away" interpolated string away from the string overload.


Also, I think it would be nice if interpolated strings supported something like structured logging. So, if I call something like logger.Log(LogLevel.Info, $"The result of processing request {requestId} was {result}.");, it has all the data it needs to produce:

Field Value
message The result of processing request 42 was Success.
level Info
requestId 42
result Success

This is not possible today with FormattableString, because it does not give any information about the expression that was used in the formatting hole, only its value (see dotnet/csharplang#1949).

A new representation for interpolated strings might be a good opportunity to enhance them in this way, or at least to make sure the proposed design does not prevent doing this in the future.

@stephentoub
Copy link
Member Author

For example, for Console.WriteLine, the implementation would have no information about where to actually insert the newline.

Console.WriteLine would need either to have an overload accepting the target (e.g. the ValueStringBuilder) and it would write the newline into it, or you would use Console.Write rather than Console.WriteLine and append the newline yourself (ideally with a Write overload accepting a span to avoid the trip through a string), or it could do two separate writes to the console.

At the end of the day, this approach only gets you as far as someone using the thing that's building up the result... there will often be some point at which you convert that result into whatever the target is (e.g. a string, or a span), after which any additional manipulation/allocations are beyond your control.

but wouldn't the same rationale apply here?

Not necessarily. @jaredpar is already proceeding on a path here where the compiler could prefer a new type over string.

@jaredpar
Copy link
Member

Not necessarily. @jaredpar is already proceeding on a path here where the compiler could prefer a new type over string.

Indeed. The proposal is being tracked here

dotnet/csharplang#2302

In summary though: if we can find a good win and an explainable back compat case then we should be able to move forward here with a change.

@msftgits msftgits transferred this issue from dotnet/corefx Feb 1, 2020
@msftgits msftgits added this to the 5.0 milestone Feb 1, 2020
@maryamariyan maryamariyan added the untriaged New issue has not been triaged by the area owner label Feb 23, 2020
@stephentoub stephentoub removed the untriaged New issue has not been triaged by the area owner label Feb 25, 2020
@JeremyKuhne JeremyKuhne removed their assignment Mar 25, 2020
@stephentoub stephentoub modified the milestones: 5.0.0, Future Jun 17, 2020
@stephentoub
Copy link
Member Author

Replaced by #50601

@ghost ghost added in-pr There is an active PR which will close this issue when it is merged and removed in-pr There is an active PR which will close this issue when it is merged labels Apr 11, 2021
@ghost ghost locked as resolved and limited conversation to collaborators May 15, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-needs-work API needs work before it is approved, it is NOT ready for implementation area-System.Runtime
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants