-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggestion: optimize codegen for "simple" string interpolation #9212
Comments
Related: #194. |
Sorry about that, I did a search for issues about "interpolation", not sure how I missed this one. Apparently the idea was dismissed, I will read why later on. |
I read the other thread and @bbarry seemed to have made the argument that led to closing the issue. I think not all arguments were convincing. Let me recap some of that thread objections. First, keep in mind that we could choose to draw the line between what is optimized and what falls back to simpler codegen anywhere we want to. Most interpolations I see in code at work are basic: few parameters, no format, no alignment. Those could easily compile to specialized code. (1) Thread #194 focuses on (2) Alignment is perceived as a huge show stopper... Just add a helper method to align strings and we're done: (3) Boxing can't be removed most of the time. Argument is: when there's a format cast to I understand MS has finite resources and maybe there are more important problems..
Such examples are very common and the generated code could be optimal. (Micro 😦) benchmark: |
@jods4
Of course, since the compiler knows that
Is that what you're hoping we'll do? I don't see how to get it any simpler than that. Things are not much better for
|
@gafter I was unaware that all formatting operations look whether This is indeed a roadblock for generating efficient formatting code. Idea: given that almost all code runs with the built-in string result = CultureInfo.CurrentCulture is ICustomFormatter ?
string.Format("Hello {0}", name) :
"Hello " + name; I admit that it's getting complicated 😞 |
@jods4 A good example is using the currency formatter. |
@leppie can you elaborate? I'm curious and not sure what you refer to. |
@jods4 |
Oh sure. I use those all the time (numbers and dates mostly). The feature I didn't know about was that |
It could take a fast path for string args though if it does not do that already. |
@leppie I'm kind of wondering what a proper usage of that feature would be. I mean: if you want to override formatting for a class you don't have control of (so can't implement |
@jods4 |
@jods4 Also such optimizations might even be performed by the JIT, or at least it could hint that way. |
@leppie On the other hand, you'd have no idea there is a type check when you use string interpolation. kind of wtf. |
In the pull request I initially submitted for this I reached the conclusion that a few helper methods added to the BCL would ultimately save a few instructions worth of time but that this would require an effort across multiple projects both open and closed all to make the compiler more complex. |
@bbarry Interesting to see the PR as I now can see this is where most of the discussion about this issue has taken place. Like you (judging by your comments), I feel sad that C# introduces clean, attractive new features that devs quickly adopt (and are sometimes suggested by refactoring tools such as Resharper) to later find out that in perf. critical code you should avoid them (e.g. the Kestrel case you pointed out). It makes me even more sad that the feature is not impossible to implement efficiently (look at Rust formatting) but of course hindsight is always 20/20. I would agree that trying to generate specialized code for all cases is too complicated. But taking advantage of edge cases, common code could be (near-)optimal. Undeniably this is more complexity in the compiler, but it's isolated (interpolation codegen only) and it would benefit all C# users. |
@gafter I didn't see all those duplicates! Maybe that's a sign that something should be done? Optimizing common cheap cases at runtime is possible yet complicated. Here's one idea:
I guess the complicated part will be to come up with a good syntax 😒 |
@jods4, this was suggested in dotnet/csharplang#177 using |
String interpolation simplicity and clearness makes it very attractive for formatting text.
In fact, at my company I've seen it used for things as trivial as
var greeting = $"Hello {name}"
, which admittedly is only slightly nicer than"Hello " + name
.The thing is: it doesn't have the same performance characteristics. Currently C# converts an interpolated string into a
string.Format("Hello {0}", name)
call. This has the benefit of being the simplest implementation (from a compiler perspective). But it has quite some overhead: boxing of value type placeholders (such as numbers), parsing of the format string, validation of the parameters, etc.Most of the time you may not care. But sometimes in hot path code you want to be very careful with those allocations and computations.
I suggest that the compiler could generate more optimized code in at least some (or all?) situations.
It is evident that when you convert to
FormattableString
orIFormattable
you'd need to generate a specialized class for each template, which may be excessive (or not?).In the common case of interpolating strings, you could turn the parameters into strings at interpolation site, with specialized code, so that value types don't require boxing.
Concatenation of the results could be done with
string.Concat
, which doesn't require allocating an array for up to 4 parts.This means that code like
string s = $"x: {x}, y: {y}"
, which is a common use of string interpolation, could generate the IL equivalent ofstring s = string.Concat("x: ", x.ToString(), ", y: ", y.ToString())
which is probably the most efficient code you could have to create that string.If the compiler did that work, we would get both benefits of clean syntax and most efficient specialized code.
The text was updated successfully, but these errors were encountered: