New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Capture names (or full code) of interpolated values if string interpolation is passed in as a FormattableString #142

Closed
ashmind opened this Issue Jan 29, 2015 · 26 comments

Comments

Projects
None yet
10 participants
@ashmind
Copy link

ashmind commented Jan 29, 2015

This proposal was already answered on CodePlex with "too late for C#6", but since C#7 is taking input at the moment, I think it is worth mentioning this again.

Problem:

Current design for FormattableString is:

public class FormattableString {
    public string Format { get; }
    public object[] Args { get; }
    // ...
}

Consider two following usages:

// Some future Serilog:
log.Information($"Processed {position} in {elapsedMs:000} ms.");

// Some future Dapper:
connection.Execute($"SELECT * FROM T1 WHERE Id = {id}");

In addition to saving the log string, Serilog saves format values as properties on the log entry object. However with current FormattableString the names of these values (position, elapsedMs) would not be available.

Dapper does not really need the names (it can generate parameters named @p1 and @p2), but proper names can improve readability of SQL traces and general SQL debugging.

And another (weird) use case for capturing names of the provided values:

Argument.NotNull($"{x}"); // doesn't actually need the string at all, but uses arg name/value

Compare this with current:

Argument.NotNull(nameof(x), x); // more verbose, not DRY

Potential solution

Add IReadOnlyList<string> Names { get; } to the FormattableString.

Those could be generated based on anonymous class property name algorithm:

1. $"{x}, {y}"  // Names = "x", "y"
2. $"{x.Y.Z}"   // Names = "Z"
3. $"{x}, {5}"  // Names = "x", null

Or just literally contain passed in code as a string, e.g. $"{x+5}" // Names = "x+5", though in this case something like Snippets would be a better name.

Since each call site will have a fixed set of names/snippets, each list can be cached per call site once and reused between calls. This means that even the names/snippets aren't used, the performance cost would be minimal.

Pros
  1. Allows frameworks to be more intelligent when generating a new string from FormattableString or recording the parameters somewhere.
  2. Gives some unexpected uses to the FormattableString by making it a light-weight way to pass identifier-named value -- e.g. validation.
  3. If code is passed as as string instead of generating a name, this can be used to create an easily cacheable analog of expression trees (by parsing it with Roslyn etc), and various other kinds of 'magic' APIs.
Cons
  1. If names are generated using anonymous class algorithm, some of them might be null. This might be unexpected -- frameworks can provide fallback, but they have to be careful of those cases.
  2. If full code is passed as a string, analysis has to be careful so that it is not broken by small variations within the code (e.g. as introduced by refactoring).
@paulomorgado

This comment has been minimized.

Copy link

paulomorgado commented Mar 8, 2015

@ashmind,

What would you expect to be the captured names here?

$"{OneCustomer.Id} = {AnotherCustomer.Id}"
@ashmind

This comment has been minimized.

Copy link

ashmind commented Mar 9, 2015

@paulomorgado As per above, there are two options:

  1. "Id", "Id"
  2. "OneCustomer.Id", "AnotherCustomer.Id"

I am pretty sure there are reasons why second one isn't a good idea -- waiting for more feedback.

@paulomorgado

This comment has been minimized.

Copy link

paulomorgado commented Mar 9, 2015

@ashmind, there can't be two. which one do you propose and why?

@ashmind

This comment has been minimized.

Copy link

ashmind commented Mar 9, 2015

there can't be two

Not when implemented, but within the discussion there can definitely be two options. I believe we'll probably end up with 1 if this is actually implemented, but 2 has some interesting scenarios of its own. So I'm waiting for more feedback to see what other people think.

@atifaziz

This comment has been minimized.

Copy link
Contributor

atifaziz commented Jul 27, 2015

It would be better to see string interpolation handling & packaging driven by user-code rather than hard-wiring into the compiler and expanding on FormattableString. For example, if the compiler sees an interpolated string being converted to a type other than FormattableString then it should invoke methods on that type to build the interpolated string.

Suppose there is a class FormattableStringBuilder. When the compiler sees a conversion as follows:

FormattableStringBuilder fsb = $"Processed {position} in {elapsedMs:000} ms.");

it could generate the following code to hand off the various bits of the interpolated string to the builder type/implementation:

    var fsb = new FormattableStringBuilder(2);
    fsb.AddText("Processed ");
    fsb.AddArgument(position, null, "position");
    fsb.AddText(" in ");
    fsb.AddArgument(elapsedMs, "000", "elapsedMs");
    fsb.AddText(" ms.");

The FormattableStringBuilder would be duck-typed such that no definition is required in the base class libraries. The compiler would simply work with the type using the following contract:

  • Instantiate the type and pass the number of expected arguments if such a constructor of a single int argument exists. This could be used internally by the builder, for example, to allocate an array or initialize the capacities of dynamic lists.
  • For each literal part of the interpolated string, look for a method called AddText taking a single string argument and call it.
  • For each argument ({…}) in the interpolated string, look for a method called AddArgument taking a single argument and pass the expression result into it. The argument can be typed more restrictively than object to allow only a subset of values or otherwise avoid boxing. If AddArgument has two arguments with the second argument being a string then it should receive the format specification for that argument and null if no specification was given. Finally, if there is a third argument that is typed string then the compiler would send the source of the argument expression (as found in the code) into that argument, which would cover this proposal (could also imagine capturing as an expression tree here if the argument is typed such). If AddArgument is overloaded then the one with the most arguments (3) should be chosen and one where the first argument most closely matches the type of the expression.

Here is an example of what FormattableStringBuilder could look like:

class FormattableStringBuilder
{
    public FormattableStringBuilder(int capacity) { /* ... */ }
    public void AddText(string text) { /* ... */ }
    public void AddArgument(object value) { /* ... */ }
    public void AddArgument(object value, string format) { /* ... */ }
    public void AddArgument(object value, string format, string source) { /* ... */ }
}

The details can vary based on further discussion and refinement but the idea I want float here with this approach is that it enables the implementation of the builder to decide everything from formatting and allocation to how the names would be captured in the $"{OneCustomer.Id} = {AnotherCustomer.Id}" case raised by @paulomorgado.

@paulomorgado

This comment has been minimized.

Copy link

paulomorgado commented Jul 27, 2015

@atifaziz, the names are never captured. Only the value of the expressions (that may or may not "have a name") is captured.

And you can do a lot with what's provided out of the box:

public static IDbCommand CreateCommand(this IDbConnection connection, FormattableString commandText)
{
    var command = connection.CreateCommand();
    command.CommandType = CommandType.Text;

    if (commandText.ArgumentCount > 0)
    {
        var commandTextArguments = new string[commandText.ArgumentCount];
        for (var i = 0; i < commandText.ArgumentCount; i++)
        {
            commandTextArguments[i] = "@p" + i.ToString();
            command.AddParameter(commandTextArguments[i], commandText.GetArgument(i));
        }

        command.CommandText = string.Format(CultureInfo.InvariantCulture, commandText.Format, commandTextArguments);
    }
    else
    {
        command.CommandText = commandText.Format;
    }

    return command;
}

When called by this code:

var id = 2;
var connection = new SqlConnection();
var command = connection.CreateCommand($"select column1, column2 from table where column1 = {id}");

The compiler will generate this caller code:

int num = 2;
SqlConnection connection = new SqlConnection();
object[] arguments = new object[] { num };
IDbCommand command = connection.CreateCommand(FormattableStringFactory.Create("select column1, column2 from table where column1 = {0}", arguments));
Extensions.Dump<string>(base.GetType().Assembly.Location);

So, what are you missing here? What's the use case?

@atifaziz

This comment has been minimized.

Copy link
Contributor

atifaziz commented Jul 27, 2015

@paulomorgado I'm aware that names are not captured and I'm not missing anything. In fact, I'm not sure how your comment is related to mine at all. I was proposing an alternative to the potential solution by @ashmind:

Add IReadOnlyList Names { get; } to the FormattableString.

@khellang

This comment has been minimized.

Copy link

khellang commented Aug 5, 2015

👍 Even having just the content of the holes as strings would be a nice improvement!

@nblumhardt

This comment has been minimized.

Copy link

nblumhardt commented Aug 5, 2015

There's already some precedent for this kind of functionality in Linq. Check out Where() in both of these cases:

new [] { 1, 2, 3 }.Where(n => n > 2)

dbContext.Cats.Where(cat => cat.Name == "Tiger")

Where() is syntactically equivalent in both expressions, but theIEnumerable<> parameter type in one case causes the lambda expression to be interpreted as a Func<>, while in the other the parameter type is IQueryable<> and thus the lambda is an Expression<Func<>> providing all of the original syntactic metadata.

Analogously, in this instance:

writer.Write($"Hello, {name}")

log.Write($"Hello, {name}")

...the first Write() might accept a string or FormattableString as per C#6, and the second an Expression<FormatExpression> (not sure if there's anything like FormatExpression today). The complier would emit either the direct or quoted call depending on the target type.

FormatExpression would expose the literals ("Hello, " etc) and the holes (name) via the usual expression machinery. This eliminates the need to choose a canonical string representation of the embedded expressions ("Id" vs "OneCustomer.Id" and so-on) and keeps pretty closely to existing language features.

Unfortunately, the runtime allocations involved might make it unappealing to use in the case of structured logging, which was the basis of @ashmind's original post. More investigation of the scenarios and costs would have to happen before a proposal could be built around this, but I thought I'd put it out there just in case it helps someone else to move the thread forwards.

@gafter gafter added the Discussion label Nov 20, 2015

@gafter

This comment has been minimized.

Copy link
Member

gafter commented Dec 9, 2015

Why don't you just convert the interpolated string into the contents of an expression lambda?
Expression<Func<string>> e = () => $"etc";

@ashmind

This comment has been minimized.

Copy link

ashmind commented Dec 22, 2015

@gafter Was your question for @nblumhardt or me?

If it was to me, then my initial proposal has the following benefits over expressions:

  1. Significantly less allocations — only one FormattableString instead of multiple Expressions
  2. No need for (uncacheable?) processing of the expression tree
  3. No chance of getting a tree that has unexpected node types and can't be processed
  4. Easier to use — who wants to do all their logging using () => "..." ?
  5. Easier to understand — FormattableString is simpler than Expression<Func<string>>
@gafter

This comment has been minimized.

Copy link
Member

gafter commented Jan 21, 2016

We do not expect to do anything like this.

@gafter gafter closed this Jan 21, 2016

@urig

This comment has been minimized.

Copy link

urig commented Apr 8, 2016

Why was this issue closed?

@gafter

This comment has been minimized.

Copy link
Member

gafter commented Apr 9, 2016

Because "We do not expect to do anything like this."

@urig

This comment has been minimized.

Copy link

urig commented Apr 10, 2016

Repeating that sentence does not make it any more coherent.
Is there a justified reason why this issue was closed?

@gafter

This comment has been minimized.

Copy link
Member

gafter commented Apr 10, 2016

To avoid people spending time and energy getting their hopes up for something that is not going to happen.

@urig

This comment has been minimized.

Copy link

urig commented Apr 10, 2016

Granted closing the issue achieves that. At the same time leaving things without any reason or explanation of WHY the feature is declined works to increase frustration.

@gafter

This comment has been minimized.

Copy link
Member

gafter commented Apr 11, 2016

@urig There are billions of things we don't do for every one thing we do. We don't need a billion reasons. We need reasons for the things we choose to do.

But if I were to look for a reason: I don't think it is reasonable for every conversion from interpolated string to FormattableString to pay the cost to store data for those occasional users who might want it.

@ashmind

This comment has been minimized.

Copy link

ashmind commented Apr 11, 2016

@gafter The reason is important. The way your original decision was worded strongly discouraged me from providing any Roslyn feedback in the future. I don't mind the rejection, but why spend my free time doing well-structured proposals to get a single sentence? Which gives me no idea on what I did wrong -- or how to avoid wasting your (and my) time again.

I suppose it depends on your situation. If you are overloaded with ideas, then I understand why it doesn't matter if you lose a few people along the way.

However I'll still give my two cents on what I would like to see in a proper Roslyn decision:

  1. Why the idea does not work.
  2. Whether it's rejected from the language or from the team backlog -- can other people grab it?
  3. Whether it was discussed by Roslyn team and a link to the meeting notes if so.
@gafter

This comment has been minimized.

Copy link
Member

gafter commented Apr 11, 2016

  1. I am not saying the idea "does not work". I am saying that there is nobody on the LDM who would champion such a feature request, and that would be necessary for it to even be discussed by the LDM. For reasons I previously presented, I would oppose the feature as proposed. (I would be less opposed if it were based on a conversion to a type newly introduced for this purpose).
  2. This was never on the backlog. Features that we're explicitly interesting in having the community take on are marked with the "Up For Grabs" label. Of course this is an open-source project, so you can fork it and produce any prototype you think will be helpful in selling the idea.
  3. There is nobody who would champion this to bring it to the LDM, so there are no meeting notes for any discussion. Given the backlog of features we are keenly interested in, we don't have any reason to put this on the agenda, or anyone on the team who has asked for it to appear on the agenda, or anyone on the LDM who is willing to kick more features out of C# 7 so we can have time to discuss this. There are hundreds of ideas that we don't discuss, and this is one of them.
@MadsTorgersen

This comment has been minimized.

Copy link
Contributor

MadsTorgersen commented Apr 11, 2016

Fortunately and unfortunately we get more proposals than we can discuss in any depth. Folks on the LDM spend time looking at proposals here on GitHub, and bring up ones that they feel should warrant a discussion. This may be because they agree with the proposal, or feel that it raises a scenario that is worth trying to address, or similar. We call this "championing" a feature. This is our primary filtering mechanism, and it helps us focus our limited time on proposals that have a chance.

I realize how annoying it is to spend time on a proposal and then have it ignored or summarily dismissed. I wish there was a realistic process by which we could formulate a "why not" for every well-worked-out proposal that we do not choose to adopt into the language. Neal closing this issue is the closest we can get - he realizes the issue won't make it to a discussion, and closes it as a signal to the participants on the thread to help prioritize their efforts.

I'm sorry if this comes off as non-appreciative: it is not! We deeply value the discussions and proposals here, and they have a big influence on where we take the language. This goes for feature proposals, and equally for the many posts pointing out flaws and inconsistencies in proposals - ours and others'. I want to thank everyone who spends time and creative energy here, enriching the discussion and helping drive the languages forward.

If you feel a proposal really deserves our attention and isn't getting it, I recommend submitting it on UserVoice as well:

https://visualstudio.uservoice.com/forums/121579-visual-studio-2015/category/30931-languages-c
https://visualstudio.uservoice.com/forums/121579-visual-studio-2015/category/30933-languages-visual-basic

Whereas the feedback here on GitHub is rich on technical commentary and insights, UserVoice has the crucial benefit of allowing voting. It also tends to reach a more representative set of developers. Top UserVoice suggestions are sure to grab our attention, whether or not we are able to do something about them. In this way, the two forums supplement each other well.

Again, thanks for all the great ideas! I love the intensity, quality and volume of discussion, and it breaks my heart that we (the language designers) cannot engage deeply in all of it.

@ashmind

This comment has been minimized.

Copy link

ashmind commented Apr 11, 2016

@gafter Thank you for the detailed answer. It really helped me to understand the situation. I didn't know about the importance of a champion, or that making an implementation is encouraged even before the idea is explicitly marked as Up For Grabs.

That's exactly why I wanted details -- things that you take for granted are not always obvious to people outside the team. For example based on your answer it seems that the proposals are closed if they aren't likely for the nearest future (C# 7, 8?), while I interpreted closed as "not suitable for C# at all".

@ashmind

This comment has been minimized.

Copy link

ashmind commented Apr 11, 2016

@MadsTorgersen Thank you for taking time to clarify this -- I really appreciate your sentiment and the quality and openness of C# design process. I will definitely consider User Voice for future proposals I find important.

@gafter

This comment has been minimized.

Copy link
Member

gafter commented Apr 12, 2016

@ashmind I generally close things that I perceive as never being able to make the bar for a future release (not just 7 and 8). For this particular issue, if it is really important to you I recommend prototyping it with a different target type than FormattableString so people would not be forced to pay the price for the feature if they don't want the benefits.

@atifaziz

This comment has been minimized.

Copy link
Contributor

atifaziz commented Apr 12, 2016

I would be less opposed if it were based on a conversion to a type newly introduced for this purpose

@gafter Isn't that what I proposed? Or did you mean something else altogether?

@gafter

This comment has been minimized.

Copy link
Member

gafter commented Apr 12, 2016

@atifaziz You suggested something more "general" using duck typing. We generally cannot add conversions to types that could exist before, as it would be a breaking change in overload resolution. Instead we could in theory add a new type with a new conversion from an interpolated string to that new type. Though I have to say I do not forsee it bubbling to the top of our wish list in the forseeable future.

PawelGerr added a commit to PawelGerr/Thinktecture.Extensions.Logging that referenced this issue Feb 10, 2018

PawelGerr added a commit to PawelGerr/Thinktecture.Extensions.Logging that referenced this issue Oct 8, 2018

PawelGerr added a commit to PawelGerr/Thinktecture.Extensions.Logging that referenced this issue Oct 8, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment