Proposal: Combine overload for IncrementalValuesProvider<T> #58127

Sergio0694 · 2021-12-06T11:18:05Z

Background and Motivation

I find myself often ending up in a situation where I'd like to combine two IncrementalValuesProvider<T> instances, essentially "zipping" them. There doesn't seem to be an API for doing this though, as the existing Combine methods only accept one of the left/right values being an IncrementalValueProvider<T> instance. Consider the following simplified scenario:

IncrementalValuesProvider<INamedTypeSymbol> symbols = context.SyntaxProvider.CreateSyntaxProvider(...);

IncrementalValuesProvider<string> left = symbols.Select(static (item, token) => GatherInfoA(item));
IncrementalValuesProvider<string> right = symbols.Select(static (item, token) => GatherInfoB(item));

context.RegisterSourceOutput(left, static (context, item) => { });

// This doesn't compile: no matching overload. I'd like to zip left and right together
// here as I need to access matching items from both when generating code. I don't want
// to have to recompute the information in left again in this right pipeline subtree.
context.RegisterSourceOutput(right.Combine(left), static (context, item) => { });

The rationale here is that:

The intermediate information in left is used on its own in a first source production node
That same information is also needed in the source production node taking right
I would like not to have to call GatherInfoA() again for each item in right, as I already have that info
Additionally calling GatherInfo_() might be expensive, so I really just want to reuse the result I have

Proposed API

namespace Microsoft.CodeAnalysis
{
    public static class IncrementalValueProviderExtensions
    {
+        public static IncrementalValuesProvider<(TLeft Left, TRight Right)> Combine<TLeft, TRight>(this IncrementalValuesProvider<TLeft> provider1, IncrementalValuesProvider<TRight> provider2);
    }
}

Alternative solutions

Consider this scenario:

- SOURCE
   |
   | - Data A ---> Output
   | ---|--- Data B ---> Output
        |      |
        |------|--- Data C ---> Output

One possible workaround doable today is to do something like this:

dataA
.Collect()
.Combine(dataB.Collect())
.SelectMany(static (item, token) =>
    item.Left.Zip(item.Right, static (Left, Right) => (Left, Right)));

Which does yield back an IncrementalValuesProvider<(A, B)> sequence, but this doesn't seem efficient at all. The fact I'm doing Collect() on both means that every time a single item in the sequences is removed/added/updated, the entire collection will be reevaluated, instead of just that one item. What I'd like instead is to just have individual items that are changed to be queried for reevaluation, with the guarantee that if both source sequences have no incompatible filters on them (that is, either they have no Where calls, or if they do, they have one that applies the same filtering on both sequences), then I'll just get asked to recompute a single pair of items in this resulting values provider combining the two.

Notes

In order for this to work, Roslyn needs to guarantee that items in the same position across different IncrementalValuesProvider<T> instance will match and refer to the same source item. As in, this will only work if Roslyn can guarantee that transformations on the values providers are "stable": the two input sources will always have the same number of items when processed (if the user hasn't messed up filtering) and that items will not be reordered in just one of the two providers. That is, if source item A is used to produce B and C in the transformed producers left and right, then calling Combine on them should guarantee that each resulting pair will correctly associate items B and C for each original source item A used to produce them.

cc. @sharwell and @jkoritzinsky who will involved in this conversation on Discord 🙂

The text was updated successfully, but these errors were encountered:

sharwell · 2021-12-06T17:00:27Z

I don't like the name Combine for this. It suggests the output will be combinations of the inputs, but it is not.

Sergio0694 · 2021-12-06T17:04:39Z

Would Zip work? I mean that's the same name LINQ uses for this operation, and other extensions for IncrementalValueProvider<TValue> and IncrementalValuesProviders<TValues> are mirroring the names of LINQ APIs as well.

sharwell · 2021-12-06T17:07:34Z

Potentially yes. Note that Razor identified a use for true Combine semantics, by having one input be zero-or-one with zero meaning "disable the source generator" and one item meaning "enable the source generator". This eliminates a large amount of boolean checks, but perhaps isn't easy to understand that IncrementalValuesProvider<T> is being used as a IncrementalValueProvider<Optional<T>>.

chsienki · 2021-12-15T03:19:38Z

We deliberately left out the Combine which performs a cross join when designing the APIs. It is possible to manually perform one with the APIs today however:

IncrementalValuesProvider<INamedTypeSymbol> symbols = context.SyntaxProvider.CreateSyntaxProvider(...);

IncrementalValuesProvider<string> left = symbols.Select(static (item, token) => GatherInfoA(item));
IncrementalValuesProvider<string> right = symbols.Select(static (item, token) => GatherInfoB(item));

var crossJoin = left.Combine(right.Collect()).SelectMany(static (pair, _) => pair.right.Select(static rightItem => (pair.left, rightItem));

This isn't particularly inefficient. The gather's are called only as needed, and the collect() will be considered cached if all items in it are. When the right hand side changes, the select many will always be called, but the resulting tuples will essentially be 'cached out' in that most of them produced won't be modified so no downstream nodes will be executed for them. Given that the SelectMany is cheap (comparatively) to run, you shouldn't see any perf downsides to doing it this way.

Sergio0694 · 2022-01-06T21:06:51Z

Wait, I'm confused. While the name would be the same, this doesn't seem to be the semantics I'm proposing in this issue. I don't want a cross-join (MxN items), I want a zip of two sequences of equal length. I do agree with @sharwell that maybe a different name (eg. Zip) would be better, if Combine would instead suggest a cross-join behavior like the one you mentioned 🙂

Sergio0694 added Concept-API This issue involves adding, removing, clarification, or modification of an API. Feature Request labels Dec 6, 2021

dotnet-issue-labeler bot added Area-Language Design untriaged Issues and PRs which have not yet been triaged by a lead labels Dec 6, 2021

jinujoseph added the Area-Compilers label Dec 9, 2021

jaredpar removed the untriaged Issues and PRs which have not yet been triaged by a lead label Dec 14, 2021

jaredpar added this to the 17.2 milestone Dec 14, 2021

jaredpar assigned chsienki Dec 14, 2021

jcouv modified the milestones: 17.2, 17.3 May 14, 2022

jaredpar modified the milestones: 17.3, Backlog Jun 27, 2022

Sergio0694 mentioned this issue Jul 30, 2022

Question: How can I combine multiple IncrementalValuesProvider<T> #63093

Closed

CyrusNajmabadi removed the Area-Language Design label Nov 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Combine overload for IncrementalValuesProvider<T> #58127

Proposal: Combine overload for IncrementalValuesProvider<T> #58127

Sergio0694 commented Dec 6, 2021 •

edited

sharwell commented Dec 6, 2021

Sergio0694 commented Dec 6, 2021

sharwell commented Dec 6, 2021 •

edited

chsienki commented Dec 15, 2021 •

edited

Sergio0694 commented Jan 6, 2022

Proposal: Combine overload for IncrementalValuesProvider<T> #58127

Proposal: Combine overload for IncrementalValuesProvider<T> #58127

Comments

Sergio0694 commented Dec 6, 2021 • edited

Background and Motivation

Proposed API

Alternative solutions

Notes

sharwell commented Dec 6, 2021

Sergio0694 commented Dec 6, 2021

sharwell commented Dec 6, 2021 • edited

chsienki commented Dec 15, 2021 • edited

Sergio0694 commented Jan 6, 2022

Sergio0694 commented Dec 6, 2021 •

edited

sharwell commented Dec 6, 2021 •

edited

chsienki commented Dec 15, 2021 •

edited