Skip to content

Latest commit

 

History

History
125 lines (80 loc) · 7.43 KB

File metadata and controls

125 lines (80 loc) · 7.43 KB

Collection literals working group meeting for June 26, 2023

Agenda

  • Collection expression lifetimes
  • BCL apis
  • Breaking changes
  • Extension method lookup
  • Syntactic disambiguation

Discussion

Collection expression lifetimes

Relevant link: https://gist.github.com/jaredpar/a784f8963ce73613a0a696ef89a00d9a

Discussed the rules about spans produced by collection literals, and whether those are 'safe to return' from the 'current method'.

Conclusion. We will accept the rules in the provided link. Specifically, spans from collection literals are not "Safe to return from current method". This has the benefit of allowing the compiler to potentially stackalloc these spans for very high efficiency code (including cases like G([1, 2, 3])).

While the compiler is free to use stackallocs for these spans, it is not required to do so. However, as some consumers may only want to allow collection literals when they will be stackalloc'ed we will ensure that the compiler emits a hidden-diagnostic if it heap allocs. These consumers can set that diagnostic to be an error, blocking unwanted span heap allocations. In those cases, they can manually stackalloc.

Users who want to be able to return those spans can do so by explicit conversion to an array. e.g. Span<int> s = (int[])[1, 2, 3];.

BCL Apis

BCL gave a status update on the work they have done. Specifically: dotnet/runtime#87945

This (merged) PR adds the pattern of an immutable collection type C<T> having a factory method public static C<T> C.Create<T>(ReadOnlySpan<T> values); to create the collection. A soon to be added attribute (CollectionBuilderAttribute) will be added that will allow the language/compiler to find this factory method from a given collection type.

We do not plan any more BCL apis beyond this. However, there are two key changes we will make in the compiler in terms of emit, to get even better codegen than what those above methods allow. Specifically:

  1. while there is an public static ImmutableArray<T> ImmutableArray.Create<T>(ReadOnlySpan<T> values); method, the compiler will not call it. Instead, it will sidestep it and use ImmutableArray<T> ImmutableCollectionsMarshal.ToImmutableArray<T>(T[] values); method. This method passes ownership of the array directly to ImmutableArray<T> ensuring no overhead (no excess allocations or copying).

  2. while List<T> will already work with collection literals (due to it supporting collection-initializers) the compiler will emit new collection literal List<T>'s as:

List<int> list = [a, b, c]; // translates to:

List<int> list = new List(capacity: 3);
CollectionsMarshal.SetCount(list, 3);
Span<int> __storage = CollectionsMarshal.AsSpan(list);
__storage[0] = a;
__storage[1] = b;
__storage[2] = c;

Rule '1' will always apply for immutable arrays. Rule '2' can apply depending on how complex it is for the compiler. For example, if the code were List<int> list] = [await a, await b, await c]; the compiler might choose to just emit this equivalently to new List<int> { await a, await b, await c } (i.e. equivalently to a collection initializer expression).

Breaking changes

We would like to update the language so that (A)[B] is always a cast of a collection literal, not an indexing expression of a trivial identifier expression. This would apply to (A)[B] (trivial identifier), or (A.B.C)[D] (dotted identifier). In both of these cases, if the user wanted indexing, it would be far more idiomatic to write this out as A[B] or A.B.C[D] (the parentheses are superfluous).

THe compelling case to want to make this a cast of a collection literal is:

using CrazyArray = ImmutableArray<List<HashSet<int>>>; // i.e. some complex collection type

// ...
var v1 = (CrazyArray)[a, b, c]; // really don't want to have to spell out the type in each location
var v2 = (CrazyArray)[x, y, z];

Currently the prototype impl makes no breaks here and keeps parsing as it exists prior to this feature.

As this is a breaking change, we might want to combine it with the work on "safe breaking changes" being looked into. However, we also think the chance of this impacting someone would be super small (though investigations would be necessary). We want to get a read from LDM on if we should explore this fruther.

Extension method lookup

WG believes it would be valuble for customers to be able to write:

static class Extensions
{
    public static ImmutableArray<T> AsImmutableArray<T>(this ImmutableArray<T> array) => array;
}

void M()
{
    var v = [complex, type, examples].AsImmutableArray();
}

This is because the alternative would be to have to write something like:

(ImmutableArray<Complex<Type<Examples[]>>)[complex, type, examples];

In other words, the extension method approach would allow for type inference to make this much simpler and cleaner than having to fully specify all the types. Working group believes (but needs to verify) that a simple extension (no pun intended) to 'extension method lookup' would make this work. Specifically, while the collection type has no natural type currently (like null, a lambda or a method group) it could still trivially participate in extension method lookup and would work in cases like this.

We do, however, believe the value here is diminished if, say, the language adopts features like "postfix cast" and "partial generic type inference". If those features make progress, you could write the above as: [complex, type, examples].(ImmutableArray<_>). This would be simple and clear.

WG also thinks that if we did add extension method support for the above, that we could consider doing the same for lambdas and method-groups.

Syntactic disambiguation update

The feature has some small syntactic ambiguities with both existing features, and features coming around at the same time. Specifically:

Case 1:

a ? [b] ...   // is that:

(a?[b])       // a conditional access expression, or:
a ? [b] : ... // a conditional expression

Conclusion: we will preserve existing parsing for these. So if it is a case where it would successfully parse as a conditional-access-expression, it will remain that way. However, if trying that approach would end up failing, we will try again as a conditional-expression and accept that if it succeeds. This means the following will work:

var v1 = a ? [b] : c;
var v2 = [a ? [b] : c];

While this involves complex lookahead, we only need to apply it when parsing fails and we can see that there was a ?[ involved. So costs should be minimal for mainline cases. A PR for this is here: dotnet/roslyn#68756

Case2: The collection literals feature also introduces its own ambiguity with: [a ? [b] : c] is that [(a?[b]): c] (a dictionary literal?), or [a ? ([b]) : (c)] a collection literal with a conditional-access in it.

Unlike the above case, there is no back compat issue here as none of this code was legal before. In this case, we choose an interpretation that prefers list-literals first, then dictionary literals. So the above is a list-literal containing a conditional-access-expression. If a dictionary-literal is preferred, the initial expression must be parenthesized.

Case3: In [range1, ..x, range2], ..x could be a spread-element or a range-expression.

WG affirms this is always a spread-element. This should be an extremely unlikely event. And, if the user wants a range-expression they can write 0..x or (..x).