New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Proposal]: Collection Expressions Next (C#13 and beyond) #7913
Comments
I'd like to see some way of calling the constructor of a type and/or setting some property during initialization. The first thing that came to mind is the comparer of a dictionary, but I'm sure there are other use-cases. Either way, collection expressions are amazing, and support for natural types and inline expressions would make them that little bit better! |
@KennethHoff that's in the list, as part of hte dictionary-expression exploration work. Thanks :) |
I'm not sure if the following proposal is too crazy, so I will describe it here quickly, as it's related to this topic: Imagine that I have a Ideally I would like to spread the array of ints into the array of strings while also calling If But since there is no such conversion, I would like to write something like Spread operator as a first-class citizen.So basically the spread operator The spread unary operator may be applied to any other expression, resulting in a spread_expression. When that happens, an implicit The spread_expression then behaves, for the purposes of member-lookup and overload-resolution, as an expression whose type is the element-type of the original enumerable. Because of that, one may invoke any members the element type might have, as well use the spread_expression on a method that takes element-type as argument. These invocations will then be inserted inside the invisible The result of a member invocation performed on, or taking the spread_expression as an argument, is also itself a spread_expression, whose element-type is the return type of the invoked member, if not Finally, one will want to capture the result of all these method invocations on each element of the original collection. Therefore, any spread_expression can be used regularly as a spread_element in a collection expression, with the existing rules. Problems:
|
C# already has a query comprehension syntax, LINQ. |
True, but, that's not really an argument. If you would argue against all proposals saying "it can already be done in some way", then none would ever be accepted. Lists could already be created with LINQ or with initializers, yet collection expressions were introduced. And they have the added benefits of duck typing/nice syntax/good performance. LINQ, on the other hand, is interface-based and makes use of delegates and anonymous objects. If one could perform some simple transformations through the use of spread operator, everything would be inserted directly in the caller method, with no delegates or closures. There isn't any optimization better than that, LINQ would almost be obsolete. |
See: #7634 |
That's where supporting extension methods would help. As you could write:
|
I assume you could also do this? [ ..[1, 2, 3].Select(i => i.ToString()) ] |
@KennethHoff yes. |
@KennethHoff Yes, or |
I suggest widening support for collection builders to accept other buildable collections. For example I can choose to wrap a HashSet or a List or an Array or whatever. It feels so weird to accept a Span for such types. |
@En3Tho Can you give an example? |
public class ArrayWrapperBuilder
{
// This works
public static ArrayWrapper<T> Create<T>(ReadOnlySpan<T> values)
{
return new(values.ToArray());
}
// I want this to work instead. Array is already a creatable collection, so just let compiler create it and use directly here
// Imagine if it was a List<T> or HashSet<T> wrapper. Even more allocations while compiler is perfectly able to create this kind of collection directly.
public static ArrayWrapper<T> Create<T>(T[] values)
{
return new(values);
}
}
[CollectionBuilder(typeof(ArrayWrapperBuilder), nameof(ArrayWrapperBuilder.Create))]
public readonly struct ArrayWrapper<T>(T[] array) : IEnumerable<T>
{
public T[] Array => array;
public IEnumerator<T> GetEnumerator()
{
throw new NotImplementedException();
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
public static class ArrayWrapperCreator
{
public static ArrayWrapper<T> GetWrapper<T>() => [default, default, default];
} |
I'm glad the lack of Something like foreach (int i in [1, 2, 3]) was literally the first thing I tried with collection literals and I was surprised why it was failing to compile with CS9176 saying there was no target type. It basically provides exactly the same amount of information as foreach (var i in (IEnumerable<int>)[1, 2, 3]) which happily compiles today (though I would rather use some old-school array initialization instead of the bulky cast). I think this could be added even independently from natural types. |
One addition that we would like to see is support for multi-dimensional collection literals. These are important when working with tensor libraries, e.g. here's how simple tensors are defined using the pytorch library: data = [[1, 2],[3, 4]]
x_data = torch.tensor(data) The above form isn't feasible using today's C# collection literals, so supporting it without some kind of language support seems unlikely. One possibility is that we could reuse nested collection literal syntax to construct multi-dimensional collections. TL;DR it should be possible to extend the collection builder pattern to recognize factory methods such as public static T[,] Create<T>(ReadOnlySpan<T> values, int nDim, int mDim); and then be able to define 2-D arrays like so T[,] values = [[1, 0], [0, 1]]; In principle, it should be possible for the compiler to infer the rank and dimensions and detect shape mismatches (e.g. something like What's more interesting though is that by reusing nested collection syntax there is no inherent limit on the supported number of dimensions, and the number of dimensions doesn't need to be fixed for a given type. We could for instance support builder methods of shape public static Tensor<T> Create<T>(ReadOnlySpan<T> values, params ReadOnlySpan<int> dimensions); Which should let you specify the following 2x2x2 tensor: Tensor<int> tensor = [[[0, 0], [0, 0]], [[1, 0], [0, 1]]]; |
Thanks @eiriktsarpalis . The working group is discussing this. I'll add you to that. |
Having read the (hundreds) of comments on the original Collection Expression proposal, and noting the large number of natural type immutable versus mutable comments, what about if the type was immutable if the Collection Expression consisted only of immutable literals?
but
tt[1] = "z"; // fine |
@NetMage Why would |
Greetings to everyone. Sorry in advance if the topic I am asking about was already discussed somewhere. In that case I would be very grateful if you share the link to such discussion with me. I usually do not participate in the proposal discussions, this is a first one for me. So, please excuse me if I don't follow some workflow for proposal discussions. I decided to reach out after I worked a bit with collection expressions in C# 12, investigated the generated code for them, and found two cases that I personally found confusing. Here is the first one: Infinite generators. Suppose that you have some kind of a collection generator that can generate an infinite private static IEnumerable<int> OddNumbers()
{
int current = 1;
while (true)
{
yield return current;
current+=2;
}
} This can be used with LINQ: var oddNumbers = OddNumbers();
IEnumerable<int> oddNumberWithMinus1 = [-1, .. oddNumbers];
// we will never get here Unexpectedly, this program hangs. The investigation of the generated code shows that the compiler generates the code that internally fully materializes the collection in memory: IEnumerable<int> ints1 = Program.OddNumbers();
List<int> items = new List<int>();
items.Add(-1);
foreach (int num in ints1)
items.Add(num);
IEnumerable<int> ints2 = (IEnumerable<int>) new \u003C\u003Ez__ReadOnlyList<int>(items); Because of this the program hangs - it's impossible to materialize an infinite collection. So my problems are:
However, the term collection was used for I think, it will be difficult if even possible to write an analyzer to warn about such cases, but at least the documentation should be very clear about them.
Could this scenario receive some extra support in C# 13? Because the extra allocation here could be removed with standard LINQ methods like var oddNumbers = OddNumbers();
IEnumerable<int> x = oddNumbers.Prepend(-1);
Console.WriteLine(string.Join(" ", x.Take(3))); // prints -1 1 3 Sorry if such proposal was already considered and thanks in advance. |
Collection expressions aren't a comprehension language or intended as an alternative to LINQ, they always fully materialize spreads. I do agree that the documentation should probably call this out more clearly. |
Yes. It's a core part of the design. Linq is already there for comprehensions. Collection expressions exist intentionally to produce fully materialized collections. |
Thanks for the feedback, but this was an intentional decision. We do not want these collections to be lazy (and then have expensive enumeration semantics, or have them redo computation each time you enumerate). We have query comprehensions for that already. These collections are intended to be fully materialized, so you know that the final collection you get is cheap, efficient and finite. |
The second scenario I have encountered is somewhat artificial and definitely does not follow best design practices. Misuse of the "Count" property with duck typing The scenario involves the collection expression like this: public static IEnumerable<int> PrependOne(<Some integer collection type> s) => [1, ..s]; The generated code for the method looks like this: int num1 = 1;
<collection type> myCollection = s;
int index1 = 0;
int[] items = new int[1 + myCollection.Count];
items[index1] = num1;
int index2 = index1 + 1;
foreach (int num2 in myCollection)
{
items[index2] = num2;
++index2;
}
return (IEnumerable<int>) new <>z__ReadOnlyArray<int>(items); The C# compiler is very clever, it attempts to generate the most optimized code relying on the However, while C# compiler cheats while using this size calculation. Everything is OK when the collection explicitly states that it can provide Now consider this example. Suppose I have the following custom collection: public class MyCollection : IEnumerable<int>, IEnumerable<string>
{
private readonly int[] _ints;
private readonly string[] _strings;
internal int Count => _strings.Length;
public MyCollection(int[] ints, string[] strings) => (_ints, _strings) = (ints, strings);
public IEnumerator<int> GetEnumerator() => ((IEnumerable<int>)_ints).GetEnumerator();
IEnumerator IEnumerable.GetEnumerator() => _ints.GetEnumerator();
IEnumerator<string> IEnumerable<string>.GetEnumerator() => ((IEnumerable<string>)_strings).GetEnumerator();
} The example is artificial and it's clearly not the best design. But similar code appears sometimes, or can live in an old legacy code base. Now, suppose I'm using this custom collection in a collection expression: public static IEnumerable<int> PrependOne(MyCollection s) => [1, ..s];
//...
var myCollection = new MyCollection([1, 2, 3, 4], ["string"]);
IEnumerable<int> modified = PrependOne(myCollection); And unexpectedly I received This looks confusing to me, no written code has any access by index. It requires for developer to know what code is generated by the compiler behind the nice syntax sugar. I can't say that I really like this compiler trick even if it is legal and probably brings some performance benefits and less allocations in case of value types. The compiler assumes a contract in a place where it does not exists (the whole idea of duck typing). I know that C# already did this before, for example with |
@HaloFour , @CyrusNajmabadi thank you for your response! You both very clearly confirmed that the feature was designed to materialize collections and works only on finite collections. |
We made explicit choices with collection expressions to assume that people write well-behaved and sensible types. We want the optimizations to hit the broadest set of cases. It is understood that a non-well-behaved type may then have problems. But we're optimizing the lang design for the literal 99.999% case, at the cost of these strange outliers. Our recommendation if you do have types like these is to write analyzers to block them off with collection-exprs. |
@CyrusNajmabadi that is understandable. I do understand at least part of the reasons why the duck typing was used. For example, you just introduced a new way to classify types - well-behaved. Is there a formal definition for a well behaved type? I may consider the collection from my example to be well-behaved, why not? It does not break any contracts it explicitly states, only some implicit contracts that were later imposed by the new version of compiler. |
Sure. Feel free to file doc bugs. :)
Yes. If you are a collection type (defined in our spec), and you supply a .Count, then enumerating you should produce the same number of elements. Similarly, if you have an indexer, and you index into the type from |
Because the Count and GetEnumerator refer to totally different sequences. Again, this is vastly out of the normal case for collection use in reality. :) We are being pragmatic here. The 99.999% case is well behaved collections. Sacrificing the value we get on the normal case for the ecosystem for strange types like this would be cutting off our nose to spite our face. |
And there is nothing in the documentation that clarifies this and prevents the feature from being misused. Moreover, collection expressions definitely overlap with
I definitely will. I have read more thoroughly the documentation and feature specs and I'm sorry to say this but I feel that the current documentation is in an awful state. A lot of things like collection materialization or duck typing are implicit or not mentioned at all. Many are described in "feature specs" which contain too much details about implementation and at the same time explicitly state that the actual implementation may be different which undermines their value. |
Feel free to contribute doc fixes or file issues. This is all open source :-) This is the repo for the design and specification of the feature. Docs are not done here. Feedback on that should go to the docs team. Thanks! |
They overlap. But there is nothing about IEnumerable that makes it lazy. We considered and made a conscious decision that the collection expression space was all about realized collections. Just like the comprehension space was not. This wasn't comprehensions 2.0. We're happy with comprehensions today. This was specifically any about normal connections and the challenges there:-) |
This comment has been minimized.
This comment has been minimized.
@SENya1990 it's unclear to me what language request you are making there. It seems more to be runtime/api requests. If that's the case, then such requests would better go to dotnet/runtime.
This is unrelated to collection expressions. You'd get the same error if you didn't use collection expressions: int[] arr = new int[] { 1,2,3 };
int[] y = arr[1..]; Rather, the error here is about you trying to index into an array using a range. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Thanks for the feedback. This is out of the purview of the language. Runtime requests should go to dotnet/runtime. Compiler requests should go to dotnet/Roslyn. Each can make decisions based on their respective policies and priorities. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
@SENya1990 you are operating in unsupported ways. As halo mentioned, we're not going to do additional work to make these scenarios light up. We've already stated and published our positions here. You're free to do what you want here. But that may involve extra effort on your part. Thanks. I also mentioned that this is out of scope for the language discussion, and that this has nothing to do with collection expressions. Please make requests with the runtime or Roslyn teams if you want to continue this discussion. And please leave this issue for collection expressions only. Thank you. Due to this all being off topic, I've hidden all the messages. |
If among them there is one project, or a small number of projects, that are referenced by all the others in the solution (even if transitively), then you can use SharedProperties package to add anything (such as references or additional source files) to all of them at once. You just need to use
You can always use ILMerge/ILRepack to get rid of extra DLLs. And for the solution with 200 projects, just put the helper class in an existing one.
You can also SharedProperties to create the
Sorry, this matter has gone off-topic. Too bad github doesn't have private messages. Could you please minimize this message as well? @SENya1990, feel free to email me if you wish to continue. I'm in the same scenario as you, I must stay on .NET Framework, and I'm bending the limits of the language to make cool stuff (check my repo too). |
I'm pleased to acknowledge the established clear understanding that collection literals are inherently materialized. Moving away from the traditional focus on natural types tied to materialization, I'd like to propose a new notation: This new notation could benefit scenarios such as:
|
@CrayonPastel On the final point, calling extension methods will not be enabled by this scenario any faster than they'll be enabled on today's collection expressions. We already have interest in For the other points, can you give examples that show the clearest syntax in today's world that gets the job done, and then contrast how the new proposed list comprehension syntax looks, and the motivation for providing the new alternative? The concern is that specifically |
Collection Expressions Next
Summary
This issue is intended to be the umbrella tracking item for all collection expression designs and work following the core design (#5354) that shipped in C#12.
As this is likely to be a large item with many constituent parts, it will link out to respective discussions and designs as they occur.
Roughly, here are the items we would like to consider, as well as early notes on the topic: https://github.com/dotnet/csharplang/blob/main/meetings/working-groups/collection-literals/CL-2024-01-23.md
["a", .. b ? ["c"] : []]
andforeach (bool? b in [true, false, null])
. Collection expressions: inline collections in spreads and foreach #7864Memory<T>
,ArraySegment<T>
etc.)IEnumerable
, etc.)Design meetings
https://github.com/dotnet/csharplang/blob/main/meetings/2024/LDM-2024-01-08.md - Iteration types of collections
https://github.com/dotnet/csharplang/blob/main/meetings/working-groups/collection-literals/CL-2024-01-23.md - WG meetup
https://github.com/dotnet/csharplang/blob/main/meetings/2024/LDM-2024-01-10.md - Conversions vs construction
https://github.com/dotnet/csharplang/blob/main/meetings/2024/LDM-2024-02-05.md#collection-expressions-inline-collections
The text was updated successfully, but these errors were encountered: