Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-enumerating Count Linq Method #27183

Closed
TylerBrinkley opened this issue Aug 21, 2018 · 16 comments · Fixed by #48239
Closed

Non-enumerating Count Linq Method #27183

TylerBrinkley opened this issue Aug 21, 2018 · 16 comments · Fixed by #48239
Assignees
Labels
api-approved API was approved in API review, it can be implemented area-System.Linq help wanted [up-for-grabs] Good issue for external contributors
Milestone

Comments

@TylerBrinkley
Copy link
Contributor

TylerBrinkley commented Aug 21, 2018

Rationale

When dealing with collections it isn't uncommon that one needs to be able to get the count of an IEnumerable<T> as a default value for a new collection's capacity. The naive thing to do would be to simply use the Enumerable.Count extension method however if the IEnumerable<T> isn't an ICollection<T> or any of the other interfaces or types with a Count or Length property that method results in enumerating the collection which isn't desirable when just wanting a default value for a new collection's capacity. Instead of manually checking for those various interfaces each time someone needs this functionality I would like a non-enumerating count linq method added.

Proposed API

 namespace System.Linq {
     public static class Enumerable {
+        public static bool TryGetNonEnumeratedCount(IEnumerable<T> source, out int count);
     }
 }

Details

It would essentially retrieve the count so long as it doesn't necessitate enumerating the collection.

This would be a somewhat fragile method in that it should be updated whenever the next interface or type with a Count or Length property is introduced and thus may exhibit different behavior between versions.

Use Cases

It could be used in collection constructors where initializing a collection with an IEnumerable<T> is common and a good initial capacity is important to reduce resizing costs such as in the Dictionary<TKey, TValue> constructor here or the HashSet<T> constructor here. Unfortunately, since this method will be implemented at a higher level than these corelib types we won't be able to use it there.

Updates

  • Switched to Count with an allowEnumeration boolean parameter.
  • Switched to TryGetCount.
  • Switched to TryGetNonEnumeratedCount.
@TylerBrinkley
Copy link
Contributor Author

TylerBrinkley commented Aug 21, 2018

Alternatively we could add a public static int? Count(this IEnumerable<TSource> source, bool onlyIfCheap) overload similar to the internal IIlistProvider.GetCount(bool onlyIfCheap) method but it would return null if onlyIfCheap is true and it can't be determined in a O(1) operation.

@jaredpar
Copy link
Member

Alternatively we could add a public static int? Count(this IEnumerable source, bool onlyIfCheap)

Don't like this approach. The onlyIfCheap is not a quantifiable term. What is cheap to me may not be cheap to you.

It would essentially have the same body as the current Count extension method but would return the defaultValue parameter if it would need to enumerate the collection to find the count.

There are plenty of examples today where there is a pair of methods Operation and OperationOrDefault and they have a well established pattern. When Operation fails do the OrDefault action. The problem is that Count cannot fail. This proposal would be both adding a new method and a failure mode for something that can't fail today.

If we were to add an API like this I tihnk it would need to be a pair of methods:

  • Count that can fail when the source doesn't implement one of the "fast count" interfaces
  • The OrDefault variant

@cston, @333fred

@TylerBrinkley
Copy link
Contributor Author

@jaredpar

So you would prefer something like the following?

 namespace System.Linq {
     public static class Enumerable {
+        public static int? NonEnumeratingCount(this IEnumerable<TSource> source);
+        public static int NonEnumeratingCountOrDefault(this IEnumerable<TSource> source, int defaultValue);
     }
 }

I'm not sure an OrDefault method is necessary as it's very simple to achieve without, at least in C#.

var capacity = source.NonEnumeratingCount() ?? 0;

@TylerBrinkley
Copy link
Contributor Author

TylerBrinkley commented Aug 30, 2018

Instead of onlyIfCheap we could rename it to allowEnumeration which would have the reverse connotation.

@Grauenwolf
Copy link

What's wrong with this?

int? QuickCount<T>(this IEnumerable<T> source ) => (source as IReadOnlyCollection<T>)?.Count;

The only purpose of IReadOnlyCollection<T> is to tell us when the Count property is available. So its pretty safe to rely on it.

@khellang
Copy link
Member

khellang commented Feb 8, 2019

Or

bool TryCount<T>(this IEnumerable<T> source, out int count)

🤔

@TylerBrinkley
Copy link
Contributor Author

@Grauenwolf I guess my main motivation is to support IEnumerables that are the result of LINQ methods which don't currently implement IReadOnlyCollection<T>.

I've created the new issue dotnet/corefx#35173 to address adding implicit support for IReadOnlyCollection<T> to the results of LINQ methods when possible. If that were implemented there'd be less motivation to me to add this non-enumerating count method.

@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the Future milestone Jan 31, 2020
@maryamariyan maryamariyan added the untriaged New issue has not been triaged by the area owner label Feb 23, 2020
@adamsitnik adamsitnik removed the untriaged New issue has not been triaged by the area owner label Sep 2, 2020
@eiriktsarpalis
Copy link
Member

Ad-hoc implementations that try to calculate the count without enumerating are pretty common, so I think it would make sense to expose a method that does it in a correct and complete way. That being said, I don't think it should be called Count and it probably shouldn't be an extension method either. How about something like

public static class Enumerable
{
    public static bool TryGetCount(IEnumerable<T> source, out int count);
}

@TylerBrinkley
Copy link
Contributor Author

I agree it shouldn't be an extension method. I like the TryGetCount method name, works for me.

@eiriktsarpalis
Copy link
Member

@TylerBrinkley would you be able to update the original post? I'll see if I can submit this for review.

@TylerBrinkley
Copy link
Contributor Author

@eiriktsarpalis I've updated the original post.

@eiriktsarpalis eiriktsarpalis added api-ready-for-review API is ready for review, it is NOT ready for implementation and removed api-suggestion Early API idea and discussion, it is NOT ready for implementation labels Dec 2, 2020
@eiriktsarpalis eiriktsarpalis self-assigned this Dec 2, 2020
@LeaFrock
Copy link
Contributor

LeaFrock commented Dec 4, 2020

I think it will cause developers confused when returning false...

Enumerable.TryGetCount(mySource, out int count)

...method results in enumerating the collection which isn't desirable when just wanting a default value for a new collection's capacity.

I know it's reasonable but I prefer to handle it by users themself. Or, public static int NonEnumeratingCountOrDefault(this IEnumerable<TSource> source, int defaultValue = default); should be better. At least it informs users what this method does and users won't have to understand the actual meaning of "Try" then deal with failure result.

@eiriktsarpalis
Copy link
Member

I think it will cause developers confused when returning false

How so? The point of this addition is to provide its user a clear signal as to whether a count can be obtained without enumerating. Roughly it would translate into the following code:

if (Enumerable.TryGetCount(source, out int count))
{
    T[] fixedBuffer = new T[count];
    ConsumeUsingFixedBuffer(source, fixedBuffer);
}
else
{
    List<T> resizingBuffer = new List<T>();
    ConsumeUsingResizingBuffer(source, resizingBuffer);
}

At least it informs users what this method does and users won't have to understand the actual meaning of "Try" then deal with failure result.

I think you are right in pointing out that the currently proposed name TryGetCount is ambiguous in what it is doing. How about TryGetNonEnumeratedCount?

@LeaFrock
Copy link
Contributor

LeaFrock commented Dec 4, 2020

Well, it's better.

@TylerBrinkley
Copy link
Contributor Author

I changed the proposal to TryGetNonEnumeratedCount and added a few use cases.

@terrajobst terrajobst added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review API is ready for review, it is NOT ready for implementation labels Jan 26, 2021
@terrajobst
Copy link
Member

terrajobst commented Jan 26, 2021

Video

  • The concept makes sense but it seems nicer as an extension method
namespace System.Linq
{
    public static class Enumerable
    {
        public static bool TryGetNonEnumeratedCount(this IEnumerable<T> source, out int count);
    }
}

@eiriktsarpalis eiriktsarpalis added the help wanted [up-for-grabs] Good issue for external contributors label Jan 26, 2021
@eiriktsarpalis eiriktsarpalis removed their assignment Jan 28, 2021
@eiriktsarpalis eiriktsarpalis removed this from the Future milestone Jan 28, 2021
@eiriktsarpalis eiriktsarpalis added this to the 6.0.0 milestone Jan 28, 2021
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Feb 12, 2021
@eiriktsarpalis eiriktsarpalis self-assigned this Feb 13, 2021
eiriktsarpalis added a commit that referenced this issue Feb 15, 2021
* implement Enumerable.TryGetEnumeratingCount

* address feedback

* update consistency tests

* Replace EnumerableHelpers.TryGetCount with new method

* Rename to method name as approved

* make method is renamed in all projects
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Feb 15, 2021
@dotnet dotnet locked as resolved and limited conversation to collaborators Mar 17, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-System.Linq help wanted [up-for-grabs] Good issue for external contributors
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants