Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a non-generic IReadOnlyCollection interface #23337

Open
weitzhandler opened this issue Aug 26, 2017 · 36 comments
Open

Add a non-generic IReadOnlyCollection interface #23337

weitzhandler opened this issue Aug 26, 2017 · 36 comments
Labels
api-needs-work API needs work before it is approved, it is NOT ready for implementation area-System.Collections
Milestone

Comments

@weitzhandler
Copy link
Contributor

weitzhandler commented Aug 26, 2017

Description

I often bump into scenarios where I have to evaluate an IEnumerable that might be either ICollection<T>, ICollection, IEnumerable<T> or IEnumerable, to get their size if available, so to avoid iterating on them to count them.

The thing with ICollection<T>, is that it doesn't inherit from ICollection, and there are types that inherit from ICollection<T> but not from ICollection (for instance HashSet<T>).
Additionally, ICollection<T> isn't co-variant, meaning not every ICollection<T> is an ICollection<object>, like IEnumerable<T> is.
I'm not arguing about this design, which is a good thing to avoid confusion and mis-adding of types.

Motives

The thing I do find annoying here, and I bump into it a lot, is the Count property, which is common to ICollection and ICollection<T> and IReadOnlyCollection<T>, but these interfaces don't overlap each other.
Obtain a value from a generic type when the type argument is unknown at compile time is only possible with reflection. So in order to get the known size of a collection you'd have to do the following (see discussion here). And with the branching of the TypeInfo class, this gets even worse:

//gets count if known
int? GetCountIfKnown(IEnumerable collection)
{
  if(collection is ICollection collection)
    return collection.Count;
  else
  {
    Type genericCol = value.GetType().GetTypeInfo().ImplementedInterfaces.Select(t =>
      t.GetTypeInfo()).FirstOrDefault(i => 
        i.IsGenericType && i.GetGenericTypeDefinition() == typeof(ICollection<>));
    if (genericCol != null)
      retun (int)genericCol.GetTypeInfo().GetProperty("Count").GetValue(value);
  }  
  return null;
}

As pointed out later in comments by @NightOwl888, the IReadOnlyCollection<T> interface also exposes the Count property, but the reasons it doesn't address my issue is:

  • It's generic, so when the type is unknown at compile time, we can only evaluate it using reflection
  • ICollection<T> doesn't inherit from it.
  • Not all collections with known size inherit from it

There are many use cases of non-generic collections I bump into, the real reason we need non-generic support, is not because their not generic, but rather because their generic argument may not be known at runtime.
I can't remember all of the scenarios I've encountered this demand as they are many, but I'll name a few.

  • When developing general tools (i.e. XAML converters) that are supposed to eat a wide variety of value types and work differently when the value is a collection.
  • We need to store a collection of ICollection<T>s of a variety of types. When enumerating the main collection, the count of the items can only be achieved by acquiring the collection type with reflection.

Another reason is that ICollection has more methods to implement, and is more tedious.

Solution

My suggestion is then, to introduce a new interface ICountable (or IReadOnlyCollection - non generic) that its contract determines that a collection has a Count property and exposed the collection size regardless of its item type.

This interface should be implemented by (please comment if you're aware of others):

  • ICollection
  • ICollection<T>
  • IReadOnlyCollection<T>

The new interface should have one sole task: expose a known size of a collection via a Count property, with disregard to the collection type (T), editability (ICollection vs IReadOnlyCollection), or other features.

int? GetCountIfKnown(IEnumerable enumerable) =>
  (enumerable as IReadOnlyCollection)?.Count;

Concequences

  • Implicit implementations of ICollection, ICollection<T> and IReadOnlyCollection<T> will have to change the pattern to ICountable (or if we call it IReadOnlyCollection or anything else).
  • Implementations that call the property Count on ICollection (or the others) as "declared", will have to switch to either calling the runtime property, or call it on ICountable etc.

Related: https://github.com/dotnet/corefx/issues/23700.

@NightOwl888
Copy link

NightOwl888 commented Aug 28, 2017

Note that if we really want to go down this road that not all "countable" types have a property named Count. Files, arrays, and streams all use the property name Length rather than Count to determine their size. So by the same token, we would need an ILengthable ? or probably better ILength or ISize interface with a property Length.

I do see the advantage with covariance, but it doesn't seem right to force every IEnumerable<T> to implement a Count property that might not be practical to do. Perhaps there could be an ICountableEnumerable<T> (inheriting IEnumerable<T>) that exists between IEnumerable<T> and ICollection<T>, which would not break backward compatibility (except perhaps with LINQ).

@weitzhandler
Copy link
Contributor Author

weitzhandler commented Aug 28, 2017

@NightOwl888 Please allow me to disagree with you, because most of those types you mention implement ICollection<T> or ICollection, and return the Length.
It's just that some types implement ICollection<T> but not ICollection, which in case getting the Count property can only be achieved with reflection, hence my suggestion.
The need for the ICountable is specifically is to support an easy determination of ICollection<T> with a known count/length, but T is unknown at compile time, because really the Count property of ICollection<T> has nothing to do with T, the reason ICollection<T> isn't covariant (i.e. ICollection<out T> like IEnumerable<T>, and doesn't inherit from ICollection (not really sure), is to avoid mixing types, if I get it correctly, but with that we find the Count property bound to T when it shouldn't.

ICountableEnumerable is precisely what I meant. Name it whatever you like but separate it out.

@karelz
Copy link
Member

karelz commented Aug 29, 2017

What would be the real-world use of such interface? Interfaces don't come for free - they have runtime impact, size on disk impact and also maintenance cost associated.

@weitzhandler
Copy link
Contributor Author

weitzhandler commented Aug 29, 2017

@karelz
Currently, getting the Count property for a generic ICollection of unknown type is only possible with reflection:

int GetCountIfAvailable(IEnumerable obj)
{
  if(obj is ICollection collection) 
    return collection.Count;

  var gCollection = typeof(ICollection<>);
  var iCollection = obj.GetType().GetInterfaces().FirstOrDefault(i =>
    i.IsGenericType && i.GetGenericTypeDefinition() == typeof(ICollection<>));

  if (iCollection != null)
    return (int)iCollection.GetProperty("Count").GetValue(obj);

  return -1; //count unavailable
}

If we branch out the Count property (which really has nothing to do with the collection type in first place), we'll be able to achieve the above by:

int GetCountIfAvailable(IEnumerable obj) =>
  return obj is ICountable countable ? countable.Count : -1;

Additionally, ICollection<T> doesn't inherit from ICollection, while their Count property should have been mutual. Having the intermediate interface shared between them will spare us from always having to check for ICollection and ICollection<T> separately when the collection size is available without iterating.
Besides, the ability to modify a collection and the ability to get its size are two different aspects and shouldn't be tied together in one interface.

@khellang
Copy link
Member

khellang commented Aug 29, 2017

Note that if we really want to go down this road that not all "countable" types have a property named Count.

That's no problem. You'd just implement the interface explicitly:

public class SomeCollectionWithLengthProperty : ICountable
{
    public int Length { get; }
    public int ICountable.Count => Length;
}

@weitzhandler
Copy link
Contributor Author

weitzhandler commented Aug 29, 2017

@khellang @NightOwl888 in fact, arrays and some other types actually do just this.

@NightOwl888
Copy link

NightOwl888 commented Aug 29, 2017

I do see the advantage with covariance, but it doesn't seem right to force every IEnumerable<T> to implement a Count property that might not be practical to do. Perhaps there could be an ICountableEnumerable<T> (inheriting IEnumerable<T>) that exists between IEnumerable<T> and ICollection<T>, which would not break backward compatibility (except perhaps with LINQ).

I am going to have to retract this statement. There already is a IReadOnlyCollection<T> type that is effectively the same thing as the ICountableEnumerable<T> type I envisioned, being a covariant alternative to ICollection<T>.

As for the ICollection vs ICollection<T> vs IReadOnlyCollection vs IReadOnlyCollection<T> issue, I agree there is a gap here. It would be nice if there were a way to work with any collection without having to handle several different interface types.

@weitzhandler
Copy link
Contributor Author

weitzhandler commented Aug 29, 2017

@NightOwl888

There already is a IReadOnlyCollection type that is effectively the same thing as the ICountableEnumerable type I envisioned, being a covariant alternative to ICollection.

Yes, but IReadOnlyCollection means something else in it's essence, and as you said for yourself, arrays, and other common collection types don't implement it.

What I am after is a centralized interface that tells that any collection (IEnumerable) implementing it, has a known size without having to iterate it.

@svick
Copy link
Contributor

svick commented Aug 29, 2017

@weitzhandler T[] does implement IReadOnlyCollection<T>. What common collection types do you think don't implement it?

@karelz
Copy link
Member

karelz commented Aug 29, 2017

While I understand the technical value on the caller side, I am still not sure how common such code is which needs to know the Count out of IEnumerable (if available). IMO it is usable mostly/only in Linq.
Adding new interface just to make Linq code easier, maybe a bit faster (would be nice to see the numbers) seems like overkill to me.

Or are there other real-world usages where this would provide significant value?

@weitzhandler
Copy link
Contributor Author

weitzhandler commented Aug 30, 2017

@NightOwl888 @svick The problem with IReadOnlyCollection<T> is again, it's generic, and I'm looking for a non generic interface that should expose Count and should be common to ICollection<T> and ICollection, so any collection with a known size will be available regardless to its generic definition.

@karelz it could be a bit faster, and it could be n times faster depending on the collection size. I bump into this scenario every day. Given an IEnumerable that can be either ICollection<T>, ICollection, add to this if, T is unknown.

  • My code gets splattered with ICollection and ICollection<T> because I'm dealing with large collections
  • If I don't know the collection type I can't rely on the Count property, even I know it'll be an ICollection<T> at runtime

Here's an example scenario, and that's when T is known. Add to this class plenty more check for when T is unavailable. Here's another example (lines 81-95).

@NightOwl888
Copy link

As someone who has recent experience dealing with custom .NET collection types and the challenges surrounding them, let me see if I can offer up a solution that would work not just for the Count property, but for better APIs around collections in general.

The main issue here is not that we are lacking an interface, it is that we are lacking any kind of relationship between the interfaces to make them generally usable within APIs. One of the biggest issues is there is no way to create APIs that work with both generic and non-generic interfaces without resorting to lots of reflection and casting. This is not limited to getting the Count property from an IEnumerable, but is a challenge for any API logic that needs to work with both generic and non-generic collection types.

Another issue exists when trying to implement custom collection types. We typically can't just pick an interface and go, we typically need to implement several interfaces, which involve time and research. And none of the existing collection types are very helpful. Although they are not sealed, none of their members are virtual, so we are always starting from scratch to make custom collection types in .NET because nothing was made reusable.

Back on point, let's say for the sake of argument we want to make a method that works on any IList or IList<T> (that is, any generic or non-generic Array or List). There isn't one interface in the box that can do this for us. The best we can do is use IList, since that is the only interface we have that meets all of our criteria.

void DoSomething(IList list)

Now, a problem with doing this is that IList<T> doesn't inherit IList. So, we have to check which type we are dealing with in order to get to any functionality, even if that functionality is common between the two interfaces. And as @weitzhandler pointed out, we are stuck with Reflection to deal with the generic type case since we have no idea what type T actually is at runtime.

void DoSomething(IList list)
{
    if (list.GetType().ImplementsGenericInterface(typeof(IList<>))
          // handle generic
    else
         // handle non-generic
}

private static bool ImplementsGenericInterface(this Type target, Type interfaceType)
{
    return target.GetTypeInfo().IsGenericType && target.GetGenericTypeDefinition().GetInterfaces().Any(
                x => x.GetTypeInfo().IsGenericType && interfaceType.IsAssignableFrom(x.GetGenericTypeDefinition())
    );
}

To further complicate things, although all of the built-in types implement all of the interfaces to ensure our API works, there are no guarantees that custom collection types will implement all of the interfaces that we may need (depending on the functionality we are after). So despite our best efforts to make APIs that "just work", there are no guarantees that they always will.

What's missing from the .NET toolbox is the fact there are no abstractions that allow us to design APIs that seamlessly work with both generic and non-generic types. So, I propose that we make abstract classes to fill that gap. This is not without precedence - this is exactly how they do it in Java.

Although in .NET, we need to explicitly handle both generic and non-generic types and we might also make abstractions for the covariant types.

  • AbstractReadOnlyCollection
  • AbstractReadOnlyCollection<T>
  • AbstractCollection
  • AbstractCollection<T>
  • AbstractReadOnlyList
  • AbstractReadOnlyList<T>
  • AbstractList
  • AbstractList<T>
  • AbstractReadOnlySet
  • AbstractReadOnlySet<T>
  • AbstractSet
  • AbstractSet<T>
  • AbstractReadOnlyDictionary
  • AbstractReadOnlyDictionary<T>
  • AbstractDictionary
  • AbstractDictionary<T>

Here is a basic example. The real advantage comes with the inheritance hierarchy which ensures our contract is shared between generic and non-generic implementations.

    // All read-only non-generic arrays, lists, sets, and dictionaries subclass this
    public abstract class AbstractReadOnlyCollection : IEnumerable
    {
        public abstract int Count { get; }

        public IEnumerator GetEnumerator()
        {
            throw new NotImplementedException();
        }
    }

    // All read-only generic arrays, lists, sets, and dictionaries subclass this
    public abstract class AbstractReadOnlyCollection<T> : AbstractReadOnlyCollection, IReadOnlyCollection<T>, IEnumerable<T>, IEnumerable
    {
        IEnumerator<T> IEnumerable<T>.GetEnumerator()
        {
            throw new NotImplementedException();
        }
    }

    // All non-generic arrays, lists, sets, and dictionaries subclass this
    public abstract class AbstractCollection : AbstractReadOnlyCollection, ICollection, IEnumerable
    {
        public abstract bool IsSynchronized { get; }
        public abstract object SyncRoot { get; }

        public abstract void CopyTo(Array array, int index);
    }

    // All generic arrays, lists, sets, and dictionaries subclass this
    public abstract class AbstractCollection<T> : AbstractReadOnlyCollection<T>, ICollection<T>, IReadOnlyCollection<T>, ICollection, IEnumerable<T>, IEnumerable
    {
        public abstract bool IsReadOnly { get; }
        public abstract bool IsSynchronized { get; }
        public abstract object SyncRoot { get; }

        public abstract void Add(T item);
        public abstract void Clear();
        public abstract bool Contains(T item);
        public abstract void CopyTo(T[] array, int arrayIndex);
        public abstract void CopyTo(Array array, int index);
        public abstract bool Remove(T item);
    }

Note that the above is intended as an example of the public API contract, not as a finished implementation.

With that in place, it becomes much easier to implement custom collection types, since a lot of the boilerplate code can be moved into the abstract base types. For example, we could potentially move the List<T>.Sort() functionality into an abstract class so all IList<T> implementations don't need to create their own implementation. In addition, all of our interfaces are guaranteed to exist by all implementations.

Note also that none of this is breaking - it is completely backward compatible with existing APIs.

But more on point, @weitzhandler won't need to make a custom function to get the Count, he simply needs to accept the lowest level abstract collection and it will always work whether collection is generic or not (including Array types). In fact, all shared functionality is readily available without reflection or casting.

void DoSomething(AbstractReadOnlyCollection collection)
{
    var count = collection.Count;
}

@weitzhandler - do you think this proposal would address all of your concerns?

@karelz - As you can see, this proposal has a broader real-world impact, both around creating APIs that deal with collection types and with creating custom collection types. Both of these tasks are challenging to do with today's .NET APIs. LINQ is a great for working with generic collections, but doesn't bridge the gap between generic and non-generic collections and in some cases the performance cost is too high to be practical.

Collection Equality

One other thing I would like to chime in on is the fact that in .NET it is very difficult to compare 2 collections and their nested collections for equality. This is yet another common thing that is asked for that Enumerable.SequenceEqual only covers part of because dictionaries and sets require the comparison to be done regardless of order, and it also doesn't work recursively. This is a very common requirement during unit testing.

While I don't propose we change the default behavior, it would be nice if there were a constructor parameter that could be used to change the equality checking on built-in collection types from reference equality to deep value equality checking including all nested collection types. The actual default implementations of both of these strategies could be put into the abstract base classes, giving implementers a choice of which type of equality checking they prefer.

Perhaps there could also be a DeepEquals() and GetDeepHashCode() or something along those lines so APIs could have their choice of equality checking. Food for thought...more consideration required.

@svick
Copy link
Contributor

svick commented Aug 30, 2017

@NightOwl888

One of the biggest issues is there is no way to create APIs that work with both generic and non-generic interfaces without resorting to lots of reflection and casting.

Why do you need to work with non-generic interfaces in the first place? Is it because of some legacy API?

And none of the existing collection types are very helpful. Although they are not sealed, none of their members are virtual, so we are always starting from scratch to make custom collection types in .NET because nothing was made reusable.

There is Collection<​T>, which exists exactly for this purpose and has virtual members.

I propose that we make abstract classes to fill that gap.

I think you should open a separate proposal for that, it's far out of scope of this issue. (Which is also why I won't comment on the proposal itself here.)

@NightOwl888
Copy link

NightOwl888 commented Aug 30, 2017

Why do you need to work with non-generic interfaces in the first place? Is it because of some legacy API?

For one, to implement the aforementioned deep equality checking so it works on existing platform collection types. I have no control over what type of collections that a user might utilize in a generic type and since I opted to save time and only support nested generic collection types (without support for non-generic collections), there are gaps in my implementation.

But in a nutshell, I am running into the same challenges that @weitzhandler is.

There is Collection<​T>, which exists exactly for this purpose and has virtual members.

Thanks for pointing that out. Although, that still sounds too low level to be very useful, since lists, sets, and dictionaries have common behaviors that are generally very different from one another (for example, it only makes sense to sort lists in general, and most sets and dictionaries have undefined sort order).

It is also not commonly implemented by existing collection types (for example List<T> or Dictionary<T>), and therefore does not serve as a common way to work with all collections from APIs (which is the primary reason for this proposal).

I think you should open a separate proposal for that, it's far out of scope of this issue. (Which is also why I won't comment on the proposal itself here.)

Thanks again. I will do that when I get a chance and take into account the existing abstract Collection<T>.

@weitzhandler
Copy link
Contributor Author

I'm on mobile not able to text much.
Anyway I'm quite happy with how the collection system works.
I would appreciate if List<T> and Dictionary<TKey, TValue> can become virtual indeed, but for that I'll suggest opening a new issue.
I'm here only to request the count property to be extracted and implemented by all known size collections regardless of type.

@weitzhandler
Copy link
Contributor Author

weitzhandler commented Aug 31, 2017

@svick

Why do you need to work with non-generic interfaces in the first place? Is it because of some legacy API?

There are many cases, and I often bump into them. One is when you need to store a collection of ICollection<T> where T is of various types. Every generic type, when it needs to be worked on in a batched manner has to become non-generic, or else it has to be done via reflection. Also whenever T isn't supposed to be known, we'll want to use the common non-generic methods before going the reflection ugly path.


@svick

@NightOwl888 And none of the existing collection types are very helpful. Although they are not sealed, none of their >>members are virtual, so we are always starting from scratch to make custom collection types in .NET >>because nothing was made reusable.

There is Collection<​T>, which exists exactly for this purpose and has virtual members.

What I was just gonna say. Tho it would be nice having List<T>, Dictionary<TKey, TValue> and some other common implementations more virtual. Collection<T> is far from being as powerful as List<T>. But for this comes what you said:

I think you should open a separate proposal for that, it's far out of scope of this issue. (Which is also why I won't comment on the proposal itself here.)

@weitzhandler
Copy link
Contributor Author

weitzhandler commented Aug 31, 2017

Really this post belongs on the coreclr repo. Migrated.

@svick
Copy link
Contributor

svick commented Aug 31, 2017

@weitzhandler

Really this post belongs on the coreclr repo.

It does not, all API proposals for .Net Core belong to this repo:

For all managed API addition proposals use the CoreFX Issues Page and follow the API Review Process.

That applies even for proposals whose implementation is going to be in the CoreCLR repo.

@weitzhandler weitzhandler reopened this Aug 31, 2017
@ianhays
Copy link
Contributor

ianhays commented Sep 11, 2017

I like the idea of this from a design perspective e.g. if we were starting anew.

However, my issue with this proposal is how many changes it would require devs to make i.e.

Implicit implementations of ICollection, ICollection and IReadOnlyCollection will have to change the pattern to ICountable (or if we call it IReadOnlyCollection or anything else).
Implementations that call the property Count on ICollection (or the others) as "declared", will have to switch to either calling the runtime property, or call it on ICountable etc.

I don't think that being able to have a shared interface with the Count property is enough of a convenience to justify those breaking changes.

Because of the required breaking changes, it would be very unlikely that we would be able to push a change this large through the API review process without a huge number of people that wanted it.

@zeldafreak
Copy link

zeldafreak commented Nov 29, 2017

EDIT: I've pivoted towards standardizing the implementation of collections rather than introducing another interface. See my later comment.


I, too, wish there was a go-to means for recognizing a collection as opposed to an iterable, whether that be an interface (ICountable, IReadOnlyCollection, IHasCount etc.) or something else. The prevalence of IEnumerable makes it a perfect lowest common denominator for something you iterate over. There's currently no lowest common denominator for a collection-- instead, there are 3 different interfaces with their own Count property: ICollection, ICollection<T>, and IReadOnlyCollection<T>. I don't know how Span<T>/Memory<T> will end up impacting the framework-- perhaps they'll make things easier, or perhaps it'll be one more "collection-like" type requiring special handling.

I did want to address @NightOwl888's proposed abstract base classes: you wouldn't be able to use them for collections that are value types, such as ImmutableArray<T>.

@karelz
Copy link
Member

karelz commented Oct 9, 2018

Personally, I am not convinced the value is worth the effort & overhead of yet another interface. But I let @safern make the final call here as area owner.

@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the 5.0 milestone Jan 31, 2020
@maryamariyan maryamariyan added the untriaged New issue has not been triaged by the area owner label Feb 23, 2020
@eiriktsarpalis eiriktsarpalis removed this from the 5.0.0 milestone Jun 24, 2020
@eiriktsarpalis eiriktsarpalis added this to the Future milestone Jun 24, 2020
@eiriktsarpalis eiriktsarpalis removed the untriaged New issue has not been triaged by the area owner label Jun 24, 2020
@eiriktsarpalis
Copy link
Member

Currently, getting the Count property for a generic ICollection of unknown type is only possible with reflection

We typically use ICollection to determine the count for non-generic enumerables. So perhaps this is an issue about having more collections implement this interface explicitly?

While I do acknowledge that ICollection is more complex than what is being proposed here, my concerns around introducing a new interface is summed up perfectly by this xkcd classic.

Assuming we did decide to add a non-generic IReadOnlyCollection interface, most likely it would not be inherited by IReadOnlyCollection<T> or ICollection<T>.

@zeldafreak
Copy link

I'm going to pivot from my 2017 position and agree with @eiriktsarpalis. If we had the ability to start over from scratch, I'd advocate for a non-generic IReadOnlyCollection hard. But you're right, if you added that interface today, you couldn't have the generic interfaces inherit from it without huge breaking changes.

I think the only workable solution is to pick one of the interfaces, such as ICollection, and make sure every collection (looking at you, HashSet<T>) in the framework implements it. Document the fact that for the sake of runtime performance, any custom collection type should implement that interface. Document a recommended way collections should implement the aspects of ICollection that are not always applicable (e.g. Add and Remove for read-only collections). Perhaps make use of Roslyn analyzers to enforce these patterns (e.g. look for types that implement ICollection<T>, perhaps IReadOnlyCollection<T> as well, and warn if they don't implement ICollection).

If the framework cannot provide a way to unify these interfaces, the next best thing is for its types and documentation to be consistent and clear that implementing ICollection is the standard practice. Clear documentation is equally (if not more) important as the code changes. Then there will be a clear path forward, a "standard" way to obtain the Count, and only those forced to work with non-standard collection types would have to resort to ugly reflection hacks. The goal is no longer elimination, but mitigation.

@weitzhandler
Copy link
Contributor Author

weitzhandler commented Nov 17, 2020

Related: #42254, #24793.

@eiriktsarpalis eiriktsarpalis changed the title Make a new interface ICountable Add a non-generic IReadOnlyCollection interface Oct 29, 2021
@eiriktsarpalis
Copy link
Member

eiriktsarpalis commented Oct 29, 2021

FWIW here's what a non-generic IReadOnlyCollection could look like:

namespace System.Collections
{
+    public interface IReadOnlyCollection : IEnumerable
+    {
+        public int Count { get; }
+    }

    public partial interface ICollection 
+        : IReadOnlyCollection
    {
-         public int Count { get; }
    }
}

namespace System.Collections.Generic
{
    public partial interface IReadOnlyCollection<T>
+        : IReadOnlyCollection
    {
-        public int Count { get; }
    }

    public partial interface ICollection<T>
+        : IReadOnlyCollection
    {
-        public int Count { get; }
    }
}

I'm not sure if this could be considered a breaking change (only scenario I can think of is it would change reflection metadata of the Count PropertyInfo). I don't know if we have precedent doing something similar in the past either.

cc @stephentoub

@jhudsoncedaron
Copy link

Hmm. Not seeing it. I think the use cases need to be expanded a bit. I can't come up with a use case where IReadOnlyCollection (no generic) would work where IReadOnlyCollection<object> wouldn't because of the covariant casting.

@mikernet
Copy link
Contributor

mikernet commented Oct 29, 2021

@jhudsoncedaron Try casting an IReadOnlyCollection<int> to IReadOnlyCollection<object> :)

@svick
Copy link
Contributor

svick commented Oct 30, 2021

@eiriktsarpalis I think it would be a breaking change for any type that implements ICollection.Count explicitly, though that could be probably fixed using a DIM:

public interface ICollection : IReadOnlyCollection
{
    int Count { get; }

    int IReadOnlyCollection.Count => Count;
}

@Timovzl
Copy link

Timovzl commented Nov 1, 2021

As a fellow library author, I, too, run into this issue over and over.

To name just one simple use case, JSON deserializers tend to need not just a generic Deserialize<T>(string json) overload, but also a Deserialize(Type type, string json) overload. To name another, earlier in this thread there was some talk about deep equality checking in collections. The initial collection may be known to implement IReadOnlyCollection<T>, but we do not know at compile time that T is going to be, say, an Order[], into which we would need to recurse.

There are plenty of cases, particularly in class libraries, where we need to support being unable to rely on the generic collection type.

I believe @zeldafreak summarized the issue perfectly:

The prevalence of IEnumerable makes it a perfect lowest common denominator for something you iterate over. There's currently no lowest common denominator for a collection-- instead, there are 3 different interfaces with their own Count property: ICollection, ICollection, and IReadOnlyCollection.

Remember, one key difference between just being an IEnumerable vs. being some IEnumerable with a Count is that the latter is materialized.

We have a single abstraction for any enumerable sequence, but we are missing one for any materialized sequence.

I think the only workable solution is to pick one of the interfaces, such as ICollection.

I would steer clear of this idea. If class libraries were to depend on ICollection as an indicator of something being a materialized sequence, they are asking for way more than they should be. Whenever a developer needs to create a custom type for some immutable collection, they are very likely to overlook this "requirement": ICollection is implicates mutability and thus isn't a natural fit. I don't think documentation cures this, since there is no direct cause for looking up documentation. As a result, the behavior of class libraries would needlessly degenerate to the "I'm not sure if this is materialized" path.

Instead, we should achieve a situation where implementing either IReadOnlyCollection<T>, ICollection<T>, or ICollection (directly or indirectly) is enough. That is the natural indicator of being materialized, as opposed to being just a plain IEnumerable<T>.

@Timovzl
Copy link

Timovzl commented Nov 1, 2021

As a note of interest, we might identify a third level of materialization/concreteness:

  1. Enumerable sequence
  2. Materialized sequence
  3. Indexable sequence

By indexable sequence I mean a sequence that supports T this[int index], like lists and arrays. This is another lacking abstraction at the moment: we would need to check for both IReadOnlyList<T> and IList<T> and implement the two separately, and we have no non-generic alternative.

Let's look at an example. Say a class library needs to call GetHashCode() on the elements of a given IEnumerable<T>, and return an IReadOnlyCollection<int> containing the resulting hash codes.

  • If the input sequence is not materialized, we need to use an enumerator, and a dynamically growing List<int> for the output.
  • If the input sequence is materialized (the subject of this thread), the output can simply be an int[] pre-initialized to the right capacity. This saves allocations and copies, and produces an output value that saves a layer of indirection.
  • If the input sequence is indexable, we can add a second optimization: we can avoid allocating an enumerator, and instead use a regular for loop and index directly to the input elements.

I do not mean to derail the thread. Clearly, having an abstraction for a materialized sequence is the first and primary concern. But it seems interesting to keep in mind that there might be another level of abstraction that is missing in the exact same way.

@eiriktsarpalis
Copy link
Member

@eiriktsarpalis Eirik Tsarpalis FTE I think it would be a breaking change for any type that implements ICollection.Count explicitly, though that could be probably fixed using a DIM:

@svick good point, I had not realized this doesn't work for inherited interface members. Your DIM suggestion seems interesting though, on quick inspection it doesn't seem to suffer from the usual diamond ambiguity issues since that signature is already taken by the derived interfaces.

namespace System.Collections
{
+    public interface IReadOnlyCollection : IEnumerable
+    {
+        int Count { get; }
+    }

    public partial interface ICollection 
+        : IReadOnlyCollection
    {
-         int Count { get; }
+        new int Count { get; }
+        int IReadOnlyCollection.Count => Count;
    }
}

namespace System.Collections.Generic
{
    public partial interface IReadOnlyCollection<T>
+        : IReadOnlyCollection
    {
-        int Count { get; }
+        new int Count { get; }
+        int IReadOnlyCollection.Count => Count;
    }

    public partial interface ICollection<T>
+        : IReadOnlyCollection
    {
-        int Count { get; }
+        new int Count { get; }
+        int IReadOnlyCollection.Count => Count;
    }
}

This is likely not a source breaking change but would need to double check that no runtime breaks occur because of it.

@NightOwl888
Copy link

earlier in this thread there was some talk about deep equality checking in collections.

An update on this. We ended up having to implement a (nearly) complete set of generic collections that are structurally comparable. They rely on a StructuralEqualityComparer implementation that has 2 modes:

  • Default - highly optimized path using collections and arrays that all implement IStructuralEquatable
  • Aggressive - same optimized path as Default that falls back to using Reflection/dynamic to determine what type of collection we are dealing with, for supporting BCL and custom collection types that do not implement IStructuralEquatable, but do implement IList<T>, ISet<T> or IDictionary<TKey, TValue>

The documentation describes this in more detail, and of course you can analyze the code to see how hairy this gets in Aggressive mode when we are dealing with BCL collection types.

https://github.com/NightOwl888/J2N/blob/b7f80ca424af2b077b5d42beff4d9fbad8b05ef5/src/J2N/Collections/StructuralEqualityComparer.cs#L37-L63

The initial collection may be known to implement IReadOnlyCollection<T>, but we do not know at compile time that T is going to be, say, an Order[], into which we would need to recurse.

Actually, arrays do implement IStructuralEquatable, but there are 2 issues with the implementation that makes it unsuitable for deep equality checking:

  1. Boxing if the array contains value types
  2. It doesn't cascade the check to see if the array type implements IStructuralEquatable

I did want to address @NightOwl888's proposed abstract base classes: you wouldn't be able to use them for collections that are value types, such as ImmutableArray<T>.

This is a valid argument. Do note that:

  1. Java has no structs, so this is why it works in Java.
  2. Java also considers array to be a different animal than a collection, where .NET uses common interfaces on both.

But this does point out another hole in our implementation where we didn't take this special case into account because it does not subclass Array.

@NightOwl888
Copy link

An alternative way around a non-generic IReadOnlyCollection would be to provide a similar wildcard generic feature that Java has:

int? GetCountIfKnown(IEnumerable enumerable) =>
  (enumerable as IReadOnlyCollection<?>)?.Count;

There are many cases we need to access the members of an instance cast to a type that requires a generic parameter. And we are passed a reference that doesn't include anything about the parameter (object, for example). But we don't even care what the generic closing type it is, we just care about calling members of the instance.

What to do about calls to instance members that include the generic type? A few options:

  1. Ideally the compiler could just resolve them implicitly.
  2. Allow them to be resolved with a cast.
  3. A compile error indicating the generic closing type is unresolved for that member.

This would solve the issue in a more widespread way and eliminate the constant need to go "back to non-generic" by creating new interfaces.

Of course, this doesn't solve the multiple identity disorder of having to have branching code to check which of many collection interfaces we are dealing with, but it does fix a lot of other problems with analyzing generics without having their closing types handy.

@mikernet
Copy link
Contributor

mikernet commented Nov 2, 2021

@NightOwl888 I was vouching for a .NET implementation of wildcard generics with a proposal here: dotnet/csharplang#1992

@terrajobst
Copy link
Member

I believe this need was addressed with Enumerable.TryGetNonEnumeratedCount().

@eiriktsarpalis
Copy link
Member

@terrajobst TryGetNonEnumeratedCount only works with generic enumerables. We should keep this open unless we feel we don't want to support the non-generic scenario.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-needs-work API needs work before it is approved, it is NOT ready for implementation area-System.Collections
Projects
None yet
Development

No branches or pull requests