Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IEqualityComparer for sequences (T[], Memory<T>, IEnumerable<T>) #44796

Open
GSPP opened this issue Nov 17, 2020 · 14 comments
Open

IEqualityComparer for sequences (T[], Memory<T>, IEnumerable<T>) #44796

GSPP opened this issue Nov 17, 2020 · 14 comments
Labels
api-needs-work API needs work before it is approved, it is NOT ready for implementation area-System.Collections
Milestone

Comments

@GSPP
Copy link

GSPP commented Nov 17, 2020

Problem:

Sometimes, it is useful to use arrays as keys in a map such as Dictionary. A motivating use case would be blockchain applications. Here, we often need to map a hash to some piece of data. For example, we might want to map a 32 byte block hash (byte[]) to block metadata.

Proposal:

There should be built-in equality comparers for sequences. By default, each element is default-compared. It should be possible to use a custom nested comparer.

I propose this API surface:

public static class EqualityComparer
{
    public static IEqualityComparer<T[]> Array<T>(IEqualityComparer<T> elementComparer = null);
    public static IEqualityComparer<Memory<T>> Memory<T>(IEqualityComparer<T> elementComparer = null);
    public static IEqualityComparer<IEnumerable<T>> Sequence<T>(IEqualityComparer<T> elementComparer = null);
}

Example usage:

var dict = new Dictionary<byte[], BlockHeader>(EqualityComparer.Array<byte>());

These factory methods can internally pick an optimal comparer for the given arguments. For example, for arrays of primitive types, we can use memcmp-style algorithms internally. If a custom comparer is provided, an element-by-element comparison algorithm must be used. A cached instance can be used if no custom comparer is provided.

Possible extension: The same thing could be done for IComparer<T[]> (see #33873).

@GSPP GSPP added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Nov 17, 2020
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the untriaged New issue has not been triaged by the area owner label Nov 17, 2020
@Dotnet-GitSync-Bot
Copy link
Collaborator

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@huoyaoyuan
Copy link
Member

I don't think adding excessive overloads for matrix of current collections and operations is a good idea.

I'm looking forward to improvement of type system (HKT, Type Classes). Then they can be written once for all kind of certain abstraction.

@ghost
Copy link

ghost commented Nov 17, 2020

Tagging subscribers to this area: @eiriktsarpalis, @jeffhandley
See info in area-owners.md if you want to be subscribed.

Issue Details
Description:

Problem:

Sometimes, it is useful to use arrays as keys in a map such as Dictionary. A motivating use case would be blockchain applications. Here, we often need to map a hash to some piece of data. For example, we might want to map a 32 byte block hash (byte[]) to block metadata.

Proposal:

There should be built-in equality comparers for sequences. By default, each element is default-compared. It should be possible to use a custom nested comparer.

I propose this API surface:

public static class EqualityComparer
{
    public static IEqualityComparer<T[]> Array<T>(IEqualityComparer<T> elementComparer = null);
    public static IEqualityComparer<Memory<T>> Memory<T>(IEqualityComparer<T> elementComparer = null);
    public static IEqualityComparer<IEnumerable<T>> Sequence<T>(IEqualityComparer<T> elementComparer = null);
}

Example usage:

var dict = new Dictionary<byte[], BlockHeader>(EqualityComparer.Array<byte>());

These factory methods can internally pick an optimal comparer for the given arguments. For example, for arrays of primitive types, we can use memcmp-style algorithms internally. If a custom comparer is provided, an element-by-element comparison algorithm must be used. A cached instance can be used if no custom comparer is provided.

Possible extension: The same thing could be done for IComparer<T[]> (see #33873).

Author: GSPP
Assignees: -
Labels:

api-suggestion, area-System.Collections, untriaged

Milestone: -

@eiriktsarpalis
Copy link
Member

I think the request seems reasonable. The API design should probably be made in unison with #33873.

@eiriktsarpalis eiriktsarpalis removed the untriaged New issue has not been triaged by the area owner label Nov 25, 2020
@eiriktsarpalis
Copy link
Member

Should we be also supporting ReadOnlyMemory<T> comparers?

@Joe4evr
Copy link
Contributor

Joe4evr commented Nov 26, 2020

Should we be also supporting ReadOnlyMemory<T> comparers?

Is there a case that Memory<T> needs to be treated differently from ReadOnlyMemory<T>? If no, then I'd say to only have ReadOnlyMemory<T> comparers, since converting from the writable version to ReadOnly is implicit.

@eiriktsarpalis
Copy link
Member

eiriktsarpalis commented Nov 26, 2020

Agreed, so perhaps something like this?

namespace System.Collections.Generic
{
+    public static class EqualityComparer
+    {
+        public static IEqualityComparer<T[]> Array<T>(IEqualityComparer<T>? elementComparer = null);
+        public static IEqualityComparer<ReadOnlyMemory<T>> ReadOnlyMemory<T>(IEqualityComparer<T>? elementComparer = null);
+        public static IEqualityComparer<IEnumerable<T>> Sequence<T>(IEqualityComparer<T>? elementComparer = null);
+    }
}

@huoyaoyuan
Copy link
Member

Any annotation to clarify it's sequence equality instead of set equality?

@eiriktsarpalis
Copy link
Member

Any annotation to clarify it's sequence equality instead of set equality?

Hmm, I can't think of a reason why this could be assumed to use set equality (or why would ever want to introduce set equality for these types to require some form of disambiguation). But perhaps you're right that we should be explicit about the semantics in the combinator names. But I can't think of a naming convention better than this:

namespace System.Collections.Generic
{
+    public static class EqualityComparer
+    {
+        public static IEqualityComparer<T[]> ArraySequenceEquality<T>(IEqualityComparer<T>? elementComparer = null);
+        public static IEqualityComparer<ReadOnlyMemory<T>> MemorySequenceEquality<T>(IEqualityComparer<T>? elementComparer = null);
+        public static IEqualityComparer<IEnumerable<T>> EnumerableSequenceEquality<T>(IEqualityComparer<T>? elementComparer = null);
+    }
}

@eiriktsarpalis eiriktsarpalis added this to the Future milestone Dec 9, 2020
@eiriktsarpalis eiriktsarpalis added the needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration label Dec 9, 2020
@eiriktsarpalis eiriktsarpalis added untriaged New issue has not been triaged by the area owner and removed needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration labels Apr 19, 2021
@eiriktsarpalis
Copy link
Member

Related to #48304.

@eiriktsarpalis eiriktsarpalis added api-needs-work API needs work before it is approved, it is NOT ready for implementation and removed untriaged New issue has not been triaged by the area owner labels Apr 19, 2021
@terrajobst terrajobst removed the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Jun 25, 2021
@qwertie
Copy link

qwertie commented Jul 25, 2021

@Joe4evr A Dictionary with Memory<T> keys would need a Memory<T> equality comparer, since implicit coersion to ReadOnlyMemory doesn't work inside generic code. Arguably one should not use Memory<T> keys, though.

@Joe4evr
Copy link
Contributor

Joe4evr commented Jul 26, 2021

@qwertie If you absolutely need that for some ungodly reason, you can make a wrapper easily enough:

internal sealed class MemoryComparerWrapper<T> : IEqualityComparer<Memory<T>>
{
    private readonly IEqualityComparer<ReadOnlyMemory<T>> _wrappedComparer;

    public MemoryComparerWrapper(IEqualityComparer<T>? elementComparer = null)
        => _wrappedComparer = EqualityComparer.MemorySequenceEquality(elementComparer);

    public bool Equals(Memory<T> x, Memory<T> y) => _wrappedComparer.Equals(x, y);
    public int GetHashCode(Memory<T> obj) => _wrappedComparer.GetHashCode(obj);
}

@qwertie
Copy link

qwertie commented Jul 26, 2021

That would double the number of interface calls per comparison, reducing performance. Keep in mind: performance is what motivates the use of Memory<T> in the first place.

@eiriktsarpalis
Copy link
Member

Related to #27039.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-needs-work API needs work before it is approved, it is NOT ready for implementation area-System.Collections
Projects
None yet
Development

No branches or pull requests

8 participants