Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "Create" method to EqualityComparer<> class #24460

Closed
Tracked by #79053
jkotas opened this issue Dec 14, 2017 · 16 comments · Fixed by #75212
Closed
Tracked by #79053

Add "Create" method to EqualityComparer<> class #24460

jkotas opened this issue Dec 14, 2017 · 16 comments · Fixed by #75212
Assignees
Labels
api-approved API was approved in API review, it can be implemented area-System.Runtime
Milestone

Comments

@jkotas
Copy link
Member

jkotas commented Dec 14, 2017

EDITED by @stephentoub on 4/26/2022 to update proposal:

namespace System.Collections.Generic
{
    public abstract class EqualityComparer<T> : IEqualityComparer<T>, IEqualityComparer
    {
+        public static EqualityComparer<T> Create(Func<T?, T?, bool> equals, Func<T, int> getHashCode);
    }
}

To discuss:

  • Making getHashCode optional such that the returned instance's GetHashCode would throw NotSupportedException if it's used.

From @ViIvanov on December 14, 2017 20:15

Now we have a Create method in Comparer<> class:

public static Comparer<T> Create(Comparison<T> comparison)

I think, it will be usable to have an analog in EqualityComparer<>:

public static EqualityComparer<T> Create(Func<T, T, bool> equals, Func<T, int> getHashCode)

Copied from original issue: dotnet/coreclr#15526

@JonHanna
Copy link
Contributor

Not quite as handy since we need two delegates, and there isn't as much of an existing use as there is with Comparison, but I could definitely see this as something I'd use while experimenting at least, even if I might end up replacing with a custom equality comparer.

@karelz
Copy link
Member

karelz commented Dec 15, 2017

What is the value? Example of usage today vs. in future? How commonly is it used?
Ideally please follow formal API proposal (see the example there).

From a brief look it seems today's state - inheritance + overriding 2 methods is not too much extra code compared to the API proposed above.
The case of Comparer is more needed IMO to avoid adding another class with boilerplate.

@weitzhandler
Copy link
Contributor

weitzhandler commented Feb 20, 2018

I also find that very useful.
It's annoying when you always have to implement an interface to be able to compare two property values.

Though I'd prefer the method to be in a shorter version:

public static IEqualityComparer<T> Create(params Expression<Func<T, object>>[] properties);

Which will compare the two objects if the specified properties are equal, and will generate hash-codes based on the provided lambda properties.

Usage:

var eqComparer1 = EqualityComparer<Contact>.Create(c => c.FullName);
var eqComparer2 = EqualityComparer<Contact>.Create(c => c.FirstName, c => c.LastName);

Here's some untested prototype:

public class EqualityComparerImpl<T> : IEqualityComparer<T>
{
  public static EqualityComparerImpl<T> Create(
    params Expression<Func<T, object>>[] properties) =>
    new EqualityComparerImpl<T>(properties);

  PropertyInfo[] _properties;
  EqualityComparerImpl(Expression<Func<T, object>>[] properties)
  {
    if (properties == null)
      throw new ArgumentNullException(nameof(properties));

    if (properties.Length == 0)
      throw new ArgumentOutOfRangeException(nameof(properties));

    var length = properties.Length;
    var extractions = new PropertyInfo[length];
    for (int i = 0; i < length; i++)
    {
      var property = properties[i];
      extractions[i] = ExtractProperty(property);
    }
    _properties = extractions;
  }

  public bool Equals(T x, T y)
  {
    if (ReferenceEquals(x, y))
      //covers both are null
      return true;
    if (x == null || y == null)
      return false;
    var len = _properties.Length;
    for (int i = 0; i < _properties.Length; i++)
    {
      var property = _properties[i];
      if (!Equals(property.GetValue(x), property.GetValue(y)))
        return false;
    }
    return true;
  }

  public int GetHashCode(T obj)
  {
    if (obj == null)
      return 0;

    var hashes = _properties.Select(pi => pi.GetValue(obj)?.GetHashCode() ?? 0).ToArray();
    return Combine(hashes);
  }

  static int Combine(int[] hashes)
  {
    int result = 0;
    foreach (var hash in hashes)
    {
      uint rol5 = ((uint)result << 5) | ((uint)result >> 27);
      result = ((int)rol5 + result) ^ hash;
    }
    return result;
  }

  static PropertyInfo ExtractProperty(Expression<Func<T, object>> property)
  {
    if (property.NodeType != ExpressionType.Lambda)
      throwEx();

    var body = property.Body;
    if (body.NodeType == ExpressionType.Convert)
      if (body is UnaryExpression unary)
        body = unary.Operand;
      else
        throwEx();

    if (!(body is MemberExpression member))
      throwEx();

    if (!(member.Member is PropertyInfo pi))
      throwEx();

    return pi;

    void throwEx() =>
      throw new NotSupportedException($"The expression '{property}' isn't supported.");
  }
}

And BTW, there IS demand for this.

@peterduniho
Copy link

Is it hard to write a class implementing IEqualityComparer<T>? Nope. But it's tedious, and wordy.

That's why in every private "util" library I use, I eventually wind up with a delegate-based implementation of IEqualityComparer<T> that has a Create() method to accept the appropriate lambda expression delegates for the Equals() and GetHashCode() implementations. I get to write (copy/paste) that class in once, and then when I need an equality comparer, it's a one-liner in the actual code.

And to be clear: writing a new IEqualityComparer<T> for each use case in a non-starter. It's way too much extra code, and it hides the implementation details away in some other class, when what you really want is to be able to see those right where the code is used.

Does it come up a ton? Again, nope. I may need this maybe two or three times a year, tops. But then, so what? It's such a small thing to add, and yet a glaring omission, from the otherwise very helpful and convenient .NET API.

Is there a place to vote for this? As you can see, my vote would be a solid "yes".

@karelz
Copy link
Member

karelz commented Jul 18, 2018

@peterduniho upvote top post. It is sortable on GitHub by number of votes there ...

@peterduniho
Copy link

@karelz thanks. I don't see an "upvote" option per se, but I assume you mean the "thumbs up reaction" option. "Vote" submitted. :)

@stimpy77
Copy link

stimpy77 commented Aug 10, 2018

Can't even write a .Distinct(..) without also writing up a whole new IEqualityComparer class implementation, just for one LINQ clause, this is ridiculous.

Maybe Distinct(..) itself should tuck this away with its own overload(s) with a single lambda or a growable (params style) set of lambdas listing all the properties to find distinction on for just this one filter statement. The output of this being a generated implementation of IEqualityComparer. In which case I suppose it would be handled by a different team / group with a different github work item.

@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the 5.0 milestone Jan 31, 2020
@GrabYourPitchforks GrabYourPitchforks added api-ready-for-review and removed api-suggestion Early API idea and discussion, it is NOT ready for implementation labels Feb 15, 2020
@terrajobst
Copy link
Member

terrajobst commented Jun 4, 2020

Video

Can someone update the sample to make a compelling case for the currently proposed API? The sample code was for a different proposal.

If Distinct is the only use case, then maybe we should fix Distinct itself.

@terrajobst terrajobst added api-needs-work API needs work before it is approved, it is NOT ready for implementation and removed api-ready-for-review labels Jun 4, 2020
@peterduniho
Copy link

Can someone update the sample to make a compelling case for the currently proposed API? The sample code was for a different proposal.
If Distinct is the only use case, then maybe we should fix Distinct itself.

I'm not sure what sample you mean, so I'm probably missing part of the point of the request. My apologies about that.. But this isn't just an issue with Distinct(). Another common scenario would be various dictionary types (e.g. Dictionary<TKey, TValue> and ConcurrentDictionary<TKey, TValue>). Pretty much any place you'd want an instance of IEqualityComparer<T>, it would be much nicer to be able to quickly define one using lambdas, rather than having to implement a whole new class every time.

The fact that utility libriaries used internally in numerous projects have defined this functionality over and over strongly suggests it'd be a useful addition to the API (for what it's worth, this has come up in pretty much every non-trivial project I've been involved with over the last ten years or so).

Yes, it's true that this could be addressed in other ways. For example, just forget about it and let everyone keep doing what they've been doing (i.e. create the helper methods themselves). Or each of the places in .NET where IEqualityComparer<T> can be used can add an overload where instead of passing IEqualityComparer<T>, one can pass the necessary lambdas to define the comparer. But putting this in a single centralized place where it can be used for any place one needs IEqualityComparer<T> seems to me to be a lot more useful approach.

@ericstj ericstj modified the milestones: 5.0.0, 6.0.0 Aug 5, 2020
@ericstj
Copy link
Member

ericstj commented Aug 5, 2020

Given where we are at in 5.0.0 I don't this this will make it. I do support the scenario, I've come across this before and have felt it jarring that I need to stop coding and create a new type, just when I want to tweak the comparison behavior of a collection.

@daviburg
Copy link

Needed that today to avoid repeatedly writing classes for Distinct in LINQ


    using System;
    using System.Collections.Generic;
    using System.Diagnostics.CodeAnalysis;

    /// <summary>
    /// Inline capable generic equality comparer
    /// </summary>
    /// <typeparam name="T">The type to compare for equality</typeparam>
    public class InlineableEqualityComparer<T> : EqualityComparer<T>
    {
        /// <summary>
        /// Type specific equality function
        /// </summary>
        private readonly Func<T, T, bool> equalityFunction;

        /// <summary>
        /// Type specific hash function
        /// </summary>
        private readonly Func<T, int> hashFunction;

        /// <summary>
        /// Prevents the creation of default instance, as it would be invalid
        /// </summary>
        private InlineableEqualityComparer()
        {
        }

        /// <summary>
        /// Creates a new instance of an equality comparer using the provided functions
        /// </summary>
        /// <param name="equalityFunction">The equality function</param>
        /// <param name="hashFunction">The hash function</param>
        public InlineableEqualityComparer(Func<T, T, bool> equalityFunction, Func<T, int> hashFunction)
        {
            this.equalityFunction = equalityFunction ?? throw new ArgumentNullException(nameof(equalityFunction));
            this.hashFunction = hashFunction ?? throw new ArgumentNullException(nameof(hashFunction));
        }

        /// <inheritdoc/>
        public override bool Equals(T x, T y) => this.equalityFunction(x, y);

        /// <inheritdoc/>
        public override int GetHashCode(T obj) => this.hashFunction(obj);


        /// <summary>
        /// Creates an equality comparer using the provided functions
        /// </summary>
        /// <param name="equalityFunction">The equality function</param>
        /// <param name="hashFunction">The hash function</param>
        /// <returns>The new comparer</returns>
        [SuppressMessage("Design", "CA1000:Do not declare static members on generic types", Justification = "Same pattern as .NET's Comparer<T>.Create(Comparison<T>) Method")]
        public static IEqualityComparer<T> Create(Func<T, T, bool> equalityFunction, Func<T, int> hashFunction) =>
            new InlineableEqualityComparer<T>(equalityFunction, hashFunction);
    }

Usage (with dummy lambdas for the sake of argument):
.Distinct(InlineableEqualityComparer<MetadataRetrievalNode>.Create((x, y) => true, x => 0))

@tannergooding tannergooding modified the milestones: 6.0.0, Future Jul 12, 2021
@eiriktsarpalis
Copy link
Member

eiriktsarpalis commented Apr 21, 2022

Came across the need for such a method today while working with Roslyn incremental values. I suspect there might be value in making the getHashCode delegate optional, which would help accelerate scenaria like the following

nodes1.SequenceEqual(nodes2, EqualityComparer<SyntaxNode>.Create((n1, n2) => n1.IsEquivalentTo(n2)));

Calling GetHashCode on a comparer that doesn't specify a getHashCode delegate would throw NotSupportedException. It is fairly common to construct comparers that only require the Equals method.

@tannergooding @terrajobst thoughts? It's fairly pervasive pattern that should eliminate the need for boilerplate classes for most APIs that accept IEqualityComparer<T>. If you agree I can update the OP and mark it ready for review.

@ViIvanov
Copy link
Contributor

Not sure, that NotSupportedException is a part of equality comparer (IEqualityComparer<> or EqualityComparer<>) contract 🤔
Someone can expect NotimplementedException or even some ArgumentException in that case.

Maybe, for your cases it is better to use explicit delegate?

nodes1.SequenceEqual(nodes2, EqualityComparer<SyntaxNode>.Create((n1, n2) => n1.IsEquivalentTo(n2), _ => throw new NotSupportedException()));

@stephentoub stephentoub added api-ready-for-review API is ready for review, it is NOT ready for implementation and removed api-needs-work API needs work before it is approved, it is NOT ready for implementation labels Apr 26, 2022
@bartonjs
Copy link
Member

bartonjs commented Sep 6, 2022

Video

Looks good as proposed

namespace System.Collections.Generic
{
    public partial class EqualityComparer<T> : IEqualityComparer<T>, IEqualityComparer
    {
        public static EqualityComparer<T> Create(Func<T?, T?, bool> equals, Func<T, int> getHashCode);
    }
}

@bartonjs bartonjs added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review API is ready for review, it is NOT ready for implementation labels Sep 6, 2022
@stephentoub stephentoub self-assigned this Sep 6, 2022
@stephentoub stephentoub removed this from the Future milestone Sep 6, 2022
@stephentoub stephentoub added this to the 8.0.0 milestone Sep 6, 2022
@stephentoub
Copy link
Member

@bartonjs, in implementing this and using it in various places, I think we should make getHashCode optional. It's common to only need Equals and not need GetHashCode (e.g. passing a comparer to SequenceEquals), and I expect a significant number of the consumers for this API will be tests, in which case it's actually beneficial for GetHashCode to throw if it's not supplied in order to validate GetHashCode isn't being used in situations you don't expect it to be used. Since in the discussion yesterday the room was largely ambivalent about whether to require getHashCode or not and we effectively flipped a coin, I'm going to proceed under the assumption we're ok with it.

public static EqualityComparer<T> Create(Func<T?, T?, bool> equals, Func<T, int>? getHashCode = null);

The only real downside is if GetHashCode was actually needed for a use and someone neglected to supply it because it's optional, but in such cases, they'll likely get an exception almost immediately.

@bartonjs
Copy link
Member

bartonjs commented Sep 7, 2022

I'm not sure that I agree that "null-accepting" should immediately jump to "null-defaulting", but I guess I agree with the logic that the exception is fast-coming.

But, yeah, if anyone had said "oh, it's super common in writing tests to not specify an implementation for GetHashCode" then we'd've slapped on the question mark.

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Sep 7, 2022
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Sep 11, 2022
@dotnet dotnet locked as resolved and limited conversation to collaborators Oct 11, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-System.Runtime
Projects
None yet
Development

Successfully merging a pull request may close this issue.