Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hash code of string is broken in .NET Core 2.1, but works in 2.0 #27778

Closed
karelz opened this issue Oct 31, 2018 · 6 comments
Closed

Hash code of string is broken in .NET Core 2.1, but works in 2.0 #27778

karelz opened this issue Oct 31, 2018 · 6 comments
Labels
area-System.Globalization bug tenet-compatibility Incompatibility with previous versions or .NET Framework
Milestone

Comments

@karelz
Copy link
Member

karelz commented Oct 31, 2018

From @mwikstrom on October 30, 2018 5:23

I recently upgraded one of my projects from .NET Core 2.0 to .NET Core 2.1. After doing so several of my tests started to fail.

After narrowing this down I've found that in .NET Core 2.1 it is not possible to compute the hash code of a string using a culture aware comparer with the string sort compare option.

I've created a test that reproduce my problem:

    [TestMethod]
    public void Can_compute_hash_code_using_invariant_string_sort_comparer()
    {
        var compareInfo = CultureInfo.InvariantCulture.CompareInfo;
        var stringComparer = compareInfo.GetStringComparer(CompareOptions.StringSort);
        stringComparer.GetHashCode("test"); // should not throw!
    }

I've tested it on a couple of frameworks with the following results:

  • .NET Core 2.0: ✔ PASS
  • .NET Core 2.1: ✖ FAIL
  • .NET Framework 4.7: ✖ FAIL
  • .NET Framework 4.6.2: ✖ FAIL

When failing an ArgumentException is thrown from CompareInfo.GetHashCodeOfString saying:

Value of flags is invalid

Now, to my questions:

  1. Why is it not allowed to use CompareOptions.StringSort when computing a hash code?

  2. Why was it allowed in .NET Core 2.0?`

As far as I understand CompareOptions.StringSort only affects the relative sort order of strings and should not affect hash code computation. MSDN says:

StringSort Indicates that the string comparison must use the string sort algorithm. In a string sort, the hyphen and the apostrophe, as well as other nonalphanumeric symbols, come before alphanumeric characters.

Copied from original issue: dotnet/core#2030

@karelz
Copy link
Member Author

karelz commented Oct 31, 2018

@karelz
Copy link
Member Author

karelz commented Oct 31, 2018

@tarekgh is that in your area of expertise?

@tarekgh
Copy link
Member

tarekgh commented Oct 31, 2018

Yes, this looks a regression from @atsushikan change in the commit: dotnet/corert@c6c0125#diff-63aa554c73fc39ae34377cf46321a8b2R208

The original code used to do the following: https://github.com/dotnet/corefx/blob/release/2.0.0/src/System.Globalization.Extensions/src/System/Globalization/Extensions.cs#L80

@mwikstrom

Thanks for reporting that. Considering this is the behavior on .NET Full framework from version 4.6, it'll be hard to change this behavior there and I am inclining not changing it on .NET Core 2.1 either for the sake of consistency with the full framework.
How important this issue to you especially working around it should be trivial?

@mwikstrom
Copy link

mwikstrom commented Oct 31, 2018

The workaround I've come up with is to use the following class:

internal sealed class CultureAwareStringSortComparer : StringComparer
{
    public CultureAwareStringSortComparer(CompareInfo compareInfo)
    {
        this.SortComparer = compareInfo.GetStringComparer(CompareOptions.StringSort);
        this.HashCodeComparer = compareInfo.GetStringComparer(CompareOptions.None);
    }

    internal StringComparer SortComparer { get; }

    internal StringComparer HashCodeComparer { get; }

    public override int Compare(string x, string y) => this.SortComparer.Compare(x, y);

    public override bool Equals(string x, string y) => this.HashCodeComparer.Equals(x, y);

    public override int GetHashCode(string obj) => this.HashCodeComparer.GetHashCode(obj);
}

The idea is to not use the StringSort option for GetHashCode.

All my tests are passing using this workaround. However I'm concerned with how the Equals method should be implemented.

As you can see I've chosen to use the hash code comparer for that. My reasoning behind that is that code that rely on hash codes also rely on equality, while code that rely on sort comparison should rely on a zero return value instead of equals.

I don't know how the implementation of Equals would differ whether I use StringSort or not. Maybe there is no difference? Then this workaround would be just fine.

Maybe you (@tarekgh) could ease my mind on this?

@tarekgh
Copy link
Member

tarekgh commented Oct 31, 2018

You should implement Equals as follow:

public override bool Equals(string x, string y) => this.SortComparer.Equals(x, y);

StringSort can affect the result of the string comparisons. other than that it looks OK to me.

also you may write it in more generic way like

    internal sealed class CultureAwareStringSortComparer : StringComparer
    {
        public CultureAwareStringSortComparer(CompareInfo compareInfo, CompareOptions options)
        {
            this.SortComparer = compareInfo.GetStringComparer(options);
            this.HashCodeComparer = compareInfo.GetStringComparer(options & ~CompareOptions.StringSort);
        }

        internal StringComparer SortComparer { get; }

        internal StringComparer HashCodeComparer { get; }

        public override int Compare(string x, string y) => this.SortComparer.Compare(x, y);

        public override bool Equals(string x, string y) => this.SortComparer.Equals(x, y);

        public override int GetHashCode(string obj) => this.HashCodeComparer.GetHashCode(obj);
    }

I am closing this issue but feel free to reply back with any question.

@tarekgh tarekgh closed this as completed Oct 31, 2018
@mwikstrom
Copy link

Excellent. Thank you!

@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the 3.0 milestone Jan 31, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 15, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Globalization bug tenet-compatibility Incompatibility with previous versions or .NET Framework
Projects
None yet
Development

No branches or pull requests

4 participants