Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Avoid string allocations in Cookie.GetHashCode #8163

Closed
wants to merge 3 commits into from

Conversation

jamesqo
Copy link
Contributor

@jamesqo jamesqo commented Apr 29, 2016

Previously, Cookie.GetHashCode implicitly allocated a new string (from string.Concat) every time it was called, in addition to a params object[] array and boxing the version. This PR changes its override of GetHashCode to match the one used by Tuple (here), and adds tests for the change. (Note that this could be considered a breaking change.)

This leads to an about 8x speedup in the method.

cc @stephentoub @davidsh

hash2 = ((hash2 << 5) + hash2) ^ Domain.GetHashCode();

int hash3 = ((hash1 << 5) + hash1) ^ hash2;
return ((hash3 << 5) + hash3) ^ Version; // Version is already an int, no need to call GetHashCode
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes also need to be made to the copy of this code here:
https://github.com/dotnet/corefx/blob/master/src/System.Net.Http/src/netcore50/System/Net/cookie.cs#L831

Eventually, there won't be a copy of Cookie.cs but for now there is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be simplified with a helper:

public override int GetHashCode() => 
    CombineHashCodes(
        CombineHashCodes(
            CombineHashCodes(Name.GetHashCode, Value.GetHashCode(), 
            CombineHashCodes(Path.GetHashCode, Domain.GetHashCode)), 
        Version));

private static int CombineHashCode(int hash1, int hash2) => ((hash1 << 5) + hash1) ^ hash2;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephentoub True; maybe that function could be moved into a new file in Common, similar to what I proposed in #8034? I think it'd also be useful in places like ValueTuple.

@davidsh
Copy link
Contributor

davidsh commented Apr 29, 2016

cc: @CIPop

@davidsh
Copy link
Contributor

davidsh commented Apr 29, 2016

This PR changes its override of GetHashCode to match the one used by Tuple (here), and adds tests for the change. (Note that this could be considered a breaking change.)

I'll need to check whether this is an acceptable change for CoreFx.


[Theory]
[MemberData(nameof(HashCodesTestData))]
public static void HashCodes_ShouldWorkLikeTuple(string name, string value, string path, string domain, int version)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, this feels like a weird thing to codify with a test. If we are going to make the change from the old behavior to a new one, why do we have a test that ensures this always behaves like Tuple in the future?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1
We should have two tests: one for the common code (there's a common test project) and one that exercises the hashcode for Cookie. The expected hash code can be hard-coded instead of creating the impression that we'll always be compatible with what Tuple does.

Copy link
Contributor Author

@jamesqo jamesqo May 16, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CIPop Unfortunately, it's not possible to hard-code these values as .NET Core uses randomized string hashing: dotnet/coreclr#229 meaning that "string".GetHashCode() will be different every time you run the app. (Verified this when I re-ran the tests with hardcoded values.)

Would it be ok to assert that we use the same algorithm as the Hash helper class I'm going to introduce, e.g.

Assert.Equal(Hash.CombineHashCodes(name, value, path, domain, version), cookie.GetHashCode());

or just leave out the test altogether? (As @ellismg points out, if we change the algorithm in the future these tests will fail.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"string".GetHashCode() will be different every time you run the app.

@jamesqo Thanks for explaining! In that case, I think it's fine to leave it the way it's written now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just delete the test (or modify it to call GetHashCode but ignore the result to ensure it doesn't throw). If we don't want to codify the behavior of GetHashCode, let's not write the test that asserts stuff.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ellismg Ok, sure; there are already tests in there calling GetHashCode already, so I'm going to go ahead and delete it.

@jamesqo
Copy link
Contributor Author

jamesqo commented Apr 30, 2016

@davidsh

I'll need to check whether this is an acceptable change for CoreFx.

OK. I'll wait until you do so before responding to PR feedback.

return (Name + "=" + Value + ";" + Path + "; " + Domain + "; " + Version).GetHashCode();
// Copied from Tuple.CombineHashCodes
int hash1 = Name.GetHashCode();
hash1 = ((hash1 << 5) + hash1) ^ Value.GetHashCode();
Copy link
Member

@krwq krwq May 3, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this has a lot of bitshifts and stuff but I'm not convinced that (33*hash1)^hash2 wouldn't be faster (<<5 is equivalent to multiplying by 32) - you should leave optimizations like that for compiler as it may be platform dependant

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jamesqo In terms of app-compat issues regarding changing the implementation of GetHashCode(), it's ok to change it.

However, you should make sure that you have added tests to verify the principles of GetHashCode:

https://msdn.microsoft.com/en-us/library/system.object.gethashcode%28v=vs.110%29.aspx

Two objects that are equal return hash codes that are equal. However, the reverse is not true: equal hash codes do not imply object equality, because different (unequal) objects can have identical hash codes.

@CIPop
Copy link
Member

CIPop commented May 3, 2016

Thanks @jamesqo for the enhancement!

This leads to an about 8x speedup in the method.

If you already have the code, why not add the test that verifies the execution speed? You can decorate the test with Trait("Perf", "true")].

@davidsh If the concern is regarding app-compat, it appears that we've already documented that GetHashCode comes with no guarantees:

From https://msdn.microsoft.com/en-us/library/system.object.gethashcode(v=vs.110).aspx :

You should never persist or use a hash code outside the application domain in which it was created, because the same object may hash across application domains, processes, and platforms.

@davidsh
Copy link
Contributor

davidsh commented May 5, 2016

@davidsh If the concern is regarding app-compat, it appears that we've already documented that GetHashCode comes with no guarantees:

Yes, you're right about that.

@davidsh
Copy link
Contributor

davidsh commented May 16, 2016

@jamesqo

Where do we stand with this PR?

There is still feedback to be incorporated:

We should have two tests: one for the common code (there's a common test project) and one that exercises the hashcode for Cookie. The expected hash code can be hard-coded instead of creating the impression that we'll always be compatible with what Tuple does.

@jamesqo
Copy link
Contributor Author

jamesqo commented May 16, 2016

@davidsh You're correct. I apologize since I had forgotten about this PR until now; I'll incorporate the feedback/changes shortly so this can be merged. Thanks.

edit: Sorry everyone for the delay, I don't have much time to work on pull requests here right now but I will notify you when I get the chance. Thanks.

edit 2: As I don't want to hold any of you up waiting on this, I'm closing this PR temporarily 😄

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
8 participants