Avoid string allocations in Cookie.GetHashCode #8163
Conversation
hash2 = ((hash2 << 5) + hash2) ^ Domain.GetHashCode(); | ||
|
||
int hash3 = ((hash1 << 5) + hash1) ^ hash2; | ||
return ((hash3 << 5) + hash3) ^ Version; // Version is already an int, no need to call GetHashCode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes also need to be made to the copy of this code here:
https://github.com/dotnet/corefx/blob/master/src/System.Net.Http/src/netcore50/System/Net/cookie.cs#L831
Eventually, there won't be a copy of Cookie.cs but for now there is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be simplified with a helper:
public override int GetHashCode() =>
CombineHashCodes(
CombineHashCodes(
CombineHashCodes(Name.GetHashCode, Value.GetHashCode(),
CombineHashCodes(Path.GetHashCode, Domain.GetHashCode)),
Version));
private static int CombineHashCode(int hash1, int hash2) => ((hash1 << 5) + hash1) ^ hash2;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stephentoub True; maybe that function could be moved into a new file in Common
, similar to what I proposed in #8034? I think it'd also be useful in places like ValueTuple
.
cc: @CIPop |
I'll need to check whether this is an acceptable change for CoreFx. |
|
||
[Theory] | ||
[MemberData(nameof(HashCodesTestData))] | ||
public static void HashCodes_ShouldWorkLikeTuple(string name, string value, string path, string domain, int version) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, this feels like a weird thing to codify with a test. If we are going to make the change from the old behavior to a new one, why do we have a test that ensures this always behaves like Tuple in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
We should have two tests: one for the common code (there's a common test project) and one that exercises the hashcode for Cookie. The expected hash code can be hard-coded instead of creating the impression that we'll always be compatible with what Tuple does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@CIPop Unfortunately, it's not possible to hard-code these values as .NET Core uses randomized string hashing: dotnet/coreclr#229 meaning that "string".GetHashCode()
will be different every time you run the app. (Verified this when I re-ran the tests with hardcoded values.)
Would it be ok to assert that we use the same algorithm as the Hash
helper class I'm going to introduce, e.g.
Assert.Equal(Hash.CombineHashCodes(name, value, path, domain, version), cookie.GetHashCode());
or just leave out the test altogether? (As @ellismg points out, if we change the algorithm in the future these tests will fail.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"string".GetHashCode() will be different every time you run the app.
@jamesqo Thanks for explaining! In that case, I think it's fine to leave it the way it's written now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just delete the test (or modify it to call GetHashCode but ignore the result to ensure it doesn't throw). If we don't want to codify the behavior of GetHashCode, let's not write the test that asserts stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ellismg Ok, sure; there are already tests in there calling GetHashCode already, so I'm going to go ahead and delete it.
OK. I'll wait until you do so before responding to PR feedback. |
return (Name + "=" + Value + ";" + Path + "; " + Domain + "; " + Version).GetHashCode(); | ||
// Copied from Tuple.CombineHashCodes | ||
int hash1 = Name.GetHashCode(); | ||
hash1 = ((hash1 << 5) + hash1) ^ Value.GetHashCode(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this has a lot of bitshifts and stuff but I'm not convinced that (33*hash1)^hash2 wouldn't be faster (<<5
is equivalent to multiplying by 32) - you should leave optimizations like that for compiler as it may be platform dependant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jamesqo In terms of app-compat issues regarding changing the implementation of GetHashCode(), it's ok to change it.
However, you should make sure that you have added tests to verify the principles of GetHashCode:
https://msdn.microsoft.com/en-us/library/system.object.gethashcode%28v=vs.110%29.aspx
Two objects that are equal return hash codes that are equal. However, the reverse is not true: equal hash codes do not imply object equality, because different (unequal) objects can have identical hash codes.
Thanks @jamesqo for the enhancement!
If you already have the code, why not add the test that verifies the execution speed? You can decorate the test with @davidsh If the concern is regarding app-compat, it appears that we've already documented that GetHashCode comes with no guarantees: From https://msdn.microsoft.com/en-us/library/system.object.gethashcode(v=vs.110).aspx :
|
Yes, you're right about that. |
Where do we stand with this PR? There is still feedback to be incorporated:
|
@davidsh You're correct. I apologize since I had forgotten about this PR until now; I'll incorporate the feedback/changes shortly so this can be merged. Thanks. edit: Sorry everyone for the delay, I don't have much time to work on pull requests here right now but I will notify you when I get the chance. Thanks. edit 2: As I don't want to hold any of you up waiting on this, I'm closing this PR temporarily 😄 |
Previously,
Cookie.GetHashCode
implicitly allocated a new string (fromstring.Concat
) every time it was called, in addition to aparams object[]
array and boxing the version. This PR changes its override ofGetHashCode
to match the one used byTuple
(here), and adds tests for the change. (Note that this could be considered a breaking change.)This leads to an about 8x speedup in the method.
cc @stephentoub @davidsh