Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HashCode based on xxHash32 #14863

Merged
merged 10 commits into from Nov 16, 2017

Conversation

@jcdickinson
Copy link
Contributor

commented Nov 4, 2017

Migrated Pull Request

See the original PR for full discussion on the code.

Fixes "Proposal: Add System.HashCode to make it easier to generate good hash codes." in corefx.

Works by maintaining the xxHash32 state variables (v1 -> v4, length) as well as a queue of values that fall outside of the block size (16 bytes/4 ints). The seed is initialized to random bytes.

Tests against known xxHash32 vectors are provided in the original pull request.

Performance

BenchmarkDotNet=v0.10.9, OS=Windows 10 Redstone 2 (10.0.15063)
Processor=Intel Core i7-4800MQ CPU 2.70GHz (Haswell), ProcessorCount=8
Frequency=2630627 Hz, Resolution=380.1375 ns, Timer=TSC
.NET Core SDK=2.0.2
  [Host]     : .NET Core 2.0.0 (Framework 4.6.00001.0), 64bit RyuJIT
  DefaultJob : .NET Core 2.0.0 (Framework 4.6.00001.0), 64bit RyuJIT

Method Mean Error StdDev Scaled
Unsafe 79.98 ns 1.4723 ns 1.3772 ns 1.00
'Calls, No Inlining' 40.34 ns 0.1778 ns 0.1576 ns 0.50
'Unrolled, No Inlining' 12.19 ns 0.0433 ns 0.0384 ns 0.15
'Calls, Inlining' 34.55 ns 0.1372 ns 0.1216 ns 0.43
'Unrolled, Inlining' 12.22 ns 0.0885 ns 0.0785 ns 0.15

Deviations from xxHash32

  • Length is stored as the number of hashed fields (not bytes). It is multiplied at the end by 4 so that it behaves exactly like xxHash32.
  • The structure won't accept more than uint.MaxValue values, failing with an OverflowException. This is to prevent the possibility of it re-initializing.
  • The code that mixes in individual bytes is omitted - this struct only accepts int.

3rd Party Code

Submission containing materials of a third party:

Author Project License Comments
Cyan4973 xxHash BSD Reference, not copied code

cc: @morganbr

HashCode based on xxHash32
Works by maintaining the xxHash32 state variables (v1 -> v4, length) as well as a queue of values that fall outside of the block size (16 bytes/4 ints). The seed is initialized to random bytes.

Further details, unit tests and history: dotnet/corefx#25013

public struct HashCode
{
private static readonly uint s_seed = GenerateGlobalSeed();

This comment has been minimized.

Copy link
@jcdickinson

jcdickinson Nov 4, 2017

Author Contributor

NB

private uint _queue1, _queue2, _queue3;
private uint _length;

private static unsafe uint GenerateGlobalSeed()

This comment has been minimized.

Copy link
@jcdickinson

jcdickinson Nov 4, 2017

Author Contributor

NB

This comment has been minimized.

Copy link
@jamesqo

jamesqo Nov 7, 2017

Contributor

@benaadams In the future, maybe you would be interested in making sure the JIT makes this a constant?

return (int)hash;
}

[MethodImpl(MethodImplOptions.AggressiveInlining)]

This comment has been minimized.

Copy link
@jkotas

jkotas Nov 4, 2017

Member

AggressiveInlining on methods that are this big does not sound right. It will lead to pretty significant code bloat that costs performance as well.

@AndyAyersMS What do you think?

This comment has been minimized.

Copy link
@jcdickinson

jcdickinson Nov 4, 2017

Author Contributor

To help make a decision here are some scientific results (LONG). Micro-benchmarks indicate that there is no good reason to inline these methods.

This comment has been minimized.

Copy link
@AndyAyersMS

AndyAyersMS Nov 4, 2017

Member

Agree that force inlining these is not worth it -- there's not much per call site specialization that can happen and the computations have long dependence chains.

This comment has been minimized.

Copy link
@jamesqo

jamesqo Nov 6, 2017

Contributor

@jkotas Are you talking about Rol? It seems like a pretty small method to me.

This comment has been minimized.

Copy link
@jcdickinson

jcdickinson Nov 6, 2017

Author Contributor

It wasn't being inlined when I tested it on an old runtime - I need to re-test this specific method as it does contain a JIT intrinsic.

This comment has been minimized.

Copy link
@jcdickinson

jcdickinson Nov 7, 2017

Author Contributor

@jkotas @jamesqo incredible; Rol is not inlined.

INLINER: during 'fgInline' result 'failed this call site' reason 'unprofitable inline' for 'System.HashCode:Test_1():int' calling 'System.HashCode:Rol(int,int):int'

This comment has been minimized.

Copy link
@jkotas

jkotas Nov 7, 2017

Member

@jkotas Are you talking about Rol?

I was talking about Combine ... I think that the commented shifted.

Rol is not inlined.

Marking Rol with AggressiveInlining should be fine. You may want to check that the JIT recognizes the pattern and actually produces Rol instruction for it.

This comment has been minimized.

Copy link
@jcdickinson

jcdickinson Nov 7, 2017

Author Contributor

Yep, it does.

This comment has been minimized.

Copy link
@AndyAyersMS

AndyAyersMS Nov 7, 2017

Member

The method is just a bit too big to fit into the jit's "always inline" category, and the inliner can't easily anticipate that the IL sequence in this method is going to reduce to something that nicely fits a machine instruction. So the jit falls back to its profitability heuristics and doesn't see much else of merit, and so it decides not to inline.

The jit's inliner is less aggressive than inliners in C/C++ compilers, as it needs to balance jit time and code size growth vs optimization potential. Heuristics like those the inliner uses are often mistaken on specific cases; the hope is that they are generally decent on average.

Currently there's no easy automatic way for the jit to know that a particular inline is more important than others. Hence the aggressive inlining attribute exists, though best used sparingly.

@jcdickinson

This comment has been minimized.

Copy link
Contributor Author

commented Nov 4, 2017

@jkotas after quite a bit of investigation about whether or not to inline the public-facing methods I've found that they shouldn't be. Changes are staged and ready to hit this PR assuming that you agree.

This has lead me down a rabit hole of a missed value range optimization opportunity in RyuJIT (the gist in the review comment). I'm honestly not sure if I should log an item for RyuJIT - thoughts?

Pasted rant to save you a click:


I've noticed that jumps are still present in the Add() code. They shouldn't be: there is enough information to eliminate them entirely. These results might look very different if RyuJIT is improved - possibly exactly like the manually optimized Combine() code. Changing _length & 0x3 to _length % 3 doesn't change the code (so it's not bit twiddling sorcery tripping up the jitter).

This is the offending C#:

        [MethodImpl(MethodImplOptions.NoInlining)]
        public static int Test_8()
        {
            var hc = new HashCode();
            hc.Add(_test);
            hc.Add(_test);
            hc.Add(_test);
            hc.Add(_test);
            hc.Add(_test);
            hc.Add(_test);
            hc.Add(_test);
            hc.Add(_test);
            return hc.ToHashCode();
        }

If hc were a field on a heap object I'd totally understand why no optimization can occur (another thread might access it); however, in this case it is absolutely known that all the code exists 100% in the stack (apart from a static readonly read). At the very least, the jumps could be eliminated. I'm guessing some constraint must be loosened?

[MethodImpl(MethodImplOptions.AggressiveInlining)]
private void Add(int value)
{
// Note that x & 0x3 is like mod 3, but faster.

This comment has been minimized.

Copy link
@jakobbotsch

jakobbotsch Nov 4, 2017

Member

You probably meant "mod 4" here?

This comment has been minimized.

Copy link
@jcdickinson

jcdickinson Nov 4, 2017

Author Contributor

Good catch.

@mikedn

This comment has been minimized.

Copy link
Contributor

commented Nov 4, 2017

If hc were a field on a heap object I'd totally understand why no optimization can occur (another thread might access it); however, in this case it is absolutely known that all the code exists 100% in the stack

Yeah but ToHashCode is not inlined and that means that the address of the local variable hc is taken. This tends to block optimizations.

@jcdickinson

This comment has been minimized.

Copy link
Contributor Author

commented Nov 4, 2017

Yeah but ToHashCode is not inlined and that means that the address of the local variable hc is taken. This tends to block optimizations.

The address of hc is on the stack, though, it's a value type. I tested that theory and the size did come down, but only to 1789b (from 1814b). Here's my test bench. I might just open an issue anyway, it can always be closed - I'm now worried about flooding this PR with off-topic discussion.

@jkotas

This comment has been minimized.

Copy link
Member

commented Nov 4, 2017

Changes are staged and ready to hit this PR assuming that you agree.

Yes, I agree. Thanks!

PR Feedback
* Removing AgressiveInlining from all public facing methods.
* Fixing erroneous comment
@jcdickinson

This comment has been minimized.

Copy link
Contributor Author

commented Nov 4, 2017

Ubuntu x64 Innerloop Formatting

Present in the previous build.

return result;
}

public static int Combine<T1>(T1 value1)

This comment has been minimized.

Copy link
@jamesqo

jamesqo Nov 5, 2017

Contributor

/cc @terrajobst. I thought the only work this method should do was

=> value1?.GetHashCode() ?? 0;

This comment has been minimized.

Copy link
@jcdickinson

jcdickinson Nov 5, 2017

Author Contributor

In the proposal:

we might also want public static int Combine(T1 value);. I know it looks a little funny, but it would provide a way of diffusing bits from something with a limited input hash space. For example, many enums only have a few possible hashes, only using the bottom few bits of the code. Some collections are built on the assumption that hashes are spread over a larger space, so diffusing the bits may help the collection work more efficiently.

Added a comment to explain the intent.


[Obsolete("Use ToHashCode to retrieve the computed hash code.", error: true)]
[EditorBrowsable(EditorBrowsableState.Never)]
[MethodImpl(MethodImplOptions.NoInlining)]

This comment has been minimized.

Copy link
@jamesqo

jamesqo Nov 5, 2017

Contributor

Why this attribute? It can't be called anyway, so not necessary.

This comment has been minimized.

Copy link
@jcdickinson

jcdickinson Nov 5, 2017

Author Contributor

You can indirectly HashSet<HashCode>() but I will remove it regardless.

{
if (comparer is null)
comparer = EqualityComparer<T>.Default;
Add(comparer.GetHashCode(value));

This comment has been minimized.

Copy link
@jamesqo

jamesqo Nov 5, 2017

Contributor

For the default comparer, it'd be faster to:

if (comparer is null)
{
    Add(value);
}
else
{
    Add(comparer.GetHashCode(value));
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
private void Add(int value)
{
// Note that x & 0x3 is like mod 4, but faster.

This comment has been minimized.

Copy link
@jamesqo

jamesqo Nov 5, 2017

Contributor

The JIT should optimize % 4 to & 3. Change it to % 4 for better readability.

{
v1 = s_seed + Prime1 + Prime2;
v2 = s_seed + Prime2;
v3 = s_seed + 0;

This comment has been minimized.

Copy link
@jamesqo

jamesqo Nov 5, 2017

Contributor

Nit: Why not just s_seed?

This comment has been minimized.

Copy link
@jcdickinson

jcdickinson Nov 5, 2017

Author Contributor

The original xxHash32 code did this, not sure if it's a form of self-documentation. Will remove.

PR Feedback
* Adding comment explaining why Combine(value1) performs a hash.
* Removing superfluous code and attributes.
* Improving perf of Add(..., null).
* Moving two lines of code below the giant comment to improve reading locality.
* Use mod instead of bit twiddling.
private static unsafe uint GenerateGlobalSeed()
{
uint result;
Interop.GetRandomBytes((byte*)&result, sizeof(int));

This comment has been minimized.

Copy link
@jamesqo

jamesqo Nov 6, 2017

Contributor

Nit: sizeof(uint) to better signal intent

return (int)hash;
}

[MethodImpl(MethodImplOptions.AggressiveInlining)]

This comment has been minimized.

Copy link
@jamesqo

jamesqo Nov 6, 2017

Contributor

@jkotas Are you talking about Rol? It seems like a pretty small method to me.

}

// Throw for more than uint.MaxValue fields.
_length = checked(_length + 1);

This comment has been minimized.

Copy link
@jamesqo

jamesqo Nov 6, 2017

Contributor

I don't think this is necessary. It's untestable without spending a lot of CPU time, will never come up in a real scenario, and is adding costs to a hot path (if Add is used instead of the Combine methods, it is probably being called in a loop). Do you have a good reason for adding this?

This comment has been minimized.

Copy link
@jcdickinson

jcdickinson Nov 6, 2017

Author Contributor

I don't think it's super necessary either, if you are hashing uint.MaxValue fields you probably have far bigger DOS issues than HashCode re-initializing state. The strange (_length ^ position) == 0 check came from the original code, which was more robust without leaning on checked:

(_v1 | _v2 | _v3 | _v4 | (_length ^ position) == 0)

It further decreases the chances of re-initialization to probably some crazy probability (_v1 | (_position... would probably be good enough). The comment about the bloom filter is also erroneous for this reason (and should be updated).

Should I restore the original bloom filter, restore the shorter option or simply drop the checked?

This comment has been minimized.

Copy link
@jamesqo

jamesqo Nov 7, 2017

Contributor

You should drop the checked. I am not familiar with the details of the hashing algorithm, so I will leave the rest to your discretion. However, if you decide not to include the _v1 | _v2 | _v3 | _v4 then it should just be _length == position, not (_length ^ position) == 0.

This comment has been minimized.

Copy link
@morganbr

morganbr Nov 7, 2017

Contributor

If checked is taking a meaningful amount of time, you could probably just try

if(_length == Uint32.Max) 
{
    throw new InvalidOperationException(); // Or some other appropriate exception
}

Hopefully branch prediction can make it incredibly cheap since it'll never get taken.

I'm not specifically concerned about a DoS, rather just that this could start yielding incorrect results at a certain point. We shouldn't silently start producing incorrect data since it's incredibly difficult to debug or recover from.

This comment has been minimized.

Copy link
@jcdickinson

jcdickinson Nov 7, 2017

Author Contributor

@morganbr if we're throwing an exception, I assume we should expose the length so that users don't have try/catch? If CLI is still a thing, it will have to be long.

Edit: throw behaves exactly like checked, trying some other tricks.

Edit: Amazing. If I put the check into it's own method, it does get removed. It doesn't matter; Add (which is the only method that invoked Add(int)) isn't inlined anyway. This is becoming an exercise in academia.

This comment has been minimized.

Copy link
@morganbr

morganbr Nov 7, 2017

Contributor

I don't have strong feelings either way on exposing it. I also don't know the exact CLS rules, but yeah we probably would have to make it an int or a long so that all languages can reference it. If it's a long, I don't mind skipping an overflow check since 2^63 bytes is impossibly large, but I don't know how much that would slow down this code.

This comment has been minimized.

Copy link
@jkotas

jkotas Nov 7, 2017

Member

I do not think we should be throwing here.

I think this should uint, and we should let it overflow silently.

// 2. Accumulate remaining blocks of length 4 (1 uint) into the hash.
// 3. Accumulate remaining blocks of length 1 into the hash.

// There is no need for *3 as this type only accepts ints. _queue1,

This comment has been minimized.

Copy link
@jamesqo

jamesqo Nov 6, 2017

Contributor

Nit: I was confused for a second by *3. Maybe you should say #3? Same for other occurrences.

[MethodImpl(MethodImplOptions.AggressiveInlining)]
private void Add(int value)
{
// xxHash works as follows:

This comment has been minimized.

Copy link
@jamesqo

jamesqo Nov 6, 2017

Contributor

It would be clearer to say "The original xxHash works as follows" here. Otherwise, the reader might be confused when you bring #3 up and then say "we don't do #3."

// If the length is less than 3, _v1 -> _v4 don't contain
// anything yet. xxHash32 treats this differently.

uint hash = (_length ^ position) == 0

This comment has been minimized.

Copy link
@jamesqo

jamesqo Nov 6, 2017

Contributor

Why not just _length ^ position?

else // == 3
{
// length smaller than 4?
if ((_length ^ position) == 0)

This comment has been minimized.

Copy link
@jamesqo

jamesqo Nov 6, 2017

Contributor

Why not just _length == position?

? MixEmptyState()
: MixState(_v1, _v2, _v3, _v4);

// Multiply by 4 because we've been counting in ints, not

This comment has been minimized.

Copy link
@jamesqo

jamesqo Nov 6, 2017

Contributor

Nit: I read that first as "we're counting in ints, not bytes". I know I misread, but it would be clearer to say "we're counting in bytes, not ints."

hash += _length * 4;

// Mix what remains in the queue
// Switch can't be inlined right now, so use as few

This comment has been minimized.

Copy link
@jamesqo

jamesqo Nov 6, 2017

Contributor

If switch can't be inlined, have you tried using goto? Messier, but it would provide the same fallthrough behavior without branches.

This comment has been minimized.

Copy link
@jcdickinson

jcdickinson Nov 7, 2017

Author Contributor

No significant perf improvement, but the machine code is shorter. I'll be using this one.

return (int)hash;
}

# pragma warning disable 0809

This comment has been minimized.

Copy link
@jamesqo

jamesqo Nov 6, 2017

Contributor

Nit: unindent the pragma per .NET Core coding conventions. (same for below pragma)


[Obsolete("Use ToHashCode to retrieve the computed hash code.", error: true)]
[EditorBrowsable(EditorBrowsableState.Never)]
public override int GetHashCode() => throw new NotSupportedException();

This comment has been minimized.

Copy link
@justinvp

justinvp Nov 6, 2017

Contributor

It might be helpful to include a message with the NotSupportedException similar to (or the same as) the obsolete message, added to https://github.com/dotnet/coreclr/blob/master/src/mscorlib/Resources/Strings.resx

PR Feedback
* Improving some comments. Ensuring comments wrap at column 80.
* Fixing GenerateGlobalSeed().
* Reading _length once in methods.
* Ignoring overflow.
* Simplifying some conditions.
* Using goto for fallthrough in ToHashCode().
* Adding Equals as a disallowed method.
* Adding exception message string.

var val = (uint)value;
uint position = _length % 4;
uint previousLength = _length++;

This comment has been minimized.

Copy link
@jcdickinson

jcdickinson Nov 7, 2017

Author Contributor

Storing this locally shaves off many bytes in the final code.

This comment has been minimized.

Copy link
@jamesqo

jamesqo Nov 7, 2017

Contributor

If that's important, then please add a comment about it.

private uint _queue1, _queue2, _queue3;
private uint _length;

private static unsafe uint GenerateGlobalSeed()

This comment has been minimized.

Copy link
@jamesqo

jamesqo Nov 7, 2017

Contributor

@benaadams In the future, maybe you would be interested in making sure the JIT makes this a constant?

private static uint Rol(uint value, int count)
=> (value << count) | (value >> (32 - count));

[MethodImpl(MethodImplOptions.AggressiveInlining)]

This comment has been minimized.

Copy link
@jamesqo

jamesqo Nov 7, 2017

Contributor

@benaadams Here too, as s_seed is a static readonly and the Prime* values are constants, so this should just JIT to assigning a bunch of constants.

This comment has been minimized.

Copy link
@jcdickinson

jcdickinson Nov 7, 2017

Author Contributor

@jamesqo @benaadams the problem is that this is possible, even with readonly fields:

typeof(HashCode).GetField("s_seed", ...).SetValue(null, 1234)

It would be a breaking change; SetValue would have to be updated to throw for readonly fields.

This comment has been minimized.

Copy link
@jkotas

jkotas Nov 7, 2017

Member

@jcdickinson Setting readonly field using reflection after it has been initialized has undefined behavior. The JITed code may or may not reflect the updated value.

This comment has been minimized.

Copy link
@mikedn

mikedn Nov 7, 2017

Contributor

This whole thing with s_seed being static readonly was mentioned in the original PR. It's not really relevant because this code ends up being crossgened and thus those values are not actually compile time constants.

This comment has been minimized.

Copy link
@jamesqo

jamesqo Nov 8, 2017

Contributor

@mikedn Oh I forgot that this code is crossgened, that's a shame. @jkotas What happened to the BypassNGenAttribute that you mentioned earlier, in my PR to Memmove? It seems like it would be useful here.


private static uint MixEmptyState()
{
return s_seed + Prime5;

This comment has been minimized.

Copy link
@jamesqo

jamesqo Nov 7, 2017

Contributor

@benaadams Here too


var val = (uint)value;
uint position = _length % 4;
uint previousLength = _length++;

This comment has been minimized.

Copy link
@jamesqo

jamesqo Nov 7, 2017

Contributor

If that's important, then please add a comment about it.


public static int Combine<T1>(T1 value1)
{
// Provide a way of diffusing bits from something with a limited

This comment has been minimized.

Copy link
@jamesqo

jamesqo Nov 7, 2017

Contributor

Nit: double space a limited

// implementation has to change in the future we don't want to worry
// about people who might have incorrectly used this type.

[Obsolete("HashCode is a mutable struct and should not be compared with other HashCodes.", error: true)]

This comment has been minimized.

Copy link
@jamesqo

jamesqo Nov 7, 2017

Contributor

The full error message should be "HashCode is a mutable struct and should not be compared with other HashCodes. Use ToHashCode to retrieve the computed hash code."

This comment has been minimized.

Copy link
@jcdickinson

jcdickinson Nov 7, 2017

Author Contributor

For the exception message too? Or just the obsolete message?

This comment has been minimized.

Copy link
@justinvp

justinvp Nov 7, 2017

Contributor

I think some developers are going to mistakenly try to call GetHashCode instead of ToHashCode. It'd be worth including "Use ToHashCode to retrieve the computed hash code." in both the obsolete and exception message for GetHashCode as a hint of what to do.

This comment has been minimized.

Copy link
@jamesqo

jamesqo Nov 7, 2017

Contributor

@jcdickinson I would suggest making leaving off the Use ToHashCode ... for the Equals obsolete message and its exception message.


if (position == 0) goto mixFinal;

This comment has been minimized.

Copy link
@jamesqo

jamesqo Nov 7, 2017

Contributor

This has the same effect as the nested-ifs you had earlier, actually. I had something different in mind for goto, but now I don't think it will let us get the same perf as if we could use switch. You can revert this part to how it was before.

Have you tried marking ToHashCode with AggressiveInlining and using switch? Does switch generate more code than ifs?

This comment has been minimized.

Copy link
@jcdickinson

jcdickinson Nov 7, 2017

Author Contributor

According to my tests, the JIT will never inline a method with switch, regardless of AggressiveInlining.

This comment has been minimized.

Copy link
@AndyAyersMS

AndyAyersMS Nov 7, 2017

Member

Yes, this is a limitation in the current inliner.


uint hash = (_length ^ position) == 0

This comment has been minimized.

Copy link
@jamesqo

jamesqo Nov 7, 2017

Contributor

Nit: uint hash = length < 4 ? MixEmptyState() : MixState(_v1, _v2, _v3, _v4)

PR Feedback
* Adding further detail to obsolete and exception message.
 * Fix spacing issues.
* Restore nested if for ToHashCode.
@@ -395,14 +396,17 @@ public int ToHashCode()
// position == 1 means that _queue1 is populated; _queue2 would have
// been populated on the next call to Add.

if (position == 0) goto mixFinal;

This comment has been minimized.

Copy link
@jamesqo

jamesqo Nov 7, 2017

Contributor

Now you can get rid of the goto comment you put in earlier

PR Feedback
* Correcting goto comment.
* Using different strings for errors.
* Re-ordering a comment so that it makes more sense.
* Fixing comment that was still difficult to understand.
@jcdickinson

This comment has been minimized.

Copy link
Contributor Author

commented Nov 9, 2017

Perf Build and Test

Tizen armel Cross Checked Innerloop Build and Test

Anything to do with me?

@4creators

This comment has been minimized.

Copy link
Collaborator

commented Nov 9, 2017

@dotnet-bot test Tizen armel Cross Checked Innerloop Build and Test

@4creators

This comment has been minimized.

Copy link
Collaborator

commented Nov 9, 2017

@dotnet-bot test OSX10.12 x64 Checked Innerloop Build and Test

@jcdickinson

This comment has been minimized.

Copy link
Contributor Author

commented Nov 12, 2017

Anything else you need from me, or are we waiting for an internal review?

@morganbr

This comment has been minimized.

Copy link
Contributor

commented Nov 14, 2017

@dotnet-bot test Perf Build and Test please

@morganbr

This comment has been minimized.

Copy link
Contributor

commented Nov 14, 2017

@jcdickinson , I've reviewed it and and it looks good. As long as the Perf run comes back clean, I'll sign off. @jkotas, any more comments?

}
}

[MethodImpl(MethodImplOptions.AggressiveInlining)]

This comment has been minimized.

Copy link
@jkotas

jkotas Nov 14, 2017

Member

This is pretty big method - in particular with all the force-inlining on the smaller methods that it calls. ForceInlining is very likely going to hurt performance of real workloads.

}

public void Add<T>(T value, IEqualityComparer<T> comparer)
{

This comment has been minimized.

Copy link
@jkotas

jkotas Nov 14, 2017

Member

It may be better to manually inline the other Add here - Add((comparer != null) ? comparer.GetHashCode(value) : (value?.GetHashCode() ?? 0)) - to save on number of generic method instantiations that have to be created.

This comment has been minimized.

Copy link
@jamesqo

jamesqo Nov 14, 2017

Contributor

Maybe add a comment to this effect?

This comment has been minimized.

Copy link
@jkotas

jkotas Nov 14, 2017

Member

Nit: Having the single call to Add like I have suggested would result into a bit smaller/faster code because of the call will be compiled just one.

(In theory, the JIT can merge the two calls into one - but such optimization is not done today.)

This comment has been minimized.

Copy link
@jcdickinson

jcdickinson Nov 15, 2017

Author Contributor

Weird, I'd expect the CIL to be the same, but I supposed csc would need to use IR for that to happen. TIL.

PR Feedback
* Removing AggressiveInlining on big method.
* Reducing number of generic instantiations for Add<T>.
@jkotas
jkotas approved these changes Nov 14, 2017
jcdickinson added 2 commits Nov 14, 2017
PR Feedback
* Comment.
@jcdickinson

This comment has been minimized.

Copy link
Contributor Author

commented Nov 15, 2017

@dotnet-bot test Ubuntu armlb Cross Debug Innerloop Build please

PR Feedback
Slightly optimize Add<T>(,).
@jcdickinson

This comment has been minimized.

Copy link
Contributor Author

commented Nov 15, 2017

@dotnet-bot test Ubuntu16.04 armlb Cross Debug Innerloop Build please

@jcdickinson

This comment has been minimized.

Copy link
Contributor Author

commented Nov 15, 2017

@dotnet-bot test OSX10.12 x64 Checked Innerloop Build and Test please

@morganbr
Copy link
Contributor

left a comment

Thanks, @jcdickinson! This is a big improvement and you did a great job of handling all of the performance questions. I'm really looking forward to starting adoption.

@morganbr morganbr merged commit 55f6bec into dotnet:master Nov 16, 2017

16 checks passed

CROSS Check Build finished.
Details
CentOS7.1 x64 Checked Innerloop Build and Test Build finished.
Details
CentOS7.1 x64 Debug Innerloop Build Build finished.
Details
OSX10.12 x64 Checked Innerloop Build and Test Build finished.
Details
Perf Build and Test Build finished.
Details
Tizen armel Cross Checked Innerloop Build and Test Build finished.
Details
Ubuntu arm64 Cross Debug Innerloop Build Build finished.
Details
Ubuntu armlb Cross Debug Innerloop Build Build finished.
Details
Ubuntu x64 Checked Innerloop Build and Test Build finished.
Details
Ubuntu x64 Innerloop Formatting Build finished.
Details
Ubuntu16.04 armlb Cross Debug Innerloop Build Build finished.
Details
WIP ready for review
Details
Windows_NT x64 Checked Innerloop Build and Test Build finished.
Details
Windows_NT x64 Innerloop Formatting Build finished.
Details
Windows_NT x86 Checked Innerloop Build and Test Build finished.
Details
license/cla All CLA requirements met.
Details
dotnet-bot added a commit to dotnet/corert that referenced this pull request Nov 16, 2017
HashCode based on xxHash32 (dotnet/coreclr#14863)
* HashCode based on xxHash32

Works by maintaining the xxHash32 state variables (v1 -> v4, length) as well as a queue of values that fall outside of the block size (16 bytes/4 ints). The seed is initialized to random bytes.

Further details, unit tests and history: dotnet/corefx#25013

Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
jkotas added a commit to dotnet/corert that referenced this pull request Nov 16, 2017
HashCode based on xxHash32 (dotnet/coreclr#14863)
* HashCode based on xxHash32

Works by maintaining the xxHash32 state variables (v1 -> v4, length) as well as a queue of values that fall outside of the block size (16 bytes/4 ints). The seed is initialized to random bytes.

Further details, unit tests and history: dotnet/corefx#25013

Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
@jamesqo

This comment has been minimized.

Copy link
Contributor

commented Nov 16, 2017

🎉 🎉 🎉 It's been a year since this was proposed. So satisfying to finally see this merged!

@jcdickinson

This comment has been minimized.

Copy link
Contributor Author

commented Nov 17, 2017

Pleasure! Looking forward to my next PR, learned a lot from this one.

A-And added a commit to A-And/corert that referenced this pull request Nov 20, 2017
HashCode based on xxHash32 (dotnet/coreclr#14863)
* HashCode based on xxHash32

Works by maintaining the xxHash32 state variables (v1 -> v4, length) as well as a queue of values that fall outside of the block size (16 bytes/4 ints). The seed is initialized to random bytes.

Further details, unit tests and history: dotnet/corefx#25013

Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
dotnet-bot added a commit to dotnet/corefx that referenced this pull request Jan 13, 2018
HashCode based on xxHash32 (dotnet/coreclr#14863)
* HashCode based on xxHash32

Works by maintaining the xxHash32 state variables (v1 -> v4, length) as well as a queue of values that fall outside of the block size (16 bytes/4 ints). The seed is initialized to random bytes.

Further details, unit tests and history: #25013

Signed-off-by: dotnet-bot-corefx-mirror <dotnet-bot@microsoft.com>
dotnet-bot added a commit to dotnet/corefx that referenced this pull request Jan 13, 2018
HashCode based on xxHash32 (dotnet/coreclr#14863)
* HashCode based on xxHash32

Works by maintaining the xxHash32 state variables (v1 -> v4, length) as well as a queue of values that fall outside of the block size (16 bytes/4 ints). The seed is initialized to random bytes.

Further details, unit tests and history: #25013

Signed-off-by: dotnet-bot-corefx-mirror <dotnet-bot@microsoft.com>
safern added a commit to dotnet/corefx that referenced this pull request Jan 16, 2018
HashCode based on xxHash32 (dotnet/coreclr#14863)
* HashCode based on xxHash32

Works by maintaining the xxHash32 state variables (v1 -> v4, length) as well as a queue of values that fall outside of the block size (16 bytes/4 ints). The seed is initialized to random bytes.

Further details, unit tests and history: #25013

Signed-off-by: dotnet-bot-corefx-mirror <dotnet-bot@microsoft.com>
safern added a commit to dotnet/corefx that referenced this pull request Jan 16, 2018
HashCode based on xxHash32 (dotnet/coreclr#14863)
* HashCode based on xxHash32

Works by maintaining the xxHash32 state variables (v1 -> v4, length) as well as a queue of values that fall outside of the block size (16 bytes/4 ints). The seed is initialized to random bytes.

Further details, unit tests and history: #25013

Signed-off-by: dotnet-bot-corefx-mirror <dotnet-bot@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.