Skip to content

Commit

Permalink
[Runtime] Improve the metadata hashing function.
Browse files Browse the repository at this point in the history
The inputs to the hash function is pointers that have a predictable patten. The
hashes that we were generating and using for the metadata caches were not very
good, and as a result we generated very deep search trees. A small change that
improved the utilization of the 'length' field and another bit-rotate round
improved the quality of the hash function.

I am going to attach to the github commit two pictures. The first picture is the
binary with the old hash. The first tree is very deep and sparse. The second
picture is with the new hash function and the tree is very wide and uniform.  I
used the benchmark 'TypeFlood' in debug mode to generate huge amounts of
metadata.
  • Loading branch information
nadavrot committed Feb 10, 2016
1 parent 866b44c commit 4227645
Showing 1 changed file with 3 additions and 2 deletions.
5 changes: 3 additions & 2 deletions stdlib/public/runtime/MetadataCache.h
Original file line number Diff line number Diff line change
Expand Up @@ -66,12 +66,13 @@ class EntryRef {
}

size_t hash() {
size_t H = 0x56ba80d1 ^ length ;
size_t H = 0x56ba80d1 * length ;
for (unsigned i = 0; i < length; i++) {
H = (H >> 10) | (H << ((sizeof(size_t) * 8) - 10));
H ^= ((size_t)args[i]) ^ ((size_t)args[i] >> 19);
}
return H * 0x27d4eb2d;
H *= 0x27d4eb2d;
return (H >> 10) | (H << ((sizeof(size_t) * 8) - 10));
}

const void * const *begin() const { return args; }
Expand Down

15 comments on commit 4227645

@nadavrot
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two pictures below show the shape of the internal data structure used by the metadata caches.

Before:

before

After:
after

@dduan
Copy link
Contributor

@dduan dduan commented on 4227645 Feb 10, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweeeeeet

@kostiakoval
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing, would be interesting be see the speed improvements related to this change.

@jckarter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!

@jckarter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ER: playgrounds should display the bucket graph for Dictionarys

@riking
Copy link

@riking riking commented on 4227645 Feb 11, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nadavrot Can you thumbnail those images? My browser struggles on the image decode when loading this page...

@Cyan4973
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might also consider changing entirely the hash function by some proven ones with excellent dispersion and speed. See for example this excellent article which makes a great job at comparing several hash algorithms for their speed and dispersion.

Better dispersion == smaller tree depth.

A PRIMER ON REPEATABLE RANDOM NUMBERS, by RUNE SKOVBO JOHANSEN

@gribozavr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be even better to not roll our own, and just use llvm Hashing.h or std::hash (if all of our platforms are new enough). In fact, the issue of bad balancing in the tree was caused by us writing our own hash function in the first place.

@nadavrot
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gribozavr Using LLVM's hash is not a very good idea. We hash our metadata pointers very frequently and LLVM's hash is way too slow. Most Swift programs have a small number of generic types at runtime, which means that we have a small number of entries in the tree, but we calculate the hash and perform the lookup very frequently. This means that a low-quality hash that runs very fast outperforms a slow hash that does a very good job. Another problem that we have is that the number if inputs to our hash function is typically very small (1 to 3 pointers) and we need to ensure that we have as much entropy in the high bits, even for the small inputs. A non-specialized hash, like LLVM's, can only do this at the cost of a high runtime (by doing lots of operations that blend the bits, such as multiplication and rotation). Another requirement that we have is the support for both 32bits and 64bits. We need the hash to be fast and effective on 32bit platforms, and LLVM is 64bits only.

@gribozavr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nadavrot OK, here's another hash with known good properties:

https://131002.net/siphash/

The performance is discussed in https://131002.net/siphash/siphash.pdf on page 13.

@Cyan4973
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

siphash has very poor performance in 32-bits mode unfortunately,
which doesn't meet @nadavrot requirement.

Most modern fast hashes are tailored for 64-bits CPU, since they run faster using this mode.
So it leaves only a handful which remain fast on 32-bits targets.

This benchmark was specifically run on 32-bits.

Now it also seems that another requirement is to work on short keys, which isn't uncommon.
That also could be tested with a benchmark.
The SMHasher test set for example includes a benchmark module for small keys.

edit : @nadavrot : sorry, crossed posts by a few seconds ...

@nadavrot
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gribozavr SipHash is a bad candidate for two reasons. First, it is slow (optimistically 15 cycles per byte), which is an order of magnitude slower than what we use today. And Second, it requires at least 16 bytes of inputs, which we don't always have.

@gribozavr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use different functions for 32 and 64-bit targets.

@Cyan4973
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.
In general, the gains from 32 to 64 bits makes it worthwhile.
Two caveats though :

  • Increases maintenance complexity (can be mitigated by using both version from a single library)
  • For short key specifically, 64-bits version may not have enough room to prove superior.
    Once again, only benchmarking will tell...

@rudkx
Copy link
Contributor

@rudkx rudkx commented on 4227645 Feb 13, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 💯

Please sign in to comment.