Attempt to make consistent hashing more uniform #7

paulmach · 2015-08-18T06:06:27Z

This code runs rand strings against the consistent hash function and checks uniformity. Unfortunately the results are not even close to uniform:

  58856 10.0.16.69:11211
  18389 10.0.22.188:11211
  22755 10.0.26.214:11211

This is matches what we're seeing in production (memcache load is not uniform and matches the distribution above). The issue is due to the non-uniformity of the crc32 hash function. :(

So I updated to use a different hash table as well as to default to more replicas on the consistent hash "ring". This gave better results, but I was unable to find something that would be better for all inputs

  29161 10.0.16.69:11211
  37910 10.0.22.188:11211
  32929 10.0.26.214:11211

@mlerner

mrdmnd · 2015-08-18T06:49:27Z

I left some comments on the snippet, but:

I've run into this issue before, when building Airbnb's experiment assignment framework's randomizer. It has to do with how the CRC algorithm treats the low-order bits of the input.

We solved this by moving to Murmur3 hash:

uint32_t murmur3_32(const char *key, uint32_t len, uint32_t seed) {
    static const uint32_t c1 = 0xcc9e2d51;
    static const uint32_t c2 = 0x1b873593;
    static const uint32_t r1 = 15;
    static const uint32_t r2 = 13;
    static const uint32_t m = 5;
    static const uint32_t n = 0xe6546b64;

    uint32_t hash = seed;

    const int nblocks = len / 4;
    const uint32_t *blocks = (const uint32_t *) key;
    int i;
    for (i = 0; i < nblocks; i++) {
        uint32_t k = blocks[i];
        k *= c1;
        k = (k << r1) | (k >> (32 - r1));
        k *= c2;

        hash ^= k;
        hash = ((hash << r2) | (hash >> (32 - r2))) * m + n;
    }

    const uint8_t *tail = (const uint8_t *) (key + nblocks * 4);
    uint32_t k1 = 0;

    switch (len & 3) {
    case 3:
        k1 ^= tail[2] << 16;
    case 2:
        k1 ^= tail[1] << 8;
    case 1:
        k1 ^= tail[0];

        k1 *= c1;
        k1 = (k1 << r1) | (k1 >> (32 - r1));
        k1 *= c2;
        hash ^= k1;
    }

    hash ^= len;
    hash ^= (hash >> 16);
    hash *= 0x85ebca6b;
    hash ^= (hash >> 13);
    hash *= 0xc2b2ae35;
    hash ^= (hash >> 16);

    return hash;
}

See this article for more: http://michiel.buddingh.eu/distribution-of-hash-values

mrdmnd · 2015-08-18T06:52:22Z

This one is good, too:
http://research.neustar.biz/2012/02/02/choosing-a-good-hash-function-part-3/

paulmach · 2015-08-18T17:29:20Z

Thanks for the tip. I reversed the bytes as suggested here and got results similar to this fix.

mrdmnd · 2015-08-18T17:30:44Z

Awesome. Glad that got sorted.

paulmach · 2015-08-18T17:35:12Z

yeah, but then I tweaked some other things and it got bad again. I'm working on murmur3 now

paulmach · 2015-08-18T20:17:14Z

closed in favor or #8

andyxning · 2016-05-21T09:28:55Z

May be we can chose some existing hash algorithm that works better than crc32 with consistent hashing. Implementing an hash algorithm self is not an optimal solution, existing library is. :)

Attempt to make consistent hashing more uniform

c8136cc

paulmach mentioned this pull request Aug 18, 2015

Distribution is not uniform stathat/consistent#10

Open

paulmach closed this Aug 18, 2015

paulmach deleted the pm/uniform branch August 18, 2015 20:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Attempt to make consistent hashing more uniform #7

Attempt to make consistent hashing more uniform #7

paulmach commented Aug 18, 2015

mrdmnd commented Aug 18, 2015

mrdmnd commented Aug 18, 2015

paulmach commented Aug 18, 2015

mrdmnd commented Aug 18, 2015

paulmach commented Aug 18, 2015

paulmach commented Aug 18, 2015

andyxning commented May 21, 2016

Attempt to make consistent hashing more uniform #7

Attempt to make consistent hashing more uniform #7

Conversation

paulmach commented Aug 18, 2015

mrdmnd commented Aug 18, 2015

mrdmnd commented Aug 18, 2015

paulmach commented Aug 18, 2015

mrdmnd commented Aug 18, 2015

paulmach commented Aug 18, 2015

paulmach commented Aug 18, 2015

andyxning commented May 21, 2016