Poor server distribution #104

Closed
vchekan opened this Issue Jun 5, 2012 · 5 comments

Projects

None yet

5 participants

@vchekan
vchekan commented Jun 5, 2012

I noticed that the client "favors" certain servers when making requests with keys that differs a little. For example, if the key is "a_:stat:number" where number is 0-20 and with 9 servers distribution is:

a_:stat:0 10.7.1.12
a_:stat:1 10.7.1.12
a_:stat:2 10.7.1.12
a_:stat:3 10.7.1.12
a_:stat:4 10.7.1.12
a_:stat:5 10.7.1.12
a_:stat:6 10.7.1.10
a_:stat:7 10.7.1.12
a_:stat:8 10.7.1.10
a_:stat:9 10.7.1.10
a_:stat:10 10.7.1.12
a_:stat:11 10.7.1.12
a_:stat:12 10.7.1.12
a_:stat:13 10.7.1.12
a_:stat:14 10.7.1.12
a_:stat:15 10.7.1.12
a_:stat:16 10.7.1.12
a_:stat:17 10.7.1.12
a_:stat:18 10.7.1.10
a_:stat:19 10.7.1.10

As you can see, the server 10.7.1.12 is heavily favored (15 out of 20 occurrences) which is 75% of requests, while with 20 servers I would expect 1/20 = 5% requests to a single server.
The client configuration is the default one, without any key transformers, etc. defined.

@jkpindahbc

I am interested to learn if anyone has addressed this or if there exists some configuration changes that may address the issue. We are using the client in a production environment and would like to be able to balance the storage load better. There are 12 servers in the cluster and 2 to 3 of them are overloaded where 5 of them have little or no data getting stored on them. Thanks.

@vchekan
vchekan commented Jan 2, 2013

@jkpindahbc, I solved the problem by using Ketama node location factory. It is part of memcached client but it is not the default factory.

var conf = new MemcachedClientConfiguration();
...
conf.NodeLocatorFactory = new KetamaNodeLocatorFactory();
return new MemcachedClient(conf);

The distribution is still far from even but improved a lot:

Default: (FNV1a hash)
15 10.7.1.12
13 10.7.1.1
8 10.7.1.10
3 10.7.1.11
1 10.7.1.12

Ketama:
7 10.7.1.10
7 10.7.1.1
6 10.7.1.12
5 10.7.1.2
4 10.7.1.11
3 10.7.1.9
3 10.7.1.3
3 10.7.1.13
1 10.7.1.4
1 10.7.1.13

@yilo
yilo commented Feb 12, 2014

Do you think Ketama is better than Default? I also met same problem that two memcached servers, one get more heavier load than another. Not sure if I need to choose Ketama. And how about others such as TigerHashKeyTransformer, SHA1KeyTransformer.
thanks

@vchekan
vchekan commented Feb 13, 2014

@yilo Accordint to my test above, it is better. In-production servers also behave much better once I changed hash. You need to test distribution of hashes to find out which one works the best.

@jamey-taylor

We've observed the same thing.

The KetamaNodeLocator consistently provides a more even distribution accross nodes, but we didn't want to switch to it because of issue #113.

However, if you compare how the KetamaNodeLocator and the DefaultNodeLocator work, you'll notice that the DefaultNodeLocator is pretty much a simplified version of Ketama. There are 3 major differences in how the two build the index of hash keys to servers.

  1. Ketama creates 160 index mutations or slices per server, and Default creates only 100.
  2. Ketama reverses the order of bytes from hashes, and Default does not.
  3. Ketama's default hash is the built-in .NET MD5 hash, and Default uses an internal implementation of a Fibonacci hash.

After tweaking all 3 parameters above, I concluded that the hash algorithm contributes to Default having a poor distribution across nodes.

If you make a copy of DefaultNodeLocator and replace all occurrences of the Fibonacci hash with an MD5 hash, you'll most likely see an improved distribution. For a small cluster, you might see worse distribution -- as the example below indicates. But as the number of nodes in your memcached cluster increases, it's probable switching to MD5 will help you out.

I used reflection to peek at the dictionary of key-to-server values in the node locators to determine how many possible key hashes (uint.MinValue to uint.MaxValue) would be allocated to a particular server and compared the results for various configurations. I pasted some of the results below. Keep in mind, that your results will vary, because the distribution changes based the IP addresses of the nodes.

2 servers: 10.0.0.1 - 10.0.0.2

DefaultNodeLocator
Node Keys
10.0.0.1:11211 1933710048 45.02 %
10.0.0.2:11211 2361257247 54.98 %

KetamaNodeLocator
Node Keys
10.0.0.1:11211 2102353687 48.95 %
10.0.0.2:11211 2192613608 51.05 %

DefaultNodeLocatorMD5
Node Keys
10.0.0.1:11211 1846857399 43.00 %
10.0.0.2:11211 2448109896 57.00 %

3 servers: 10.0.0.1 - 10.0.0.3

DefaultNodeLocator
Node Keys
10.0.0.1:11211 1538860362 35.83 %
10.0.0.2:11211 2182144864 50.81 %
10.0.0.3:11211 573962069 13.36 %

KetamaNodeLocator
Node Keys
10.0.0.1:11211 1368297246 31.86 %
10.0.0.2:11211 1486717250 34.62 %
10.0.0.3:11211 1439952799 33.53 %

DefaultNodeLocatorWithMD5
Node Keys
10.0.0.1:11211 1220214900 28.41 %
10.0.0.2:11211 1705981095 39.72 %
10.0.0.3:11211 1368771300 31.87 %

12 servers: 10.0.0.1 - 10.0.0.12

DefaultNodeLocator
Node Keys
10.0.0.1:11211 251176749 5.85 %
10.0.0.10:11211 243426577 5.67 %
10.0.0.11:11211 300083499 6.99 %
10.0.0.12:11211 330384367 7.69 %
10.0.0.2:11211 716357460 16.68 %
10.0.0.3:11211 399253606 9.30 %
10.0.0.4:11211 187559973 4.37 %
10.0.0.5:11211 838456974 19.52 %
10.0.0.6:11211 225633515 5.25 %
10.0.0.7:11211 356156731 8.29 %
10.0.0.8:11211 263221002 6.13 %
10.0.0.9:11211 183256842 4.27 %

KetamaNodeLocator
Node Keys
10.0.0.1:11211 308872529 7.19 %
10.0.0.10:11211 397353016 9.25 %
10.0.0.11:11211 359410151 8.37 %
10.0.0.12:11211 334112090 7.78 %
10.0.0.2:11211 363841247 8.47 %
10.0.0.3:11211 361749691 8.42 %
10.0.0.4:11211 382162451 8.90 %
10.0.0.5:11211 306283767 7.13 %
10.0.0.6:11211 375643565 8.75 %
10.0.0.7:11211 396653969 9.24 %
10.0.0.8:11211 343676679 8.00 %
10.0.0.9:11211 365208140 8.50 %

DefaultNodeLocatorWithMD5
Node Keys
10.0.0.1:11211 352131379 8.20 %
10.0.0.10:11211 360082131 8.38 %
10.0.0.11:11211 366372356 8.53 %
10.0.0.12:11211 349456483 8.14 %
10.0.0.2:11211 438330925 10.21 %
10.0.0.3:11211 355439076 8.28 %
10.0.0.4:11211 365394346 8.51 %
10.0.0.5:11211 322431280 7.51 %
10.0.0.6:11211 353893462 8.24 %
10.0.0.7:11211 308521292 7.18 %
10.0.0.8:11211 352873634 8.22 %
10.0.0.9:11211 370040931 8.62 %

@enyim enyim closed this Apr 24, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment