Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow an optional instance name, use it for consistent hashing #25

Closed
antirez opened this issue Dec 5, 2012 · 11 comments
Closed

Allow an optional instance name, use it for consistent hashing #25

antirez opened this issue Dec 5, 2012 · 11 comments

Comments

@antirez
Copy link

antirez commented Dec 5, 2012

The problem

Twemproxy can be configured in order to avoid auto ejecting nodes, and when it is configured this way the user can rely on the fact that a given key will always be mapped to the same server, as long as the list of hosts remain the same.

This is very useful when using the proxy with Redis, especially when Redis is not used as a cache but as a data store, because we are sure keys are never moved in other instances, never leaked, and so forth, so the cluster is consistent.

However since Twemproxy adds a given host into the hash ring by hashing the ip:port:priority string directly, it is not possible for users to relocate instances without as a side effect changing the key-instance mapping. This little detail makes very hard to work with Twemproxy and Redis in production environments where network addresses can change.

Actually this is a problem with Memcached as well. For instance if our memcached cluster changes subclass, the consistent hashing will completely shuffle the map, and this will result info many cache misses happening after the reconfiguration.

Proposed solution

The proposed solution is to change the configuration so that instead of a list of instances like:

servers:
   - 127.0.0.1:6379:1
   - 127.0.0.1:6380:1
   - 127.0.0.1:6381:1
   - 127.0.0.1:6382:1

It is (optionally) possible to specify an host / name pair for every instance:

servers:
   - 127.0.0.1:6379:1 server1
   - 127.0.0.1:6380:1 server2
   - 127.0.0.1:6381:1 server3
   - 127.0.0.1:6382:1 server4

When an instance name is specified, it is used to insert the node in the hash ring instead to hash the ip:port:priority.

Open problems

One open problem with this solution is that modifying the priority will still mess with the mapping.
There are several solutions to this problem:

  • Simply ignore the problem and warn the user in the documentation.
  • Ignore the priority when an instance name is specified.
  • Ignore the priority when an instance name is specified, but read it instead from the name. For instance an instance name like "myserver:100" has priority 100. In this way it is obvious that to change the priority the user is forced to change the name, and hence the map.
@jzawodn
Copy link

jzawodn commented Dec 5, 2012

+1

This is pretty much exactly how we do it at craigslist with our sharding setup. We has to a "node name" rather than directly to an IP:PORT pair, so it's possible to move data without losing any keys.

http://blog.zawodny.com/2011/02/26/redis-sharding-at-craigslist/

@antirez
Copy link
Author

antirez commented Dec 5, 2012

Thanks for the ACK Jeremy! I also did the same when trying to implement Dynamo concepts on top of Redis.

@manjuraj
Copy link
Collaborator

manjuraj commented Dec 5, 2012

I like this idea of using the "node name" (when specified) instead of "host:port" pair as input to consistent hashing. I also believe that this should be fairly easy to implement

Regarding the open problem of priority, we can just use the priority from the "host:port:priority" triplet. For example, for a input like "127.0.0.1:6382:1 server4" we will use "1" as the priority of server4

@antirez
Copy link
Author

antirez commented Dec 5, 2012

@manjuraj doesn't the priority affect the way the hash ring is populated? (more repliacas of the same node if priority is higher)? If not I was addressing a non existing problem (that just changing the priority would change the map).

@charsyam
Copy link
Contributor

charsyam commented Dec 5, 2012

@manjuraj I have a question. if server->port is 11211, then why don't you attach port number in hash string? is there special issue?

           if (server->port == KETAMA_DEFAULT_PORT) {
                hostlen = snprintf(host, KETAMA_MAX_HOSTLEN, "%.*s-%u",
                                   server->name.len, server->name.data,
                                   pointer_index - 1);
            } else {
                hostlen = snprintf(host, KETAMA_MAX_HOSTLEN, "%.*s:%u-%u",
                                   server->name.len, server->name.data,
                                   server->port, pointer_index - 1);
            }

@manjuraj
Copy link
Collaborator

manjuraj commented Dec 6, 2012

@charsyam This code exists for backward compatibility reasons.

When we deployed twemproxy inside twitter for memcached protocol, for a while we would do dual reads - read data through proxy and read data directly from backend server cluster and ensure that we read the same data from both code paths. Since the client was using libmemcached, we had to make sure that we used the same consistent hashing algorithm as that used by libmemcached library to ensure that keys get mapped to the same server.

I guess, we can now update this code to not attach a port number only if the server pool is a memcache server pool

@manjuraj
Copy link
Collaborator

manjuraj commented Dec 6, 2012

@antirez priority refers to the weight of a server. For example, if I am running redis on a server1 with 4G and another redis on server2 with 8G, I would want to give server2 twice the weight given to server1 in order for the keys to distribute evenly across the total cluster memory

So, if a server migrates from "127.0.0.1:6379:X server1" to "1.2.3.4:8888:Y server1", we ensure that we keep the weights X and Y same to keep the key mapping stable

@antirez
Copy link
Author

antirez commented Dec 6, 2012

@manjuraj yes, I and you understand this, but IMHO this is the random user interaction:

"Hey we got this new fast box with BIG RAM! Holy Shit let's move one of our instances there"

- 192.168.1.3:6379:10 server1
+ 192.168.1.5:6379:99 server1

"Look, I updated the priority because this box is so much bigger!"

And the user ends with data shuffled around instances in a way that is very hard to recover.

So back to my proposals, honestly, both ignoring priority and putting it into the name sound wrong to me. For the following reasons:

  • Ignoring priority is a surprising behavior.
  • Forcing it to be part of the name could work but there is a numerical part anyway, like "myserver:1000", users may still think that the numerical part can be changed without problems.

It's probably better just to use warnings inside the documentation to make sure people understand that changing priority OR instance name will result in different mapping of keys.

@charsyam
Copy link
Contributor

charsyam commented Dec 6, 2012

@antirez @manjuraj it is complicated problem. I also think ignoring priority is good way when redis is true. but it can also cause some misconception because twemproxy also has to support memcache.

like craiglist. some can use like below too.

192.168.1.3:2000:1 server1-1
192.168.1.3:2001:1 server1-2
192.168.1.3:2002:1 server1-3
192.168.1.3:2003:1 server1-4

but, no one can deny that users will easily make a mistake.

@antirez
Copy link
Author

antirez commented Dec 6, 2012

Maybe the ultimate solution is that:

  • If node ejection is false.
  • If redis is true
  • If for every node the user specified a node name

THEN -> Exit with an error if the specified priority is not always "1", with an error message that makes sense, like:
"You are proxying Redis protocol with node ejection disabled and explicit names for all the nodes. In this setup usually a static map between keys and hosts is needed, so all the instances must be configured with priority 1 (otherwise changing the priority may change how keys are mapped to servers)."

Optionally one may support an option to still allow non-1 priority with Redis server in this setup.

Ok I think so far this is absolutely the best option we have.

@manjuraj
Copy link
Collaborator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants