Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LatestRoute always picks the same node in cluster #127

Closed
nathanwelch opened this issue Jun 7, 2016 · 4 comments
Closed

LatestRoute always picks the same node in cluster #127

nathanwelch opened this issue Jun 7, 2016 · 4 comments

Comments

@nathanwelch
Copy link

Goal

I want to handle situations where a node in a pool fails and is TKO'd. I'd like to do this for both writes/reads so that operations can continue to other working nodes. I'd like TKO'd nodes to be skipped until they are usable again but I would like load to be mostly even across the other nodes while this happens. Based on trial and error, reading the docs, and reading through other issues, it seems like this is the best way to achieve this for reads/writes:

"route": {
    "type": "OperationSelectorRoute",
    "operation_policies": {
        "add": "AllFastestRoute|Pool|A",
        "delete": "AllFastestRoute|Pool|A",
        "get": "LatestRoute|Pool|A",
        "set": "AllFastestRoute|Pool|A"
    }
}

AllFastestRoute makes sets to failed nodes in Pool A not return an error but still allows for replication so that future get commands will hit. LatestRoute should try a node and then failover to another one on error. I tried FailoverRoute but seems to always go in order on the pool so it will always choose the first available node which sends all traffic there rather than doing any sort of hashing to find a server.

Issue

I'm running a mcrouter client in a docker container on every API host that talks to memcached. Even on different API hosts, LatestRoute seems to be choosing the same memcached node in my pool. Would the host ID that mcrouter uses to generate the list of nodes in the pool for LatestRoute come from the mcrouter docker container? If so, I guess it's possible that's always the same even on different API hosts?

Questions

  1. Based on my goal, is this the right way to go about it? Is there some combination of HashRoute and FailoverRoute that might work better?
  2. Assuming the above config is correct, what's the right way to configure LatestRoute to make sure it talks to more than one node?
  3. Am I right in thinking that mcrouter does not automatically remove TKO'd nodes unless the route config explicitly allows it? I originally thought node failures were handled out of the box somehow without needing to make a specific FailoverRoute config.
@nathanwelch
Copy link
Author

I did find #35 (comment) as well. I implemented that and I think it mostly solves my issues. However, I still have some questions:

  1. If you're replicating to all nodes in a cluster and a set fails on a node, what's the best way to recover from that? For some caches I'm OK with it failing so I can use something like AllFastestRoute but for others, a failed set might mean that a subsequent get is getting stale data. I can think of some application-level solutions to this but I'm curious what's best practice here.
  2. Is the approach from feature: Random MissFailoverRoute ? #35 (comment) actually the recommended way to go about achieving my goal?
  3. From my original questions: Am I right in thinking that mcrouter does not automatically remove TKO'd nodes unless the route config explicitly allows it?

Thanks!

@pavlo-fb
Copy link
Contributor

Hi @nathanwelch! First of all, thanks for the detailed explanation of your problem.

Is there some combination of HashRoute and FailoverRoute that might work better?

Correct. To achieve your goal I would go with

"route": {
  "type": "OperationSelectorRoute",
  "operation_policies": {
    "get": {
      "type": "FailoverRoute",
      "children": [
        "HashRoute|Pool|A",
        "Pool|A"
      ]
    }
  },
  "default_policy": "AllAsyncRoute|Pool|A"
}

Full example with logs and small test: http://pastebin.com/GBFhmmvb

What does it do? For get it routes based on key hash and, if the request fails, failovers the request to all nodes one by one. The only drawback is the request may be sent twice to the same node, but in case of TKO (the most common case) it doesn't matter, it's basically free.
For all other operations (add, delete, set, etc.) it will asynchronously send requests to all nodes. Mcrouter will immediately return a default reply (e.g. NOT_STORED, NOT_FOUND) and will never return an error. Feel free to replace AllAsyncRoute with AllFastestRoute/AllMajorityRoute/AllSyncRoute if you want to wait for some nodes to reply.
NOTE: this config doesn't work for gets/cas, since those requests have to deal with cas tokens.

Would the host ID that mcrouter uses to generate the list of nodes in the pool for LatestRoute come from the mcrouter docker container?

hostid is calculated based on local IP address. If all mcrouter instances are running on the same host, they will use the same hostid. The logic of calculating hostid is here: https://github.com/facebook/mcrouter/blob/master/mcrouter/lib/fbi/cpp/globals.cpp#L61

Am I right in thinking that mcrouter does not automatically remove TKO'd nodes unless the route config explicitly allows it? I originally thought node failures were handled out of the box somehow without needing to make a specific FailoverRoute config.

Mcrouter will not remove TKO'd nodes by default. You should explicitly add FailoverRoute or some other similar route (e.g. MissFailoverRoute) to enable logic of retrying requests on failure.

@jmswen
Copy link
Member

jmswen commented Jul 14, 2016

Hi @nathanwelch, did @pavlo-fb's response resolve your issue?

@nathanwelch
Copy link
Author

@jmswen Yes it did answer my question but I haven't had a chance to test it out yet. Sorry I forgot to close the issue afterwards. Thanks for help!

For the record for future readers, the solution in #35 (comment) gave me some weird results in testing. I can elaborate if someone needs it but basically after sustained use I saw a dramatic increase in latency from any cache reads/writes. I never fully tracked down why though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants