Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combine map default with cmdmap? #7

Open
sevrinrobstad opened this issue Jun 18, 2024 · 15 comments
Open

Combine map default with cmdmap? #7

sevrinrobstad opened this issue Jun 18, 2024 · 15 comments

Comments

@sevrinrobstad
Copy link

Hi.

I'm trying to test memcached proxy to substitute mcrouter.

Copying our routes setup from mcrouter, I have hit a snag...
What I want to achieve is to map commands to different pools (and route logics) for both special case prefix, AND for default (but to different pools),
for example to send
SET's to route-allfastest, and
GET's to route_failover on another pool.

The code block below shows a non-complete example:

    conf = {
        mode = "prefix",
        stop = "/"
    },
    map = {
        -- route handler for path "foo/*"
        specialcase = cmdmap{
            [mcp.CMD_SET] = route_allfastest{
                children = { "foo" },
            },
        },
    },
    default = cmdmap{
            -- only handle SET commands for path "bar/*"
            [mcp.CMD_SET] = route_allfastest{
                children = { "foo2" },
            },
        },
--    default = route_allfastest{
--        children = { "foo" }
--    },
}

But this gives
ERROR: Failed to execute mcp_config_pools: /files/routelib.lua:304: bad argument #1 to 'for iterator' (table expected, got nil)

With the commented out default block (without cmdmap) it works.

Am I doing something wrong, or is there any other way to achieve this config?

@dormando
Copy link
Member

Hey!

Somehow I'd not seen a mcrouter config like that before. I might have to do a little C work to support this, let me think it through. Do you have a more complete mcrouter config I can reference?

Right now the default route can only point to a single route handler and I don't think routelib can work around that. I'll double check today though.

@dormando
Copy link
Member

dormando commented Jun 18, 2024

Yeah this'll definitely take a little C but it looks easy. might take a day or two to get it written/tested/merged.

Fwiw I've been trying to not "kitchen sink" the proxy and add some features as-needed. most mcrouter users I've talked to barely used its featureset. So requests like this a very valuable to me, thank you!

@sevrob
Copy link

sevrob commented Jun 18, 2024

Hi again.
Our complete mcrouter setup looks something like this:

"route": {
   "type": "PrefixSelectorRoute",
   "policies": {
     "largekeys": {
        "type": "OperationSelectorRoute",
        "operation_policies": {
        "add": "AllFastestRoute|Pool|large_keys",
        "cas": "AllFastestRoute|Pool|large_keys",
        "delete": "AllFastestRoute|Pool|large_keys",
        "get": "MissFailoverRoute|Pool|large_keys_read",
        "gets": "MissFailoverRoute|Pool|large_keys_read",
        "incr": "AllFastestRoute|Pool|large_keys",
        "decr": "AllFastestRoute|Pool|large_keys",
        "set": "AllFastestRoute|Pool|large_keys"
        }
      },
   },
   "wildcard": { /* all other key/prefixes */
    "type": "OperationSelectorRoute",
      "operation_policies": {
          "add": "AllFastestRoute|Pool|write_all",
          "cas": "AllFastestRoute|Pool|write_all",
          "delete": "AllFastestRoute|Pool|write_all",
          "get": "MissFailoverRoute|Pool|read",
          "gets": "MissFailoverRoute|Pool|read",
          "touch": "AllFastestRoute|Pool|write_all",
          "set": "AllFastestRoute|Pool|write_all",
          "incr": "AllFastestRoute|Pool|write_all",
          "decr": "AllFastestRoute|Pool|write_all"
      }
    }
  }

It's mission is two parts - differ between a "write to all"/"read only from some" (nearest), and also do the same for "large keys" but with a different pools.
There might be better ways to do this, but this have worked fine for us, except for the fact that mcrouter have been acting up with strange memleaks and freezes lately, and it's a pain to build it.

And thanks to you! I'll really appreciate the fix whenever it's ready.

@dormando
Copy link
Member

That looks totally reasonable to me. Will update here when it's ready.

If you run into any other issues (memory/performance/etc) please just let us know asap. The proxy gets a lot of usage now but the total number of users is low so I expect some small undiscovered issues in specific use cases (like this one). Been mostly focusing on polish and usability this year.

@dormando
Copy link
Member

@sevrob any chance you could try out some branches?

Run this memcached PR: memcached/memcached#1148
with this routelib PR: #8

... and let me know if this solves your problem?

I'll probably merge the proxy PR in a day or two (I self-review after letting things sit), but will merge the routelib change after the next tarball release so I can bump the minimum version requirement at the same time. Either way if you try it and give a thumbs up that helps a lot in my confidence of the change :)

@dormando
Copy link
Member

Fwiw with this change your config could be a tiny bit shorter. Set all three of map, cmap and default, but only override the ones you need to. So something like:

    conf = {
        mode = "prefix",
        stop = "/"
    },
    map = {
        -- route handler for path "foo/*"
        specialcase = cmdmap{
            -- this covers all the set commands
            [mcp.CMD_ANY_STORAGE] = route_allfastest{
                children = { "foo" },
            },
            -- specifically override gets
            [mcp.CMD_GET] = route_etc{},
        },
    },
    cmap = {
        [mcp.CMD_GET] = route_etc{},
    },
    -- cover all the set commands with the bottom default
    default = route_allfastest{
        children = { "foo" }
    },
}

@dormando
Copy link
Member

dormando commented Jun 19, 2024

Found a couple bugs just now. force pushed the routelib branch. I also added some aliases to make the command maps less gross:

    conf = {
        mode = "prefix",
        stop = "/"
    },
    map = {
        -- route handler for path "foo/*"
        specialcase = cmdmap{
            -- this covers all the set commands
            all = route_allfastest{
                children = { "foo" },
            },
            -- specifically override gets
            get = route_etc{},
            gets = route_etc{},
        },
    },
    cmap = {
        get = route_etc{},
    },
    -- cover all the set commands with the bottom default
    default = route_allfastest{
        children = { "foo" }
    },
}

@sevrinrobstad
Copy link
Author

Absolutety, I'll have a try and report back later today!

@sevrinrobstad
Copy link
Author

sevrinrobstad commented Jun 19, 2024

So, I have done a bit of testing. The new config is so much cleaner!
But it only "sort of" works for me.
The map "largekeys"-part works fine, it writes and reads to :11250.
But writing any other key seem to hit either the :11240 OR the :11241 backend, not both.
Isn't the "route_allfastest" supposed to "copy request to all children, but return first response" (found in a comment in routelib) ?
So a SET should go to all children, but return OK after the first so the client setting the key doesn't have to wait for all to finish?
Well It doesn't, at least not with my config below.
I might also have misunderstood something :)

The cmap part seem to work fine, as when I try to GET values I have SET through proxy, it only returns the ones that are stored on the "read" backend.

verbose(true)
debug(true)

settings{
    active_request_limit = 100,
    backend_connect_timeout = 3,
}

pools{
    write_all = {
        backends =  {
            "127.0.0.1:11240",
            "127.0.0.1:11241",
        }
    },
    read = {
        backends = {
            "127.0.0.1:11241",
        }
    },
    large_keys = {
        backends = {
            "127.0.0.1:11250",
        }
    },
}

routes{
    conf = {
        mode = "prefix",
        stop = "/"
    },
    map = {
        -- route handler for path "foo/*"
        largekeys = cmdmap{
            -- This covers all the set commands
            all = route_allfastest{
                children = { "large_keys" },
            },
            get = route_failover{
                children = { "large_keys" },
                stats = true,
                miss = true,
                shuffle = true,
            },
            gets = route_failover{
                children = { "large_keys" },
                stats = true,
                miss = true,
                shuffle = true,
            },
        },
    },
-- default handling
    default = route_allfastest{
        children = { "write_all" },
    },
-- override get/gets
    cmap = {
         get = route_failover{
             children = { "read" },
             stats = true,
             miss = true,
             shuffle = true,
         },
         gets = route_failover{
             children = { "read" },
             stats = true,
             miss = true,
             shuffle = true,
         },
    },
}

@dormando
Copy link
Member

The proxy thinks in “pools”, not backends, which is a little different than mcrouter.

if you want to write to two servers, you have to define two pools with one backend in each, then specify both of those named pools as children in the route arguments.

in the most common use cases a set would be copied to multiple pools, so that’s the shorthand here. ie: if you have two pools of 3 backend each in different datacenters, it will copy that key once in each datacenter.

might be nice to get shorthand for “use this list of backends to iterate over” but it’s not planned right now

@sevrinrobstad
Copy link
Author

sevrinrobstad commented Jun 19, 2024

Hmm OK, I see.

Changing the write_all pool to two pools like this:

pools{
    write1 = {
        backends =  {
            "...:11240",
        }
    },
    write2 = {
        backends =  {
            "...:11241",
        }
    },
...

and changing the default like this

    default = route_allfastest{
        children = { "write1",
                     "write2" },
    },

seems to work perfectly!

For my setup the "pools" part is actually redundant then.
A " shorthand for “use this list of backends to iterate over” would solve this.
But it's not a blocker for going forward with memcached proxy as a replacement for mcrouter.

Thank you so much!
I'll report back if we see anything strange.

@dormando
Copy link
Member

Cool!

Yeah; fundamentally memcached's value is adding servers to the cluster increases your memory storage capacity. So just having single servers with everything copied to all of them is usually an antipattern. That said there are still definitely perfectly valid use cases to do so.

Though I've been seeing some mcrouter users copy everything to 20+ servers which is pretty bad for performance :) Splitting the pool up into like 5 servers per pool would cut CPU/bandwidth for sets way down and improve hit ratio.

There is one more trick you can use to keep your config simplified in this case. Also noting this since you have to do the same change for your route_failover configs: you need to specify multiple pools

The trick is pool sets:

pools{
    set_all = {
        zfoo = {
            backends = {
                "127.0.0.1:11322",
            }
        },
        zbar = {
            backends = {
                "127.0.0.1:11323",
            }
        },
        zbaz = {
            backends = {
                "127.0.0.1:11324",
            }
        },
    }
}

-- Then down in routes{}:
        allfastest = route_allfastest{
            -- instead of this...
            -- children = { "foo", "bar", "baz" }
            -- This will be automatically replaced with the (unordered) set of pools
            children = "set_all"
        },

It's not quite perfect since it loses the order. The feature is mostly meant for route_zfailover and the likes, where you replicate sets across N "zones" and want to first try a particular "zone" during reads.

Since you're doing a shuffled failover it'll probably work just fine and help keep the config size down. You could also build the pool_set with a for loop over an array of backends and pass it in; or load from json and do that or etc. depending on how much lua you want to do and how often your config changes.

@sevrinrobstad
Copy link
Author

Just to report back - I deployed memcached proxy with this setup in one of our dev environment last week as a drop in for mcrouter. Noone have noticed yet, that must be a good sign.

@dormando
Copy link
Member

@sevrinrobstad awesome! Thanks for the update. Fwiw I haven't done a performance head-to-head with mcrouter in a long time. Let me know if you notice anything there.

I'll get a new release out with the cmap changes this week.

@dormando
Copy link
Member

Fwiw: 1.6.29 is out and I've merged the routelib branch so you shouldn't need to deploy branches anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants