Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

backup nodes in subscriptions #179

Closed
prymitive opened this Issue Mar 9, 2013 · 15 comments

Comments

Projects
None yet
2 participants
Contributor

prymitive commented Mar 9, 2013

Could we have backup subscriptions in (raw|http|fast)router that would work like haproxy backup option

We would need to add new flag to --subscribe2 with syntax backup=[0,1] (0 by default):

uwsgi [...] --subscribe2 server=192.168.0.100:2626,key=uwsgi.it,addr=192.168.0.10:3031,backup=1

All subscribed nodes with backup=1 would not get any traffic as long there are other subscribed nodes with backup=0. Once we have only backup nodes subscribed we load balance using all of them.

This way one could setup vassal with static files and set it as backup for main ruby/python/whatever app, if it goes down due to error or maintenance we would fall back to backup vassal instead of serving a lot of 502.

Will old subscription servers (without support for backup flag) handle subscriptions? Will they just ignore this flag or will they raise an error and ignore our subscription request?

Owner

unbit commented Mar 9, 2013

It is something i am thinking about from some time (like subscribing to specific paths). Definitely needed. Currently you can only define backup nodes statically with --fallback-to options

Contributor

prymitive commented Mar 9, 2013

It shouldn't be much work, we wouldn't even need to alter load balancing algos much for it, just do normal node picking algo run (but only including backup=0 nodes), and if it doesn't return any nodes, than retry with (use_backup=1), in this run it would include only backup nodes.
I can make a patch for review.

Contributor

prymitive commented Mar 9, 2013

I forgot about retries and some other cases, I would need to check how and where are they handled

Contributor

prymitive commented Mar 10, 2013

I did a little bit here: prymitive/uwsgi@6a6f3fa

I'll play with it tomorrow and see if it works

Contributor

prymitive commented Mar 10, 2013

note to self - check if backup field types are right (uint8_t vs int) in subscription

Contributor

prymitive commented Mar 10, 2013

Maybe instead of backup we could have priorities in routers?
0 - highest
biggest - lowest prio

Code would essentially be the same as for backup, just instead of of [0-1] values we could use any number.
We first try to find node with prio=0, if such node can be picked and we had nodes with higher prio value than we retry.

But this would have limited use, backup feature is plain and clear - all primary nodes are down so we use backup.
With priorities we could only have more levels of backup, since all requests would go to higher priority nodes unless they are all dead, this would not be a way to load balance among group of servers.
This would just be a way to implement how haproxy behaves with multiple backup nodes and allbackups option disabled.

I wouldn't go this way since it complicates logic and add little value, but it felt worth mentioning.

Contributor

prymitive commented Mar 10, 2013

I'll get back to it tomorrow

Contributor

prymitive commented Mar 10, 2013

if we stick with it as a way to have multiple levels of backup, then it won't complicate much, it will be the same code with just few additional if's. I'll add that

Owner

unbit commented Mar 10, 2013

So, each node will have a "prio" field in addition to "weight". Load balancers algo would need to take it in account. Maybe we can pass a "retry" argument to their function so each algo can implement its own usage of the "prio" field.

The basic case (wrr) would be take in account nodes with priority == retry

Contributor

prymitive commented Mar 10, 2013

depends at what you are aiming for, if we stop at backup field with any numeric value, than no, lb algo would pick a working node from the group of nodes with lowest backup field value, so that you have:

nodeA - backup: 0
nodeB - backup: 1
nodeC - backup: 2

if nodeA is working we pick it A
if nodeA is down but nodeB is working we pick it B
if nodeA and nodeB are down but nodeC is working we pick C

So you have more than one level of fallback, for example:

nodeA - full working ruby app using db
nodeB - minimal static version of app, generated once in a while from db
nodeC - very minimal static page saying that site is down or in maintenance

This would work same as haproxy with multiple backup nodes and allbackups false

Contributor

prymitive commented Mar 10, 2013

ps. nodeN in the examples above might be few nodes and we just pick one of them based on lb algo

Owner

unbit commented Mar 10, 2013

Ok taking again the default case of wrr:

if (retry == 0) {
check only for nodes with prio = 0
}
else {
check for nodes with prio > last_check+1
... and increment last_check if the node failed
}

the retry argument of the algo could be increased every time for simplyfing things

Contributor

prymitive commented Mar 10, 2013

the retry argument of the algo could be increased every time for simplyfing things

yes, but when to stop retrying?

Look at prymitive/uwsgi@328bc02

in have_backup I store lowest backup field value, that is also bigger that the current backup level we are checking
if we don't pick any node, but have_backup > 0 than we know that there are nodes we can fall back to
so we retry with looking for nodes with backup level == have_backup value

This way we don't need to retry bunch of time if we have node with backup=0 and another with backup=99

We can move this outside of lb algos, but won't this might require more logic and more math do do the same?
On one hand lb algos should be clean and simple, so NOT doing backup logic there is good, and it means more DRY code.
On the other hand doing this inside lb algos doesn't add more complexity to it, backup logic blends pretty nicely there and as a result there will be less overhead due to backup logic.

I leave it up to you

Contributor

prymitive commented Mar 10, 2013

My patch is done and working:

router:

/uwsgi -M --http :8080 --http-subscription-server :2500 --http-stats :4444 --subscription-algo lrc
[...]
[uwsgi-subscription for pid 11226] new pool: localhost (hash key: 6666)
[uwsgi-subscription for pid 11226] localhost => new node: 127.0.0.1:1111 (backup: 0)
[uwsgi-subscription for pid 11226] localhost => new node: 127.0.0.1:1112 (backup: 99)

primary node:

./uwsgi -s 127.0.0.1:1111 -M --subscribe2 server=127.0.0.1:2500,key=localhost,addr=127.0.0.1:1111 --wsgi-file primary.py

primary.py:

def application(environ, start_response):
    start_response('200 OK', [('Content-Type', 'text/plain')])
    return ['primary\n']

secondary node:

./uwsgi -s 127.0.0.1:1112 -M --subscribe2 server=127.0.0.1:2500,key=localhost,addr=127.0.0.1:1112,backup=99 --wsgi-file secondary.py

secondary.py:

def application(environ, start_response):
    start_response('200 OK', [('Content-Type', 'text/plain')])
    return ['secondary\n']

Then just call curl -H "Host: localhost" localhost:8080 it will always return "primary", once primary node is stopped it will return "secondary".

I tested all algos and multiple backup combinations and it all fall back properly.
For your consideration

Contributor

prymitive commented Mar 11, 2013

ps. I'm not sure if naming is right

@prymitive prymitive added a commit to prymitive/uwsgi that referenced this issue Apr 19, 2013

@prymitive prymitive backup subscriptions in routers, fixes #179 ec489b5

@unbit unbit closed this Jul 12, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment