Skip to content
Permalink
Browse files

MEDIUM: server: Implement bounded-load hash algorithm

The consistent hash lookup is done as normal, then if balancing is
enabled, we progress through the hash ring until we find a server that
doesn't have "too much" load. In the case of equal weights for all
servers, the allowed number of requests for a server is either the
floor or the ceil of (num_requests * hash-balance-factor / num_servers);
with unequal weights things are somewhat more complicated, but the
spirit is the same -- a server should not be able to go too far above
(its relative weight times) the average load. Using the hash ring to
make the second/third/etc. choice maintains as much locality as
possible given the load limit.

Signed-off-by: Andrew Rodland <andrewr@vimeo.com>
  • Loading branch information...
arodland authored and Willy Tarreau committed Oct 25, 2016
1 parent 13d5ebb commit 4f88c636097bf5f7651c790700a8bf3fb82e5f67
Showing with 46 additions and 1 deletion.
  1. +46 −1 src/lb_chash.c
@@ -241,6 +241,34 @@ static void chash_update_server_weight(struct server *srv)
srv_lb_commit_status(srv);
}

/*
* This function implements the "Consistent Hashing with Bounded Loads" algorithm
* of Mirrokni, Thorup, and Zadimoghaddam (arxiv:1608.01350), adapted for use with
* unequal server weights.
*/
int chash_server_is_eligible(struct server *s)
{
/* The total number of slots to allocate is the total number of outstanding requests
* (including the one we're about to make) times the load-balance-factor, rounded up.
*/
unsigned tot_slots = ((s->proxy->served + 1) * s->proxy->lbprm.chash.balance_factor + 99) / 100;
unsigned slots_per_weight = tot_slots / s->proxy->lbprm.tot_weight;
unsigned remainder = tot_slots % s->proxy->lbprm.tot_weight;

/* Allocate a whole number of slots per weight unit... */
unsigned slots = s->eweight * slots_per_weight;

/* And then distribute the rest among servers proportionally to their weight. */
slots += ((s->cumulative_weight + s->eweight) * remainder) / s->proxy->lbprm.tot_weight
- (s->cumulative_weight * remainder) / s->proxy->lbprm.tot_weight;

/* But never leave a server with 0. */
if (slots == 0)
slots = 1;

return s->served < slots;
}

/*
* This function returns the running server from the CHASH tree, which is at
* the closest distance from the value of <hash>. Doing so ensures that even
@@ -254,6 +282,7 @@ struct server *chash_get_server_hash(struct proxy *p, unsigned int hash)
struct server *nsrv, *psrv;
struct eb_root *root;
unsigned int dn, dp;
int loop;

if (p->srv_act)
root = &p->lbprm.chash.act;
@@ -287,7 +316,23 @@ struct server *chash_get_server_hash(struct proxy *p, unsigned int hash)
dp = hash - prev->key;
dn = next->key - hash;

return (dp <= dn) ? psrv : nsrv;
if (dp <= dn) {
next = prev;
nsrv = psrv;
}

loop = 0;
while (p->lbprm.chash.balance_factor && !chash_server_is_eligible(nsrv)) {
next = eb32_next(next);
if (!next) {
next = eb32_first(root);
if (++loop > 1) // protection against accidental loop
break;
}
nsrv = eb32_entry(next, struct tree_occ, node)->server;
}

return nsrv;
}

/* Return next server from the CHASH tree in backend <p>. If the tree is empty,

2 comments on commit 4f88c63

@SolbiatiAlessandro

This comment has been minimized.

Copy link

replied Jan 28, 2019

this commit is mentioned by arodland in his talk at Google Cloud: https://www.youtube.com/watch?v=jk6oiBJxcaA

@arodland

This comment has been minimized.

Copy link
Contributor Author

replied Jan 28, 2019

Also in a later talk at Velocity NYC 2017, captured at https://vimeo.com/240695782 if anyone is keeping count.

Please sign in to comment.
You can’t perform that action at this time.