Skip to content

Commit

Permalink
router: introduce router.callbro and callbre
Browse files Browse the repository at this point in the history
Load-balancing is a common task speaking of cluster-wide tasks.
Sometimes it is required to do some heavy read-only operations,
for example, with Vinyl, and no one wants to send them all on
one instance despite is it a master or the nearest replica.

This commit introduces automatic load-balancing by user's
request. Router can't balance read-write requests since
officially router does not support master-master. Read-only
requests can be balanced in two ways: just over the whole
replicaset, or over slaves only if at least one is available.

Closes #168

@TarantoolBot document
Title: [vshard] document router.callbro, callbre, and new router.call option

VShard router now supports load-balancing. Only read-only requests
can be balanced because VShard officially does not support
master-master. It is possible to balance requests over the whole
replicaset, or prefer slaves. The load-balancing is a simple
round-robin.

To balance over the whole replicaset vshard.router.callbro can be
used. To prefer slaves vshard.router.callbre works.

Callbre balances over slaves only, but if there are no active
slaves, then it uses master.

Also, a new option is added to the second version of
vshard.router.call: balance = boolean. Example:

    vshard.router.call(bucket_id, {mode = 'read',
                                   balance = true,
                                   prefer_slaves = true},
                       func, args, opts)
  • Loading branch information
Gerold103 committed Mar 11, 2019
1 parent 79e6507 commit e3f9891
Show file tree
Hide file tree
Showing 5 changed files with 335 additions and 17 deletions.
190 changes: 190 additions & 0 deletions test/router/complex_call.result
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ vshard.router.bootstrap()
--
-- gh-157: introduce vshard.router.callre to prefer slaves to
-- execute a user defined function.
-- gh-168: load-balancing.
--
vshard.router.callre(1, 'echo', {'ok'})
---
Expand Down Expand Up @@ -61,12 +62,89 @@ echo_count
echo_count = 0
---
...
-- Basic load-balancing.
_ = test_run:switch('router_2')
---
...
for i = 1, 12 do vshard.router.callbro(1, 'echo', {'ok'}) end
---
...
for i = 1, 3 do rs:callbro('echo', {'ok'}) end
---
...
for i = 1, 3 do vshard.router.call(1, {mode = 'read', balance = true}, 'echo', {'ok'}) end
---
...
_ = test_run:switch('box_1_a')
---
...
echo_count
---
- 6
...
echo_count = 0
---
...
_ = test_run:switch('box_1_b')
---
...
echo_count
---
- 6
...
echo_count = 0
---
...
_ = test_run:switch('box_1_c')
---
...
echo_count
---
- 6
...
echo_count = 0
---
...
_ = test_run:switch('router_2')
---
...
-- Prefer slaves, but still balance.
for i = 1, 10 do vshard.router.callbre(1, 'echo', {'ok'}) end
---
...
for i = 1, 2 do rs:callbre('echo', {'ok'}) end
---
...
_ = test_run:switch('box_1_b')
---
...
echo_count
---
- 6
...
echo_count = 0
---
...
_ = test_run:switch('box_1_c')
---
...
echo_count
---
- 6
...
echo_count = 0
---
...
-- Now turn down some of the nodes - balancers and slave-lovers
-- should not try to visit them.
_ = test_run:switch('router_2')
---
...
_ = test_run:cmd('stop server box_1_b')
---
...
-- One slave and one master are alive. This call should visit only
-- the slave.
vshard.router.callre(1, 'echo', {'ok'})
---
- ok
Expand All @@ -84,9 +162,56 @@ echo_count = 0
_ = test_run:switch('router_2')
---
...
-- Just balance over two nodes. It does not matter who is slave,
-- and who is not.
for i = 1, 12 do vshard.router.callbro(1, 'echo', {'ok'}) end
---
...
_ = test_run:switch('box_1_a')
---
...
echo_count
---
- 6
...
echo_count = 0
---
...
_ = test_run:switch('box_1_c')
---
...
echo_count
---
- 6
...
echo_count = 0
---
...
_ = test_run:switch('router_2')
---
...
-- Only one slave is alive - not much space to balance.
for i = 1, 10 do vshard.router.callbre(1, 'echo', {'ok'}) end
---
...
_ = test_run:switch('box_1_c')
---
...
echo_count
---
- 10
...
echo_count = 0
---
...
_ = test_run:switch('router_2')
---
...
_ = test_run:cmd('stop server box_1_c')
---
...
-- When all the slaves are down, only master can be used. Even if
-- a caller prefers slaves.
vshard.router.callre(1, 'echo', {'ok'})
---
- ok
Expand All @@ -101,6 +226,71 @@ echo_count
echo_count = 0
---
...
_ = test_run:switch('router_2')
---
...
for i = 1, 3 do vshard.router.callbro(1, 'echo', {'ok'}) end
---
...
_ = test_run:switch('box_1_a')
---
...
echo_count
---
- 3
...
echo_count = 0
---
...
_ = test_run:switch('router_2')
---
...
for i = 1, 3 do vshard.router.callbre(1, 'echo', {'ok'}) end
---
...
_ = test_run:switch('box_1_a')
---
...
echo_count
---
- 3
...
echo_count = 0
---
...
--
-- What if everything is down? Router should not hang at least.
--
_ = test_run:switch('router_2')
---
...
_ = test_run:cmd('stop server box_1_a')
---
...
_, err = vshard.router.callre(1, 'echo', {'ok'})
---
...
err ~= nil
---
- true
...
_, err = vshard.router.callbre(1, 'echo', {'ok'})
---
...
err ~= nil
---
- true
...
_, err = vshard.router.callbro(1, 'echo', {'ok'})
---
...
err ~= nil
---
- true
...
_ = test_run:cmd('start server box_1_a')
---
...
_ = test_run:cmd('start server box_1_b')
---
...
Expand Down
77 changes: 77 additions & 0 deletions test/router/complex_call.test.lua
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ vshard.router.bootstrap()
--
-- gh-157: introduce vshard.router.callre to prefer slaves to
-- execute a user defined function.
-- gh-168: load-balancing.
--
vshard.router.callre(1, 'echo', {'ok'})
vshard.router.call(1, {mode = 'read', prefer_replica = true}, 'echo', {'ok'})
Expand All @@ -22,20 +23,96 @@ _ = test_run:switch('box_1_b')
echo_count
echo_count = 0

-- Basic load-balancing.
_ = test_run:switch('router_2')
for i = 1, 12 do vshard.router.callbro(1, 'echo', {'ok'}) end
for i = 1, 3 do rs:callbro('echo', {'ok'}) end
for i = 1, 3 do vshard.router.call(1, {mode = 'read', balance = true}, 'echo', {'ok'}) end
_ = test_run:switch('box_1_a')
echo_count
echo_count = 0
_ = test_run:switch('box_1_b')
echo_count
echo_count = 0
_ = test_run:switch('box_1_c')
echo_count
echo_count = 0
_ = test_run:switch('router_2')

-- Prefer slaves, but still balance.
for i = 1, 10 do vshard.router.callbre(1, 'echo', {'ok'}) end
for i = 1, 2 do rs:callbre('echo', {'ok'}) end
_ = test_run:switch('box_1_b')
echo_count
echo_count = 0
_ = test_run:switch('box_1_c')
echo_count
echo_count = 0

-- Now turn down some of the nodes - balancers and slave-lovers
-- should not try to visit them.

_ = test_run:switch('router_2')
_ = test_run:cmd('stop server box_1_b')
-- One slave and one master are alive. This call should visit only
-- the slave.
vshard.router.callre(1, 'echo', {'ok'})
_ = test_run:switch('box_1_c')
echo_count
echo_count = 0

_ = test_run:switch('router_2')
-- Just balance over two nodes. It does not matter who is slave,
-- and who is not.
for i = 1, 12 do vshard.router.callbro(1, 'echo', {'ok'}) end
_ = test_run:switch('box_1_a')
echo_count
echo_count = 0
_ = test_run:switch('box_1_c')
echo_count
echo_count = 0

_ = test_run:switch('router_2')
-- Only one slave is alive - not much space to balance.
for i = 1, 10 do vshard.router.callbre(1, 'echo', {'ok'}) end
_ = test_run:switch('box_1_c')
echo_count
echo_count = 0

_ = test_run:switch('router_2')
_ = test_run:cmd('stop server box_1_c')
-- When all the slaves are down, only master can be used. Even if
-- a caller prefers slaves.
vshard.router.callre(1, 'echo', {'ok'})
_ = test_run:switch('box_1_a')
echo_count
echo_count = 0

_ = test_run:switch('router_2')
for i = 1, 3 do vshard.router.callbro(1, 'echo', {'ok'}) end
_ = test_run:switch('box_1_a')
echo_count
echo_count = 0

_ = test_run:switch('router_2')
for i = 1, 3 do vshard.router.callbre(1, 'echo', {'ok'}) end
_ = test_run:switch('box_1_a')
echo_count
echo_count = 0

--
-- What if everything is down? Router should not hang at least.
--
_ = test_run:switch('router_2')
_ = test_run:cmd('stop server box_1_a')
_, err = vshard.router.callre(1, 'echo', {'ok'})
err ~= nil
_, err = vshard.router.callbre(1, 'echo', {'ok'})
err ~= nil
_, err = vshard.router.callbro(1, 'echo', {'ok'})
err ~= nil

_ = test_run:cmd('start server box_1_a')
_ = test_run:cmd('start server box_1_b')
_ = test_run:cmd('start server box_1_c')

Expand Down
6 changes: 4 additions & 2 deletions test/router/router.result
Original file line number Diff line number Diff line change
Expand Up @@ -1102,11 +1102,13 @@ error_messages
- Use replicaset:callre(...) instead of replicaset.callre(...)
- Use replicaset:connect_replica(...) instead of replicaset.connect_replica(...)
- Use replicaset:down_replica_priority(...) instead of replicaset.down_replica_priority(...)
- Use replicaset:call(...) instead of replicaset.call(...)
- Use replicaset:callrw(...) instead of replicaset.callrw(...)
- Use replicaset:callbro(...) instead of replicaset.callbro(...)
- Use replicaset:connect_all(...) instead of replicaset.connect_all(...)
- Use replicaset:call(...) instead of replicaset.call(...)
- Use replicaset:connect(...) instead of replicaset.connect(...)
- Use replicaset:up_replica_priority(...) instead of replicaset.up_replica_priority(...)
- Use replicaset:connect_all(...) instead of replicaset.connect_all(...)
- Use replicaset:callbre(...) instead of replicaset.callbre(...)
...
_, replica = next(replicaset.replicas)
---
Expand Down

0 comments on commit e3f9891

Please sign in to comment.