auto-388: connect to other nodes if one node fails #17

manishtomar · 2013-06-28T18:17:15Z

When node fails to connect, it tries to connect to all nodes in the cluster and fails after trying the whole cluster x number of times.

- tests not yet completed - Need advice on REVIEW code

dreid · 2013-06-28T18:43:55Z

silverberg/cluster.py

+            if client_i >= num_clients:
+                if tries >= self.MAX_TRIES:
+                    return failure
+                # REVIEW: would like to take reactor as arg but that will change signature of this


instance variable passed into init.

also taking max_tries and interval as argument

manishtomar · 2013-07-05T07:46:13Z

retest this please

Recent commit 706bab about round robin broke the tests. Adjusted them. Also added docstrings in other places

dreid · 2013-07-10T00:02:21Z

I would not conflate retrying to connect to whole the cluster with trying to connect to each seed node in the cluster. I would delegate retrying to a separate wrapper that can be used around the RoundRobinCassandraCluster object or around a regular CQLClient.
This will not connect to each seed, rather it will connect to len(seeds). But it will not necessarily be each seed when multiple requests are in flight.

Here is a sequence diagram for the simple case of a single actor (named bob) talking to the cluster: Single Actor

Here is a diagram for the case of two actors (named bob and alice): Two Actors

In the second example, because bob got cass0 out of the pool, and alice got cass1 while bob was still trying to connect to cass0, then cass0 failed, bob skipped to cass1, which was the only node in this scenario that was serving requests.

This happens because the 'pool' is just a simple counter.
0) counter=0

bob got 0 and did (0 + 1) % 3; counter=1
alice got 1 and did (1 + 1) % 3; counter=2
bob got 2 and did (2 + 1) % 3; counter=0
bob got 0 and did (0 + 1) % 3; counter=1

Bob has now tried to connect 3 times, but only to two unique servers.

Maybe in practice under load with all the requests in flight this comes out in a wash. I don't know, but this algorithm doesn't do exactly what you'd think it would.

manishtomar · 2013-07-10T00:07:31Z

OMG. Brilliant catch @dreid. Thanks for realizing a bug that would've been super irritating if it went through. I haven't gone through the whole comment yet. Will read and let you know. Thanks again

Instead internally increments client index to not accidentally allow some other caller to use index while it is hopping between cass nodes. While moving to cass nodes, it sets index.

dreid · 2013-07-15T23:52:35Z

Ok, so this does two things, it retries requests (if a request can't be fulfilled on clusterA it retries on clusterA) and redistributes requests (if a request can't be fulfilled on nodeA it tries it on nodeB)

I would like to see the retrying requests functionality broken out into a separate wrapper class that can be wrapped around either a CQLClient or a RoundRobinCassandraCluster. This should make that functionality easier to test also. I mentioned this as #1 in my last comment.

In addition, you should add a test case that actually exercises the previous race condition. Ideally you would do this by writing a test that actually fails with the old implemention at 9c4513c and succeed with the implementation added in 8d37996.

manishtomar · 2013-07-17T21:39:43Z

Would it help to have connecting to the cluster as separate function that gets retried when the whole cluster connection fails?

dreid · 2013-07-18T17:48:22Z

On Composition

class IntervalRetryingCQLClient(object):
    def __init__(self, reactor, client, interval, max_retries):
        ...
    def execute(self, query, params, consistency):
         retries = 0

         def maybe_retry(failure, retries):
             failure.trap(ConnectError)
             retries += 1
             d2 = task.deferLater(self._reactor, self._interval, self._client.execute, query, params, consistency)
             if retries <= self._max_retries:
                 d2.addErrback(_maybe_retry, retries)

             return d2

         d = self._client.execute(query, params, consistency)
         d.addErrback(_maybe_retry, retries)
         return d

What I'm advocating for is we break out the functionality of retrying a CQL query to a separate class that only has that responsibility. Something like the above.

This has a few advantages:

It works with CQLClient and RoundRobinCassandraCluster: IntervalRetryingCQLClient(reactor, CQLClient(…), …) and IntervalRetryingCQLClient(reactor, RoundRobinCassandraCluster(…), …)
It's very clear how to disable retrying. (Simply do not instantiate an IntervalRetryingCQLClient.)
It's possible to selectively use retrying for only certain queries (Simple wrap a long-lived object having an execute method in an IntervalRetryingCQLClient at the point where you know you want to retry.
It's allows for other implementations of retry strategies to work with the existing classes. Such as exponential backoff.

It's no accident that CQLClient and RoundRobinCassandraCluster have the exact same interface (primarily an execute method.) This was an intentional choice to make it easy to add features (like logging/timing, and retrying) via composition.

There is a pretty good talk from pycon2013 about composition vs inheritance which you may find interesting: http://pyvideo.org/video/1684/the-end-of-object-inheritance-the-beginning-of

If you do not want to do this work in this PR feel free to simply remove the retry functionality you've added to RoundRobinCassandraCluster and ask for another review.

On trying the next server

There is a reasonable concern that simply calling execute on the next CQLClient could cause a non-idempotent query (such as an UPDATE that increments a counter or appends an entry to a list) to be executed multiple times.

A different approach to this problem rather than retrying would be to let the initial query fail and simply blacklist the bad client node for a period of time.

In this way the application would be able to decide if it should retry the query.

@dreid

- removed clock and other retry args. - updated tests accordingly Yet to add test for race condition @dreid mentioned

Conflicts: silverberg/cluster.py silverberg/test/test_cluster.py

dreid · 2013-08-01T22:39:40Z

+1

AUTO-388: connect to other nodes if one node fails

auto-388: connect to other nodes if one node fails

ebc3810

- tests not yet completed - Need advice on REVIEW code

dreid reviewed Jun 28, 2013
View reviewed changes

clock as arg; more tests

5f1f25a

also taking max_tries and interval as argument

Manish Tomar added 2 commits July 5, 2013 00:54

Merge branch 'master' into auto-388-reconnection

c1df5d7

adjusted tests for recent merge

9c4513c

Recent commit 706bab about round robin broke the tests. Adjusted them. Also added docstrings in other places

Manish Tomar added 2 commits July 15, 2013 11:29

execute does not use _client - addresses @dreid comment

8d37996

Instead internally increments client index to not accidentally allow some other caller to use index while it is hopping between cass nodes. While moving to cass nodes, it sets index.

Merge branch 'master' into auto-388-reconnection

390cfd3

Manish Tomar added 4 commits July 31, 2013 17:23

not retrying when all nodes are down

db9a725

- removed clock and other retry args. - updated tests accordingly Yet to add test for race condition @dreid mentioned

added multiple client call test

76d8ac0

Merge branch 'master' into auto-388-reconnection

6a53904

Conflicts: silverberg/cluster.py silverberg/test/test_cluster.py

remove unnecessary import

db85cee

manishtomar added a commit that referenced this pull request Aug 1, 2013

Merge pull request #17 from rackerlabs/auto-388-reconnection

94a63c0

AUTO-388: connect to other nodes if one node fails

manishtomar merged commit 94a63c0 into master Aug 1, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

auto-388: connect to other nodes if one node fails #17

auto-388: connect to other nodes if one node fails #17

manishtomar commented Jun 28, 2013

dreid Jun 28, 2013

manishtomar commented Jul 5, 2013

dreid commented Jul 10, 2013

manishtomar commented Jul 10, 2013

dreid commented Jul 15, 2013

manishtomar commented Jul 17, 2013

dreid commented Jul 18, 2013

dreid commented Aug 1, 2013

auto-388: connect to other nodes if one node fails #17

auto-388: connect to other nodes if one node fails #17

Conversation

manishtomar commented Jun 28, 2013

dreid Jun 28, 2013

Choose a reason for hiding this comment

manishtomar commented Jul 5, 2013

dreid commented Jul 10, 2013

manishtomar commented Jul 10, 2013

dreid commented Jul 15, 2013

manishtomar commented Jul 17, 2013

dreid commented Jul 18, 2013

On Composition

On trying the next server

dreid commented Aug 1, 2013