Skip to content
Latest commit 953c322 @daschl daschl committed with daschl SPY-176: Enhance redistribution logic and avoid possible deadlocks.
Motivation
----------
There have been issues reported that redistribution of operations does
not work as expected, especially with authentication scenarios. This
has been tracked down and the following changes have been made:

Modifications
-------------

	- With the old redistribute logic, it could happen that subsequent
	  ops in the retry queue got accidentally deleted. With the copy
	  first, this cannot happen anymore.
	- On redistribute, if the handling node is still not set, just
	  clone the operation to avoid NPEs. A op without a node set
	  can happen if it is enqueued to retry because the target node
	  is not yet authenticated.
	- Do not try to add operations to a node which is not yet authen
	  ticated. This can lead to costly locks with redistributions since
	  they are run from the IO thread. Without the change, it can happen
	  that the IO thread waits for an auth latch, but is also responsible
	  for telling listeners when auth has completed, therefore
	  locking everything up until the auth latch wait runs out of time.

Result
------
Much better resilience and performance with redistributions, especially
if authentication takes longer than expected and from scenarios where the
operations get redistributed/moved around from within the IO thread.

Change-Id: Icbc5f9e4f568ea885500e8d2baedfa989c8ef801
Reviewed-on: http://review.couchbase.org/38669
Reviewed-by: Matt Ingenthron <matt@couchbase.com>
Reviewed-by: Michael Nitschinger <michael.nitschinger@couchbase.com>
Tested-by: Michael Nitschinger <michael.nitschinger@couchbase.com>
Something went wrong with that request. Please try again.