Skip to content
This repository has been archived by the owner on Feb 17, 2022. It is now read-only.

Commit

Permalink
SPY-176: Enhance redistribution logic and avoid possible deadlocks.
Browse files Browse the repository at this point in the history
Motivation
----------
There have been issues reported that redistribution of operations does
not work as expected, especially with authentication scenarios. This
has been tracked down and the following changes have been made:

Modifications
-------------

	- With the old redistribute logic, it could happen that subsequent
	  ops in the retry queue got accidentally deleted. With the copy
	  first, this cannot happen anymore.
	- On redistribute, if the handling node is still not set, just
	  clone the operation to avoid NPEs. A op without a node set
	  can happen if it is enqueued to retry because the target node
	  is not yet authenticated.
	- Do not try to add operations to a node which is not yet authen
	  ticated. This can lead to costly locks with redistributions since
	  they are run from the IO thread. Without the change, it can happen
	  that the IO thread waits for an auth latch, but is also responsible
	  for telling listeners when auth has completed, therefore
	  locking everything up until the auth latch wait runs out of time.

Result
------
Much better resilience and performance with redistributions, especially
if authentication takes longer than expected and from scenarios where the
operations get redistributed/moved around from within the IO thread.

Change-Id: Icbc5f9e4f568ea885500e8d2baedfa989c8ef801
Reviewed-on: http://review.couchbase.org/38669
Reviewed-by: Matt Ingenthron <matt@couchbase.com>
Reviewed-by: Michael Nitschinger <michael.nitschinger@couchbase.com>
Tested-by: Michael Nitschinger <michael.nitschinger@couchbase.com>
  • Loading branch information
daschl authored and Michael Nitschinger committed Jun 24, 2014
1 parent ca3c8d4 commit 953c322
Showing 1 changed file with 7 additions and 2 deletions.
9 changes: 7 additions & 2 deletions src/main/java/net/spy/memcached/MemcachedConnection.java
Expand Up @@ -473,8 +473,9 @@ private void handleOperationalTasks() throws IOException {
}

if (!retryOps.isEmpty()) {
redistributeOperations(new ArrayList<Operation>(retryOps));
ArrayList<Operation> operations = new ArrayList<Operation>(retryOps);
retryOps.clear();
redistributeOperations(operations);
}

handleShutdownQueue();
Expand Down Expand Up @@ -1042,7 +1043,7 @@ public void redistributeOperation(Operation op) {

// The operation gets redistributed but has never been actually written,
// it we just straight re-add it without cloning.
if (op.getState() == OperationState.WRITE_QUEUED) {
if (op.getState() == OperationState.WRITE_QUEUED && op.getHandlingNode() != null) {
addOperation(op.getHandlingNode(), op);
return;
}
Expand Down Expand Up @@ -1257,6 +1258,10 @@ public void insertOperation(final MemcachedNode node, final Operation o) {
* @param o the operation to add.
*/
protected void addOperation(final MemcachedNode node, final Operation o) {
if (!node.isAuthenticated()) {
retryOperation(o);
return;
}
o.setHandlingNode(node);
o.initialize();
node.addOp(o);
Expand Down

0 comments on commit 953c322

Please sign in to comment.