Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
JCBC-457: Force CCCP config fetching on node reconnect.
Motivation ---------- If a node needs to be reconnected, there is a strong indication that the socket has been closed and this could be because of a topology change. Modification ------------ If a reconnect is scheduled, make sure it forces a config update. Also, the method is added for memcache buckets to keep the behavior consistent. Result ------ Quicker detection of topology changes, eventually getting quicker to a valid config state. Change-Id: I5244dfc6d6f19288977ef98745d47efe25773093 Reviewed-on: http://review.couchbase.org/36779 Reviewed-by: Matt Ingenthron <matt@couchbase.com> Tested-by: Michael Nitschinger <michael.nitschinger@couchbase.com>
- Loading branch information
Showing
2 changed files
with
42 additions
and
18 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
919ff00
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this fix the following issue: http://www.couchbase.com/communities/q-and-a/java-client-not-aware-about-failed-over-node#comment-1940 ?
919ff00
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alexo hey. not sure if it does - are you performing load while the node goes down or is there no traffic going through at this point?
919ff00
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@daschl there is no traffic going. The problem is not a temporary. Once the node is down and failed-over, the client is unable to reconnect, probably because he is not aware about topology change.
919ff00
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alexo yes, because we are not notified of the socket change if no traffic goes through. You either need to run traffic across it, or you can go back to the non pull based config approach (http config), as opposed to the current pull based one (carrier publication) which has been added with 1.4.0.
Once you apply load again, does a new configuration get picked up properly? Or is it forever in a unstable state. You can try with 1.4.1 as well now http://search.maven.org/#artifactdetails%7Ccom.couchbase.client%7Ccouchbase-client%7C1.4.1%7Cjar
919ff00
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@daschl after the node is down and failed over, the client is trying to get a document (so, there is traffic running across it). Still, it fails with timeout exception.
Could you point out how to use the non pull based config approach? I will try the 1.4.1 version, but in case the issue is not fixed, I would like to have a plan B.
919ff00
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alexo try
System.setProperty("cbclient.disableCarrierBootstrap", "true");
beforenew CouchbaseClient(..)
919ff00
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@daschl Good news. I have tested with the 1.4.1 and the problem seems to be fixed. Thanks!
919ff00
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh great, let me know if you run into troubles :)
919ff00
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this maybe related to http://www.couchbase.com/communities/q-and-a/java-client-not-aware-failed-over-node-under-certain-circumstances ? The extra detail is that I am reading from a replica while the node is inaccessible. It does not seem to reproduce if I failover a living node. My test uses iptables to block communication with the node. I then read from the replica and then perform the failover when the admin console shows it is "down". If I continue to read from the replica... it starts to throw an exception once the failover is complete (i.e. I don't get a value from the prior master or the newly promoted replica)
919ff00
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry.. just realized where the JIRA for the java client was... made a ticket:
http://www.couchbase.com/issues/browse/JCBC-467