Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve availability by caching system.peers #864

Open
aphyr opened this Issue Feb 13, 2019 · 0 comments

Comments

3 participants
@aphyr
Copy link

commented Feb 13, 2019

When JVM Cassandra clients connect to YugaByte's CQL interface, they make an immediate, synchronous request to discover other nodes in the cluster by querying the system table for other peers. This information is cached on the client to allow the client to automatically route requests to other peers--both for load balancing and fault tolerance.

In Yugabyte 1.1.13.0-b2 CE, the system.peers table is stored on the masters, which means that when a given node n cannot reach a master node which is connected to a majority of the masters (e.g. because the local node is partitioned away from the masters, masters are partitioned from each other, or master nodes have crashed, or some combination thereof), no client can connect to n.

While it may be advantageous to perform linearizable reads of the system.peers table in some circumstances, the common case (what clients do automatically at the start of every connection) does not require up-to-date information. Client requests for system.peers are already cached on clients, and even if clients receive only partial lists of peers (including the trivial list of only the node they're talking to), that's still enough to begin doing work.

By caching the system.peers table, and/or returning a trivial row for the current node only when no master is available, YugaByte DB could let clients continue to connect and do work during some fault conditions.

@kmuthukk kmuthukk added the enhancement label Mar 1, 2019

@kmuthukk kmuthukk added this to To Do in YBase features via automation Mar 1, 2019

@amitanandaiyer amitanandaiyer added this to To do in Jepsen Testing via automation Mar 6, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.