Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Astyanax client accessing cross datacenter on CL_LOCAL_QUORUM for write & read. #268

Closed
repanag opened this issue Apr 4, 2013 · 5 comments

Comments

@repanag
Copy link

repanag commented Apr 4, 2013

Astyanax client accessing cross datacenter on CL_LOCAL_QUORUM for write & read.

I have multi datacenter setup with 2 (DC1) & 1 (DC2) node in each dc (as shown below). Keyspace and Astyanax Context with the following configuration. I have 2 nodes in DC1 and 1 node in DC2. When the client is running from DC1, the request some times goes to DC2 (when looking at the host on result.getHost()). I expected the CL_LOCAL_QUORUM to always go to the local datacenter. Am i configuring some thing wrong? Is this some thing to do with RING_DESCRIBE?

CREATE KEYSPACE grd WITH replication = {
'class': 'NetworkTopologyStrategy',
'DC1': '1',
'DC2': '1'
};

        context = new AstyanaxContext.Builder()
                .forCluster(clusterName)
                .forKeyspace(keySpaceName)

                .withAstyanaxConfiguration(new AstyanaxConfigurationImpl()      
                    .setDiscoveryType(NodeDiscoveryType.RING_DESCRIBE)
                    .setCqlVersion(cqlVersion)
                    .setTargetCassandraVersion(cassandraVersion)
                    .setDefaultReadConsistencyLevel(ConsistencyLevel.CL_LOCAL_QUORUM)
                    .setDefaultWriteConsistencyLevel(ConsistencyLevel.CL_LOCAL_QUORUM)
                )
                .withConnectionPoolConfiguration(new ConnectionPoolConfigurationImpl("MyConnectionPool")
                    .setPort(cassandraRpcPort)
                    .setMaxConnsPerHost(10)
                    .setSeeds(cassSeedHosts)
                    .setConnectTimeout(connectTimeout)
                    .setSocketTimeout(readTimeout)
                )
                .withConnectionPoolMonitor(new CountingConnectionPoolMonitor())
                .buildKeyspace(ThriftFamilyFactory.getInstance());

Host where the request went to is captured using:
OperationResult<CqlResult<UUID, String>> result = pcqlQuery.execute();
logInfo.append("{ host= " + result.getHost());

thanks,
srrepaka

@elandau
Copy link
Contributor

elandau commented Apr 4, 2013

You need to tell the client which DC it is in. Try adding this to the ConnectionPoolConfigurationImpl

.setLocalDatacenter("DC1")

This will cause the client to filter out the nodes from the other DC.

@repanag
Copy link
Author

repanag commented Apr 4, 2013

thanks for the quick response.

If we set the DC, will it fail if both the nodes in that dc are down?

Also looking into this, Will this mostly connect to local dc and also access other dc, if local dc nodes are down?
.setLatencyAwareUpdateInterval(10000) // Will resort hosts per token partition every 10 seconds
.setLatencyAwareResetInterval(10000) // Will clear the latency every 10 seconds
.setLatencyAwareBadnessThreshold(0.50) // Will sort hosts if a host is more than 100% slower than the best and always assign connections to the fastest host, otherwise will use round robin
.setLatencyAwareWindowSize(100) // Uses last 100 latency samples

@elandau
Copy link
Contributor

elandau commented Apr 4, 2013

Yes, if all nodes in that DC are down the client will fail. I'm curious why you chose such a configuration. It's not very fault tolerant. Our most basic deployment is to have a minimum of 3 nodes per DC and a replication factor of 3. This type of setup let's you loose up to 1 node without any impact on quorum. If you're OK with consistency level ONE then you could have 2 nodes per DC.

@repanag
Copy link
Author

repanag commented Apr 8, 2013

sorry for the late response. had to move to some other high-priority work..

let me understand what ur saying. we have the current configuration:

config: replication factor: 1, consistency level: local_quorum, nodes: 2 per dc
replication_factor: 1 (per dc) --> so it implies that the number of replicas maintained would be 1 per dc.
nodes: 2 per dc.
local_quorum: 1 (based on: A quorum is calculated as "(replication_factor / 2) + 1" (rounded down to a whole number))
Fault tolerance: if one node is down we might have inconsistent data, but the read/writes would still go through since local-quorum is 1.

config: replication factor: 2, consistency level: local_quorum, nodes: 2 per dc
replication_factor: 2 (per dc) --> so it implies that the number of replicas maintained would be 2 per dc.
nodes: 2 per dc.
local_quorum: 2 (based on: A quorum is calculated as "(replication_factor / 2) + 1" (rounded down to a whole number))
Fault tolerance: if one node is down we might have consistent data, but the read/writes will have an issue local_quoum is 2.

config: replication factor: 3, consistency level: local_quorum, nodes: 3 per dc
replication_factor: 3 (per dc) --> so it implies that the number of replicas maintained would be 2 per dc.
nodes: 3 per dc.
local_quorum: 2 (based on: A quorum is calculated as "(replication_factor / 2) + 1" (rounded down to a whole number))
Fault tolerance: if one node is down we might still have consistent data, the read/writes should also go through fine.

Correct me if my understanding is wrong.. and sorry for going off topic here.. Is this way your saying we need to go with local_quorum and replication factor of 3.

-srrepaka

@sagarl
Copy link
Contributor

sagarl commented Feb 13, 2014

Looks like the issue has been already answered. Please reopen if you have any questions.

@sagarl sagarl closed this as completed Feb 13, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants