Skip to content

Azure-Samples/azure-cassandra-mi-java-v4-speculative-execution

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

page_type languages products description urlFragment
sample
java
azure
Demo: implementing speculative execution policy in Azure Managed Instance for Apache Cassandra
azure-cassandra-mi-java-v4-speculative-execution

Demo: implementing speculative execution policy in Azure Managed Instance for Apache Cassandra.

Sometimes a Cassandra node might be experiencing difficulties (ex: long GC pause, or node is being rebooted) and take longer than usual to reply. Queries sent to that node will experience bad latency. One thing we can do to improve that is pre-emptively start a second execution of the query against another node, before the first node has replied or errored out. If that second node replies faster, we can send the response back to the client. We also cancel the first execution (note that “cancelling” in this context simply means discarding the response when it arrives later, Cassandra does not support cancellation of in flight requests)

This demo sample loads data into a Cassandra table and artificially degrades the performance of a single node in the cluster to demonstrate the benefits of using speculative-execution-policy in Cassandra V4 Java driver.

IMPORTANT - In this sample all requests are explicitly flagged as idempotent using setIdempotent(true) (see read and write methods in UserRepository.java). If a query is not explicitly defined as idempotent, the driver will never schedule speculative executions for it, even if the policy is configured, because there is no way to guarantee that only one node will apply the mutation (since in-flight requests are never cancelled). Consider query idempotency carefully in your applications, and ensure the setting is applied where appropriate.

In a real application that implements speculative execution policy, you should of course not have a CustomLoadBalancingPolicy.java as shown in this sample (this is just used to artifically degrade the performance on one node from the client side). If using this sample as a basis for building an app, remove CustomLoadBalancingPolicy.java from the project, and the reference to it in java-exmple/src/main/resources/application.conf.

Prerequisites

  • Before you can run this sample, you must have the following :
    • An Apache Cassandra cluster and networking access to it. Check out portal quickstart for Azure Managed Instance for Apache Cassandra.
    • Java Development Kit (JDK) 1.8+
      • On Ubuntu, run apt-get install default-jdk to install the JDK.
    • Be sure to set the JAVA_HOME environment variable to point to the folder where the JDK is installed.
    • Download and install a Maven binary archive
      • On Ubuntu, you can run apt-get install maven to install Maven.
    • Git
      • On Ubuntu, you can run sudo apt-get install git to install Git.

Running this sample

  1. Clone this repository using git clone https://github.com/Azure-Samples/azure-cassandra-mi-java-v4-speculative-execution

  2. Update parameters in java-examples/src/main/resources/application.conf:

    1. Enter the datacenter name in the DC field.
    2. Enter username and password in datastax-java-driver.advanced.auth-provider section, and the IP addresses of your cluster seed nodes in datastax-java-driver.basic.contact-points.
    3. Choose one node for which performance will be artifically degraded by the app, and enter the I.P. address of that node in nodeToDegrade.
  3. Run mvn clean package from java-examples folder to build the project. This will generate cassandra-mi-load-tester-1.0.0-SNAPSHOT.jar under target folder.

  4. Run java -jar target/cassandra-mi-load-tester-1.0.0-SNAPSHOT.jar in a terminal to start your java application. Initially this will run without using speculative query execution policy. It will create a keyspace and user table, load 50 records, and then read those records, measuring the p50, p99, and min/max latencies. You should see quite high latencies for P99 and max (along with messages that the selected node is degraded):

    Run 1

  5. Next, review the content of the speculative-execution-policy section in java-exmple/src/main/resources/application.conf. Notice the line class = ConstantSpeculativeExecutionPolicy which is commented out. When this line is commented out, a default class of NoSpeculativeExecutionPolicy is used. Uncomment class = ConstantSpeculativeExecutionPolicy to implement speculative execution.

  6. Compile/build and run the application again. You should see significantly reduced p99 and max latency, as other nodes are speculatively queried while waiting for the response from the initial node that was queried if it exceeds a certain delay - see below. The number of nodes that are tried, and the amount of time to wait for a response from each node, is based on the values set for max-executions and delay respectively.

    Run 2

About

This repository contains Java v4 sample code for connecting to Azure Managed Instance for Apache Cassandra and running load tests using speculative execution.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages