cassandra-stress is modeled after the stress.py script in Apache Cassandra's source distribution.
cassandra-stress is built on top of Hector, a well-tested and widely deployed Java client for Apache Cassandra. The benefits of using Hector for a tool like this are many:
- JMX hooks into Hector for visibility into runtime statistics
- Extensive configurability of which node(s) to target
- Verbose logging output available
- Hooks into performance counters to correllate client and server performance
cassandra-stress is a currently just a command line app which supports the following commands (syntactically identical to stress.py where possible):
usage: stress [options]... url1,[[url2],[url3],...] -b,--batch-size <arg> The number of rows in the batch_mutate call -c,--columns <arg> The number of columsn to create per key -m,--unframed Disable use of TFramedTransport -n,--num-keys <arg> The number of keys to create -o,--operation <arg> One of insert, read, rangeslice, multiget -t,--threads <arg> The number of client threads to create -h,--help Print this help message and exit
This project now runs as a stand-alone binary. If you have a binary version, use the command line arguments directly:
sh bin/stress -o read -b 50 -n 12000 localhost:9160
This will perform the operation then drop you into a shell in which you can perform additional operations. The command syntax is similar to the initial call, except that the operation is the only argument:
[cassandra-stress] usage: operation [options] operation can be one of: insert, read, rangeslice, multiget, replay [N] -b,--batch-size <arg> The number of rows in the batch_mutate call -c,--columns <arg> The number of columsn to create per key -h,--help Print this help message and exit -m,--unframed Disable use of TFramedTransport -n,--num-keys <arg> The number of keys to create -t,--threads <arg> The number of client threads we will create
To re-run the operation again, just type the operation name. So from the initial example above, re-running the read operation with the same parameters would be:
The main benefit of going into a shell is that it keeps the JVM and connection pool machinery warm, so additional test runs will not have the overhead of spinning everything up. This has the added benefit of providing conitnuous access to Hector's JMX stats as well.
Todo (in rough priority)
- Create ColumnFamily specifically for test
- Add JMX innards for MBean driven command/control
- Add replay N times functionality from shell
- Add (much) better key distribution (gaussian, stdev argument)
- SuperColumn support
Offered under an MIT license (see LICENSE in the top level directory). As with any halfway decent load testing tool, you can generate a lot of load and put a system under duress or even cause it to fail. You are solely responsible for any issues experienced the execution of this tool may cause. Use at your own risk.