A complete data analysis and management environment for Apache Cassandra
Python JavaScript C++ Java Shell C Other
Switch branches/tags
Nothing to show
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
bin more dir structure mods Nov 2, 2011
custom_includes mycassandratools now OO; working. Replaces cass_mgr.py Dec 7, 2011
includes daemonizing start/stop system for use with mycassandratools.nodetool() Dec 8, 2011
lib more dir structure mods Nov 2, 2011
live more dir structure mods Nov 2, 2011
logs more dir structure mods Nov 2, 2011
settings more dir structure mods Nov 2, 2011
src Changing string formats Apr 5, 2012
.gitignore modified readme.md Dec 1, 2011
README.md Changing string formats Apr 5, 2012
date.txt New system coming soon including VM management Jul 10, 2012
pip_freeze.txt prepairing for 0.0.17a; new interface operational (basic) Dec 12, 2011
pip_freeze_includes.txt New additions to pip freeze Dec 5, 2011
pip_freeze_test.txt new pip format Apr 5, 2012


Management Logo


VERSION 0.0.18a

Thu Apr 6 12:21:46 BST 2012

This project initially aims to provide a management interface to Cassandra and also delve deep inside the JMX infrastructure to provide real-time health monitoring of the Cassandra/JVM internals by polling a python 'bean-like' attributes. It is also envisioned that there will be a benchmarking system using statistical theories for schema/data optimisation and real-time graphing of individual node structures and benchmarks. In other words, real-time analysis of incoming data; real-time analysis of actual performance and also integrating (FUTURE) hadoop-based mapreduce system and STORM.

Future-bound, it is envisioned that the system will allow deep configuration of Apache Cassandra similar to say MS-SQL server management studio; including node management, keyspace management and real-time analysis.

It will be based around a daemon (per node) and a tornado/django (client+server) idea; however, experimenting with different technologies is the way to go so far. (At the moment we are experimenting with web2py)

tornado + django or web2py + flot + ganglia + sshpt + jpype + pyjmx + pyYAML (configuration)

tornado : non-blocking host for rendering django (mycassandramanager) [tornado <- django(site/dir)]

mycassandramanager: combining ClusterSSH for actual node management # run nodetool on all hosts to compact mycasssandramanager.nodetool(ring0live, compact)

torando + httpclient + myjmxhandler ('daemon'-like tool for monitoring local node but also presenting that data to mycassandramanger tornado.httpclient(localnode) -> mycassandramanager

my{jmx,cassandra,mysql,json}handler : using jpype and pyjmx to report in real-time what a node is doing mydatagatherer.jmx.watchvariables(mytornado.httpclient())

mystresshandler : using custom pystress code to stress test a new{ly upgraded} cluster mycassandramanager.stresstest(ring0live)

flot : real-time graphing; using with tornado+django as a 'web administrator' tool mycassandramanager(sites/flot)

ganglia : long-term node monitoring; possibly using cacti instead (as I have Cassandra cacti templates in src/templates/graphing that are compatible with 1.0-series mycassandramanager(sites/ganglia)

Contributors welcome. Brand new pre-alpha project. In state of flux.


Instigated by Chris Cheyne 01NOV2011:1306