Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
This project implements online-k-clustering algorithm as mentioned in this paper(http://cseweb.ucsd.edu/~dasgupta/291/lec6.pdf). It produces REALTIME k-clustering on an infinite stream of data. It is implemented on top of twitter storm and uses cassandra as database. It deals with 2-dimensional matrices and clusters in Euclidean space.
Java
branch: master
Failed to load latest commit information.
src/clustering small fix
README updated readme

README

This project implements online-k-clustering algorithm as mentioned in this paper(http://cseweb.ucsd.edu/~dasgupta/291/lec6.pdf). It produces a REALTIME, DISTRIBUTED k-clustering on an infinite stream of data(Yes! you heard it right, it's realtime :-)). It is implemented on top of twitter storm and uses cassandra as distributed database. It deals with 2-dimensional matrices and clusters in Euclidean space.
Note: You can read more about twitter storm here(https://github.com/nathanmarz/storm/). This projects implements the algorithm in the local mode and not on actual cluster, but the same implementation can be ported to an actual cluster with very little changes.
Something went wrong with that request. Please try again.