Community Detection for Twitter follower network of 40 million users using mapreduce
Pull request Compare This branch is even with derdewey:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


Scalable Community Detection using Label Propagation and Map Reduce

Author: Akshay Bhat 
Contact: akshaybhat [at]

Please visit for more information

Folder				Description
lp				Code for Communtiy Detection 

pre				Code to pre processing the edgelist file 

twitter			Code for automating everything for twitter dataset

note that this is an experimental code, and not a library. Thus it involves multiple hacks.

You will need a working hadoop installation, this code has been tested using a cluster which used hadoop 0.19. Thus It should work very well with versions > 0.19. 
Still you will need to change path to hadoop streaming jar file.
Download from

Download numeric2users.tar.gz from above website, extract it, rename it as Users.txt and put it outside the LPMR folder. (sorry if this sounds weird, will fix this soon) 

cd into twitter directory and execute

[you will most likely get errors due to hadoop not being ]

License: Research purpose only