An implementation and example of Partial Key Grouping for Apache Storm. Partial Key Grouping is a load balancing strategy for distributed stream processing systems.
Hadoop code for "Document Similarity Self-Join With Mapreduce" (ICDM'10)
Crawler for the social network of Twitter
Forked from s4/s4
Streaming Similarity Self Join
Tools of the trade