Note: this is a straight fork of https://github.com/tuulos/disco with the disco python libraries moved to the root so that this package can be installed by pip.
Disco - Massive data, Minimal code
Disco is an implementation of the Map-Reduce framework for distributed computing. Like the original framework, which was publicized by Google, Disco supports parallel computations over large data sets on an unreliable cluster of computers. This makes it a perfect tool for analyzing and processing large datasets without having to bother about difficult technical questions related to distributed computing, such as communication protocols, load balancing, locking, job scheduling or fault tolerance, all of which are taken care by Disco.
See discoproject.org for more information.