- elephant-bird 2 Twitter's collection of LZO and Protocol Buffer-related Hadoop, Pig, Hive, and HBase code.
- cascading 1 Cascading is a feature rich API for defining and executing complex and fault tolerant data processing workflows on a Hadoop cluster.
- DataflowPythonSDK 1 Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.
- zkclient 1 a zookeeper client, that makes life a little easier.
- hadoop-lzo 1 Patched, refactored version of code.google.com/hadoop-gpl-compression for hadoop 0.20