Fork of Hustle - Originally developed at Chango - A column oriented, embarrassingly distributed relational event database.
A column oriented, embarrassingly distributed, relational event database.


  • column oriented - super fast queries
  • events - write only semantics
  • distributed insert - designed for petabyte scale distributed datasets with massive write loads
  • compressed - bitmap indexes, lz4, and prefix trie compression
  • relational - join gigantic data sets
  • partitioned - smart shards
  • embarrassingly distributed (based on Disco)
  • embarrassingly fast (uses LMDB)
  • NoSQL - Python DSL
  • bulk append only semantics
  • highly available, horizontally scalable
  • REPL/CLI query interface

Example Query

select(impressions.ad_id,, h_sum(pix.amount), h_count(),
       where=(( < '2014-01-13') & (impressions.ad_id == 30010),
      < '2014-01-13'),
       join=(impressions.site_id, pix.site_id),


After cloning this repo, here are some considerations:

  • you will need Python 2.7 or higher - note that it probably won't work on 2.6 (has to do with pickling lambdas...)
  • you need to install Disco 0.5 and its dependencies - get that working first
  • you need to install Hustle and its 'deps' thusly:
cd hustle
sudo ./

Please refer to the Installation Guide for more details


Hustle User Guide

Hustle Mailing List


Special thanks to following open-source projects:

