Skip to content
klbostee edited this page Sep 14, 2010 · 55 revisions

Dumbo is a Python module that allows you to easily write and run Hadoop streaming programs (it’s named after Disney’s flying circus elephant, since the logo of Hadoop is an elephant and Python was named after the BBC series “Monty Python’s Flying Circus”).

Quick Installation

wget http://github.com/klbostee/dumbo/tarball/release-0.16
tar zxvf klbostee-dumbo*
cd klbostee-dumbo*
sudo ant install_pymod

or with git:


git clone git://github.com/klbostee/dumbo.git
cd dumbo
sudo ant install_pymod

Documentation