This repository is private.
All pages are served over SSL and all pushing and pulling is done over SSH.
No one may fork, clone, or view it unless they are added as a member.
Every repository with this icon (
) is private.
Every repository with this icon (
This repository is public.
Anyone may fork, clone, or view it.
Every repository with this icon (
) is public.
Every repository with this icon (
Klaas Bosteels (author)
Mon Dec 08 01:02:24 -0800 2008
commit e465a2f88be763539eb723d133962977692e9c91
tree 8e6e7998187ea8ecf57a721f5e5465729a182ef2
parent e4af69153a2109ecaae6db67c5eb5d586ce2c30c
tree 8e6e7998187ea8ecf57a721f5e5465729a182ef2
parent e4af69153a2109ecaae6db67c5eb5d586ce2c30c
dumbo /
| name | age | message | |
|---|---|---|---|
| |
README | ||
| |
build-pymod.xml | Tue Nov 04 03:28:22 -0800 2008 | |
| |
build.xml | ||
| |
examples/ | ||
| |
src/ |
README
DESCRIPTION """"""""""" Originally, Dumbo was just a simple Python module that made writing and running Streaming programs very easy, but now it also consists of some helper code in Java. More generally, Dumbo can be considered to be a convenient Python API for writing MapReduce programs. INSTALLATION """""""""""" The Java code gets build together with the rest of Hadoop when the "dumbo/" directory is put in Hadoop's "src/contrib/", and the Python module can be installed by running sudo ant -f build-pymod.xml install_pymod in the "src/contrib/dumbo" directory. If the dir "dumbo/" is a subdir of Hadoop's "src/contrib/", then the -f option can be omitted: sudo ant install_pymod USAGE """"" /usr/local/hadoop/bin/hadoop dfs -put examples/brian.txt brian.txt python examples/wordcount.py -hadoop /path/to/hadoop \ -file excludes.txt -input brian.txt -output brian-wc python -m dumbo cat brian-wc > brian-wc.txt MORE INFO """"""""" http://github.com/klbostee/dumbo/wikis








