Skip to content

Latest commit

 

History

History
18 lines (12 loc) · 743 Bytes

README.md

File metadata and controls

18 lines (12 loc) · 743 Bytes

bench_mapr

Benchmark Python mapreduce implementations following A Guide to Python Frameworks and HiBench Hadoop benchmark suite.

Current status

As of 20140814T030000Z:

Only wordcount and sort have been implemented. Only Disco and Hadoop streaming Python examples have been completed.

This repository is follow-up to posting on Disco user group: https://groups.google.com/forum/#!topic/disco-dev/u3EsnGgLOPM

Versions used:

  • Python v2.7 from the ContinuumIO Anaconda Python distribution
  • Disco v0.4.4 from the ContinuumIO Anaconda Python distribution
  • Hadoop v2.3.0-cdh5.0.3 from the Cloudera distribution