Skip to content


Subversion checkout URL

You can clone with
Download ZIP
MapReduce Design Patterns code examples
Java Shell
Latest commit ebfd679 @cfeduke updated README formatting

Mr. D. Patterns

Err uh... MapReduce Design Patterns.

As I work through the MapReduce Design Patterns book I need a place to stash my source code. This is it.

I stayed moderately true to the examples, with some re-arrangement here and there. Most notably the MRDPUtils#transformXmlToMap performs a StringEscapeUtils#unescapeHtml within itself rather than separately in any mapper that needs that functionality.

To use

$ mvn package

I've placed a bunch of scripts in the ./bin/ directory. These make a few terrible assumptions about your environment. You can change ./bin/ to be more accomodating.

  1. There is a $HADOOP_HOME, even though its deprecated
  2. The $DATADIR is mapped to $DATADIR=/Users/$USER/Downloads/stack-overflow-dump-2009-09
  3. You have the CC data dump from StackOverflow (I used 2009 because its smallish, you should be able to use any year)
  4. The launch scripts assume single node mode

Make sure Hadoop is running ($HADOOP_HOME/bin/ and execute the script of your choice.

Something went wrong with that request. Please try again.