Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
tree: 7bd0731279

Fetching latest commit…

Cannot retrieve the latest commit at this time

..
Failed to load latest commit information.
data
l0-basic-without-hadoop/ex0-wordcount
l1-basic-with-hadoop
l2-basic-using-whirr/ex3-whirr
l4-vision-and-image-processing-with-hadoop
l5-hadoopy-flow/ex2-hadoopy-flow
README

README

Hadoopy Examples
Brandyn White <bwhite@dappervision.com>

The examples are grouped by requirements and background into 'levels'.  The background is intended to give you a sense of what is recommended, don't let it scare you off.  Each exercise is listed below with one of 'TODO' (incomplete or not started), 'Untested' (almost done but still rough), or 'Tested' (completed and integrated into the automated tests in test_examples.py).

L0 - Basic examples without Hadoop (everything is run locally)
Requirements: Hadoopy, Python 2.6+, Linux/OS X
Background: Basic knowledge of MapReduce (TODO link to resources), Syntactic understanding of Python
    ex0[Tested]: Wordcount

L1 - Basic examples with Hadoop (interacts with HDFS and Hadoop)
Requirements: L0 + CDH2/3
Background: L0 + basic understanding of Hadoop job execution and HDFS
    ex0[Tested]: Wordcount
    ex1[Tested]: Direct write to HDFS + Wordcount

L2 - Basic examples with Hadoop using a Whirr cluster
Requirements: L0 + Whirr, Amazon AWS Account, a few $
Background: L0 + familiarity with Amazon AWS

L3 - Intermediate examples with Hadoop
Requirements: L1 or L2
Background: L0 + understanding of Hadoop design patterns (TODO link to Jimmy's Book)

L4 - Image processing + Computer Vision with Hadoop
Requirements: (L1 or L2) + Python Imaging Library (PIL), OpenCV
Background: L0 + familiarity with PIL and OpenCV

L5 - Automated parallel job execution with Hadoopy Flow
Requirements: (L1 or L2) + Hadoopy Flow, gevent
Background: L0 + familiarity with 'greenlets', dataflow

L6 - Mixing Java Hadoop code with Hadoopy
Requirements: (L1 or L2) + JDK
Background: L0 + familiarity with Java

L7 - Using Hadoopy with the Oozie job execution engine
Requirements: (L1 or L2) + Oozie
Background: L0 + familiarity with Oozie

L8 - Using Hadoopy with the Avro serialization format
Requirements: (L1 or L2) + Avro
Background: L0 + familiarity with Avro

L9 - Using Hadoopy with the Cassandra database
Requirements: (L1 or L2) + Cassandra
Background: L0 + familiarity with Cassandra
Something went wrong with that request. Please try again.