Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Example source code accompanying O'Reilly's "Hadoop: The Definitive Guide" by Tom White
Branch: master
Pull request Compare This branch is 292 commits behind tomwhite:master.

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
app3/src/main/sh
avro
book/src/test
ch02/src/main
ch03/src/main
ch04/src/main
ch05/src/main
ch06/src/main
ch07/src/main/examples
ch08/src/main
ch09/src/main
ch11/src/main
ch12/src/main
ch13/src/main/java
ch14/src/main
ch15/src/main/java
ch16/src/main/java
common/src
input
snippet
.gitignore
CHANGES
README
build-examples.xml
build.xml
findbugs-exclude.xml
ivy.xml

README

Example code for "Hadoop: The Definitive Guide, Second Edition" by Tom White.
Copyright (C) 2010 Tom White, 978-1-449-38973-4

http://www.hadoopbook.com/
http://oreilly.com/catalog/9781449389734/

The code is hosted at http://github.com/tomwhite/hadoop-book/. You can find code
for the first edition at http://github.com/tomwhite/hadoop-book/tree/1e.

This version of the code has been tested with:
 * Hadoop 0.20.2
 * Pig 0.7.0
 * Hive 0.5.0-dev (compiled from SVN trunk)
 * HBase 0.89.0-SNAPSHOT (compiled from SVN trunk)
 * ZooKeeper 3.3.1.

Before running the examples you need to install Hadoop, Pig, Hive, HBase,
and ZooKeeper as explained in the book.

You also need to install Ivy (http://ant.apache.org/ivy/).

Then you can compile the code:

ant jar pig hive hbase

You should then be able to run the examples from the book.

Chapter names for "Hadoop: The Definitive Guide", Second Edition

ch01 - Meet Hadoop
ch02 - MapReduce
ch03 - The Hadoop Distributed Filesystem
ch04 - Hadoop I/O
ch05 - Developing a MapReduce Application
ch06 - How MapReduce Works
ch07 - MapReduce Types and Formats
ch08 - MapReduce Features
ch09 - Setting Up a Hadoop Cluster
ch10 - Administering Hadoop
ch11 - Pig
ch12 - Hive
ch13 - HBase
ch14 - ZooKeeper
ch15 - Sqoop
ch16 - Case Studies

app1 - Installing Apache Hadoop
app2 - Cloudera's Distribution for Hadoop
app3 - Preparing the NCDC Weather Data
Something went wrong with that request. Please try again.