Skip to content
This repository

Example source code accompanying O'Reilly's "Hadoop: The Definitive Guide" by Tom White

branch: master
Octocat-spinner-32 app3 Change example code directory structure to be more maven-like: src/ma… July 23, 2010
Octocat-spinner-32 book Upgrade from HBase 0.90 to 0.94. January 12, 2014
Octocat-spinner-32 ch02 Allow snippet version overrides to be done by major Hadoop version. January 10, 2014
Octocat-spinner-32 ch03 Upgrade from Hadoop 2.0.3-alpha to 2.2.0. January 09, 2014
Octocat-spinner-32 ch04-avro Remove imports from snippet. February 08, 2012
Octocat-spinner-32 ch04 Get builds working and tests passing. February 01, 2012
Octocat-spinner-32 ch05 Allow different Hadoop versions to be specified. January 09, 2014
Octocat-spinner-32 ch07 Allow snippet version overrides to be done by major Hadoop version. January 10, 2014
Octocat-spinner-32 ch08 Allow snippet version overrides to be done by major Hadoop version. January 10, 2014
Octocat-spinner-32 ch09 Add some missing files. January 11, 2012
Octocat-spinner-32 ch11 Updated minor grunt output changes in the tests for Pig 0.9.1. January 18, 2012
Octocat-spinner-32 ch12 Add Hive code for conversions and indexes. January 28, 2012
Octocat-spinner-32 ch13 Upgrade from HBase 0.90 to 0.94. January 12, 2014
Octocat-spinner-32 ch14 Fix bug in ZK program. February 05, 2012
Octocat-spinner-32 ch15 Move creation of examples jar into chapter modules themselves. (Excep… January 12, 2012
Octocat-spinner-32 ch16 Support Hadoop 0.20.x, 0.23.0-SNAPSHOT August 22, 2011
Octocat-spinner-32 common Get builds working and tests passing. February 01, 2012
Octocat-spinner-32 experimental Add more experimental code. January 11, 2012
Octocat-spinner-32 hadoop-examples Change a few Oozie paths. January 24, 2012
Octocat-spinner-32 hadoop-meta Introduce hadoop.distro parameter to avoid problems with profile acti… January 10, 2014
Octocat-spinner-32 input Add Hive code for conversions and indexes. January 28, 2012
Octocat-spinner-32 snippet Allow snippet version overrides to be done by major Hadoop version. January 10, 2014
Octocat-spinner-32 .gitignore Add Hive code for conversions and indexes. January 28, 2012
Octocat-spinner-32 README Upgrade from HBase 0.90 to 0.94. January 12, 2014
Octocat-spinner-32 pom.xml Move creation of examples jar into chapter modules themselves. (Excep… January 12, 2012
README
Example code for "Hadoop: The Definitive Guide, Third Edition" by Tom White.
Copyright (C) 2011 Tom White, 978-1-449-31152-0

http://www.hadoopbook.com/
http://oreilly.com/catalog/9781449311520/

The code is hosted at http://github.com/tomwhite/hadoop-book/. You can find code
for the first edition at http://github.com/tomwhite/hadoop-book/tree/1e, and
for the second edition at http://github.com/tomwhite/hadoop-book/tree/2e.

This version of the code has been tested with:
 * Hadoop 1.2.1/0.22.0/0.23.x/2.2.0
 * Avro 1.5.4
 * Pig 0.9.1
 * Hive 0.8.0
 * HBase 0.90.4/0.94.15
 * ZooKeeper 3.4.2
 * Sqoop 1.4.0-incubating
 * MRUnit 0.8.0-incubating

Before running the examples you need to install Hadoop, Pig, Hive, HBase,
ZooKeeper, and Sqoop (as appropriate) as explained in the book.

You also need to install Maven.

Then you can build the code with:

% mvn package -DskipTests

By default Hadoop 1.2.1 is used. This can be changed by specifying the
hadoop.version property, e.g.

% mvn package -DskipTests -Dhadoop.version=1.2.0

There are profiles for different Hadoop major versions and distributions,
specified in hadoop-meta/pom.xml, and they are specified using the hadoop.distro
property. For example, to use the default version of Hadoop 2:

% mvn package -DskipTests -Dhadoop.distro=apache-2

Again, you can specify hadoop.version to use a particular Hadoop 2 version:

% mvn package -DskipTests -Dhadoop.distro=apache-2 -Dhadoop.version=2.1.1-beta

You should then be able to run the examples from the book.

Chapter names for "Hadoop: The Definitive Guide", Third Edition

ch01 - Meet Hadoop
ch02 - MapReduce
ch03 - The Hadoop Distributed Filesystem
ch04 - Hadoop I/O
ch05 - Developing a MapReduce Application
ch06 - How MapReduce Works
ch07 - MapReduce Types and Formats
ch08 - MapReduce Features
ch09 - Setting Up a Hadoop Cluster
ch10 - Administering Hadoop
ch11 - Pig
ch12 - Hive
ch13 - HBase
ch14 - ZooKeeper
ch15 - Sqoop
ch16 - Case Studies

app1 - Installing Apache Hadoop
app2 - Cloudera's Distribution for Hadoop
app3 - Preparing the NCDC Weather Data
Something went wrong with that request. Please try again.