Skip to content
Example source code accompanying O'Reilly's "Hadoop: The Definitive Guide" by Tom White
Makefile Java Shell Scala Perl Python Other
Find file
Failed to load latest commit information.
appc/src/main/sh Rename app3 to appc to reflect the name in the book ('Appendix C'). Nov 27, 2014
book Put Hadoop version number in same POM as other components. Nov 27, 2014
ch02-mr-intro Ensure that code doesn't exceed maximum line length. Feb 4, 2015
ch03-hdfs Change version to 4.0. Nov 27, 2014
ch04-yarn Renamed chxx-* to actual chapter numbers. Sep 18, 2014
ch05-io Ensure that code doesn't exceed maximum line length. Feb 4, 2015
ch06-mr-dev Ensure that code doesn't exceed maximum line length. Feb 4, 2015
ch08-mr-types Change version to 4.0. Nov 27, 2014
ch09-mr-features Change version to 4.0. Nov 27, 2014
ch10-setup/src/main Rename ch12-setup to ch10-setup Nov 13, 2014
ch12-avro Ensure that code doesn't exceed maximum line length. Feb 4, 2015
ch13-parquet Change version to 4.0. Nov 27, 2014
ch14-flume Renamed chxx-* to actual chapter numbers. Sep 18, 2014
ch15-sqoop Change version to 4.0. Nov 27, 2014
ch16-pig Ensure that code doesn't exceed maximum line length. Feb 4, 2015
ch17-hive Change version to 4.0. Nov 27, 2014
ch18-crunch Ensure that code doesn't exceed maximum line length. Feb 4, 2015
ch19-spark Ensure that code doesn't exceed maximum line length. Feb 4, 2015
ch20-hbase Ensure that code doesn't exceed maximum line length. Feb 4, 2015
ch21-zk Change version to 4.0. Nov 27, 2014
ch22-case-studies Change version to 4.0. Nov 27, 2014
common Change version to 4.0. Nov 27, 2014
conf Convert site files for Important Hadoop Daemon Properties section Sep 24, 2014
hadoop-examples Change version to 4.0. Nov 27, 2014
hadoop-meta Put Hadoop version number in same POM as other components. Nov 27, 2014
input Add union type to Hive test. Oct 18, 2014
snippet Change chapter names of xml files (O'Reilly prod change). Feb 4, 2015
.gitignore Add Hive code for conversions and indexes. Jan 28, 2012
README.md Put Hadoop version number in same POM as other components. Nov 27, 2014
pom.xml Change version to 4.0. Nov 27, 2014

README.md

Hadoop Book Example Code

This repository contains the example code for Hadoop: The Definitive Guide, Fourth Edition by Tom White (O'Reilly, 2014).

Code for the First, Second, and Third Editions is also available.

Note that the chapter names and numbering has changed between editions, see Chapter Numbers By Edition.

Building and Running

To build the code, you will first need to have installed Maven and Java. Then type

% mvn package -DskipTests

This will do a full build and create example JAR files in the top-level directory (e.g. hadoop-examples.jar).

To run the examples from a particular chapter, first install the component needed for the chapter (e.g. Hadoop, Pig, Hive, etc), then run the command lines shown in the chapter.

Sample datasets are provided in the input directory, but the full weather dataset is not contained there due to size restrictions. You can find information about how to obtain the full weather dataset on the book's website at http://www.hadoopbook.com/.

Hadoop Component Versions

This edition of the book works with Hadoop 2. It has not been tested extensively with Hadoop 1, although most of it should work.

For the precise versions of each component that the code has been tested with, see book/pom.xml.

Copyright

Copyright (C) 2014 Tom White

Something went wrong with that request. Please try again.