Hadoop 2.x Administration Cookbook

$5 Tech Unlocked 2021!

Buy and download this Book for only $5 on PacktPub.com

If you have read this book, please leave a review on Amazon.com. Potential readers can then use your unbiased opinion to help them make purchase decisions. Thank you. The $5 campaign runs from December 15th 2020 to January 13th 2021.

Hadoop 2.x Administration Cookbook

This is the code repository for Hadoop 2.x Administration Cookbook, published by Packt. It contains all the supporting project files necessary to work through the book from start to finish.

About the Book

Hadoop enables the distributed storage and processing of large data sets across clusters of computers. Learning to administer Hadoop is crucial to exploit its unique features. With this book, you will be able to overcome common problems encountered in Hadoop administration.

The book begins with laying the foundation by showing you the steps needed to set up the Hadoop cluster and its various nodes. You will get a better understanding of how to maintain Hadoop cluster, especially on the HDFS layer and using YARN and MapReduce. Further on, you will explore durability and high availability of a Hadoop cluster.

You’ll get a better understanding of the schedulers in Hadoop and how to configure and use them for your tasks. You will also get hands-on experience with the back up and recovery options and the performance tuning aspects of Hadoop. Finally, you will get a better understanding of troubleshooting, diagnostics, and best practices in Hadoop administration.

By the end of this book, you will have a proper understanding of working with Hadoop clusters and will also be able to secure, encrypt it, and configure auditing for your Hadoop clusters.

Instructions and Navigation

All of the code is organized into folders. Each folder starts with a number followed by the application name. For example, Chapter02.

Chapter 10 does not contains any code files.

The code will look like the following:

<property>
    <name>dfs.hosts.exclude</name>
    <value>/home/hadoop/excludes</value>
    <final>true</final>
</property>

To go through the recipes in this book, users need any Linux distribution, which could be Ubuntu, Centos, or any other flvor, as long as it supports running JVM. We use Centos in our recipe, as it is the most commonly used operating system for Hadoop clusters.

Hadoop runs on both virtualized and physical servers, so it is recommended to have at least 8 GB for the base system, on which about three virtual hosts can be set up. Users do not need to set up all the recipes covered in this book all at once; they can run only those daemons that are necessary for that particular recipe. This way, they can keep the resource requirements to the bare minimum. It is good to have at least four hosts to practice all the recipes in this book. These hosts could be virtual or physical.

In terms of software, users need JDK 1.7 minimum, and any SSH client, such as PuTTY in Windows or Terminal, to connect to the Hadoop nodes.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
Chapter01		Chapter01
Chapter02		Chapter02
Chapter03		Chapter03
Chapter04		Chapter04
Chapter05		Chapter05
Chapter06		Chapter06
Chapter07		Chapter07
Chapter08		Chapter08
Chapter09		Chapter09
Chapter11		Chapter11
Chapter12		Chapter12
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chapter01

Chapter01

Chapter02

Chapter02

Chapter03

Chapter03

Chapter04

Chapter04

Chapter05

Chapter05

Chapter06

Chapter06

Chapter07

Chapter07

Chapter08

Chapter08

Chapter09

Chapter09

Chapter11

Chapter11

Chapter12

Chapter12

LICENSE

LICENSE

README.md

README.md

Repository files navigation

$5 Tech Unlocked 2021!

Buy and download this Book for only $5 on PacktPub.com

Hadoop 2.x Administration Cookbook

About the Book

Instructions and Navigation

Related Products

About

Releases

Packages

Contributors 3

License

PacktPublishing/Hadoop-2.x-Administration-Cookbook

Folders and files

Latest commit

History

Repository files navigation

$5 Tech Unlocked 2021!

Hadoop 2.x Administration Cookbook

About the Book

Instructions and Navigation

Related Products

About

Resources

License

Stars

Watchers

Forks