Data Science with Hadoop by Packt Publishing
##What You Will Learn:
- Best practices for setup and configuration of Hadoop clusters, tailoring the system to the problem at hand
- Integration with relational databases, using Hive for SQL queries and Sqoop for data transfer
- Installing and maintaining Hadoop 2.X cluster and its ecosystem.
- Advanced Data Analysis using the Hive, Pig, and Map Reduce programs.
- Machine learning principles with libraries such as Mahout and Batch and Stream data processing using Apache Spark
- Understand the changes involved in the process in the move from Hadoop 1.0 to Hadoop 2.0
- Dive into YARN and Storm and use YARN to integrate Storm with Hadoop
- Deploy Hadoop on Amazon Elastic MapReduce and Discover HDFS replacements and learn about HDFS Federation