Skip to content

PacktPublishing/Data-Science-with-Hadoop

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data-Science-with-Hadoop

Data Science with Hadoop by Packt Publishing

##What You Will Learn:

  • Best practices for setup and configuration of Hadoop clusters, tailoring the system to the problem at hand
  • Integration with relational databases, using Hive for SQL queries and Sqoop for data transfer
  • Installing and maintaining Hadoop 2.X cluster and its ecosystem.
  • Advanced Data Analysis using the Hive, Pig, and Map Reduce programs.
  • Machine learning principles with libraries such as Mahout and Batch and Stream data processing using Apache Spark
  • Understand the changes involved in the process in the move from Hadoop 1.0 to Hadoop 2.0
  • Dive into YARN and Storm and use YARN to integrate Storm with Hadoop
  • Deploy Hadoop on Amazon Elastic MapReduce and Discover HDFS replacements and learn about HDFS Federation

About

Code repository for Data Science with Hadoop, published by Packt

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published