Skip to content

Building a Big Data Analytics Stack [Video], by Packt Publishing

License

Notifications You must be signed in to change notification settings

PacktPublishing/Building-a-Big-Data-Analytics-Stack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Building a Big Data Analytics Stack [Video]

This is the code repository for Building a Big Data Analytics Stack [Video], published by Packt. It contains all the supporting project files necessary to work through the video course from start to finish.

About the Video Course

Building a Big Data ecosystem is hard. There are a variety of technologies available and every one of them has its pros and cons. When building a big data pipeline for software engineers, we need to use more low-level tools and APIs such as HBase and Apache Spark. In this course, we’ll check out HBase, a database built by optimizing on the HDFS. Moving on, we’ll have a bit of fun with Spark MLlib. Finally, you’ll get an understanding of ETL and deploy a Hadoop project to the cloud. Building Big Data Ecosystem is hard. There are a variety of technologies available and every one of them has own pros and cons. Software Engineers we need to use more low-level tools and APIs like HBase and Apache Spark while building big data pipeline. By the end of the course, you’ll be able to use more high-level tools that have more user-friendly, declarative APIs such as Pig and Hive.

What You Will Learn

  • Use Pig and Hive in a non-Java way to understand the power of Hadoop
  • Explore Spark and use it to stream and batch process
  • Use HBase database from Java application
  • Find out more about the machine learning toolkit and its use with Spark
  • Know how to leverage the pros of Big Data tools

Instructions and Navigation

Assumed Knowledge

To fully benefit from the coverage included in this course, you will need:
This course is for big data developers and big data engineers who work with and analyze data in clusters. It’s also ideal for developers who work with raw structured and unstructured data sets, and data analysts who work with Hadoop clusters.

Technical Requirements

This course has the following software requirements:

  • Intel Core 2 Duo/Quad/hex/Octa or higher end 64 bit processor PC or Laptop (Minimum operating frequency of 2.5GHz)
  • Hard Disk capacity of 1- 4TB
  • 64-512 MB RAM
  • 10 Gigabit Ethernet or Bonded Gigabit Ethernet

Related Products

About

Building a Big Data Analytics Stack [Video], by Packt Publishing

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published