Course Content

#Big Data Course Syllabus# This is taken from sources in internet and compiled -- just to serve as a skeleton for what we are going to learn (not intend to any copyright infringements.

Introduction to Hadoop and Big Data:

What is Big Data.
What are the challenges for processing big data?
What technologies support big data?
What is Hadoop?
Why Hadoop?
History of Hadoop
Use cases of Hadoop
RDBMS vs Hadoop
When to use and when not to use Hadoop
Ecosystem tour
Vendor comparison
Hardware Recommendations & Statistics

HDFS: Hadoop Distributed File System: — Significance of HDFS in Hadoop

Features of HDFS
5 daemons of Hadoop
1. Name Node and its functionality
2. Data Node and its functionality
3. Secondary Name Node and its functionality
4. Job Tracker and its functionality
5. Task Tracker and its functionality
Data Storage in HDFS
1. Introduction about Blocks
2. Data replication
Accessing HDFS
1. CLI (Command line Interface) and admin commands
2. Java Based Approach
Fault tolerance
Download Hadoop
Installation and set-up of Hadoop
1. Start-up & Shut down process
HDFS Federation

YARN

Map Reduce:

Map Reduce Story
Map Reduce Architecture
How Map Reduce works
Developing Map Reduce
Map Reduce Programming Model*
1. Different phases of Map Reduce Algorithm.
2. Different Data types in Map Reduce.
3. how Write a basic Map Reduce Program.
  1. Driver Code
  2. Mapper
  3. Reducer
Creating Input and Output Formats in Map Reduce Jobs
1. Text Input Format
2. Key Value Input Format
3. Sequence File Input Format 1. Data localization in Map Reduce 2. Combiner (Mini Reducer) and Partitioner 3. Hadoop I/0 4. Distributed cache

PIG

Introduction to Apache Pig
Map Reduce Vs. Apache Pig
SQL vs. Apache Pig
Different data types in Pig
Modes of Execution in Pig
Grunt shell
Loading data
Exploring Pig
Latin commands

HIVE

Hive introduction
Hive architecture
Hive vs RDBMS
HiveQL and the shell
Managing tables (external vs managed)
Data types and schemas
Partitions and buckets HBASE
Architecture and schema design
HBase vs. RDBMS
HMaster and Region Servers
Column Families and Regions
Write pipeline
Read pipeline
HBase commands

FLUME

SQOOP

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Course Content

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally