Skip to content

mayank-17/Project-Hadoop

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

Project-Hadoop

The Project consists of 5 sub-projects which use different technologies.

Part 1: Map-Reduce: Finding words present in different files Objective: to find which words are present in how many files along with the file names using Map-Reduce Framework.

Part 2: Sqoop Task: Loading Data from RDBMS to HDFS Objective: to load data from RDBMS (MySQL) to HDFS & then load data from HDFS to Hive tables using Apache Sqoop.

Part 3: Stocks Analysis using Hive Objective: to run queries on the stocks dataset loaded using Sqoop to understand it better and then perform analytics on it.

Part 4: Pig Analytics Objective: to perform basic queries on 2 datasets: stocks and dividends and then joining them using Pig Latin to understand how Apache Pig Framework works.

Part 5: Twitter’s Top 10 popular Hashtag Streaming per second using Apache Spark Objective: to find the Top 10 popular Hashtag on Twitter and perform Web Scraping using Spark & Scala to stream the data on per second basis.

To view the Project please download Mini Project Report.pdf File

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published