Project-Hadoop

The Project consists of 5 sub-projects which use different technologies.

Part 1: Map-Reduce: Finding words present in different files Objective: to find which words are present in how many files along with the file names using Map-Reduce Framework.

Part 2: Sqoop Task: Loading Data from RDBMS to HDFS Objective: to load data from RDBMS (MySQL) to HDFS & then load data from HDFS to Hive tables using Apache Sqoop.

Part 3: Stocks Analysis using Hive Objective: to run queries on the stocks dataset loaded using Sqoop to understand it better and then perform analytics on it.

Part 4: Pig Analytics Objective: to perform basic queries on 2 datasets: stocks and dividends and then joining them using Pig Latin to understand how Apache Pig Framework works.

Part 5: Twitter’s Top 10 popular Hashtag Streaming per second using Apache Spark Objective: to find the Top 10 popular Hashtag on Twitter and perform Web Scraping using Spark & Scala to stream the data on per second basis.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Mini Project Report.pdf		Mini Project Report.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project-Hadoop

To view the Project please download Mini Project Report.pdf File

About

Releases

Packages

mayank-17/Project-Hadoop

Folders and files

Latest commit

History

Repository files navigation

Project-Hadoop

To view the Project please download Mini Project Report.pdf File

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages