BigDataTechnologies

Spark Streaming Sentiment analysis from API twitter

Prerequisites

Be sure that you already has installed this software in your Cloudera Centos 6 -Kafka -kafka-server -Anaconda3 (python 3.5.1) -All librarys (see below how...)

Platform cloudera Centos 6

Check the service and see if hbase-master and regionserver are up service --status-all

If Hbase is down run the next commands

sudo service hbase-regionserver start sudo service hbase-master start

Install kafka and run the next command

kafka-console-producer --broker-list localhost:9092 --topic projectTweets

Create the environment

conda create -n py351 python=3.5.1

To activate the environment run the next commands

source activate /home/cloudera/anaconda3/envs/py351

To install the libraries run the next commands

pip install -r requirements.txt

Create the Hbase

1)source activate /home/cloudera/anaconda3/envs/py351 2)python hbasetable.py 3)from hbasetable import create_table 4)create_table()

Create the Hive

Run the next hive file hive socialmediatable.hive

SparkSQL

Run the next command in a console in your Cloudera virtual machine spark-shell -i '/home/cloudera/Desktop/BigDataTechnologiesProject/sparkSQL.scala

#IMPORTANT in the class twitter.py is mandatory put the KEYS to connect with Twitter

Run the daemons

1 python sparkstreaming.py 2 python twitter.py

Check the table socialmedia in HUE and that's all now we have the last tweets about of these 5 companies :)

VISUALIZATION WITH PowerBI

Follow the guide "20181214_ConnectingMSPowerBI2Hive.docx"
Open the file "DashboardTweets.pbix" with PowerBI Desktop
Press "refresh" button and now you are ready to present to your boss :)

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
BigDataTech_Project		BigDataTech_Project
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BigDataTech_Project

BigDataTech_Project

README.md

README.md

Repository files navigation

BigDataTechnologies

Spark Streaming Sentiment analysis from API twitter

Prerequisites

Platform cloudera Centos 6

If Hbase is down run the next commands

Install kafka and run the next command

Create the environment

To activate the environment run the next commands

To install the libraries run the next commands

Create the Hbase

Create the Hive

SparkSQL

Run the daemons

VISUALIZATION WITH PowerBI

About

Releases

Packages

Languages

cahernandez06/BigDataTech-Project

Folders and files

Latest commit

History

BigDataTech_Project

BigDataTech_Project

README.md

README.md

Repository files navigation

BigDataTechnologies

Spark Streaming Sentiment analysis from API twitter

Prerequisites

Platform cloudera Centos 6

If Hbase is down run the next commands

Install kafka and run the next command

Create the environment

To activate the environment run the next commands

To install the libraries run the next commands

Create the Hbase

Create the Hive

SparkSQL

Run the daemons

VISUALIZATION WITH PowerBI

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages