Be sure that you already has installed this software in your Cloudera Centos 6 -Kafka -kafka-server -Anaconda3 (python 3.5.1) -All librarys (see below how...)
Check the service and see if hbase-master and regionserver are up service --status-all
sudo service hbase-regionserver start sudo service hbase-master start
kafka-console-producer --broker-list localhost:9092 --topic projectTweets
conda create -n py351 python=3.5.1
source activate /home/cloudera/anaconda3/envs/py351
pip install -r requirements.txt
1)source activate /home/cloudera/anaconda3/envs/py351 2)python hbasetable.py 3)from hbasetable import create_table 4)create_table()
Run the next hive file hive socialmediatable.hive
Run the next command in a console in your Cloudera virtual machine spark-shell -i '/home/cloudera/Desktop/BigDataTechnologiesProject/sparkSQL.scala
#IMPORTANT in the class twitter.py is mandatory put the KEYS to connect with Twitter
1 python sparkstreaming.py 2 python twitter.py
Check the table socialmedia in HUE and that's all now we have the last tweets about of these 5 companies :)
- Follow the guide "20181214_ConnectingMSPowerBI2Hive.docx"
- Open the file "DashboardTweets.pbix" with PowerBI Desktop
- Press "refresh" button and now you are ready to present to your boss :)