- In Progress
The Project is to stream Yelp data and store it in Hbase table as well as find some insights.
Start zookeeper server and make sure it's running fine.
Start kafka server with your port configuration.
./createKafkaTopic.sh <zookeeper-server> <replication-factor> <partitions> <TopicName>
e.g
./createKafkaTopic.sh localhost:2181 1 1 businessTopic
- Go to streaming module
- mvn clean install
- Run Stream app using spark-submit
bin/spark-submit --class <class-name> --master <yarn-cluster> <jar file path>
e.g
bin/spark-submit --class StreamData --master local[4]/media/manu/Coding/Coding/Yelp-Image-Analytics/streaming/target/streaming-1.0-SNAPSHOT.jar
-
Go to kafka module
-
Provide Kafka param in file kafkaConfig.properties present in resource directory
-
mvn clean install
-
Run using spark-submit
*Producer app using spark-submit
spark-submit --class Producer <jar-file-path>
*Producing data using producer.sh script
./producer.sh <file-path> <kafka-broker-list> <topic-name>
e.g
./producer.sh /data/json/user.json localhost:9092 userTopic