A [Giter8][g8] template for showcasing integration of Kafka 0.10 with Spark Streaming in which we are pushing Tweets in Kafka Cluster and consuming tweets using spark streaming.
Here we are using:
Kafka Client for Kafka API
Twitter4J Streaming as a source.
Jackson ObjectMapper for byte stream conversion.
Typesafe Config to read configuration file.
Spark Streaming to read from kafka cluster
###Steps to Install and Run Zookeeper and Kafka on your system :
Step 1: Download Kafka
Download Kafka from here
Step 2: Extract downloaded file
tar -xzvf kafka_2.11-0.10.1.1.tgz
cd kafka_2.11-0.10.1.1
Step 3: Start Servers
Start Zookeeper:
bin/zookeeper-server-start.sh config/zookeeper.properties
Start Kafka server:
bin/kafka-server-start.sh config/server.properties
###Clone Project
git clone git@github.com:knoldus/activator-kafka-spark-streaming.git
cd activator-kafka-spark-streaming
bin/activator clean compile
###Start Tweet Producer
bin/activator "run-main com.knoldus.demo.ProducerDemo"
This will start fetching tweets and push every tweet into the Kafka queue.
###Start Streaming
bin/activator "run-main com.knoldus.demo.SparkStreamingDemo"
This will start streaming.
For any issue please raise a ticket @ Github Issue
Written in 2017 by Knoldus Software LLP [other author/contributor lines as appropriate] To the extent possible under law, the author(s) have dedicated all copyright and related and neighboring rights to this template to the public domain worldwide. This template is distributed without any warranty. See http://creativecommons.org/publicdomain/zero/1.0/. [g8]: http://www.foundweekends.org/giter8/