Skip to content
Real-time report dashboard with Apache Kafka, Apache Spark Streaming and Node.js
Branch: master
Clone or download
duyetdev Merge pull request #1 from Hellofafar/patch-1
Update the link to download Spark 2.2.0
Latest commit 4731418 Jul 27, 2018
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.github/images Update spark server. Nov 17, 2017
bin Update Dec 14, 2017
nodejs-kafka-proxy Nodejs Kafka proxy. Nov 16, 2017
spark Update spark server. Nov 17, 2017
.gitignore Initial commit Nov 8, 2017
LICENSE Initial commit Nov 8, 2017
README.md
RRD.png Add files via upload Nov 15, 2017
package.json Nodejs Kafka proxy. Nov 16, 2017
requirements.txt Update Dec 14, 2017

README.md

Realtime Dashboard

Real-time report dashboard with Apache Kafka, Apache Spark Streaming and Node.js

Getting started

1. Setup environment

Clone this project

git clone https://github.com/duyetdev/realtime-dashboard.git
cd realtime-dashboard/

# Setup env
./bin/env.sh

Download Apache Spark 2.2.0

cd $RRD_HOME
wget http://archive.apache.org/dist/spark/spark-2.2.0/spark-2.2.0-bin-hadoop2.7.tgz
tar -xzf spark-2.2.0-bin-hadoop2.7.tgz
export SPARK_HOME=$RRD_HOME/spark-2.2.0-bin-hadoop2.7

Download Kafka

cd $RRD_HOME
wget http://mirrors.viethosting.com/apache/kafka/1.0.0/kafka_2.11-1.0.0.tgz
tar -xzf kafka_2.11-1.0.0.tgz
export KAFKA_HOME=$RRD_HOME/kafka_2.11-1.0.0

Install Node.js packages

npm install

2. Start Kafka Server

Start Zookeeper and Kafka

cd $RRD_HOME/kafka_2.11-1.0.0

# Start zookeeper
bin/zookeeper-server-start.sh config/zookeeper.properties &

# Start Kafka
bin/kafka-server-start.sh config/server.properties &

Create Topics

bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic website-collect
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic website-report

We can't access Kafka directly via HTTP, so we start Kafka Proxy :

node nodejs-kafka-proxy/server.js

# [2017-11-16 14:24:03,008] INFO Accepted socket connection from /127.0.0.1:42984 (org.apache.zookeeper.server.NIOServerCnxnFactory)
# [2017-11-16 14:24:03,010] WARN Connection request from old client /127.0.0.1:42984; will be dropped if server is in r-o mode (org.apache.zookeeper.server.ZooKeeperServer)
# [2017-11-16 14:24:03,010] INFO Client attempting to establish new session at /127.0.0.1:42984 (org.apache.zookeeper.server.ZooKeeperServer)
# [2017-11-16 14:24:03,025] INFO Established session 0x15fc38ffab40011 with negotiated timeout 30000 for client /127.0.0.1:42984 (org.apache.zookeeper.server.ZooKeeperServer)
# Example app listening on port 3000!

Test (Optional) Kafka Produder and Consumer

Open two terminals:

# Terminal 1
$ bin/kafka-console-producer.sh --broker-list localhost:9092 --topic website-collect
This is a message
This is another message
{"client_id": "blog.duyet.net", "time": "1510736940", "event": "view", "ip":"1.2.3.4", "UA": "Chrome"}
{"client_id": "blog.duyet.net", "time": "1510736940", "event": "click", "ip":"1.2.3.5", "UA": "Firefox"}
# Terminal 2
$ bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic website-collect --from-beginning
This is a message
This is another message
{"client_id": "blog.duyet.net", "time": "1510736940", "event": "view", "ip":"1.2.3.4", "UA": "Chrome"}
{"client_id": "blog.duyet.net", "time": "1510736940", "event": "click", "ip":"1.2.3.5", "UA": "Firefox"}

Test proxy server:

http://localhost:3000/proxy/website-collect?message=hello

You will see in Consumer Kafka:

3. Apache Spark Streaming

Submit Spark Streaming script

# Usage: spark_server.py <zk> <input_topic> <output_topic>

$SPARK_HOME/bin/spark-submit --packages org.apache.spark:spark-streaming-kafka-0-8_2.11:2.0.2 \
    $RRD_HOME/spark/spark_server.py \
    localhost:2181 website-collect website-report
You can’t perform that action at this time.