Streaming_benchmark

Streaming Benchmark is designed to measure the performance of stream processing system such as flink and spark. Three use cases are simulated (User Visit Session Analysis, Evaluation of Real-time Advertising and Shopping Record Analysis). Raw data is generated and stored in Kafka. Streams map into streaming tables and queries act on these tables.

Building

mvn clean package

Prerequisites

You should have Apache Kafka, Apache zookeeper, Apache Spark and Blink installed in your cluster.

Setup

Clone the project into your master.
Update conf/benchmarkConf.yaml (The properties of Kafka, Zookeeper, benchmark...)

streambench.zkHost                       ip1:2181,ip2:2181,ip3:2181...
streambench.kafka.brokerList             ip1:port1,ip1:port2...
streambench.kafka.consumerGroup          benchmark(default)

Update flink/conf/benchmarkConf.yaml (The properties of flink)

streambench.flink.checkpointDuration     5000
streambench.flink.timeType               EventTime(Use EventTime or ProcessTime)

Update conf/dataGenHosts (The hosts where data will be generated; suggest to generate data on kafka node)

ip1
ip2
...

Update conf/queriesToRun (The queries will be run)

q1.sql
q2.sql
q3.sql
...

Update conf/env

export DATAGEN_TIME=100 (Running time for each query)
export THREAD_PER_NODE=10(The number of threads on each node to generate data)
export FLINK_HOME={FLINK_HOME}
export SPARK_HOME={SPARK_HOME}

Copy the project to every node which will generate data (the same hosts in conf/dataGenHosts) and ensure that the master node can log in these hosts without password.

Run Benchmark

Start Zookeeper, kafka, Spark, Blink first. Run flink benchmark: sh bin/runFlinkBenchmark.sh. Run spark benchmark: sh bin/runSparkBenchmark.sh. Run both flink and spark benchmark: sh bin/runAll.sh.

Result

The results will be save on flink/result/result.log and spark/result/result.log and the format of result is just like below:

Finished time: 2019-10-30 19:07:26; q1.sql  Runtime: 58s TPS:10709265
Finished time: 2019-10-30 19:08:37; q2.sql  Runtime: 57s TPS:8061793
Finished time: 2019-10-30 19:09:51; q5.sql  Runtime: 57s TPS:4979921

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
bin		bin
common		common
conf		conf
dataGen		dataGen
flink		flink
spark		spark
utils		utils
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Streaming_benchmark

Building

Prerequisites

Setup

Run Benchmark

Result

About

Releases

Packages

Languages

Flink-zhisheng/StreamingBench

Folders and files

Latest commit

History

Repository files navigation

Streaming_benchmark

Building

Prerequisites

Setup

Run Benchmark

Result

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages