Skip to content

This test program uses structured streaming of spark, receive message that is json type from kafka, then get counts of DNS for every people In one minute. (I think one source Ip addr is one people). Finally, the result will insert to mysql database.

License

Notifications You must be signed in to change notification settings

cynthia1wang/jdbcsink

Repository files navigation

jdbcsink

This test program uses structured streaming of spark, receive message that is json type from kafka, then get counts of DNS for every people In one minute. (I think one source Ip addr is one people). Finally, the result will insert to mysql database.

Building on Unix with sbt

scala version is 2.11.8, if you not, you will change the build.sbt, modify

scalaVersion:= "your version"

Prerequisites:

spark-2.1.1-bin-hadoop2.6

KAFAK-2.1.1-1.2.1.1.po.18

mysql version:14.14

linux:centos 6.5(Final)

Build instructions:

git clone https://github.com/cynthia1wang/jdbcsink.git

$cd jdbcsink/

$sbt clean package

Run the test programs:

$spark-submit --deploy-mode cluster --master yarn --supervise --driver-class-path mysql-connector-java-5.1.42-bin.jar --class DNSstatJob --jars spark-sql-kafka-0-10_2.11-2.1.1.jar,spark-streaming-kafka-0-10-assembly_2.11-2.1.1.jar,mysql-connector-java-5.1.42-bin.jar target/scala-2.11/dnsjob-project_2.11-1.0.jar

About

This test program uses structured streaming of spark, receive message that is json type from kafka, then get counts of DNS for every people In one minute. (I think one source Ip addr is one people). Finally, the result will insert to mysql database.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages