This example here should show how to use components like Kafka and Vertx for realtime analysis of data streams.

Additional Information

Example Setup (idea)

  • Various components are sending AVRO serialized log- or data-messages to Kafka
  • CAMUS as a Kafka consumer stores the messages to HDFS for further MapReduce processing
  • This Vertx project as a Kafka consumer processes the messages as well, acting as a Stream Processing component
  • The processed data is shown on a D3.js based Dashboard

Stream Processing

Process of extracting knowledge structures from continuous, rapid and unlimited data.

I am using in this project:

  • StreamLib: Great set of implemented stream processing algorithms
  • Top-k: Efficient Computation of Frequent and Top-k Elements in Data Streams (paper)
  • Adoptive Counting: Fast and accurate counting of unique elements (paper)
  • Sliding Window: Only considering data in a certain time window


  • MainVerticle: Starts the following Verticles
  • KafkaVerticle: Kafka consumer Verticle, consumes the configured set of Topics
  • WebSocketVerticle: Manages frontend WebSocket connections
  • DataVerticle: Processes received messages and calculates messages for Frontend

Start this

vertx uninstall eu.fakodmy-vertx-realtime-module1.0-SNAPSHOT

vertx runmod eu.fakodmy-vertx-realtime-module1.0-SNAPSHOT -conf config.json