Skip to content

aceiro/kafka-tutorial

Repository files navigation

Kafka Tutorial

This is a step-by-step Kafka Tutorial to move you from beginner to advanved level in Kafka. This tutorial was inspired from Apache Kafka course that I recommed a lot.

This is a step-by-step example that shows some importants things existing on Apache Kafka. In this case, consumer and producer are integrated to produce data from Twitter API using filters, streams and others "kafkaesque" to Elasticsearch.

This repository contains the structure below:

  • kafka-basics
  • kafka-producer-twitter
  • kafka-consumer-elasticsearch
  • kafka-streams-filter-twitter

To build this repository you will need:

To make sure that you have all dependencies check this command in root folder

mvn clean package

Learning Kafka from this tutorial

Kafka-basics

Twitter + Kafka-consumer-elasticsearch

Basic CLI to start in Kafka

  1. Installing Apache Kafka from wget
 $ wget https://dlcdn.apache.org/kafka/3.0.0/kafka_2.12-3.0.0.tgz
 $ tar xzvf kafka_2.12-3.0.0.tgz
 $ cd kafka_2.12-3.0.0
 $ kafka-topics.sh # checking installation 
                   # it shows help logs from kafka-topics.sh command
  1. Instarting Kafka brokers and Zookeper
 $ zookeeper-server-start.sh ~/kafka_2.12-3.0.0/config/zookeeper.properties
 $ kafka-server-start.sh ~/kafka_2.12-3.0.0/config/server.properties
  1. Creating a first topic
kafka-topics.sh --bootstrap-server localhost:9092 --topic first_topic --create --partitions 3 --replication-factor 1 # this command create a new topic
                                      # it requires minimum partitions = 3 and 
                                      # replication factor = 1
                                      # The ISR is simply all the replicas of a partition that are "in-sync" with the leader partition. 
  1. Listing topics
 $ kafka-topics.sh --bootstrap-server localhost:9092 --list
  1. Starting a consumer from CLI
 $ kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic first_topic
  1. Starting a producer form CLI
 $ kafka-console-producer.sh --bootstrap-server localhost:9092 --topic first_topic
  1. Replying data from begginning based on current offset
 $ kafka-console-consumer.sh --topic first_topic --bootstrap-server localhost:9092 --from-beginning
  1. Describe topics from groupds
 $ kafka-consumer-groups.sh --bootstrap-server 127.0.0.1:9092 --group kafka-demo-elasticsearch --describe
  1. Resseting offsets from group to replying data
 $ kafka-consumer-groups.sh --bootstrap-server 127.0.0.1:9092 --group kafka-demo-elasticsearch --execute  --reset-offsets --to-earliest --topic tweets_topic --to-earliest