Java Client Application (Producer, Consumer) and Setting up Apache Kafka Linux Ubuntu
Inside each Kafka cluster, there are several key components:
-
Brokers that are responsible for storing and serving the data.
-
Inside brokers, there are topics to feed names to which records are written.
-
Records are the objects that represent our message (key value pairs)
-
Partition is a subset of the topic data and is stored on a single broker.
-
A topic can have multiple partitions to allow for parallel processing of the data.
-
Outside the broker, we have the application that communicates to the broker. These applications are producers that write data to Kafka topics and consumers.
Apache Kafka's Main components:
- Topics: categories to which messages are published by producers and consumed by consumers.
- Producers
- Consumers
- Setting up Apache Kafka Server on Amazon AWS Linux Ubuntu Free-Tier. Reference: https://www.linkedin.com/pulse/kafka-aws-free-tier-steven-aranibar/
- Java Client Application (Consumer) - Build using Netbeans IDE and Maven Project Reference: https://www.conduktor.io/kafka/complete-kafka-consumer-with-java/
- Java Client Application (Producer) - Build using Netbeans IDE and Maven Project Reference: https://www.conduktor.io/kafka/complete-kafka-producer-with-java/
$sudo apt-get update$sudo apt-get upgrade
$sudo apt-get install openjdk-8-jdkubuntu@ip-172-31-1-82:
$ java -version openjdk version "1.8.0_382" OpenJDK Runtime Environment (build 1.8.0_382-8u382-ga-122.04.1-b05) OpenJDK 64-Bit Server VM (build 25.382-b05, mixed mode)
What is Zookeper?
ZooKeeper is a critical role in a Kafka cluster by providing distributed coordination and synchronization services. It maintains the cluster's metadata, manages leader elections, and enables consumers to track their consumption progress.
Kafka Installation, go to https://kafka.apache.org/downloads

We're using Binary Version (kafka_2.12-3.5.1.tgz), This java client should be compatible with any kafka version.
$wget https://downloads.apache.org/kafka/3.5.1/kafka_2.12-3.5.1.tgz $tar -zxvf kafka_2.12-3.5.1.tgz
Altering .bashrc file
ubuntu@ip-172-31-1-82:~$ vi .bashrc
At the end of the config, add the following configurations:
# adding Kafka to PATH export PATH=/home/ubuntu/kafka_2.12-3.5.1/bin:$PATH # Kafka and zookeeper environment variables export KAFKA_HEAP_OPTS=-Xms32M export ZK_CLIENT_HEAP=128, ZK_SERVER_HEAP=128
Testing or Validate If you run $kafka-topics.sh while not in the Kafka directory and then the $kafka-topics.sh options and descriptions appear: you were successful!
ubuntu@ip-172-31-1-82:~$ kafka-topics.shCreate data/zookeeper/ directory
$mkdir data/kafkaGo to config/zookeeper.properties from Kafka directory.ubuntu@ip-172-31-1-82:~/kafka_2.12-3.5.1$ ls data/ kafka zookeeper
Find the dataDir configuration and replace with your newly created data folder.
dataDir=/home/ubuntu/kafka_2.12-3.5.1/data/zookeeper
ubuntu@ip-172-31-1-82:~/kafka_2.12-3.5.1$ vi bin/zookeeper-server-start.shReplace with a different value
if [ "x$KAFKA_HEAP_OPTS" = "x" ]; then
#export KAFKA_HEAP_OPTS="-Xmx512M -Xms512M"
export KAFKA_HEAP_OPTS="-Xms32M -Xmx64M"
fi

ubuntu@ip-172-31-1-82:~/kafka_2.12-3.5.1$ zookeeper-server-start.sh config/zookeeper.propertiesIf everything goes well you will see a message
INFO binding to port 0.0.0.0/0.0.0.0:2181

Remember to get the full path name prior to using nano text editor.
ubuntu@ip-172-31-1-82:~$ cd kafka_2.12-3.5.1/ ubuntu@ip-172-31-1-82:~/kafka_2.12-3.5.1$ mkdir data/kafka ubuntu@ip-172-31-1-82:~/kafka_2.12-3.5.1$ vi config/server.propertiesscroll to the log.dirs and change it to your new Kafka directory
log.dirs=/home/ubuntu/kafka_2.12-3.5.1/data/kafka
We'll edit the kafka-server-start.sh file found in the Kafka bin/ directory.
ubuntu@ip-172-31-1-82:~/kafka_2.12-3.5.1$ vi bin/kafka-server-start.shUpdate the following config
As we can see by the bit of code I've highlighted, the default memory is set to 1GB which our t2micro free instance cannot support. Change to:
export KAFKA_HEAP_OPTS="-Xms32M -Xmx64M"
if [ "x$KAFKA_HEAP_OPTS" = "x" ]; then
#export KAFKA_HEAP_OPTS="-Xmx1G -Xms1G"
export KAFKA_HEAP_OPTS="-Xms32M -Xmx64M"
fi

ubuntu@ip-172-31-1-82:~/kafka_2.12-3.5.1$ kafka-server-start.sh config/server.properties
User didn't run from the right folder to read the config.
[2024-02-28 01:31:20,460] INFO Registered kafka:type=kafka.Log4jController MBean (kafka.utils.Log4jControllerRegistration$)
[2024-02-28 01:31:20,847] ERROR Exiting Kafka due to fatal exception (kafka.Kafka$)
java.nio.file.NoSuchFileException: config/server.properties
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
at java.nio.file.Files.newByteChannel(Files.java:361)
at java.nio.file.Files.newByteChannel(Files.java:407)
at java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:384)
at java.nio.file.Files.newInputStream(Files.java:152)
at org.apache.kafka.common.utils.Utils.loadProps(Utils.java:683)
at org.apache.kafka.common.utils.Utils.loadProps(Utils.java:670)
at kafka.Kafka$.getPropsFromArgs(Kafka.scala:51)
at kafka.Kafka$.main(Kafka.scala:90)
at kafka.Kafka.main(Kafka.scala)
ubuntu@ip-172-31-1-82:~$ ls
kafka_2.12-3.5.1 kafka_2.12-3.5.1.tgz
ubuntu@ip-172-31-1-82:~$ cd kafka_2.12-3.5.1/
ubuntu@ip-172-31-1-82:~/kafka_2.12-3.5.1$ ls
LICENSE NOTICE bin config data libs licenses logs site-docs
ubuntu@ip-172-31-1-82:~/kafka_2.12-3.5.1$ kafka-server-start.sh config/server.properties

Both Zookeeper and Kafka are running. You might not be able to connect from Java client due to the following config in the server.properties
# Listener name, hostname and port the broker will advertise to clients. # If not set, it uses the value for "listeners". #advertised.listeners=PLAINTEXT://your.host.name:9092 advertised.listeners=PLAINTEXT://54.206.54.16:9092
IP Address is your public server IP Address and don't forget to check in the Security Group (AWS) that you open the port.
One of the indication: your topic wouldn't be created and timing out.

kafka-topics.sh --bootstrap-server localhost:9092 --topic demo_java --create --partitions 3 --replication-factor 1
To observe the output of our Java producer application, open the Kafka consumer CLI, kafka-console-consumer using the command:
ubuntu@ip-172-31-1-82:~$ kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic demo_java
- STEP 1. Create a workspace
- STEP 2. Import existing Maven project from the cloned repo. 

