# [Kafka](https://kafka.apache.org/)

## Install

In [1]:
!pip install kafka-python



In [2]:
!curl -sSOL https://downloads.apache.org/kafka/3.0.0/kafka_2.13-3.0.0.tgz
!tar -xzf kafka_2.13-3.0.0.tgz

## Launch

Start Zookper server and Kafka broker

In [3]:
!./kafka_2.13-3.0.0/bin/zookeeper-server-start.sh -daemon ./kafka_2.13-3.0.0/config/zookeeper.properties
!./kafka_2.13-3.0.0/bin/kafka-server-start.sh -daemon ./kafka_2.13-3.0.0/config/server.properties
!echo "Waiting for 10 secs until kafka and zookeeper services are up and running"
!sleep 10

Waiting for 10 secs until kafka and zookeeper services are up and running


Check that both processes are running

In [4]:
!ps -ef | grep kafka

root        8989       1  0 06:12 ?        00:00:07 java -Xmx512M -Xms512M -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitGCInvokesConcurrent -XX:MaxInlineLevel=15 -Djava.awt.headless=true -Xlog:gc*:file=/content/kafka_2.13-3.0.0/bin/../logs/zookeeper-gc.log:time,tags:filecount=10,filesize=100M -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dkafka.logs.dir=/content/kafka_2.13-3.0.0/bin/../logs -Dlog4j.configuration=file:./kafka_2.13-3.0.0/bin/../config/log4j.properties -cp /content/kafka_2.13-3.0.0/bin/../libs/activation-1.1.1.jar:/content/kafka_2.13-3.0.0/bin/../libs/aopalliance-repackaged-2.6.1.jar:/content/kafka_2.13-3.0.0/bin/../libs/argparse4j-0.7.0.jar:/content/kafka_2.13-3.0.0/bin/../libs/audience-annotations-0.5.0.jar:/content/kafka_2.13-3.0.0/bin/../libs/commons-cli-1.4.jar:/content/kafka_2.13-3.0.0/bin/../libs/commons-lang3-3.8.1.jar:/content/kafka_2.13

## Create Topic

In [5]:
!./kafka_2.13-3.0.0/bin/kafka-topics.sh --create --bootstrap-server 127.0.0.1:9092 --replication-factor 1 --partitions 1 --topic numberstream

Error while executing topic command : Topic 'numberstream' already exists.
[2021-11-05 07:15:59,070] ERROR org.apache.kafka.common.errors.TopicExistsException: Topic 'numberstream' already exists.
 (kafka.admin.TopicCommand$)


## Create Producer

In [6]:
from time import sleep
from json import dumps
from kafka import KafkaProducer

In [7]:
producer = KafkaProducer(bootstrap_servers=['localhost:9092'],
                         value_serializer=lambda x: dumps(x).encode('utf-8'))

In [8]:
for e in range(10):
  data = {'number' : e}
  producer.send('numberstream', value=data)
  sleep(2)

## Create Consumer

In [9]:
from kafka import KafkaConsumer
from json import loads

In [10]:
consumer = KafkaConsumer('numberstream',
                         bootstrap_servers=['localhost:9092'],
                         auto_offset_reset='earliest',
                         auto_commit_interval_ms=1000,
                         enable_auto_commit=True,
                         group_id='my-group',
                         value_deserializer=lambda x: loads(x.decode('utf-8')))

In [11]:
for message in consumer:
  print(f'Received {message.value}')

Received {'number': 7}
Received {'number': 8}
Received {'number': 9}
Received {'number': 0}
Received {'number': 1}
Received {'number': 2}
Received {'number': 3}
Received {'number': 4}
Received {'number': 5}
Received {'number': 6}
Received {'number': 7}
Received {'number': 8}
Received {'number': 9}


KeyboardInterrupt: ignored

# Run Producer & Consumer

In [12]:
for e in range(10):
  data = {'number' : e}
  producer.send('numberstream', value=data)

  sleep(2)

  for message in consumer:
    print(f'Received {message.value}')
    break

Received {'number': 0}
Received {'number': 1}
Received {'number': 2}
Received {'number': 3}
Received {'number': 4}
Received {'number': 5}
Received {'number': 6}
Received {'number': 7}
Received {'number': 8}
Received {'number': 9}
