# Kafi: Kafka Superpowers for Your Jupyter Notebook and Python
<img src="pix/kafka.jpg" style="width: 30%; height: 30%"/>
<img src="pix/jupyter.jpg" style="width: 30%; height: 30%"/>

### Ralph Debusmann
##### `ralph.debusmann@mgb.ch`

<img src="pix/migros.png" style="width: 20%; height: 20%"/>


# Agenda

* Part I: The Birth of Kafi

* Part II: Three Paradigms for Using Kafi
  * Shell/Python interpreter
  * Juypter Notebooks
  * Code (Microservices, FaaS, Agents...)

* Part III: Use Cases for Kafi
  * Kafka Administration
  * Schema Registry Administration
  * Kafka Backups incl. Kafka Emulation
  * Simple Stream Processing
  * Building a Bridge from Kafka to Pandas Dataframes and Files

# Part I: The Birth of Kafi


<img src="pix/birth.jpg" style="width: 35%; height: 35%"/>


What happens if you would just like to create a topic on Kafka, list topics, produce some messages, or consume some messages, or search for messages?

The answer is often:
* kafkacat/kcat
* standard Kafka commandline tools (kafka-console-producer, kafka-console-consumer...)


It works...for a long time indeed. But how?

## Still the State-of-the-Art Developer Experience

### List Topics

```
kcat -b localhost:9092 -L
```


```
kafka-topics --bootstrap-server localhost:9092 --list
```

### Create Topics

(not possible with kcat)


```
kafka-topics --bootstrap-server localhost:9092 --topic topic_json --create
```


### Produce Messages

```
kcat -b localhost:9092 -t topic_json -P -K ,

123,{"bla":123}
456,{"bla":456}
789,{"bla":789}
```


```
kafka-console-producer --bootstrap-server localhost:9092 --topic topic_json --property parse.key=true --property key.separator=','

123,{"bla":123}
456,{"bla":456}
789,{"bla":789}
```


### Produce Messages Using a Schema

(not even possible with kcat...)

```
kafka-avro-console-producer --bootstrap-server localhost:9092 --topic topic_avro --property schema.registry.url=http://localhost:8081 --property key.serializer=org.apache.kafka.common.serialization.StringSerializer --property value.serializer=io.confluent.kafka.serializers.KafkaAvroSerializer --property value.schema='{"type":"record","name":"myrecord","fields":[{"name":"bla","type":"int"}]}' --property parse.key=true --property key.separator=','

123,{"bla": 123}
456,{"bla": 456}
789,{"bla": 789}
```


### Consume Messages

```
kcat -b localhost:9092 -t topic_json -C -o beginning
```


```
kafka-console-consumer --bootstrap-server localhost:9092 --topic topic_json --property print.key=true --from-beginning
```

### Search Messages

```
kcat -b localhost:9092 -t topic_json -C -o beginning -e | grep 456
```


```
kafka-console-consumer --bootstrap-server localhost:9092 --topic topic_json --from-beginning | grep 456
```

## Can't We Do Better?

I developed Kafi because I was frustrated with kcat and the standard Kafka commandline tools. Not by another commandline tool, but by building a Python module (=library).

Regardless of whether you use Kafi in your shell or in a Jupyter notebook, you have a similar experience. And your life gets so much better. I promise.


This is how you can list topics, create topics, produce messages, consume messages or search for messages with Kafi.

Because Kafi is a Python module, you first need to import it. Then, you create a Cluster object `c` reading from a configuration file:

```
from kafi.kafi import *
c = Cluster("local")
```

Then...

### List Topics

```
c.ls()
```

### Create Topics

```
c.touch("topic_json")
```

### Produce Messages

```
pr = c.producer("topic_json")
pr.produce({"bla": 123}, key="123")
pr.produce({"bla": 456}, key="456")
pr.produce({"bla": 789}, key="789")
pr.close()
```

### Produce Messages Using a Schema

```
t = "topic_avro"
s = '{"type":"record","name":"myrecord","fields":[{"name":"bla","type":"int"}]}'

p = c.producer(t, value_type="avro", value_schema=s)
p.produce({"bla": 123}, key="123")
p.produce({"bla": 456}, key="456")
p.produce({"bla": 789}, key="789")
p.close()
```


### Consume Messages

```
c.cat("topic_json")
```

# Part II: Three Paradigms for Using Kafi

<img src="pix/paradigms.jpg" style="width: 35%; height: 35%"/>


Wait, this talk is titled "Kafka Superpowers for Your Jupyter Notebook and Python". So where is the Jupyter notebook. Ok, here, but that's not what you probably ask yourselves...

There are three main paradigms for using Kafi.

## Shell/Python Interpreter

The first is in your shell using the Python interpreter, like we did in Part I above. That gives you a user/developer experience similar to bash/zsh + kcat or the standard Kafka commandline tools.

## Code (Microservices, FaaS, Agents...)

As Kafi is just a Python module, it is also super useful to use in your Python code. Either for smaller scripts, or even for building microservices, FaaS-functions, or agents (put in a pinch of llamaindex agents for example).

## Jupyter Notebooks

Now finally to them. You will see soon in Part III that Jupyter notebooks are a very convenient and powerful paradigm of using Kafi, especially for Python/Jupyter afficionados like Data Scientists etc.

But... using Kafi in a Jupyter notebook is actually also very convenient and powerful for Kafka administrators or developers! You'll see.

# Part III: Use Cases for Kafi


<img src="pix/use_cases.jpg" style="width: 35%; height: 35%"/>


## Kafka Administration

## Schema Registry Administration

## Simple Stream Processing

## Kafka Backups incl. Kafka Emulation

## Building a Bridge from Kafka to Pandas Dataframes and Files