# FastKafka

This notebook will demonstrate the capabilities and developed functionalities in FastKafka project


[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/airtai/fastkafka/blob/64-colab-based-tutorial/nbs/guides/Guide_00_FastKafka_Demo.ipynb)

## Installing fastkafka library

To install fastkafka, run: `pip install fastkafka` in your terminal

In [None]:
try:
    import fastkafka
except ImportError:
    #!pip install fastkafka==0.1.0
    !pip install "fastkafka @ git+https://github.com/airtai/fastkafka@8929c430ef058103390cc59e2864b195890784e7"

## LocalKafkaBroker

To be able to test and demonstrate the use of FastKafka, we have developed a python wrapper for Zookeeper and Kafka broker which is demonstrated here and used later in the notebook. 

In [None]:
from fastkafka.testing import LocalKafkaBroker

First, start the LocalKafkaBroker

When LocalKafkaBroker is started, it checks if there are Java and Kafka installed on the system, if not, it will install them and export them to path as it is necessary for it to function.

Note: We use `apply_nest_asyncio=True` when creating the broker in the notebook to enable it to run in a nested async loop

In [None]:
local_broker = LocalKafkaBroker(apply_nest_asyncio=True)
bootstrap_server = local_broker.start()
print(bootstrap_server)

Lets see if there are any topics in our fresh Kafka broker. If everything is okay, there should be none.

In [None]:
! kafka-topics.sh --list --bootstrap-server {bootstrap_server}

Lets now create a topic, list it, and describe it to see that our LocalKafkaBroker is really running.

In [None]:
! kafka-topics.sh --create --topic quickstart-events --bootstrap-server {bootstrap_server}

In [None]:
! kafka-topics.sh --list --bootstrap-server {bootstrap_server}

In [None]:
! kafka-topics.sh --describe --topic quickstart-events --bootstrap-server {bootstrap_server}

Now we can stop the broker as it is no longer needed

In [None]:
local_broker.stop()

LocalKafkaBroker can also be used as a context manager

In [None]:
with LocalKafkaBroker(apply_nest_asyncio=True) as bootstrap_server:
    print(bootstrap_server)

## FastKafka minimal demo

We will first create one simple fastkafka producer and one consumer to demonstrate their capabiltiy to communicate over Kafka queues

First, model the data that will be communicated between producer and consumer

In [None]:
from pydantic.main import BaseModel

from pydantic import BaseModel, Field, NonNegativeInt

In [None]:
class Data(BaseModel):
    data: NonNegativeInt = Field(..., example=202020, description="Sample data")

Now, we define our FastKafka consumer app and wrap it in a coroutine to be able to run it in the next steps

In [None]:
import asyncio
from tqdm import tqdm, trange

from fastkafka.application import FastKafka

In [None]:
async def fastkafka_consumer(msgs_count: int, topic: str, bootstrap_servers: str):
    with tqdm(total=msgs_count, desc=f"consuming from '{topic}'") as consume_pbar:
        kafka_app = FastKafka(bootstrap_servers=bootstrap_servers)

        @kafka_app.consumes(topic=topic)
        async def consume(msg: Data):
            consume_pbar.update(1)

        async with kafka_app:
            while True:
                await asyncio.sleep(0.1)
                if consume_pbar.n >= consume_pbar.total:
                    break

We do the same preparation for the FastKafka producer

In [None]:
from typing import List

In [None]:
async def fastkafka_producer(msgs: List[Data], topic: str, bootstrap_servers: str):
    with tqdm(total=len(msgs), desc=f"producing to '{topic}'") as produce_pbar:
        kafka_app = FastKafka(bootstrap_servers=bootstrap_servers)

        @kafka_app.produces(topic=topic)
        def produce(msg: Data):
            return msg

        async with kafka_app:
            for msg in msgs:
                produce(msg)
                produce_pbar.update(1)

Finally, lets run the demo

In [None]:
import asyncer
from time import sleep

In [None]:
# Prepare the messages for the producer to send
msgs = [Data(data=i) for i in trange(1_000, desc="generating messages")]

# Start the broker
async with LocalKafkaBroker(topics=["test_data"], apply_nest_asyncio=True) as bootstrap_server:
    async with asyncer.create_task_group() as tg:
        # Start the consumer FastKafka app
        tg.soonify(fastkafka_consumer)(
            msgs_count=len(msgs), topic="test_data", bootstrap_servers=bootstrap_server
        )

        # Give the consumer some time to connect to Kafka topic before we start sending
        await asyncio.sleep(5)

        # Start the producer FastKafka app
        tg.soonify(fastkafka_producer)(
            msgs=msgs, topic="test_data", bootstrap_servers=bootstrap_server
        )

### Recap

We have created a simple FastKafka consumer that updates the progress when a message is received, a producer that sends the messages to the Kafka queue.

In the previous cell we have started the LocalKafkaBroker instance and connected our producer and consumer to it. 
Finall, after the producer started producing the messages, we have consumed them with our consumer and when all of the messages have been consumed the consumer and producer stopped and our kafka broker has exited.

## FastKafka model predictions demo

Now we will create a more fleshed out application containing a Model that will ingest data samples from one Kafka topic (input_data) and produce predictions to another Kafka topic (predictions)

### Preparing the demo model

First we will prepare our model with the Iris dataset so that we can demonstrate the preditions using FastKafka

In [None]:
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression

X, y = load_iris(return_X_y=True)
model = LogisticRegression(random_state=0, max_iter=500).fit(X, y)
model.predict(X[0].reshape(1, -1))

Now, we need to model the input and prediction messages that will be sent to the Kafka broker

In [None]:
from pydantic import NonNegativeFloat

In [None]:
class IrisInputData(BaseModel):
    sepal_length: NonNegativeFloat = (
        Field(..., example=0.5, description="Sepal length in cm"),
    )
    sepal_width: NonNegativeFloat = (
        Field(..., example=0.5, description="Sepal width in cm"),
    )
    petal_length: NonNegativeFloat = (
        Field(..., example=0.5, description="Petal length in cm"),
    )
    petal_width: NonNegativeFloat = Field(
        ..., example=0.5, description="Petal width in cm"
    )


class IrisPredictionData(BaseModel):
    species: str = Field(..., example="Iris-setosa", description="Predicted species")

Now, lets prepare our prediction FastKafka app.

In [None]:
def create_app():
    kafka_app = FastKafka()

    iris_species = {
        0: "Iris-setosa",
        1: "Iris-versicolor",
        2: "Iris-virginica"
    }

    @kafka_app.consumes(topic="input_data", auto_offset_reset="latest", group_id="my_group")
    async def on_input_data(msg: IrisInputData):
        global model
        species_class = model.predict([
              [msg.sepal_length, msg.sepal_width, msg.petal_length, msg.petal_width]
            ])[0]

        to_predictions(species_class)


    @kafka_app.produces(topic="predictions")
    def to_predictions(species_class: int) -> IrisPredictionData:
        prediction = IrisPredictionData(species=iris_species[species_class])
        return prediction
    return kafka_app

Finally, lets run the test by sending a message to the running app that now encapsulates the Iris classification model:

In [None]:
from fastkafka.application import Tester
from sklearn.utils import shuffle

1. Create app

In [None]:
app = create_app()

2. Start the broker

In [None]:
broker = LocalKafkaBroker(topics=["input_data", "predictions"], apply_nest_asyncio=True)
bootstrap_server = broker.start()
app.set_bootstrap_servers(bootstrap_server)

3. Started our Tester class which mirrors the developed app topics for testing purpuoses

In [None]:
tester = Tester(app)
await tester.__aenter__()

4. Send a message and see what we get at the predictions topic

In [None]:
msg = IrisInputData(
    sepal_length=X[0][0],
    sepal_width=X[0][1],
    petal_length=X[0][2],
    petal_width=X[0][3],
)

await tester.to_input_data(msg)
await tester.awaited_mocks.on_predictions.assert_awaited(timeout=2)
print(f"Received prediction: {tester.mocks.on_predictions.call_args}")

5. To keep everything clean, close the broker and tester

In [None]:
await tester.__aexit__(None, None, None)
broker.stop()

When condensed into one cell, the test looks like this:

In [None]:
app = create_app()
msg = IrisInputData(
    sepal_length=X[0][0],
    sepal_width=X[0][1],
    petal_length=X[0][2],
    petal_width=X[0][3],
)
with LocalKafkaBroker(
    topics=["input_data", "predictions"], apply_nest_asyncio=True
) as bootstrap_servers:
    app.set_bootstrap_servers(bootstrap_servers=bootstrap_servers)
    tester = Tester(app)
    async with tester:
        await tester.to_input_data(msg)
        await tester.awaited_mocks.on_predictions.assert_awaited(timeout=2)
        prediction = tester.mocks.on_predictions.call_args

print("*"*50)
print(f"Sent data: {msg}")
print(f"Received prediction: {prediction}")

### Recap

We have created a Iris classification model and encapulated it into our fastkafka application.
The app will consume the IrisInputData from the `input_data` topic and produce the predictions to `predictions` topic.

To test the app we have:
1. Created the app
1. Started the LocalKafkaBroker
2. Started our Tester class which mirrors the developed app topics for testing purpuoses
3. Sent IrisInputData message to `input_data` topic
4. Asserted and checked that the developed iris classification service has reacted to IrisInputData message 