# Sending and Receiving Avro Messages with FastKafka

## What is Avro?

Avro is a row-oriented remote procedure call and data serialization framework developed within Apache's Hadoop project. It uses JSON for defining data types and protocols, and serializes data in a compact binary format. To learn more about the Apache Avro, please check out the [docs](https://avro.apache.org/docs/).

## Prerequisites


1. A basic knowledge of `FastKafka` is needed to proceed with this guide. If you are not familiar with `FastKafka`, please go through the [tutorial](/docs#tutorial) first.
2. `FastKafka` with dependencies for Apache Avro installed is needed. Please install `FastKafka` with Avro support using the command - `pip install FastKafka[avro]`

## Defining Avro Schema Using Pydantic Models


By default, you can use Pydantic model to define your message schemas. FastKafka internally takes care of encoding and decoding avro messages, based on the Pydantic models.

So, similar to the [tutorial](/docs#tutorial), the message schema will remain as it is.

```python
# Define Pydantic models for Avro messages
from pydantic import BaseModel, NonNegativeFloat, Field

class IrisInputData(BaseModel):
    sepal_length: NonNegativeFloat = Field(
        ..., example=0.5, description="Sepal length in cm"
    )
    sepal_width: NonNegativeFloat = Field(
        ..., example=0.5, description="Sepal width in cm"
    )
    petal_length: NonNegativeFloat = Field(
        ..., example=0.5, description="Petal length in cm"
    )
    petal_width: NonNegativeFloat = Field(
        ..., example=0.5, description="Petal width in cm"
    )


class IrisPrediction(BaseModel):
    species: str = Field(..., example="setosa", description="Predicted species")
```

No need to change anything to support avro. You can use existing Pydantic models as is.

## Reusing existing avro schema


If you are using some other library to send and receive avro encoded messages, it is highly likely that you already have an Avro schema defined.

### Building pydantic models from avro schema dictionary


Let's modify the above example and let's assume we have schemas already for `IrisInputData` and `IrisPrediction` which will look like below:

```python
iris_input_data_schema = {
    "type": "record",
    "namespace": "IrisInputData",
    "name": "IrisInputData",
    "fields": [
        {"doc": "Sepal length in cm", "type": "double", "name": "sepal_length"},
        {"doc": "Sepal width in cm", "type": "double", "name": "sepal_width"},
        {"doc": "Petal length in cm", "type": "double", "name": "petal_length"},
        {"doc": "Petal width in cm", "type": "double", "name": "petal_width"},
    ],
}
iris_prediction_schema = {
    "type": "record",
    "namespace": "IrisPrediction",
    "name": "IrisPrediction",
    "fields": [{"doc": "Predicted species", "type": "string", "name": "species"}],
}
```

We can easily construct pydantic models from avro schema using `avsc_to_pydantic` function which is included as part of `FastKafka` itself.

```python
from fastkafka._components.encoder.avro import avsc_to_pydantic

IrisInputData = avsc_to_pydantic(iris_input_data_schema)
print(IrisInputData.__fields__)

IrisPrediction = avsc_to_pydantic(iris_prediction_schema)
print(IrisPrediction.__fields__)
```

The above code will convert avro schema to pydantic models and will print pydantic models' fields. The output of the above is:

```txt
{'sepal_length': ModelField(name='sepal_length', type=float, required=True),
 'sepal_width': ModelField(name='sepal_width', type=float, required=True),
 'petal_length': ModelField(name='petal_length', type=float, required=True),
 'petal_width': ModelField(name='petal_width', type=float, required=True)}
 
 {'species': ModelField(name='species', type=str, required=True)}
```

This is exactly same as manually defining the pydantic models ourselves. You don't have to worry about not making any mistakes while converting avro schema to pydantic models manually. You can easily and automatically accomplish it by using `avsc_to_pydantic` function as demonstrated above.

### Building pydantic models from `.avsc` file

Not all cases will have avro schema conveniently defined as a python dictionary. You may have it stored as the proprietary `.avsc` files in filesystem. Let's see how to convert those `.avsc` files to pydantic models.

Let's assume our avro files are stored in files called `iris_input_data_schema.avsc` and `iris_prediction_schema.avsc`. In that case, following code converts the schema to pydantic models:

```python
import json
from fastkafka._components.encoder.avro import avsc_to_pydantic


with open("iris_input_data_schema.avsc", "rb") as f:
    iris_input_data_schema = json.load(f)
    
with open("iris_prediction_schema.avsc", "rb") as f:
    iris_prediction_schema = json.load(f)
    

IrisInputData = avsc_to_pydantic(iris_input_data_schema)
print(IrisInputData.__fields__)

IrisPrediction = avsc_to_pydantic(iris_prediction_schema)
print(IrisPrediction.__fields__)
```

## Consume/Produce avro messages with FastKafka


`FastKafka` provides `@consumes` and `@produces` methods to consume/produces messages to/from a `Kafka` topic. This is explained in [tutorial](/docs#function-decorators).

The `@consumes` and `@produces` methods accepts a parameter called `decoder`/`encoder` to decode/encode avro messages.

```python
@kafka_app.consumes(topic="input_data", encoder="avro")
async def on_input_data(msg: IrisInputData):
    global model
    species_class = model.predict(
        [[msg.sepal_length, msg.sepal_width, msg.petal_length, msg.petal_width]]
    )[0]

    await to_predictions(species_class)


@kafka_app.produces(topic="predictions", decoder="avro")
async def to_predictions(species_class: int) -> IrisPrediction:
    iris_species = ["setosa", "versicolor", "virginica"]

    prediction = IrisPrediction(species=iris_species[species_class])
    return prediction
```

In the above example, in `@consumes` and `@produces` methods, we explicitly instruct FastKafka to `decode` and `encode` messages using the `avro` `decoder`/`encoder` instead of the default `json` `decoder`/`encoder`.

## Assembling it all together

Let's rewrite the sample code found in [tutorial](/docs#running-the-service) to use `avro` to `decode` and `encode` messages:

```python
# content of the "application.py" file

iris_input_data_schema = {
    "type": "record",
    "namespace": "IrisInputData",
    "name": "IrisInputData",
    "fields": [
        {"doc": "Sepal length in cm", "type": "double", "name": "sepal_length"},
        {"doc": "Sepal width in cm", "type": "double", "name": "sepal_width"},
        {"doc": "Petal length in cm", "type": "double", "name": "petal_length"},
        {"doc": "Petal width in cm", "type": "double", "name": "petal_width"},
    ],
}
iris_prediction_schema = {
    "type": "record",
    "namespace": "IrisPrediction",
    "name": "IrisPrediction",
    "fields": [{"doc": "Predicted species", "type": "string", "name": "species"}],
}
# Or load schema from avsc files

from fastkafka._components.encoder.avro import avsc_to_pydantic

IrisInputData = avsc_to_pydantic(iris_input_data_schema)
IrisPrediction = avsc_to_pydantic(iris_prediction_schema)

    
from fastkafka import FastKafka

kafka_brokers = {
    "localhost": {
        "url": "localhost",
        "description": "local development kafka broker",
        "port": 9092,
    },
    "production": {
        "url": "kafka.airt.ai",
        "description": "production kafka broker",
        "port": 9092,
        "protocol": "kafka-secure",
        "security": {"type": "plain"},
    },
}

kafka_app = FastKafka(
    title="Iris predictions",
    kafka_brokers=kafka_brokers,
)

iris_species = ["setosa", "versicolor", "virginica"]

@kafka_app.consumes(topic="input_data", encoder="avro")
async def on_input_data(msg: IrisInputData):
    global model
    species_class = model.predict(
        [[msg.sepal_length, msg.sepal_width, msg.petal_length, msg.petal_width]]
    )[0]

    await to_predictions(species_class)


@kafka_app.produces(topic="predictions", decoder="avro")
async def to_predictions(species_class: int) -> IrisPrediction:
    iris_species = ["setosa", "versicolor", "virginica"]

    prediction = IrisPrediction(species=iris_species[species_class])
    return prediction
```


The above code is a sample implementation of using FastKafka to consume and produce Avro-encoded messages from/to a Kafka topic. The code defines two Avro schemas for the input data and the prediction result. It then uses the `avsc_to_pydantic` function from the FastKafka library to convert the Avro schema into Pydantic models, which will be used to decode and encode Avro messages.

The `FastKafka` class is then instantiated with the broker details, and two functions decorated with `@kafka_app.consumes` and `@kafka_app.produces` are defined to consume messages from the "input_data" topic and produce messages to the "predictions" topic, respectively. The functions uses the decoder="avro" and encoder="avro" parameters to decode and encode the Avro messages.

In summary, the above code demonstrates a straightforward way to use Avro-encoded messages with FastKafka to build a message processing pipeline.