# Kafka to Kafka pipeline
This pipeline receives events from Kafka, processes the events and returns them back to Kafka.

The pipeline is triggered (started) when it receives an event from the relevant Kafka topic. 

The pipeline immediatelly processes the event, and then sends it to the sink topic.

### Events
It is recommended to transform events to JSON format as explained in section **Access and Work with received events** prior processing them.

Once transformed, you can work with event as with standard Python dictionary.

Example event transformed into the JSON format:
```JSON
{
    'foo': 'bap'
}
```
### Configuration
Configuration is placed in `pipelines.conf` file in the pipeline folder.

Example configuration:
```
# Specification of Kafka connection
[connection:KafkaConnection]
bootstrap_servers=<broker1>:9092,<broker2>:9092

# Specification of source topic
[pipeline:Kafka2KafkaPipeline:KafkaSource]
topic=bs-kafka2kafka-source

# Specification of sink topic, where processed data will be sent
[pipeline:Kafka2KafkaPipeline:KafkaSink]
topic=bs-kafka2kafka-sink
```

## Import BitSwan 
Import BitSwan modules necessary for Kafka to Kafka pipeline.

In [1]:
from bspump.jupyter import *
import bspump.kafka
import json

## Register connection to Kafka
The connection details are specified in `pipelines.conf`

In [2]:
@register_connection
def connection(app):
  return bspump.kafka.KafkaConnection(app, "KafkaConnection")

## Create test events
In case you would like to test the pipeline with sample events prior running the pipeline, you can do so using calling `sample_events` method with your events as a parameter, as seen below. 

All the events which are put to `sample_events` are processes one after each other one.

This step is not necessary to run the pipeline.

```Python
sample_events([
    b"""{"foo":"bap"}""",
    b"""{"foo":"baz"}"""
])
```

In [None]:
# [Optional] Enter sample events here

## Define AutoPipeline
Autopipeline specifies source, sink and name of the pipeline. These names are reffered to in configuration file `pipelines.conf`. See Autopipeline definition below, with code comments for better clarification.

In [None]:
auto_pipeline(
    # Specification of source topic, reffered to in config file section [pipeline:Kafka2KafkaPipeline:KafkaSource]
    source=lambda app, pipeline: bspump.kafka.KafkaSource(app, pipeline, connection="KafkaConnection"),
    # Specification of sink topic, reffered to in config file section [pipeline:Kafka2KafkaPipeline:KafkaSink]
    sink=lambda app, pipeline: bspump.kafka.KafkaSink(app, pipeline, connection="KafkaConnection"),
    # Name of the pipeline, reffered to in config file section [pipeline:Kafka2KafkaPipeline]
    name="Kafka2KafkaPipeline",
)

## Access and work with received events 
When working with events from Kafka it is recommended to transform them to standard JSON format, which you can access as an dictionary.

In [None]:
# Transform received event to JSON format
event = json.loads(event.decode("utf8"))
event

{'foo': 'bap'}
{'foo': 'baz'}


### Modification of event content.
You can access the content of the event by selecting the relevant key.
Below, you can create multiple cells with code working with events.

Example event modification:
```Python
event["foo"] = event["foo"].upper()
event
```

Example output:
```JSON
{'foo': 'BAP'}
```

In [None]:
# Enter your code here

In [None]:
# Enter your code here

## Tranform event to Kafka format before sending them out.
Use `json.dumps(event).encode()` as below, to transform the event before sending them to Kafka.

In [None]:
event = json.dumps(event).encode()

## Send event to Kafka
All events are sent to the Kafka sink topic specified in `pipelines.conf` at the end of pipeline **automatically**.