<div style="text-align: center; line-height: 0; padding-top: 2px;">
  <img src="https://www.quantiaconsulting.com/logos/quantia_logo_orizz.png" alt="Quantia Consulting" style="width: 600px; height: 250px">
</div>

# Advanced Synchronous and Asynchronous write

## Performance Tuning
In the previous notebook we saw how to produce message in a synchronous and asynchronously way.

But the `produce()` does not trigger the message sending.

In order control the sending process and, consequently, to boost performance, you can adjust batching parameters at producer creation time.

* Size of the batch of messages:
    * `batch.num.messages`: max number of messages in a batch - the producer sends queued messages once the number of the messages in a batch reach the specified dimension - Default: 10000.
    OR
    * `batch.size`: max message batch size in bytes - the producer sends queued messages once the size of the messages in a batch reach the specified dimension - Deafult: 16kb.
* Time to wait for messages:
    * `linger.ms`: time to wait for message to batch together - the producer sends queued messages after the specified time interval even if the overall size of the messages is less than the specified `batch.size` - Default: 0.5 (Send immediately)

The values of the two parameters above can vary based on your needs:
* High throughput scenario: large `batch.size` (or `batch.num.messages`) and `linger.ms`, or flush manually
* Low Latency scenario: small `batch.size` (or `batch.num.messages`) and `linger.ms`

**NOTE** Generally, the `batch.size` configuration is not directly exposed to the client library.
The python client, we are currently using, expose the `batch.num.messages`.

[More Info](https://github.com/edenhill/librdkafka/blob/master/INTRODUCTION.md#performance) and [Configuration Value List](https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md)

## Initialization

Let's initialize a producer and test advanced configuration.

We will use JSON messages, open [JSON consumer](../../kafka-python/plain/json-consumer.ipynb) next to this notebook in order to fell the differences.


In [None]:
from confluent_kafka import Producer, KafkaError
import json
import qcutils
import time

servers=qcutils.read_config_value("kafka.server") + ":" + str(qcutils.read_config_value("kafka.port"))

topic = ''

assert len(topic) > 0, "In order to avoid conflicts during write operation, please name the topic as <surname>-topic"

admin_conf = {'bootstrap.servers': servers}
a = AdminClient(admin_conf)
fs = a.create_topics([NewTopic(topic, num_partitions=1, replication_factor=1)])

for t, f in fs.items():
    try:
        f.result()  # The result itself is None
        print("Topic {} created".format(t))
    except Exception as e:
        print("Failed to create topic {}: {}".format(t, e))

## Sending triggered by time (`linger.ms`)

Create a new producer for a high throughput scenario by concentrating our effort on the `linger.ms` configuration.

Increase `linger.ms` to 5 seconds, produce a single message and see what happens and....**when** it happens

In [None]:
producerconf = {
    'bootstrap.servers': servers,
    'linger.ms': 5000,
}


p = Producer(producerconf)

record_key = "qc-key"
record_value = json.dumps({'count': 1})
print("Producing record: {}\t{}".format(record_key, record_value))
p.produce(topic, key=record_key, value=record_value)


### What's happened here?

When the message appear in you consumer?

.... after 5 seconds!

## Sending triggered by number of messages (`batch.num.messages`)

In order to understand the `batch.num.messages` configuration, let's try to change the `batch.num.messages` parameter to 10 and change the `linger.ms` to an higher value (e.g 10s)

In [None]:
producerconf = {
    'bootstrap.servers': servers,
    'batch.num.messages':10,
    'linger.ms': 10000
}


p = Producer(producerconf)

for n in range(15):
    record_key = "qc-key"
    record_value = json.dumps({'count': n})
    print("Producing record: {}\t{}".format(record_key, record_value))
    p.produce(topic, key=record_key, value=record_value)


### And now...what's happened here?

Again...When the message appear in you consumer?

... the first 10 messages appear as soon as they are produced (remember the `batch.num.messages` config), but the last 5 messages appear after 10 seconds!! (remember the `linger.ms` config)

## Force the Sending Operation

How can we force the Sending Operation?

**We can't!**

It is important to properly configure the producer for our use case. 

The default values are ok for playing :D

The usage of the `flush()` and the `poll()` **does not force sending**, it is needed to check the production result in a synchronous or asynchronous way!

##### ![Quantia Tiny Logo](https://www.quantiaconsulting.com/logos/quantia_logo_tiny.png) 2020 Quantia Consulting, srl. All rights reserved.