Kafka connect are prebuilt connectors that can be used to integrate Kafka with other sources or targets (souces or sinks in Kafka terms). Let's create a postgreSQL one.

In [1]:
from kafka import KafkaProducer
import json

In [5]:
bootstrap_servers="localhost:9092"
topic_name="kafka-localhost-python"

In [3]:
producer = KafkaProducer(
 bootstrap_servers=bootstrap_servers,
 value_serializer=lambda v: json.dumps(v).encode('ascii'),
 key_serializer=lambda v: json.dumps(v).encode('ascii')
)

Let's create a new stream, adding the schema to it.

Kafka Connect JDBC Sink requires a schema to be attached to the stream defining the its fields in detail. We have two choices:

- Attaching the schema to each JSON message
- Use schema registry with AVRO format

For the sake of this example we'll include the schema definition to the JSON message. Let's define the schema

In [4]:
key_schema = {
    "type": "struct",
    "fields": [
        {
            "type": "int32",
            "optional": False,
            "field": "id"
        }
    ]
}

value_schema = {
    "type": "struct",
    "fields": [
        {
            "type": "string",
            "optional": False,
            "field": "name"
        },
        {
            "type": "string",
            "optional": False,
            "field": "pizza"}]
}

And send some data

In [6]:
producer.send(
    topic_name+"-schema", 
    key={"schema": key_schema, "payload": {"id":1}},
    value={"schema": value_schema, 
           "payload": {"name":"👨 Frank", "pizza":"Margherita 🍕"}}
)

producer.send(
    topic_name+"-schema",
    key={"schema": key_schema, "payload": {"id":2}},
    value={"schema": value_schema, 
           "payload": {"name":"👨 Dan", "pizza":"Fries 🍕+🍟"}}
)

producer.send(
    topic_name+"-schema",
    key={"schema": key_schema, "payload": {"id":3}},
    value={"schema": value_schema,
           "payload": {"name":"👨 Jan", "pizza":"Mushrooms 🍕+🍄"}}
)

producer.flush()

Let's start the Kafka Connect Postgres Connector

---

In [7]:
!ls ~/kafka_2.12-3.2.0/config/

connect-console-sink.properties   consumer.properties
connect-console-source.properties [34mkraft[m[m
connect-distributed.properties    log4j.properties
connect-file-sink.properties      producer.properties
connect-file-source.properties    server.properties
connect-log4j.properties          tools-log4j.properties
connect-mirror-maker.properties   trogdor.conf
connect-standalone.properties     zookeeper.properties


In [8]:
%%writefile ~/kafka_2.12-3.2.0/config/connect-distributed-local.properties
bootstrap.servers=localhost:9092
group.id=local-connect-cluster

key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=true
value.converter.schemas.enable=true

internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=false
internal.value.converter.schemas.enable=false

offset.storage.topic=connect-local-stg-offsets
config.storage.topic=connect-local-stg-configs
status.storage.topic=connect-local-stg-status

consumer.max.poll.records=1
consumer.enable.auto.commit=false
consumer.auto.offset.reset=latest

Writing /Users/sparshagarwal/kafka_2.12-3.2.0/config/connect-distributed-local.properties


In [14]:
!ls ~/kafka_2.12-3.2.0/bin

[31mconnect-distributed.sh[m[m             [31mkafka-mirror-maker.sh[m[m
[31mconnect-mirror-maker.sh[m[m            [31mkafka-producer-perf-test.sh[m[m
[31mconnect-standalone.sh[m[m              [31mkafka-reassign-partitions.sh[m[m
[31mkafka-acls.sh[m[m                      [31mkafka-replica-verification.sh[m[m
[31mkafka-broker-api-versions.sh[m[m       [31mkafka-run-class.sh[m[m
[31mkafka-cluster.sh[m[m                   [31mkafka-server-start.sh[m[m
[31mkafka-configs.sh[m[m                   [31mkafka-server-stop.sh[m[m
[31mkafka-console-consumer.sh[m[m          [31mkafka-storage.sh[m[m
[31mkafka-console-producer.sh[m[m          [31mkafka-streams-application-reset.sh[m[m
[31mkafka-consumer-groups.sh[m[m           [31mkafka-topics.sh[m[m
[31mkafka-consumer-perf-test.sh[m[m        [31mkafka-transactions.sh[m[m
[31mkafka-delegation-tokens.sh[m[m         [31mkafka-verifiable-consumer.sh[m[m
[31mkafka-delete-records.

In [12]:
!kafka-topics.sh --bootstrap-server localhost:9092 --create --replication-factor 1 --partitions 1 --topic connect-local-stg-offsets
!kafka-topics.sh --bootstrap-server localhost:9092 --create --replication-factor 1 --partitions 1 --topic connect-local-stg-configs
!kafka-topics.sh --bootstrap-server localhost:9092 --create --replication-factor 1 --partitions 1 --topic connect-local-stg-status
!kafka-topics.sh --bootstrap-server localhost:9092 --create --replication-factor 1 --partitions 1 --topic kctopic_for_sink

Created topic connect-local-stg-offsets.
Created topic connect-local-stg-configs.
Created topic connect-local-stg-status.
Created topic kctopic_for_sink.


In [16]:
!mkdir -p /usr/local/share/java

In [17]:
!cp ../assets/jars/*.jar /usr/local/share/java

In [18]:
!ls -al /usr/local/share/java

total 26912
drwxr-xr-x  10 sparshagarwal  admin      320 Jul 16 16:06 [34m.[m[m
drwxrwxr-x  45 sparshagarwal  admin     1440 Jul 16 16:04 [34m..[m[m
-rw-r--r--@  1 sparshagarwal  admin    99087 Jul 16 16:06 connect-api-3.2.0.jar
-rw-r--r--@  1 sparshagarwal  admin    15340 Jul 16 16:06 connect-file-3.2.0.jar
-rw-r--r--@  1 sparshagarwal  admin   126898 Jul 16 16:06 javax.ws.rs-api-2.1.1.jar
-rw-r--r--@  1 sparshagarwal  admin  4941003 Jul 16 16:06 kafka-clients-3.2.0.jar
-rw-r--r--@  1 sparshagarwal  admin   682804 Jul 16 16:06 lz4-java-1.8.0.jar
-rw-r--r--@  1 sparshagarwal  admin    41125 Jul 16 16:06 slf4j-api-1.7.36.jar
-rw-r--r--@  1 sparshagarwal  admin  1970939 Jul 16 16:06 snappy-java-1.1.8.4.jar
-rw-r--r--@  1 sparshagarwal  admin  5885445 Jul 16 16:06 zstd-jni-1.5.2-1.jar


In [20]:
%%writefile ../test.txt
Hello
This is first message to kafka connect

Writing ../test.txt


In [22]:
%%writefile ../test.txt
This is the third message to kafka connect

Overwriting ../test.txt
