<div id="singlestore-header" style="display: flex; background-color: rgba(235, 249, 245, 0.25); padding: 5px;">
    <div id="icon-image" style="width: 90px; height: 90px;">
        <img width="100%" height="100%" src="https://raw.githubusercontent.com/singlestore-labs/spaces-notebooks/master/common/images/header-icons/database.png" />
    </div>
    <div id="text" style="padding: 5px; margin-left: 10px;">
        <div id="badge" style="display: inline-block; background-color: rgba(0, 0, 0, 0.15); border-radius: 4px; padding: 4px 8px; align-items: center; margin-top: 6px; margin-bottom: -2px; font-size: 80%">SingleStore Notebooks</div>
        <h1 style="font-weight: 500; margin: 8px 0 0 4px;">Ingest data from Confluent Cloud (Kafka)</h1>
    </div>
</div>

<img src=https://raw.githubusercontent.com/singlestore-labs/spaces-notebooks/master/notebooks/confluent-cloud-integration/images/confluent-kafka-integration.png width="100%">

### Confluent Cluster (Kafka) set up

Prior to initiating the integration process, it is essential to configure a Confluent Kafka cluster. Please refer to the provided <a href="https://docs.confluent.io/cloud/current/get-started/index.html">link</a> for a quick start guide.

- Once the cluster is created, proceed to create a topic named for example <b>'s2-topic'</b> and configure the value AVRO schema, as example choose proposed default:

<img src=https://raw.githubusercontent.com/singlestore-labs/spaces-notebooks/master/notebooks/confluent-cloud-integration/images/kafka-value-schema.png width="1000">

- Create API Keys and save for later usage:

<img src=https://raw.githubusercontent.com/singlestore-labs/spaces-notebooks/master/notebooks/confluent-cloud-integration/images/confluent-api-key.png width="1000">

- Go to 'Connectors' and create a sample producer <b>'datagen'</b> with 'Use an existing API key' option for the established Kafka topic <b>'s2-topic'</b>(<a href="https://docs.confluent.io/cloud/current/get-started/index.html#step-3-create-a-sample-producer">Step 3</a> from quick guide). Configure the producer to utilize the same schema as the created topic.
- Launch <b>'datagen'</b> producer and check  <b>'s2-topic'</b> has new messages.

### Set up variables

Choose <b>S2_DATABASE_NAME</b>, <b>S2_TABLE_NAME</b> and <b>S2_PIPELINE_NAME</b> to use for integration

### Copy data from Confluent Cloud
- Set up <b>CONFLUENT_KAFKA_TOPIC_NAME</b> - put created topic name ('s2-topic')
- Set up <b>CONFLUENT_API_KEY</b> and <b>CONFLUENT_API_SECRET</b> - put API key and secret

Go to 'Clients' -> choose language (for example java) and set up following variables:
- <b>CONFLUENT_CLUSTER_BOOTSTRAP_SERVER</b> from <b>bootstrap.servers</b>
- <b>CONFLUENT_SCHEMA_REGISTRY_URL</b> from <b>schema.registry.url</b>

Click 'Create Schema Registry API key' to create schema api key and set up:
- <b>CONFLUENT_SCHEMA_REGISTRY_KEY</b> and <b>CONFLUENT_SCHEMA_REGISTRY_SECRET</b>

In [1]:
S2_DATABASE_NAME = 'confluent_cloud_integration'
S2_TABLE_NAME = 'kafka_events'
S2_PIPELINE_NAME = 'kafka_consumer_pipeline'
CONFLUENT_KAFKA_TOPIC_NAME = 's2-topic'
CONFLUENT_CLUSTER_BOOTSTRAP_SERVER = 'pkc-xmzwx.europe-central2.gcp.confluent.cloud:9092'
CONFLUENT_API_KEY = 'EAPEIJZDU5KY26X5'
CONFLUENT_API_SECRET = '***************************************'

CONFLUENT_SCHEMA_REGISTRY_URL='https://psrc-9zg5y.europe-west3.gcp.confluent.cloud'
CONFLUENT_SCHEMA_REGISTRY_KEY = '7ALNJUEMWMBIMAQL'
CONFLUENT_SCHEMA_REGISTRY_SECRET = '***************************************'

### Create database

In [2]:
%%sql

DROP DATABASE IF EXISTS {{S2_DATABASE_NAME}};
CREATE DATABASE {{S2_DATABASE_NAME}};

<div class="alert alert-block alert-warning">    <b class="fa fa-solid fa-exclamation-circle"></b>    <div>        <p><b>Action Required</b></p>        <p>Make sure to select the <tt>{{S2_DATABASE_NAME}}</tt> database from the drop-down menu at the top of this notebook.        It updates the <tt>connection_url</tt> which is used by the <tt>%%sql</tt> magic command and SQLAlchemy to make connections to the selected database.</p>    </div></div>

### Create a table according to kafka avro schema

In [3]:
%%sql

DROP PIPELINE IF EXISTS {{S2_DATABASE_NAME}}.{{S2_PIPELINE_NAME}};
DROP TABLE IF EXISTS {{S2_DATABASE_NAME}}.{{S2_TABLE_NAME}};
CREATE TABLE IF NOT EXISTS {{S2_DATABASE_NAME}}.{{S2_TABLE_NAME}} (
`field1` int,
`field2` double,
`field3` text
);

### Create kafka pipeline

(Update schema registry mapping section according to your schema registry in format like: <i>'table column name'</i>  <-  <i>'schema registry field name'</i>)

In [4]:
%%sql

DROP PIPELINE IF EXISTS {{S2_DATABASE_NAME}}.{{S2_PIPELINE_NAME}};
CREATE PIPELINE {{S2_DATABASE_NAME}}.{{S2_PIPELINE_NAME}}
AS LOAD DATA KAFKA '{{CONFLUENT_CLUSTER_BOOTSTRAP_SERVER}}/{{CONFLUENT_KAFKA_TOPIC_NAME}}'
CONFIG '{ \"sasl.username\": \"{{CONFLUENT_API_KEY}}\",\n         \"sasl.mechanism\": \"PLAIN\",\n         \"security.protocol\": \"SASL_SSL\",\n         \"ssl.ca.location\": \"/etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem\",\n \"schema.registry.username\": \"{{CONFLUENT_SCHEMA_REGISTRY_KEY}}\"}'
CREDENTIALS '{\"sasl.password\": \"{{CONFLUENT_API_SECRET}}\",\n \"schema.registry.password\": \"{{CONFLUENT_SCHEMA_REGISTRY_SECRET}}\"}'
BATCH_INTERVAL 20
DISABLE OFFSETS METADATA GC
INTO TABLE {{S2_TABLE_NAME}}
FORMAT AVRO
SCHEMA REGISTRY '{{CONFLUENT_SCHEMA_REGISTRY_URL}}'
(
field1  <-  my_field1,
field2  <-  my_field2,
field3  <-  my_field3
);

### Test created pipeline

In [5]:
%%sql
TEST PIPELINE  {{S2_DATABASE_NAME}}.{{S2_PIPELINE_NAME}} LIMIT 1;

### Start pipeline

In [6]:
%%sql

START PIPELINE {{S2_DATABASE_NAME}}.{{S2_PIPELINE_NAME}};

### Stop pipeline

In [7]:
%%sql

STOP PIPELINE {{S2_DATABASE_NAME}}.{{S2_PIPELINE_NAME}};

### Select consumed events

In [8]:
%%sql

SELECT * FROM {{S2_DATABASE_NAME}}.{{S2_TABLE_NAME}};

<div id="singlestore-footer" style="background-color: rgba(194, 193, 199, 0.25); height:2px; margin-bottom:10px"></div>
<div><img src="https://raw.githubusercontent.com/singlestore-labs/spaces-notebooks/master/common/images/singlestore-logo-grey.png" style="padding: 0px; margin: 0px; height: 24px"/></div>