springboot-kafka-connect-jdbc-streams

The main goal of this project is to play with Kafka, Kafka Connect and Kafka Streams. For this, we have: store-api that inserts/updates records in MySQL; Source Connectors that monitor inserted/updated records in MySQL and push messages related to those changes to Kafka; Sink Connectors that listen messages from Kafka and insert/update documents in Elasticsearch; finally, store-streams that listens messages from Kafka, treats them using Kafka Streams and push new messages back to Kafka.

Proof-of-Concepts & Articles

On ivangfr.github.io, I have compiled my Proof-of-Concepts (PoCs) and articles. You can easily search for the technology you are interested in by using the filter. Who knows, perhaps I have already implemented a PoC or written an article about what you are looking for.

Additional Readings

[Medium] Streaming MySQL changes to Elasticsearch using Kafka Connect
[Medium] Enhancing a MySQL-KafkaConnect-Elasticsearch Setup with Spring Boot Applications

Project Diagram

Applications

store-api

Monolithic Spring Boot application that exposes a REST API to manage Customers, Products and Orders. The data is stored in MySQL.
store-streams

Spring Boot application that connects to Kafka and uses Kafka Streams API to transform some "input" topics into a new "output" topic in Kafka.

Prerequisites

(De)Serialization formats

In order to run this project, you can use JSON or Avro format to serialize/deserialize data to/from the binary format used by Kafka. The default format is JSON. Throughout this document, I will point out what to do if you want to use Avro.

Start Environment

Open a terminal and inside springboot-kafka-connect-jdbc-streams root folder run:
```
docker compose up -d
```
Note: During the first run, an image for kafka-connect will be built, whose name is springboot-kafka-connect-jdbc-streams_kafka-connect. Run the command below to rebuild it.
```
docker compose build
```
Wait for all Docker containers to be up and running. To check it, run:
```
docker compose ps
```

Create Kafka Topics

In order to have topics in Kafka with more than 1 partition, we have to create them manually and not let the connectors to create them for us. So, for it:

Open a new terminal and make sure you are in springboot-kafka-connect-jdbc-streams root folder;
Run the script below:
```
./create-kafka-topics.sh
```
It will create the topics mysql.storedb.customers, mysql.storedb.products, mysql.storedb.orders, mysql.storedb.orders_products with 5 partitions.

Create connectors

Connectors use Converters for data serialization and deserialization. If you are configuring For JSON (de)serialization, the converter used is JsonConverter. On the other hand, the converter used is AvroConverter.

Important: if the Source Connector Converter serializes data, for instance, from JSON to bytes (using JsonConverter), then the Sink Connector Converter must also use JsonConverter to deserialize the bytes, otherwise an error will be thrown. The document Kafka Connect Deep Dive – Converters and Serialization Explained explains it very well.

Steps to create the connectors:

In a terminal, navigate to springboot-kafka-connect-jdbc-streams root folder
Run the following script to create the connectors on kafka-connect:
- For JSON (de)serialization
```
./create-connectors-jsonconverter.sh
```
- For Avro (de)serialization
```
./create-connectors-avroconverter.sh
```
You can check the state of the connectors and their tasks on Kafka Connect UI or running the following script:
```
./check-connectors-state.sh
```

Once the connectors and their tasks are ready (RUNNING state), you should see something like:

{"name":"mysql-source-customers","connector":{"state":"RUNNING","worker_id":"kafka-connect:8083"},"tasks":[{"id":0,"state":"RUNNING","worker_id":"kafka-connect:8083"}],"type":"source"}
{"name":"mysql-source-products","connector":{"state":"RUNNING","worker_id":"kafka-connect:8083"},"tasks":[{"id":0,"state":"RUNNING","worker_id":"kafka-connect:8083"}],"type":"source"}
{"name":"mysql-source-orders","connector":{"state":"RUNNING","worker_id":"kafka-connect:8083"},"tasks":[{"id":0,"state":"RUNNING","worker_id":"kafka-connect:8083"}],"type":"source"}
{"name":"mysql-source-orders_products","connector":{"state":"RUNNING","worker_id":"kafka-connect:8083"},"tasks":[{"id":0,"state":"RUNNING","worker_id":"kafka-connect:8083"}],"type":"source"}
{"name":"elasticsearch-sink-customers","connector":{"state":"RUNNING","worker_id":"kafka-connect:8083"},"tasks":[{"id":0,"state":"RUNNING","worker_id":"kafka-connect:8083"}],"type":"sink"}
{"name":"elasticsearch-sink-products","connector":{"state":"RUNNING","worker_id":"kafka-connect:8083"},"tasks":[{"id":0,"state":"RUNNING","worker_id":"kafka-connect:8083"}],"type":"sink"}
{"name":"elasticsearch-sink-orders","connector":{"state":"RUNNING","worker_id":"kafka-connect:8083"},"tasks":[{"id":0,"state":"RUNNING","worker_id":"kafka-connect:8083"}],"type":"sink"}

On Kafka Connect UI (http://localhost:8086), you should see:
If there is any problem, you can check kafka-connect container logs:
```
docker logs kafka-connect
```

Running Applications with Maven

store-api
- Open a new terminal and make sure you are in springboot-kafka-connect-jdbc-streams root folder.
- Run the command below to start the application:
```
./mvnw clean spring-boot:run --projects store-api -Dspring-boot.run.jvmArguments="-Dserver.port=9080"
```
  Note
  
  It will create all tables, such as: customers, products, orders and orders_products. We are using spring.jpa.hibernate.ddl-auto=update configuration.
  
  It will also insert some customers and products. If you don't want it, just set to false the properties load-samples.customers.enabled and load-samples.products.enabled in application.yml.
store-streams
- Open a new terminal and inside springboot-kafka-connect-jdbc-streams root folder.
- To start application, run:
  - For JSON (de)serialization
```
./mvnw clean spring-boot:run --projects store-streams -Dspring-boot.run.jvmArguments="-Dserver.port=9081"
```
  - For Avro (de)serialization
    
    Warning: Unable to run in this mode in my machine! The application starts fine when using avro profile but, when the 1st event arrives, the org.apache.kafka.common.errors.SerializationException: Unknown magic byte! is thrown. The problem doesn't happen while Running Applications as Docker containers.
```
./mvnw clean spring-boot:run --projects store-streams -Dspring-boot.run.jvmArguments="-Dserver.port=9081" -Dspring-boot.run.profiles=avro
```
    The command below generates Java classes from Avro files present in src/main/resources/avro
    ./mvnw generate-sources --projects store-streams

Running Applications as Docker containers

Build Application’s Docker Image

In a terminal, make sure you are inside springboot-kafka-connect-jdbc-streams root folder;
Run the following script to build the application's docker image:
```
./docker-build.sh
```

Application’s Environment Variables

store-api

Environment Variable Description

MYSQL_HOST Specify host of the MySQL database to use (default localhost)

MYSQL_PORT Specify port of the MySQL database to use (default 3306)

store-streams

Environment Variable	Description
`KAFKA_HOST`	Specify host of the `Kafka` message broker to use (default `localhost`)
`KAFKA_PORT`	Specify port of the `Kafka` message broker to use (default `29092`)
`SCHEMA_REGISTRY_HOST`	Specify host of the `Schema Registry` to use (default `localhost`)
`SCHEMA_REGISTRY_PORT`	Specify port of the `Schema Registry` to use (default `8081`)

Run Application’s Docker Container

In a terminal, make sure you are inside springboot-kafka-connect-jdbc-streams root folder;
In order to run the application's docker containers, you can pick between JSON or Avro:
- For JSON (de)serialization
```
./start-apps.sh
```
- For Avro (de)serialization
```
./start-apps.sh avro
```

Application's URL

Application	URL
store-api	http://localhost:9080/swagger-ui.html
store-streams	http://localhost:9081/actuator/health

Testing

Let's simulate an order creation. In this example, customer with id 1

{"id":1, "name":"John Gates", "email":"john.gates@test.com", "address":"street 1", "phone":"112233"}

will order one unit of the product with id 15

{"id":15, "name":"iPhone Xr", "price":900.00}

In a terminal, run the following curl command:

curl -i -X POST localhost:9080/api/orders \
  -H 'Content-Type: application/json' \
  -d '{"customerId": 1, "paymentType": "BITCOIN", "status": "OPEN", "products": [{"id": 15, "unit": 1}]}'

The response should be:

HTTP/1.1 201
{
  "id": "47675629-4f0d-440d-b6df-c829874ee2a6",
  "customerId": 1,
  "paymentType": "BITCOIN",
  "status": "OPEN",
  "products": [{"id": 15, "unit": 1}]
}

Checking Elasticsearch:

curl "localhost:9200/store.streams.orders/_search?pretty"

We should have one order with a customer and products names:

{
  "took" : 844,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "store.streams.orders",
        "_type" : "order",
        "_id" : "47675629-4f0d-440d-b6df-c829874ee2a6",
        "_score" : 1.0,
        "_source" : {
          "payment_type" : "BITCOIN",
          "created_at" : 1606821792360,
          "id" : "47675629-4f0d-440d-b6df-c829874ee2a6",
          "customer_name" : "John Gates",
          "customer_id" : 1,
          "status" : "OPEN",
          "products" : [
            {
              "unit" : 1,
              "price" : 900,
              "name" : "iPhone Xr",
              "id" : 15
            }
          ]
        }
      }
    ]
  }
}

In order to create random orders, we can use also the simulation:

curl -i -X POST localhost:9080/api/simulation/orders \
  -H 'Content-Type: application/json' \
  -d '{"total": 10, "sleep": 100}'

Useful Links/Commands

Kafka Topics UI

Kafka Topics UI can be accessed at http://localhost:8085
Kafka Connect UI

Kafka Connect UI can be accessed at http://localhost:8086
Schema Registry UI

Schema Registry UI can be accessed at http://localhost:8001
Schema Registry

You can use curl to check the subjects in Schema Registry
- Get the list of subjects
```
curl localhost:8081/subjects
```
- Get the latest version of the subject mysql.storedb.customers-value
```
curl localhost:8081/subjects/mysql.storedb.customers-value/versions/latest
```
Kafka Manager

Kafka Manager can be accessed at http://localhost:9000

Configuration
- First, you must create a new cluster. Click on Cluster (dropdown on the header) and then on Add Cluster;
- Type the name of your cluster in Cluster Name field, for example: MyCluster;
- Type zookeeper:2181 in Cluster Zookeeper Hosts field;
- Enable checkbox Poll consumer information (Not recommended for large # of consumers if ZK is used for offsets tracking on older Kafka versions);
- Click on Save button at the bottom of the page.

Elasticsearch

Elasticsearch can be accessed at http://localhost:9200

Get all indices:
```
curl "localhost:9200/_cat/indices?v"
```

Search for documents:

curl "localhost:9200/mysql.storedb.customers/_search?pretty"
curl "localhost:9200/mysql.storedb.products/_search?pretty"
curl "localhost:9200/store.streams.orders/_search?pretty"

MySQL

docker exec -it -e MYSQL_PWD=secret mysql mysql -uroot --database storedb
select * from orders;

Shutdown

To stop applications:
- If they were started with Maven, go to the terminals where they are running and press Ctrl+C;
- If they were started as Docker containers, go to a terminal and, inside springboot-kafka-connect-jdbc-streams root folder, run the script below:
```
./stop-apps.sh
```
To stop and remove docker compose containers, network and volumes, go to a terminal and, inside springboot-kafka-connect-jdbc-streams root folder, run the following command:
```
docker compose down -v
```

Cleanup

To remove the Docker images created by this project, go to a terminal and, inside springboot-kafka-connect-jdbc-streams root folder, run the script below:

./remove-docker-images.sh

Issues

Product price field, numeric.mapping doesn't work for DECIMAL fields #563. For now, the workaround is using String instead of BigDecimal as type for this field.

Name		Name	Last commit message	Last commit date
Latest commit History 95 Commits
.github		.github
.mvn/wrapper		.mvn/wrapper
connectors		connectors
docker		docker
documentation		documentation
scripts		scripts
store-api		store-api
store-streams		store-streams
.gitignore		.gitignore
README.md		README.md
check-connectors-state.sh		check-connectors-state.sh
create-connectors-avroconverter.sh		create-connectors-avroconverter.sh
create-connectors-jsonconverter.sh		create-connectors-jsonconverter.sh
create-kafka-topics.sh		create-kafka-topics.sh
docker-build.sh		docker-build.sh
docker-compose.yml		docker-compose.yml
mvnw		mvnw
mvnw.cmd		mvnw.cmd
pom.xml		pom.xml
remove-docker-images.sh		remove-docker-images.sh
start-apps.sh		start-apps.sh
stop-apps.sh		stop-apps.sh

Environment Variable	Description
`MYSQL_HOST`	Specify host of the `MySQL` database to use (default `localhost`)
`MYSQL_PORT`	Specify port of the `MySQL` database to use (default `3306`)

ivangfr/springboot-kafka-connect-jdbc-streams

Folders and files

Latest commit

History

Repository files navigation

springboot-kafka-connect-jdbc-streams

Proof-of-Concepts & Articles

Additional Readings

Project Diagram

Applications

store-api

store-streams

Prerequisites

(De)Serialization formats

Start Environment

Create Kafka Topics

Create connectors

Running Applications with Maven

Running Applications as Docker containers

Build Application’s Docker Image

Application’s Environment Variables

Run Application’s Docker Container

Application's URL

Testing

Useful Links/Commands

Shutdown

Cleanup

Issues

References

About

Topics

Resources

Stars

Watchers

Forks

Sponsor this project

Languages