Change Data Capture

In this post I'll show and describe how to achieve change-data-capture (aka CDC) using the most reliable open source softwares. Here they are:

Docker
MS SQL
Debezium
Kafka
Teiid
And of course some Java code.

Docker

Based on Linux Kernel capabilities, Docker got the idea to ship an application as a Linux container image (using the Docker format), to achieve consistency and reliability.

I would say the real "write once, run everywhere... as long as it's Linux".

More info on docker.io

Microsft SQL Server

A database platform...

I know, I said open source software, but I'm running the official Docker image, so at least is free!

They are getting there...

Debezium

No better explanation than the one taken from its site:

Debezium is an open source distributed platform for change data capture. Debezium records in a transaction log all row-level changes committed to each database table. Each application simply reads the transaction logs their interested in, and they see all of the events in the same order in which they occurred.

More info on debezium.io

Kafka

No better explanation than the one taken from its site:

Kafka® is used for building real-time data pipelines and streaming apps. It is horizontally scalable, fault-tolerant, wicked fast, and runs in production in thousands of companies.

Teiid

No better explanation than the one taken from its site:

Teiid offers a relational abstraction of all information sources that is highly performant and allows for integration with your existing relational tools. Teiid has an accompanying easy-to-use design tool that enables data architects to integrate disparate information in minutes.

Java code

You can find everything it this repo:

https://github.com/foogaro/change-data-capture/tree/master/infinispan-teiid

Architecture

There:

So basically, this is the flow:

some external software adds new records into the database;
the database stores the data and updates its transaction log;
Debezium reads the transaction log and receives the changes from the databases (either insert, or update, or delete);
Debezium sends those changes as single event to a Kafka topic;
The InfinispanSinkConnector receives those events from the topic, and it sends them to Infinispan;
A cache listener on Infinispan elaborates the new key and converts the message in a POJO (with ProtoBuf annotations);
Asynchronously... Teiid exposes the Infinispan caches providing support for JDBC, ODBC and OData4 protocols for client applications;
Applications can query and aggregate the data they need connecting directly to Teiid and proudly show their dashboards.

Now, all this work is implemented for you as MVPoC. It would be really cool if you could contribute by fixing the following issues:

Fix the issue#1 for the Infinispan custom cache-store and get rid of the @ClientListener running as while (true) {} in a Java class.
Fix the issue#2 to run it all on OpenShift;
Fix the issue#3 to show cache metrics in Grafana.

Running the all thing

As easy as

./scripts/run-them-all

Grafana dashboard

Here it is the Grafana dashbaord to monitor the Infinispan caches:

Demo

You can find a playback demo here.

The playback demo can be download both as GIF or WEBM file.

Implementation Details

Now I'm really tired.

If you need, drop me an email.

Ciao,

Luigi

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
debezium		debezium
demo		demo
fedora		fedora
grafana		grafana
images		images
infinispan-teiid		infinispan-teiid
infinispan		infinispan
kafka		kafka
mssql		mssql
prometheus		prometheus
scripts		scripts
teiid-vdb		teiid-vdb
zookeeper		zookeeper
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
blog-post.md		blog-post.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Change Data Capture

Docker

Microsft SQL Server

Debezium

Kafka

Teiid

Java code

Architecture

Running the all thing

Grafana dashboard

Demo

Implementation Details

About

Releases

Packages

Contributors 2

Languages

License

foogaro/change-data-capture

Folders and files

Latest commit

History

Repository files navigation

Change Data Capture

Docker

Microsft SQL Server

Debezium

Kafka

Teiid

Java code

Architecture

Running the all thing

Grafana dashboard

Demo

Implementation Details

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages