Jack Humphrey edited this page May 16, 2018 · 21 revisions


A listing of projects to get data streams out of MySQL

List of projects that will let you do replication from MySQL to Kafka.

Project name Site Description
aesop https://github.com/Flipkart/aesop Built on top of Databus. In production use at http://www.flipkart.com/. Allows you to plug in your own code to transform/process the MySQL events.
databus https://github.com/linkedin/databus Precursor to Kafka. Reads from MySQL and Oracle, and replicates to its own log structure. In production use at LinkedIn. No Kafka integration. Uses Open Replicator.
FlexCDC http://github.com/greenlion/swanhart-tools/ FlexCDC is a daemon which reads a MySQL replication stream and sends records to log tables or plugins. Supports transactions and ALTER table to keep the log tables in DDL sync with the MySQL server.
Lapidus https://github.com/JarvusInnovations/lapidus Streams data from MySQL, PostgreSQL and MongoDB as newline delimited JSON. Can be run as a daemon or included as a Node.js module.
Maxwell https://github.com/zendesk/maxwell Reads MySQL event stream, output events as JSON. Parses ALTER/CREATE TABLE/etc statements to keep schema in sync. Written in java. Well maintained.
mypipe https://github.com/mardambey/mypipe Reads MySQL event stream, and emits events corresponding to INSERTs, DELETEs, UPDATEs. Written in Scala. Emits Avro to Kafka.
mysql-binlog-connector-java https://github.com/shyiko/mysql-binlog-connector-java Library that parses MySQL binary logs and calls your code to process them. Fork/rewrite of Open Replicator. Has tests.
mysql_streamer https://github.com/Yelp/mysql_streamer MySQLStreamer is a database change data capture and publish system. It’s responsible for capturing each individual database change, enveloping them into messages and publishing to Kafka.
oltp-cdc-olap https://github.com/xmlking/nifi-examples/tree/master/oltp-cdc-olap Uses Maxwell to replicate to Apache Nifi.
Open Replicator https://code.google.com/p/open-replicator/ Library that parses MySQL binary logs and calls your code to process them. Does not seem to be maintained.
Project name Site Description
python-mysql-replication https://github.com/noplay/python-mysql-replication Pure python library that parses MySQL binary logs and lets you process the replication events. Basically, the python equivalent of mysql-binlog-connector-java
recordbus https://github.com/pyr/recordbus Directly maps MySQL events to JSON, with no interpretation. Written in Java. Replicates to Kafka.
Tungsten Replicator https://github.com/continuent/tungsten-replicator Reads from MySQL and replicates to its own log structure. Allows plugging in your own code to process the events however you want. Open source component of Continuent, a commercial company that does database replication.
wombat https://github.com/TiVo/wombat Uses mysql-binlog-connector-java, outputs JSON to Kafka.
kafka-mysql-connector https://github.com/wushujames/kafka-mysql-connector A plugin for Kafka Connect. Uses Maxwell to replicate MySQL to Kafka.
Debezium http://debezium.io Replicates from MySQL to Kafka. Uses mysql-binlog-connector-java. Kafka Connector. A funded project supported by Redhat with employees working on it full time.
php-mysql-replication https://github.com/krowinski/php-mysql-replication Pure PHP Implementation of MySQL replication protocol. This allow you to receive event like insert, update, delete with their data and raw SQL queries.
StreamSets Data Collector https://streamsets.com/products/sdc/ Pipelines that can be configured to continuously ingest data from any number of tables in a relational database (using JDBC).
Clone this wiki locally
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.