Name		Name	Last commit message	Last commit date
parent directory ..
MySQL CDC (Binary Log) to DeltaLake.json		MySQL CDC (Binary Log) to DeltaLake.json
MySQL CDC to Delta Lake.png		MySQL CDC to Delta Lake.png
README.md		README.md

README.md

MySQL CDC to Delta Lake

This pipeline demonstrates how to read change data capture (CDC) data from a MySQL database and replicate the changes to Delta Lake table(s) on Databricks.

For more information, see Loading Data into Databricks Delta Lake in StreamSets Data Collector documentation.

Prerequisites

StreamSets Data Collector 3.15.0 or higher. You can run Data Collector on your cloud provider of choice, or download it for local use.
Ensure the pre-requisites for Databricks Delta Lake are complete
MySQL Server with Binary log enabled
MySQL Connector/J JDBC Driver

Setup

Download the pipeline and import it into Data Collector or Control Hub
Configure all the pipeline parameters for your MySQL Database and Databricks connections
If necessary, update the MySQL binlog origin to replicate only specific tables
By default, the Databricks Delta Lake destination is configured to auto create each table that is replicated from MySQL and write the data in DBFS. If you'd like, update the configurations in the destination per your needs.
Configure Databricks Delta Lake destination to add a key column for each Delta Lake table being replicated. This is required for ensure the Merge command is run with the right conditional logic for Inserts, Updates and Deletes.
Start your Databricks cluster.

Running the Pipeline

Start the pipeline. It takes a couple of seconds to create a connection to Databricks. Once the connection is established, you should see records replicated from MySQL and sent to Delta Lake. Insert, update and delete records in MySQL to see how they are being replicated in Delta Lake.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MySQL CDC to Delta Lake

MySQL CDC to Delta Lake

MySQL CDC (Binary Log) to DeltaLake.json

MySQL CDC (Binary Log) to DeltaLake.json

MySQL CDC to Delta Lake.png

MySQL CDC to Delta Lake.png

README.md

README.md

README.md

MySQL CDC to Delta Lake

Prerequisites

Setup

Running the Pipeline

Files

MySQL CDC to Delta Lake

Directory actions

More options

Directory actions

More options

Latest commit

History

MySQL CDC to Delta Lake

Folders and files

parent directory

MySQL CDC (Binary Log) to DeltaLake.json

MySQL CDC (Binary Log) to DeltaLake.json

MySQL CDC to Delta Lake.png

MySQL CDC to Delta Lake.png

README.md

README.md

README.md

MySQL CDC to Delta Lake

Prerequisites

Setup

Running the Pipeline