DataStream-SQLServer provides real-time data streaming from SQL Server using Zookeeper, Kafka, and Debezium. This repository contains the necessary configurations, Docker setups, and sample code to get you started.
- Requirements
- Commands and Usage
- Managing Kafka Cluster
- Customization
- License
- Docker Desktop
- python
- sql server
Run the SQL commands in commands/cdc-enable.sql to enable Change Data Capture (CDC) on your SQL Server database and table.
sql
-- Enable CDC on the database EXEC sys.sp_cdc_enable_db;
-- Enable CDC on the table EXEC sys.sp_cdc_enable_table @source_schema = N'dbo', @source_name = N'customers', @role_name = NULL;
Adjust docker-workspace/config.json to suit your SQL Server and Kafka configurations. This file is used by Debezium to connect to your SQL Server database and Kafka cluster.
json
{ "name": "demo-db-cdc", "config": { ... } }
docker-compose -f zookeeper-kafka.yml up -d
kafka-topics --create --topic test-topic --bootstrap-server localhost:9092 --partitions 10 --replication-factor 1
kafka-topics --list --bootstrap-server localhost:9092
kafka-topics --delete --topic test-topic --bootstrap-server localhost:9092
curl -X POST -H "Content-Type: application/json" --data @config.json localhost:8083/connectors/
Use Kafka Manager to manage your Kafka cluster, topics, and consumers:
Access via http://localhost:9000/ Zookeeper hosts: zookeeper:2181
Feel free to modify configurations, add more tables, or adjust Kafka settings to suit your requirements. Remember to adjust the consumer.py script to handle new schemas or tables.
This project is licensed under the MIT License - see the LICENSE.md file for details.