-
Notifications
You must be signed in to change notification settings - Fork 288
(3.0): reimplement cdc base on the existing design #22801
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
zhangxu19830126
approved these changes
Nov 10, 2025
jiangxinmeng1
approved these changes
Nov 10, 2025
daviszhen
approved these changes
Nov 10, 2025
xzxiong
approved these changes
Nov 10, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
kind/bug
Something isn't working
kind/enhancement
kind/refactor
Code refactor
Review effort 5/5
size/XXL
Denotes a PR that changes 2000+ lines
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
User description
What type of PR is this?
Which issue(s) this PR fixes:
issue #22718
What this PR does / why we need it:
The new implementation has undergone systemic upgrades in architecture, testing, monitoring, and error handling, bringing it much closer to production readiness: issues can be detected faster, errors are corrected or downgraded automatically, diagnosis and recovery are easier, and there is room for future feature expansion and performance tuning.
PR Type
Bug fix, Enhancement, Tests
Description
Complete CDC v2 architecture reimplementation with systemic upgrades in architecture, testing, monitoring, and error handling for production readiness
New state machine-based executor (
ExecutorStateMachine) replacing simple boolean flags with explicit state transitions (Starting, Running, Paused, Failed, Cancelled) and comprehensive error handlingV2 MySQL sinker (
mysqlSinker2) implementing producer-consumer pattern with structured commands, transaction state machine (IDLE, ACTIVE, COMMITTED, ROLLED_BACK), and atomic error handlingTable change stream pipeline (
TableChangeStream) with dual-layer transaction safety viaTransactionManagerandDataProcessor, handling StaleRead errors gracefully with watermark resetEnhanced watermark updater with three-tier cache architecture, circuit breaker pattern for commit failures, error metadata caching, and retry logic with automatic circuit breaker
Comprehensive observability framework (
ProgressTracker,ProgressMonitor) with detailed metrics tracking, watermark lag measurement, and stuck table detectionExecutor with transaction management implementing circuit breaker pattern, retry mechanism with exponential backoff, and automatic reconnection logic
Extensive test coverage including regression tests for critical bug fixes, comprehensive integration tests, and end-to-end scenarios with mocked dependencies
Improved error handling with unified error metadata parsing, retry decision making, and proper resource cleanup with sync.Once pattern
Enhanced observability with structured logging, state information tracking, and comprehensive CDC metrics collection
Diagram Walkthrough
File Walkthrough
9 files
sinker_v2_sql_builder_test.go
New comprehensive SQL builder test suite for CDC v2pkg/cdc/sinker_v2_sql_builder_test.go
CDCStatementBuildercovering INSERT andDELETE SQL generation
support
various data types
table_change_stream_test.go
New comprehensive integration tests for table change streampkg/cdc/table_change_stream_test.go
TableChangeStreamcoveringinitialization and lifecycle
sinker_test.go
Simplified sinker tests with v2 architecture mockingpkg/cdc/sinker_test.go
CreateMysqlSinker2instead oflow-level sink operations
SQL execution details
bug_fix_tests_test.go
New regression tests for critical bug fixespkg/cdc/bug_fix_tests_test.go
TransactionTracker.UpdateToTs()ensuringwatermark advances correctly
CommitTransactionuses latesttoTsaftermultiple updates
AtomicBatch.Close()behavior and fail-fast principleswatermark_updater_retry_test.go
Tests for watermark retry count tracking and error metadatapkg/cdc/watermark_updater_retry_test.go
watermark updater
to non-retryable when exceeding
MaxRetryCountchanges reset the count
waitForErrorMetadata()to poll database for errormetadata with timeout
cdc_test.go
Comprehensive CDC test refactoring with state machine and mockimprovementspkg/frontend/cdc_test.go
GetTableDetectorto prevent panics across alltests
createMockTableDetectorForTest()factoryfunction
isRunningflagsetupCDCTestStubs()helper to stub CDC operations for tablestream tests
mockTableReadertomockChangeReaderwith updated interfacemethods
stubs
sinker_v2_test.go
New MySQL sinker v2 comprehensive test coveragepkg/cdc/sinker_v2_test.go
mysqlSinker2error handling andtransaction lifecycle
(begin/commit/rollback)
sinker_v2_executor_test.go
New executor transaction and retry mechanism test suitepkg/cdc/sinker_v2_executor_test.go
Executortransaction management and SQL executionidempotent operations
patterns
reader_v2_data_processor_test.go
New data processor change handling test suitepkg/cdc/reader_v2_data_processor_test.go
DataProcessorcovering change processing workflows8 files
watermark_updater.go
Enhanced watermark updater with circuit breaker and error handlingpkg/cdc/watermark_updater.go
(lag-acceptable, advance-forbidden)
for commit failures
error updates
for watermark operations
for persistent failures
cdc_exector.go
CDC executor state machine and error handling improvementspkg/frontend/cdc_exector.go
isRunningboolean flag withExecutorStateMachineforrobust state management with explicit transitions (Starting, Running,
Paused, Failed, Cancelled)
metrics recording, and proper cleanup of resources (readers, routines,
table detector registrations)
stopAllReaders()method to synchronously stop all runningreaders with timeout protection before Pause/Cancel operations
log message format from "CDC-Task-*" to "cdc.frontend.task.*" pattern
handleNewTables()to continue processing other tables onindividual failures instead of returning immediately, with detailed
error tracking and metrics
clearAllTableErrors()method to reset error messages duringResume, allowing retry of tables with non-retryable errors after user
fixes
ParseErrorMetadata()and
ShouldRetry()functions with improved retry decision makingNewTableReadertoNewTableChangeStream(V2 architecture)with frequency parameter support
sinker_v2.go
New V2 MySQL sinker with structured commands and state machinepkg/cdc/sinker_v2.go
mysqlSinker2with improved architecture:structured commands, explicit transaction state machine, and atomic
error handling
communication between reader and sinker goroutines
ROLLED_BACK) ensuring no transaction leaks
panics and ensuring thread-safe error propagation
FLUSH operations with detailed logging and metrics
of consumer goroutine
ProgressTrackerfor observability and metricsrecording (SQL execution, duration, throughput)
table_change_stream.go
New table change stream V2 implementation for CDC pipelinepkg/cdc/table_change_stream.go
TableChangeStreamas complete CDC data pipelinereplacing old
TableReadersafety via
TransactionManagerandDataProcessorwith detailed progress tracking
state with retryable flag
monitoring active runners
eventual consistency design
messages, and resource deregistration
observability.go
New observability framework for CDC progress trackingpkg/cdc/observability.go
ProgressTrackerfor detailed observability ofCDC processing per table with state tracking, watermark management,
and round-based metrics
detailed metrics (rows, bytes, SQL count, duration)
calculation, and watermark lag measurement
ProgressMonitorfor monitoring multiple trackers anddetecting stuck tables based on state age and watermark update lag
logging with configurable intervals
CdcInitialSyncStatusGauge,CdcThroughputRowsPerSecond,CdcWatermarkLagSeconds, etc.)sinker_v2_executor.go
New executor with transaction and retry managementpkg/cdc/sinker_v2_executor.go
Executorstruct managing database connections, transactions, andSQL execution
configurable thresholds
classification
cdc_metrics.go
New comprehensive CDC monitoring and observability metricspkg/util/metric/v2/cdc_metrics.go
table streams, and data processing
events
retry attempts
buckets
cdc_handle.go
Enhanced CDC request logging and observabilitypkg/frontend/cdc_handle.go
(drop/pause/resume/restart)
target status, and task name
information
1 files
sinker.go
Refactored sinker to use new v2 architecturepkg/cdc/sinker.go
logic
NewSinkerto delegate to newCreateMysqlSinker2implementation
46 files