-
Notifications
You must be signed in to change notification settings - Fork 33
ddl: Fix a race condition in DDL logic leading to inconsistency checked by syncpoint #3802
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
/test help |
|
@hongyunyan: The specified target(s) for The following commands are available to trigger optional jobs: Use DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Summary of ChangesHello @hongyunyan, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a potential race condition in the DDL event dispatching mechanism by introducing a buffering system within the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
/test pull-cdc-mysql-integration-heavy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces logic to hold certain DDL events to prevent a race condition, along with a new unit test to verify this behavior. The core logic change seems correct in its intent. However, I've identified a critical race condition in the implementation that could lead to events being stuck indefinitely. I've provided suggestions to fix this race. Additionally, there are some temporary changes in an integration test script that should be reverted before merging.
| if d.IsTableTriggerEventDispatcher() && | ||
| d.resendTaskMap.Len() > 0 && | ||
| blockedTables != nil && | ||
| blockedTables.InfluenceType != commonEvent.InfluenceTypeNormal { | ||
| d.blockEventStatus.setBlockEvent(event, heartbeatpb.BlockStage_WAITING) | ||
| d.HoldBlockEvent(event) | ||
| return | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following my suggestion to change HoldBlockEvent to fix a race condition, this part needs to be updated to use the new return value.
| if d.IsTableTriggerEventDispatcher() && | |
| d.resendTaskMap.Len() > 0 && | |
| blockedTables != nil && | |
| blockedTables.InfluenceType != commonEvent.InfluenceTypeNormal { | |
| d.blockEventStatus.setBlockEvent(event, heartbeatpb.BlockStage_WAITING) | |
| d.HoldBlockEvent(event) | |
| return | |
| } | |
| if d.IsTableTriggerEventDispatcher() && | |
| blockedTables != nil && | |
| blockedTables.InfluenceType != commonEvent.InfluenceTypeNormal { | |
| if d.HoldBlockEvent(event) { | |
| d.blockEventStatus.setBlockEvent(event, heartbeatpb.BlockStage_WAITING) | |
| return | |
| } | |
| } |
| func (d *BasicDispatcher) HoldBlockEvent(event commonEvent.BlockEvent) { | ||
| d.holdingBlockEventMu.Lock() | ||
| defer d.holdingBlockEventMu.Unlock() | ||
| // The event stream is blocked by this block event, so at most one such event can be pending here. | ||
| if d.holdingBlockEvent != nil { | ||
| d.HandleError(errors.ErrDispatcherFailed.GenWithStackByArgs( | ||
| "hold non-normal block event failed: holdingBlockEvent is already occupied", | ||
| )) | ||
| return | ||
| } | ||
| d.holdingBlockEvent = event | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is part of a sequence that has a critical race condition. The check d.resendTaskMap.Len() > 0 in DealWithBlockEvent and this function call are not atomic. An ACK could be processed between the check and the hold, leading to a stuck event.
To fix this, the check should be moved inside this function to make the check-and-hold atomic. This function should also return a boolean to indicate whether the event was held. I will also add a comment in DealWithBlockEvent to adapt to this change.
func (d *BasicDispatcher) HoldBlockEvent(event commonEvent.BlockEvent) bool {
d.holdingBlockEventMu.Lock()
defer d.holdingBlockEventMu.Unlock()
if d.resendTaskMap.Len() == 0 {
return false
}
// The event stream is blocked by this block event, so at most one such event can be pending here.
if d.holdingBlockEvent != nil {
d.HandleError(errors.ErrDispatcherFailed.GenWithStackByArgs(
"hold non-normal block event failed: holdingBlockEvent is already occupied",
))
} else {
d.holdingBlockEvent = event
}
return true
}| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G01 | ||
| 'ddl_for_split_tables_with_random_move_table' | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G02 | ||
| 'ddl_for_split_tables_with_failover' | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G03 | ||
| 'cdc move_table checkpoint_race_ddl_crash' | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G04 | ||
| 'complex_transaction syncpoint syncpoint_check_ts random_drop_message' | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G05 | ||
| 'ddl_for_split_tables_with_merge_and_split' | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G06 | ||
| 'ddl_for_split_tables_with_random_merge_and_split' | ||
| # G07 | ||
| 'consistent_compatibility consistent_partition_table consistent_replicate_gbk consistent_replicate_ddl consistent_replicate_basic' | ||
| # G08 | ||
| 'default_value http_proxies bank ddl_for_split_tables_random_schedule' | ||
| # G09 | ||
| 'availability resolve_lock merge_table drop_many_tables ddl_for_split_tables' | ||
| # G10 | ||
| 'consistent_replicate_nfs consistent_replicate_storage_file consistent_replicate_storage_file_large_value consistent_replicate_storage_s3' | ||
| # G11 | ||
| 'multi_changefeeds ddl_wait ddl_reentrant force_replicate_table multi_source' | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G01 | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G02 | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G03 | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G04 | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G05 | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G12 | ||
| 'ddl_with_random_move_table' | ||
| # G13 | ||
| 'tidb_mysql_test fail_over region_merge fail_over_ddl_mix_random_delay' | ||
| # G14 | ||
| 'fail_over_ddl_mix' | ||
| # G15 | ||
| 'fail_over_ddl_mix_with_syncpoint' | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G01 | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G02 | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G03 | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G04 | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G05 | ||
| 'syncpoint syncpoint syncpoint syncpoint' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
/test pull-cdc-mysql-integration-light |
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a mechanism to prevent race conditions during DDL processing, specifically between DDLs that alter table schemas (like adding or dropping tables) and subsequent database-wide or global events (like DROP DATABASE or syncpoints). The solution involves two main parts: the DDL dispatcher now holds these global events if it has un-acknowledged DDLs, and the maintainer will discard such events if it has pending scheduling tasks. This ensures that range checkers for global events are built with a complete and up-to-date view of all tables. The changes are well-reasoned and accompanied by thorough unit tests. My review has identified a minor inaccuracy in a comment and a significant but likely temporary change in the CI test configuration that should be reverted before merging.
| mysql_groups=( | ||
| # G00 | ||
| 'api_v2 generate_column many_pk_or_uk multi_source large_txn' | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G01 | ||
| 'ddl_for_split_tables_with_random_move_table' | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G02 | ||
| 'ddl_for_split_tables_with_failover' | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G03 | ||
| 'cdc move_table checkpoint_race_ddl_crash' | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G04 | ||
| 'complex_transaction syncpoint syncpoint_check_ts random_drop_message' | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G05 | ||
| 'ddl_for_split_tables_with_merge_and_split' | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G06 | ||
| 'ddl_for_split_tables_with_random_merge_and_split' | ||
| # G07 | ||
| 'consistent_compatibility consistent_partition_table consistent_replicate_gbk consistent_replicate_ddl consistent_replicate_basic' | ||
| # G08 | ||
| 'default_value http_proxies bank ddl_for_split_tables_random_schedule' | ||
| # G09 | ||
| 'availability resolve_lock merge_table drop_many_tables ddl_for_split_tables' | ||
| # G10 | ||
| 'consistent_replicate_nfs consistent_replicate_storage_file consistent_replicate_storage_file_large_value consistent_replicate_storage_s3' | ||
| # G11 | ||
| 'multi_changefeeds ddl_wait ddl_reentrant force_replicate_table multi_source' | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G01 | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G02 | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G03 | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G04 | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G05 | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G12 | ||
| 'ddl_with_random_move_table' | ||
| # G13 | ||
| 'tidb_mysql_test fail_over region_merge fail_over_ddl_mix_random_delay' | ||
| # G14 | ||
| 'fail_over_ddl_mix' | ||
| # G15 | ||
| 'fail_over_ddl_mix_with_syncpoint' | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G01 | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G02 | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G03 | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G04 | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G05 | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The mysql_groups array has been significantly modified, with most of the original integration tests replaced by repeated syncpoint tests. This appears to be a temporary change for debugging purposes. Please ensure this is reverted before the final merge to avoid disabling a large number of important integration tests.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
/gemini review |
|
/gemini summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a mechanism to prevent a race condition between DDLs that modify table schedules (like CREATE, DROP, TRUNCATE) and global DDLs or syncpoints. This is achieved by holding DB/All level events in the DDL dispatcher until all pending scheduling tasks are acknowledged by the maintainer, ensuring correct serialization. A safeguard is also added on the maintainer side to discard such events if scheduling tasks are still pending. The core logic appears sound and is supported by new unit tests. However, there is a critical issue in a test script that needs to be addressed before merging.
| mysql_groups=( | ||
| # G00 | ||
| 'api_v2 generate_column many_pk_or_uk multi_source large_txn' | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G01 | ||
| 'ddl_for_split_tables_with_random_move_table' | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G02 | ||
| 'ddl_for_split_tables_with_failover' | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G03 | ||
| 'cdc move_table checkpoint_race_ddl_crash' | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G04 | ||
| 'complex_transaction syncpoint syncpoint_check_ts random_drop_message' | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G05 | ||
| 'ddl_for_split_tables_with_merge_and_split' | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G06 | ||
| 'ddl_for_split_tables_with_random_merge_and_split' | ||
| # G07 | ||
| 'consistent_compatibility consistent_partition_table consistent_replicate_gbk consistent_replicate_ddl consistent_replicate_basic' | ||
| # G08 | ||
| 'default_value http_proxies bank ddl_for_split_tables_random_schedule' | ||
| # G09 | ||
| 'availability resolve_lock merge_table drop_many_tables ddl_for_split_tables' | ||
| # G10 | ||
| 'consistent_replicate_nfs consistent_replicate_storage_file consistent_replicate_storage_file_large_value consistent_replicate_storage_s3' | ||
| # G11 | ||
| 'multi_changefeeds ddl_wait ddl_reentrant force_replicate_table multi_source' | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G01 | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G02 | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G03 | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G04 | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G05 | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G12 | ||
| 'ddl_with_random_move_table' | ||
| # G13 | ||
| 'tidb_mysql_test fail_over region_merge fail_over_ddl_mix_random_delay' | ||
| # G14 | ||
| 'fail_over_ddl_mix' | ||
| # G15 | ||
| 'fail_over_ddl_mix_with_syncpoint' | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G01 | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G02 | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G03 | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G04 | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| # G05 | ||
| 'syncpoint syncpoint syncpoint syncpoint' | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // pendingACKCount is only used by the table trigger dispatcher. | ||
| // | ||
| // It tracks the number of DDLs that need to report to maintainer(i.e. DDLs that add/drop tables), | ||
| // but have not been ACKed by maintainer (i.e. maintainer has not finished scheduling add/drop tasks). | ||
| pendingACKCount atomic.Int64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment for pendingACKCount states that it tracks DDLs that add/drop tables. However, the implementation has been expanded to track all events reported to the maintainer by the DDL dispatcher that require an ACK, including syncpoints. To improve clarity and maintainability, please update the comment to reflect this broader scope.
For example:
// It tracks the number of events reported to the maintainer by the table trigger dispatcher that are awaiting an ACK.
Summary of ChangesThis pull request addresses a critical race condition in the DDL event processing pipeline within TiCDC. It introduces a robust buffering and deferral mechanism to ensure that global block events, such as syncpoints or database-wide DDLs, are not acted upon by the maintainer until all preceding DDLs that involve adding or dropping tables have been fully scheduled and acknowledged. This enhancement prevents the maintainer from operating on an outdated or incomplete view of the system's schema, thereby improving data consistency and the reliability of crash recovery. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
|
|
/test all |
|
/test all |
|
/retest |
|
/test pull-cdc-kafka-integration-heavy |
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a buffering mechanism to fix a race condition in DDL processing, which is a solid approach. The changes on both the dispatcher and maintainer sides aim to ensure correctness. However, I've identified a critical race condition in the new dispatcher logic that could undermine the entire fix. The logic for deciding when to dispatch a held event is not atomic, which could lead to out-of-order processing under concurrent DDL operations, re-introducing the original bug. My review focuses on this critical issue.
| if d.IsTableTriggerEventDispatcher() { | ||
| d.pendingACKCount.Add(-1) | ||
| // If there is a held DB/All block event, report it as soon as all resend tasks are ACKed. | ||
| // For schedule-related non-blocking DDLs, the maintainer only ACKs after scheduling is done. | ||
| // For schedule-related blocking DDLs, the maintainer will only begin deal with after no pending scheduling tasks. | ||
| // Thus, we ensure DB/All block events can genereate correct range checkers. | ||
| if d.pendingACKCount.Load() == 0 { | ||
| if holding := d.popHoldingBlockEvent(); holding != nil { | ||
| d.reportBlockedEventToMaintainer(holding) | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a critical race condition here that could undermine the fix this PR introduces. The logic to check pendingACKCount and dispatch the held event is not atomic.
-
Stale Read: The
pendingACKCountis decremented, and then its value is loaded in a separate step (d.pendingACKCount.Load() == 0). Another goroutine could incrementpendingACKCountbetween theAdd(-1)andLoad()calls. This would cause theLoad()to see a non-zero value and fail to dispatch the held event, leaving it stuck. -
Check-Then-Act Race: Even if you fix the stale read by using the result of
Add(-1)(i.e.,if d.pendingACKCount.Add(-1) == 0), a more subtle race exists. Another DDL event could be processed concurrently and incrementpendingACKCountafter the check evaluates to true but before the held event is dispatched byreportBlockedEventToMaintainer. This would violate the core invariant that global block events should only be dispatched when no other DDLs are pending.
To fix this robustly, the check for pendingACKCount == 0 and the subsequent pop-and-dispatch of the holding event must be performed atomically. I recommend using a sync.Mutex to protect these critical sections across all related methods (cancelResendTask, holdBlockEvent, and modifications to pendingACKCount).
| // double check here to avoid pendingACKCount becomes zero before we hold the event | ||
| if d.pendingACKCount.Load() == 0 { | ||
| holding := d.popHoldingBlockEvent() | ||
| if holding != nil { | ||
| d.reportBlockedEventToMaintainer(holding) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This double-check has the same critical race condition as the one in cancelResendTask. The pendingACKCount can change between the Load() and the popHoldingBlockEvent() call. This could lead to the held event being dispatched while another event is still pending acknowledgement, violating the ordering guarantees this PR aims to enforce.
This critical section should also be protected by the same mutex proposed for cancelResendTask to ensure the decision to dispatch and the dispatch action itself are atomic with respect to other pending operations.
|
/test all |
| if d.pendingACKCount.Load() == 0 { | ||
| if holding := d.popHoldingBlockEvent(); holding != nil { | ||
| d.reportBlockedEventToMaintainer(holding) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible that the pendingACKCount is less than zero? It's better to add a check to cover.
| if d.pendingACKCount.Load() == 0 { | ||
| holding := d.popHoldingBlockEvent() | ||
| if holding != nil { | ||
| d.reportBlockedEventToMaintainer(holding) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto
|
/retest |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: lidezhu, wk989898 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest |
|
/retest |
1 similar comment
|
/retest |
What problem does this PR solve?
Issue Number: close #3799
What is changed and how it works?
This pull request addresses a critical race condition in the DDL event processing pipeline within TiCDC. It introduces a robust buffering and deferral mechanism to ensure that global block events, such as syncpoints or database-wide DDLs, are not acted upon by the maintainer until all preceding DDLs that involve adding or dropping tables have been fully scheduled and acknowledged. This enhancement prevents the maintainer from operating on an outdated or incomplete view of the system's schema, thereby improving data consistency and the reliability of crash recovery.
Highlights
BasicDispatcherto temporarily hold database-level or all-table-level block events, ensuring they are processed only after all preceding DDL tasks that modify tables have been acknowledged.pendingACKCountandholdingBlockEventto prevent a race condition where the maintainer might create an incompletespanControllertask snapshot if global block events are processed before table-modifying DDLs are fully scheduled.Barrierin the maintainer to defer acknowledging DB/All block events from the DDL dispatcher if there are still pending schedule-required events, ensuring an accurate range checker is built.TestHoldBlockEventUntilNoResendTasks,TestDeferAllDBBlockEventFromDDLDispatcherWhilePendingSchedule) to validate the new deferral and buffering logic for DDL block events.Check List
Tests
Questions
Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?
Release note