ddl, maintainer: fix data loss and dead lock in corner case#3445
ddl, maintainer: fix data loss and dead lock in corner case#3445ti-chi-bot[bot] merged 22 commits intopingcap:masterfrom
Conversation
|
/test all |
Summary of ChangesHello @hongyunyan, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request refactors the internal mechanism for scheduling barrier block events within the system's maintainer component, likely to improve synchronization or state management. Concurrently, it adjusts the suite of heavy integration tests by removing a specific DDL-related test, which may streamline the CI process or reflect changes in DDL handling strategy. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request refactors the DDL barrier logic by moving the scheduleBlockEvent call. Previously, this was called when all dispatchers reported being blocked, but now it's correctly called after the writer dispatcher confirms the DDL execution is complete. This improves the correctness of the state management. However, the PR description is missing, which makes it difficult to understand the full context, especially regarding the removal of the ddl_wait integration test. I've added comments to address this and suggest some minor code cleanups. Additionally, an obsolete todo comment on line 366 of maintainer/barrier.go could be removed as part of this change.
| 'consistent_replicate_nfs consistent_replicate_storage_file consistent_replicate_storage_file_large_value consistent_replicate_storage_s3' | ||
| # G11 | ||
| 'multi_changefeeds ddl_wait ddl_reentrant force_replicate_table multi_source' | ||
| 'multi_changefeeds ddl_reentrant force_replicate_table multi_source' |
There was a problem hiding this comment.
maintainer/barrier_event.go
Outdated
| zap.Uint64("commitTs", be.commitTs), | ||
| zap.String("barrierType", be.blockedDispatchers.InfluenceType.String())) | ||
| be.scheduleBlockEvent() | ||
| // be.scheduleBlockEvent() |
|
/test --help |
|
@hongyunyan: The specified target(s) for The following commands are available to trigger optional jobs: Use DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/test pull-cdc-mysql-integration-light |
|
/test pull-cdc-mysql-integration-heavy |
|
/test pull-cdc-storage-integration-heavy |
|
/test pull-cdc-kafka-integration-heavy |
|
/test all |
|
/test all |
|
/test pull-cdc-mysql-integration-light |
|
/test pull-cdc-mysql-integration-heavy |
|
/test pull-cdc-mysql-integration-heavy |
|
/test pull-cdc-mysql-integration-light |
|
/test pull-cdc-mysql-integration-heavy |
|
/test pull-cdc-mysql-integration-light |
|
/gemini review |
Summary of ChangesThis pull request focuses on enhancing the robustness and correctness of DDL handling and scheduling within the system. It introduces a new ordered scheduling mechanism for DDL events to prevent data loss in complex scenarios, refactors core components for better maintainability and deadlock prevention, and updates the integration test suite with more flexible failpoints and a new comprehensive test case to validate these improvements under stress. Highlights
Changelog
Activity
|
There was a problem hiding this comment.
Code Review
This pull request introduces significant improvements to the DDL handling mechanism, enhancing its robustness and correctness, particularly in complex scenarios involving failovers and table topology changes. The key changes include:
- Ordered DDL Scheduling: A new pending event queue (
pendingScheduleEventMap) has been implemented using a min-heap. This ensures that DDL events that modify table structures are scheduled in the correct order based on their commit timestamps, preventing potential race conditions and data loss issues (e.g., aRECOVER TABLEexecuting before its correspondingTRUNCATE TABLEis finished). - Deadlock Fix: A critical deadlock between the
operatorControllerandspanControllerhas been resolved. The logic for removing tasks has been refactored to break a circular lock dependency, improving system stability. - Enhanced Testing: A new comprehensive integration test (
fail_over_ddl_mix_random_delay) has been added to stress-test DDL handling under chaotic conditions, including random DDLs, DMLs, and server failovers with injected delays. TheMySQLSinkExecDDLDelayfailpoint has also been made more flexible.
Overall, this is a high-quality contribution that significantly strengthens the DDL processing pipeline. I have a couple of minor suggestions for further improvement.
| func compareEventKey(a, b eventKey) int { | ||
| if a.blockTs < b.blockTs { | ||
| return -1 | ||
| } | ||
| if a.blockTs > b.blockTs { | ||
| return 1 | ||
| } | ||
| if !a.isSyncPoint && b.isSyncPoint { | ||
| return -1 | ||
| } | ||
| return 1 | ||
| } |
There was a problem hiding this comment.
The compareEventKey function is not a proper 3-way comparison function as it doesn't handle equality cases correctly. For example, if a and b are equal, it returns 1 instead of 0. While this might not cause issues with the current heap implementation due to deduplication in pendingScheduleEventMap, it's misleading and brittle. A future change might rely on it being a correct comparator.
I suggest refactoring the function to correctly return 0 for equal keys for clarity and robustness.
func compareEventKey(a, b eventKey) int {
if a.blockTs < b.blockTs {
return -1
}
if a.blockTs > b.blockTs {
return 1
}
if a.isSyncPoint == b.isSyncPoint {
return 0
}
if !a.isSyncPoint && b.isSyncPoint {
return -1
}
return 1
}
maintainer/span/span_controller.go
Outdated
| c.mu.Lock() | ||
| defer c.mu.Unlock() |
There was a problem hiding this comment.
The function GetRemoveTasksBySchemaID uses a write lock (c.mu.Lock()), but it only performs read operations on the controller's state. This is inconsistent with GetRemoveTasksByTableIDs which correctly uses a read lock (c.mu.RLock()). Using a write lock here can unnecessarily block other concurrent read operations, impacting performance.
To improve concurrency, please use a read lock instead.
| c.mu.Lock() | |
| defer c.mu.Unlock() | |
| c.mu.RLock() | |
| defer c.mu.RUnlock() |
| zap.String("dispatcher", be.writerDispatcher.String()), | ||
| zap.Uint64("commitTs", be.commitTs), | ||
| zap.String("barrierType", be.blockedDispatchers.InfluenceType.String())) | ||
| be.scheduleBlockEvent() |
There was a problem hiding this comment.
This is the core reason why truncate table will loss data. If we schedule when just all dispatcher reported status(here), than shedule action may before the ddl flush successfully. It will lead the new table's dispatcher created and receive dml before truncate table executed. So I remove it here, and schedule when the dispatcher reported the ddl exeuted successfully.
| } | ||
|
|
||
| trap 'stop_tidb_cluster; collect_logs $WORK_DIR' EXIT | ||
| main |
There was a problem hiding this comment.
Please also add redo test
|
/hold |
[LGTM Timeline notifier]Timeline:
|
|
/test pull-cdc-mysql-integration-heavy |
3 similar comments
|
/test pull-cdc-mysql-integration-heavy |
|
/test pull-cdc-mysql-integration-heavy |
|
/test pull-cdc-mysql-integration-heavy |
|
/merge |
|
@hongyunyan: We have migrated to builtin 👉 Please use
DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
|
/tide |
|
/help |
|
/test help |
|
@hongyunyan: The specified target(s) for The following commands are available to trigger optional jobs: Use DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/test pull-error-log-review |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: flowbehappy, lidezhu, wk989898 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What problem does this PR solve?
Issue Number: close #3443
What is changed and how it works?
Summary of Changes
This pull request focuses on enhancing the robustness and correctness of DDL handling and scheduling within the system. It introduces a new ordered scheduling mechanism for DDL events to prevent data loss in complex scenarios, refactors core components for better maintainability and deadlock prevention, and updates the integration test suite with more flexible failpoints and a new comprehensive test case to validate these improvements under stress.
Highlights
BlockedEventMapandeventKeyto a new helper file (barrier_helper.go) and introducing apendingScheduleEventMapto manage DDL events that require strict scheduling order. TheBarrierandBarrierEventstructs have been updated to leverage this new ordered scheduling, particularly for DDLs that modify tables (e.g., truncate, add/drop tables) to prevent data consistency issues during concurrent operations.operator_controllerandspan_controllerhave been modified to separate the logic of identifying tasks for removal from the actual removal process. New methods likeGetRemoveTasksByTableIDsandGetRemoveTasksBySchemaIDinspan_controllernow retrieve tasks without holding locks, allowingoperator_controllerto process these tasks and remove replica sets before the spans are removed, thereby preventing potential deadlocks.MySQLSinkExecDDLDelayfailpoint inpkg/sink/mysql/mysql_writer_ddl.gohas been enhanced to accept a configurable delay duration. This allows for more flexible and precise testing of DDL execution delays in integration tests.ensuretimeouts for redo log checks have been reduced in multiple tests, and theMySQLSinkExecDDLDelayfailpoint is now used with a specific long delay (3600seconds) in several consistent replication tests. A new integration test,fail_over_ddl_mix_random_delay, has been added to simulate mixed DML traffic, random DDLs with varying delays, and CDC failovers to improve robustness testing.Check List
Tests
Questions
Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?
Release note