ddl, maintainer: fix data loss and dead lock in corner case by hongyunyan · Pull Request #3445 · pingcap/ticdc

hongyunyan · 2025-12-03T06:19:49Z

What problem does this PR solve?

Issue Number: close #3443

What is changed and how it works?

Summary of Changes

This pull request focuses on enhancing the robustness and correctness of DDL handling and scheduling within the system. It introduces a new ordered scheduling mechanism for DDL events to prevent data loss in complex scenarios, refactors core components for better maintainability and deadlock prevention, and updates the integration test suite with more flexible failpoints and a new comprehensive test case to validate these improvements under stress.

Highlights

DDL Scheduling Refinement: The internal mechanism for scheduling DDL (Data Definition Language) barrier events has been significantly refactored. This includes moving core data structures like BlockedEventMap and eventKey to a new helper file (barrier_helper.go) and introducing a pendingScheduleEventMap to manage DDL events that require strict scheduling order. The Barrier and BarrierEvent structs have been updated to leverage this new ordered scheduling, particularly for DDLs that modify tables (e.g., truncate, add/drop tables) to prevent data consistency issues during concurrent operations.
Improved Operator Controller Safety: The operator_controller and span_controller have been modified to separate the logic of identifying tasks for removal from the actual removal process. New methods like GetRemoveTasksByTableIDs and GetRemoveTasksBySchemaID in span_controller now retrieve tasks without holding locks, allowing operator_controller to process these tasks and remove replica sets before the spans are removed, thereby preventing potential deadlocks.
Configurable DDL Execution Delay Failpoint: The MySQLSinkExecDDLDelay failpoint in pkg/sink/mysql/mysql_writer_ddl.go has been enhanced to accept a configurable delay duration. This allows for more flexible and precise testing of DDL execution delays in integration tests.
Integration Test Suite Updates: Several integration tests have been adjusted: ensure timeouts for redo log checks have been reduced in multiple tests, and the MySQLSinkExecDDLDelay failpoint is now used with a specific long delay (3600 seconds) in several consistent replication tests. A new integration test, fail_over_ddl_mix_random_delay, has been added to simulate mixed DML traffic, random DDLs with varying delays, and CDC failovers to improve robustness testing.

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)
No code

Questions

Will it cause performance regression or break compatibility?

Do you need to update user documentation, design documentation or monitoring documentation?

Release note

Please refer to [Release Notes Language Style Guide](https://pingcap.github.io/tidb-dev-guide/contribute-to-tidb/release-notes-style-guide.html) to write a quality release note.

If you don't think this PR needs a release note then fill it with `None`.

hongyunyan · 2025-12-03T06:19:58Z

/test all

gemini-code-assist · 2025-12-03T06:20:02Z

Summary of Changes

Hello @hongyunyan, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refactors the internal mechanism for scheduling barrier block events within the system's maintainer component, likely to improve synchronization or state management. Concurrently, it adjusts the suite of heavy integration tests by removing a specific DDL-related test, which may streamline the CI process or reflect changes in DDL handling strategy.

Highlights

Barrier Event Scheduling Refinement: The logic for scheduling barrier block events has been refined, specifically relocating the scheduleBlockEvent call from barrier_event.go to barrier.go to ensure it's triggered when an event is marked as done.
Integration Test Suite Adjustment: The ddl_wait integration test has been removed from the G11 group of heavy integration tests run in the CI pipeline, indicating a potential re-evaluation or refactoring of DDL-related testing.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request refactors the DDL barrier logic by moving the scheduleBlockEvent call. Previously, this was called when all dispatchers reported being blocked, but now it's correctly called after the writer dispatcher confirms the DDL execution is complete. This improves the correctness of the state management. However, the PR description is missing, which makes it difficult to understand the full context, especially regarding the removal of the ddl_wait integration test. I've added comments to address this and suggest some minor code cleanups. Additionally, an obsolete todo comment on line 366 of maintainer/barrier.go could be removed as part of this change.

gemini-code-assist · 2025-12-03T06:21:14Z

tests/integration_tests/run_heavy_it_in_ci.sh

 	'consistent_replicate_nfs consistent_replicate_storage_file consistent_replicate_storage_file_large_value consistent_replicate_storage_s3'
 	# G11
-	'multi_changefeeds ddl_wait ddl_reentrant force_replicate_table multi_source'
+	'multi_changefeeds ddl_reentrant force_replicate_table multi_source'


The ddl_wait integration test has been removed from this test group. Could you please add an explanation to the PR description for this change? It's important to understand if the test is obsolete, flaky, or covered elsewhere, as removing tests without context can be risky.

gemini-code-assist · 2025-12-03T06:21:14Z

maintainer/barrier_event.go

 		zap.Uint64("commitTs", be.commitTs),
 		zap.String("barrierType", be.blockedDispatchers.InfluenceType.String()))
-	be.scheduleBlockEvent()
+	// be.scheduleBlockEvent()


This line has been commented out. If it's no longer needed as part of this refactoring, please remove it to keep the code clean.

hongyunyan · 2025-12-03T08:28:55Z

/test --help

ti-chi-bot · 2025-12-03T08:28:58Z

@hongyunyan: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

/test pull-build

/test pull-cdc-kafka-integration-heavy

/test pull-cdc-kafka-integration-light

/test pull-cdc-mysql-integration-heavy

/test pull-cdc-mysql-integration-light

/test pull-cdc-storage-integration-heavy

/test pull-cdc-storage-integration-light

/test pull-check

/test pull-error-log-review

/test pull-unit-test

The following commands are available to trigger optional jobs:

/test pull-build-next-gen

/test pull-cdc-kafka-integration-heavy-next-gen

/test pull-cdc-kafka-integration-light-next-gen

/test pull-cdc-mysql-integration-heavy-next-gen

/test pull-cdc-mysql-integration-light-next-gen

/test pull-cdc-pulsar-integration-light

/test pull-cdc-pulsar-integration-light-next-gen

/test pull-cdc-storage-integration-heavy-next-gen

/test pull-cdc-storage-integration-light-next-gen

/test pull-unit-test-next-gen

Use /test all to run the following jobs that were automatically triggered:

pull-build

pull-build-next-gen

pull-check

pull-error-log-review

pull-unit-test

pull-unit-test-next-gen

Details

In response to this:

/test --help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

hongyunyan · 2025-12-03T08:31:07Z

/test pull-cdc-mysql-integration-light

hongyunyan · 2025-12-03T08:32:01Z

/test pull-cdc-mysql-integration-heavy

hongyunyan · 2025-12-03T08:46:40Z

/test pull-cdc-storage-integration-heavy

hongyunyan · 2025-12-03T08:46:54Z

/test pull-cdc-kafka-integration-heavy

hongyunyan · 2025-12-03T09:58:25Z

/test all

hongyunyan · 2025-12-03T13:51:51Z

/test all

hongyunyan · 2025-12-04T03:07:49Z

/test pull-cdc-mysql-integration-light

hongyunyan · 2025-12-04T03:07:54Z

/test pull-cdc-mysql-integration-heavy

hongyunyan · 2025-12-04T04:57:46Z

/test pull-cdc-mysql-integration-heavy

hongyunyan · 2025-12-04T04:57:53Z

/test pull-cdc-mysql-integration-light

hongyunyan · 2025-12-04T06:10:54Z

/test pull-cdc-mysql-integration-heavy

hongyunyan · 2025-12-04T06:10:59Z

/test pull-cdc-mysql-integration-light

hongyunyan · 2025-12-05T11:16:22Z

/gemini review

gemini-code-assist · 2025-12-05T11:16:46Z

Summary of Changes

This pull request focuses on enhancing the robustness and correctness of DDL handling and scheduling within the system. It introduces a new ordered scheduling mechanism for DDL events to prevent data loss in complex scenarios, refactors core components for better maintainability and deadlock prevention, and updates the integration test suite with more flexible failpoints and a new comprehensive test case to validate these improvements under stress.

Highlights

DDL Scheduling Refinement: The internal mechanism for scheduling DDL (Data Definition Language) barrier events has been significantly refactored. This includes moving core data structures like BlockedEventMap and eventKey to a new helper file (barrier_helper.go) and introducing a pendingScheduleEventMap to manage DDL events that require strict scheduling order. The Barrier and BarrierEvent structs have been updated to leverage this new ordered scheduling, particularly for DDLs that modify tables (e.g., truncate, add/drop tables) to prevent data consistency issues during concurrent operations.
Improved Operator Controller Safety: The operator_controller and span_controller have been modified to separate the logic of identifying tasks for removal from the actual removal process. New methods like GetRemoveTasksByTableIDs and GetRemoveTasksBySchemaID in span_controller now retrieve tasks without holding locks, allowing operator_controller to process these tasks and remove replica sets before the spans are removed, thereby preventing potential deadlocks.
Configurable DDL Execution Delay Failpoint: The MySQLSinkExecDDLDelay failpoint in pkg/sink/mysql/mysql_writer_ddl.go has been enhanced to accept a configurable delay duration. This allows for more flexible and precise testing of DDL execution delays in integration tests.
Integration Test Suite Updates: Several integration tests have been adjusted: ensure timeouts for redo log checks have been reduced in multiple tests, and the MySQLSinkExecDDLDelay failpoint is now used with a specific long delay (3600 seconds) in several consistent replication tests. A new integration test, fail_over_ddl_mix_random_delay, has been added to simulate mixed DML traffic, random DDLs with varying delays, and CDC failovers to improve robustness testing.

Changelog

maintainer/barrier.go
- Removed sync package import as its usage was refactored out.
- Removed BlockedEventMap and eventKey struct definitions and their associated methods, relocating them to barrier_helper.go.
- Added pendingEvents *pendingScheduleEventMap field to the Barrier struct to manage DDL events requiring ordered scheduling.
- Removed GetLock and ReleaseLock methods from Barrier.
- Modified handleOneStatus and handleBlockState function signatures to return an additional boolean needACK.
- Updated logic in handleEventDone and handleBlockState to utilize the new pendingEvents queue and tryScheduleEvent method for ordered DDL scheduling.
- Introduced tryScheduleEvent method to handle the scheduling of pending DDL events.
- Removed the getEventKey helper function, as it was moved to barrier_helper.go.
maintainer/barrier_event.go
- Added needSchedule bool field to the BarrierEvent struct to indicate if an event requires ordered scheduling.
- Introduced a needSchedule helper function to determine if a given heartbeatpb.State necessitates scheduling.
- Removed the direct call to be.scheduleBlockEvent() from onAllDispatcherReportedBlockEvent, as scheduling is now managed by tryScheduleEvent in barrier.go.
maintainer/barrier_helper.go
- Added new file to encapsulate helper structures and functions related to barrier events.
- Contains the BlockedEventMap struct and its methods, previously defined in barrier.go.
- Introduced pendingScheduleEventMap struct, which uses a min-heap (pendingEventKeyHeap) to store and retrieve BarrierEvent keys in commit timestamp order, ensuring DDLs are processed sequentially.
- Defined eventKey struct and getEventKey function, moved from barrier.go.
maintainer/barrier_helper_test.go
- Added new file for unit tests covering the functionality of pendingScheduleEventMap, including deduplication and popIfHead behavior.
maintainer/barrier_test.go
- Updated an assertion in TestSchemaBlock from expecting 0 DispatcherStatuses to 1, reflecting changes in how dispatcher statuses are handled.
maintainer/operator/operator_controller.go
- Modified RemoveTasksBySchemaID and RemoveTasksByTableIDs to first retrieve tasks using new span_controller methods, then remove replica sets, and finally remove spans. This change prevents potential deadlocks by separating task identification from span removal.
maintainer/span/span_controller.go
- Refactored RemoveByTableIDs and RemoveBySchemaID into distinct GetRemoveTasksByTableIDs/GetRemoveTasksBySchemaID (to retrieve tasks) and RemoveByTableIDs/RemoveBySchemaID (to perform actual span removal). This separation of concerns avoids holding locks during operator creation and prevents deadlocks.
maintainer/span/span_controller_test.go
- Updated TestBasicFunction to align with the new span_controller methods for getting and removing tasks by table and schema IDs.
pkg/sink/mysql/mysql_writer_ddl.go
- Enhanced the MySQLSinkExecDDLDelay failpoint to accept a string value representing the delay duration in seconds, allowing for dynamic configuration of DDL execution delays during testing.
tests/integration_tests/checkpoint_race_ddl_crash/run.sh
- Reduced the ensure timeout for check_redo_resolved_ts from 100 to 50.
tests/integration_tests/consistent_compatibility/run.sh
- Modified GO_FAILPOINTS export to set MySQLSinkExecDDLDelay to a specific duration of "3600" seconds instead of true.
tests/integration_tests/consistent_partition_table/run.sh
- Modified GO_FAILPOINTS export to set MySQLSinkExecDDLDelay to a specific duration of "3600" seconds instead of true.
tests/integration_tests/consistent_replicate_ddl/run.sh
- Modified GO_FAILPOINTS export to set MySQLSinkExecDDLDelay to a specific duration of "3600" seconds instead of true.
- Reduced the ensure timeout for check_redo_resolved_ts from 100 to 50.
tests/integration_tests/ddl_for_split_tables/run.sh
- Reduced the ensure timeout for check_redo_resolved_ts from 100 to 50.
tests/integration_tests/ddl_for_split_tables_with_failover/run.sh
- Reduced the ensure timeout for check_redo_resolved_ts from 100 to 50.
tests/integration_tests/ddl_for_split_tables_with_random_merge_and_split/run.sh
- Reduced the ensure timeout for check_redo_resolved_ts from 100 to 50.
tests/integration_tests/ddl_for_split_tables_with_random_move_table/run.sh
- Reduced the ensure timeout for check_redo_resolved_ts from 100 to 50.
tests/integration_tests/ddl_reentrant/run.sh
- Reduced the ensure timeout for check_ddl_executed from 1000000000000000 to 500000000000000.
tests/integration_tests/ddl_with_random_move_table/run.sh
- Reduced the ensure timeout for check_redo_resolved_ts from 100 to 50.
tests/integration_tests/fail_over_ddl_mix/run.sh
- Reduced the ensure timeout for check_redo_resolved_ts from 300 to 50.
- Increased the check_sync_diff timeout from 500 to 1000.
tests/integration_tests/fail_over_ddl_mix_random_delay/conf/changefeed.toml
- Added new configuration file for the fail_over_ddl_mix_random_delay integration test.
tests/integration_tests/fail_over_ddl_mix_random_delay/conf/diff_config.toml
- Added new configuration file for the fail_over_ddl_mix_random_delay integration test.
tests/integration_tests/fail_over_ddl_mix_random_delay/run.sh
- Added new integration test script to simulate mixed DML, random DDLs with random delays, and CDC failovers, verifying data consistency.
tests/integration_tests/fail_over_ddl_mix_with_syncpoint/run.sh
- Reduced the ensure timeout for check_redo_resolved_ts from 300 to 50.
tests/integration_tests/kill_owner_with_ddl/run.sh
- Modified GO_FAILPOINTS export to set MySQLSinkExecDDLDelay to a specific duration of "3600" seconds instead of true.
tests/integration_tests/run_heavy_it_in_ci.sh
- Added fail_over_ddl_mix_random_delay to the mysql_groups for heavy integration tests.
tests/integration_tests/storage_csv_update/run.sh
- Reduced the ensure timeout for ls "$table_dir" | grep -v "meta" from 60 to 50.

Activity

ti-chi-bot[bot] noted that the PR is not yet approved and provided instructions for approval.
hongyunyan initiated a full test run with /test all.
gemini-code-assist[bot] provided an initial summary and highlights of the PR, noting the ddl_wait test removal and a commented-out line.
hongyunyan attempted to use /test --help.
ti-chi-bot[bot] responded to /test --help with a list of available test commands.
hongyunyan triggered various specific integration tests: pull-cdc-mysql-integration-light, pull-cdc-mysql-integration-heavy, pull-cdc-storage-integration-heavy, pull-cdc-kafka-integration-heavy.
hongyunyan repeatedly triggered /test all multiple times over several days.
ti-chi-bot[bot] reported that pull-error-log-review and pull-cdc-mysql-integration-light-next-gen tests failed.
hongyunyan re-ran the failed pull-cdc-mysql-integration-light-next-gen test.
hongyunyan requested another summary from gemini-code-assist[bot].
gemini-code-assist[bot] commented on the removal of the ddl_wait integration test, requesting an explanation in the PR description.
gemini-code-assist[bot] also noted a commented-out line and suggested removing it if no longer needed.

gemini-code-assist

Code Review

This pull request introduces significant improvements to the DDL handling mechanism, enhancing its robustness and correctness, particularly in complex scenarios involving failovers and table topology changes. The key changes include:

Ordered DDL Scheduling: A new pending event queue (pendingScheduleEventMap) has been implemented using a min-heap. This ensures that DDL events that modify table structures are scheduled in the correct order based on their commit timestamps, preventing potential race conditions and data loss issues (e.g., a RECOVER TABLE executing before its corresponding TRUNCATE TABLE is finished).
Deadlock Fix: A critical deadlock between the operatorController and spanController has been resolved. The logic for removing tasks has been refactored to break a circular lock dependency, improving system stability.
Enhanced Testing: A new comprehensive integration test (fail_over_ddl_mix_random_delay) has been added to stress-test DDL handling under chaotic conditions, including random DDLs, DMLs, and server failovers with injected delays. The MySQLSinkExecDDLDelay failpoint has also been made more flexible.

Overall, this is a high-quality contribution that significantly strengthens the DDL processing pipeline. I have a couple of minor suggestions for further improvement.

gemini-code-assist · 2025-12-05T11:19:49Z

maintainer/barrier_helper.go

+func compareEventKey(a, b eventKey) int {
+	if a.blockTs < b.blockTs {
+		return -1
+	}
+	if a.blockTs > b.blockTs {
+		return 1
+	}
+	if !a.isSyncPoint && b.isSyncPoint {
+		return -1
+	}
+	return 1
+}


The compareEventKey function is not a proper 3-way comparison function as it doesn't handle equality cases correctly. For example, if a and b are equal, it returns 1 instead of 0. While this might not cause issues with the current heap implementation due to deduplication in pendingScheduleEventMap, it's misleading and brittle. A future change might rely on it being a correct comparator.

I suggest refactoring the function to correctly return 0 for equal keys for clarity and robustness.

func compareEventKey(a, b eventKey) int { if a.blockTs < b.blockTs { return -1 } if a.blockTs > b.blockTs { return 1 } if a.isSyncPoint == b.isSyncPoint { return 0 } if !a.isSyncPoint && b.isSyncPoint { return -1 } return 1 }

gemini-code-assist · 2025-12-05T11:19:49Z

maintainer/span/span_controller.go

 	c.mu.Lock()
 	defer c.mu.Unlock()


The function GetRemoveTasksBySchemaID uses a write lock (c.mu.Lock()), but it only performs read operations on the controller's state. This is inconsistent with GetRemoveTasksByTableIDs which correctly uses a read lock (c.mu.RLock()). Using a write lock here can unnecessarily block other concurrent read operations, impacting performance.

To improve concurrency, please use a read lock instead.

Suggested change

c.mu.Lock()

defer c.mu.Unlock()

c.mu.RLock()

defer c.mu.RUnlock()

maintainer/barrier.go

wk989898 · 2025-12-05T11:28:12Z

maintainer/barrier_event.go

 		zap.String("dispatcher", be.writerDispatcher.String()),
 		zap.Uint64("commitTs", be.commitTs),
 		zap.String("barrierType", be.blockedDispatchers.InfluenceType.String()))
-	be.scheduleBlockEvent()


Why remove this?

This is the core reason why truncate table will loss data. If we schedule when just all dispatcher reported status(here), than shedule action may before the ddl flush successfully. It will lead the new table's dispatcher created and receive dml before truncate table executed. So I remove it here, and schedule when the dispatcher reported the ddl exeuted successfully.

maintainer/span/span_controller.go

wk989898 · 2025-12-05T11:35:17Z

tests/integration_tests/fail_over_ddl_mix_random_delay/run.sh

+}
+
+trap 'stop_tidb_cluster; collect_logs $WORK_DIR' EXIT
+main


Please also add redo test

tests/integration_tests/run_heavy_it_in_ci.sh

hongyunyan · 2025-12-05T11:37:56Z

/hold

ti-chi-bot · 2025-12-05T11:40:49Z

[LGTM Timeline notifier]

Timeline:

2025-12-05 11:36:52.314202128 +0000 UTC m=+608957.127979700: ☑️ agreed by wk989898.
2025-12-05 11:40:48.833559428 +0000 UTC m=+609193.647337020: ☑️ agreed by flowbehappy.

hongyunyan · 2025-12-05T13:15:21Z

/test pull-cdc-mysql-integration-heavy

hongyunyan · 2025-12-05T13:16:11Z

/test pull-cdc-mysql-integration-heavy

hongyunyan · 2025-12-05T13:16:45Z

/test pull-cdc-mysql-integration-heavy

hongyunyan · 2025-12-05T13:42:11Z

/test pull-cdc-mysql-integration-heavy

hongyunyan · 2025-12-05T14:53:36Z

/merge

ti-chi-bot · 2025-12-05T14:53:37Z

@hongyunyan: We have migrated to builtin LGTM and approve plugins for reviewing.

👉 Please use /approve when you want to approve this pull request.

The changes announcement: LGTM plugin changes

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

hongyunyan · 2025-12-05T14:54:52Z

/tide

hongyunyan · 2025-12-05T14:56:53Z

/help

hongyunyan · 2025-12-05T14:57:25Z

/test help

ti-chi-bot · 2025-12-05T14:57:28Z

@hongyunyan: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

/test pull-build

/test pull-cdc-kafka-integration-heavy

/test pull-cdc-kafka-integration-light

/test pull-cdc-mysql-integration-heavy

/test pull-cdc-mysql-integration-light

/test pull-cdc-storage-integration-heavy

/test pull-cdc-storage-integration-light

/test pull-check

/test pull-error-log-review

/test pull-unit-test

The following commands are available to trigger optional jobs:

/test pull-build-next-gen

/test pull-cdc-kafka-integration-heavy-next-gen

/test pull-cdc-kafka-integration-light-next-gen

/test pull-cdc-mysql-integration-heavy-next-gen

/test pull-cdc-mysql-integration-light-next-gen

/test pull-cdc-pulsar-integration-light

/test pull-cdc-pulsar-integration-light-next-gen

/test pull-cdc-storage-integration-heavy-next-gen

/test pull-cdc-storage-integration-light-next-gen

/test pull-unit-test-next-gen

Use /test all to run the following jobs that were automatically triggered:

pull-build

pull-build-next-gen

pull-check

pull-error-log-review

pull-unit-test

pull-unit-test-next-gen

Details

In response to this:

/test help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

hongyunyan · 2025-12-05T14:57:45Z

/test pull-error-log-review

ti-chi-bot · 2025-12-05T15:07:51Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: flowbehappy, lidezhu, wk989898

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [flowbehappy,lidezhu,wk989898]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

close #3443

update

0c3a2d6

ti-chi-bot bot added do-not-merge/needs-linked-issue release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Dec 3, 2025

ti-chi-bot bot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Dec 3, 2025

gemini-code-assist bot reviewed Dec 3, 2025

View reviewed changes

update

0458f18

ti-chi-bot bot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Dec 3, 2025

update

19345b3

ti-chi-bot bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Dec 3, 2025

update

e447492

update

715d8f5

update

00a2643

hongyunyan changed the title ~~test ddl~~ ddl, maintainer: fix data loss and dead lock in corner case Dec 5, 2025

gemini-code-assist bot reviewed Dec 5, 2025

View reviewed changes

wk989898 approved these changes Dec 5, 2025

View reviewed changes

ti-chi-bot bot added needs-1-more-lgtm Indicates a PR needs 1 more LGTM. approved labels Dec 5, 2025

ti-chi-bot bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 5, 2025

flowbehappy approved these changes Dec 5, 2025

View reviewed changes

ti-chi-bot bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Dec 5, 2025

update

c8dcd9f

hongyunyan removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 5, 2025

lidezhu approved these changes Dec 5, 2025

View reviewed changes

ti-chi-bot bot merged commit 3799995 into pingcap:master Dec 5, 2025
19 checks passed

tenfyzhong pushed a commit that referenced this pull request Dec 5, 2025

ddl, maintainer: fix data loss and dead lock in corner case (#3445)

d9cdbd2

close #3443

Conversation

hongyunyan commented Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

What is changed and how it works?

Summary of Changes

Highlights

Check List

Tests

Questions

Will it cause performance regression or break compatibility?

Do you need to update user documentation, design documentation or monitoring documentation?

Release note

Uh oh!

hongyunyan commented Dec 3, 2025

Uh oh!

gemini-code-assist bot commented Dec 3, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

hongyunyan commented Dec 3, 2025

Uh oh!

ti-chi-bot bot commented Dec 3, 2025

Uh oh!

hongyunyan commented Dec 3, 2025

Uh oh!

hongyunyan commented Dec 3, 2025

Uh oh!

hongyunyan commented Dec 3, 2025

Uh oh!

hongyunyan commented Dec 3, 2025

Uh oh!

hongyunyan commented Dec 3, 2025

Uh oh!

hongyunyan commented Dec 3, 2025

Uh oh!

hongyunyan commented Dec 4, 2025

Uh oh!

hongyunyan commented Dec 4, 2025

Uh oh!

hongyunyan commented Dec 4, 2025

Uh oh!

hongyunyan commented Dec 4, 2025

Uh oh!

hongyunyan commented Dec 4, 2025

Uh oh!

hongyunyan commented Dec 4, 2025

Uh oh!

hongyunyan commented Dec 5, 2025

Uh oh!

gemini-code-assist bot commented Dec 5, 2025

Summary of Changes

Highlights

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

wk989898 Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

hongyunyan Dec 5, 2025

hongyunyan commented Dec 3, 2025 •

edited

Loading