ttl: stabilize TestCancelWhileScan runtime by zanmato1984 · Pull Request #67657 · pingcap/tidb

zanmato1984 · 2026-04-09T11:28:39Z

What problem does this PR solve?

Issue Number: ref #66982

Problem Summary:

TestCancelWhileScan still times out in CI under resource pressure. The previous fix in #67285 addressed statement-boundary cancellation correctness, but this test still spent too much time in long-mode stress setup/execution and could exceed shard timeout.

What changed and how does it work?

This PR keeps the same cancellation assertions and makes the stress path cheaper and more deterministic:

Batch test data inserts in TestCancelWhileScan instead of issuing 10k single-row inserts.
Use bounded rounds (10 default, 30 in -long) instead of time-based loops.
Add a small scan delay failpoint (sleepCoprRequest=200ms) so each round still exercises cancellation while avoiding heavy table size/time requirements.

This keeps the regression coverage focused on cancellation responsiveness while removing the long-tail runtime behavior that caused timeouts.

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)
No need to test
- I checked and no code files have been changed.

Side effects

Performance regression: Consumes more CPU
Performance regression: Consumes more Memory
Breaking backward compatibility

Documentation

Release note

None

Summary by CodeRabbit

Release Notes

This release contains internal test improvements with no user-facing changes.

Tests
- Improved test efficiency by batching data setup and reducing redundant operations.
- Made scan/cancel test runs deterministic by switching to fixed iterations.
- Enhanced fault-injection coverage to better exercise cancellation behavior.

pantheon-ai · 2026-04-09T11:28:46Z

@zanmato1984 I've received your pull request and will start the review. I'll conduct a thorough review covering code quality, potential issues, and implementation details.

⏳ This process typically takes 10-30 minutes depending on the complexity of the changes.

_{ℹ️ Learn more details on Pantheon AI.}

ti-chi-bot · 2026-04-09T11:28:51Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign tangenta for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

tiprow · 2026-04-09T11:28:58Z

Hi @zanmato1984. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

coderabbitai · 2026-04-09T11:29:02Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 10fa171a-1c8f-4937-b992-998a2d60b2cf

📥 Commits

Reviewing files that changed from the base of the PR and between fca6b36 and 10aeb41.

📒 Files selected for processing (1)

pkg/ttl/ttlworker/scan_integration_test.go

✅ Files skipped from review due to trivial changes (1)

pkg/ttl/ttlworker/scan_integration_test.go

📝 Walkthrough

Walkthrough

Test updated to use batched multi-row inserts (1,000 rows in 10 batches of 100), enable the sleepCoprRequest failpoint with return(200), and replace time-based scan/cancel looping with a fixed rounds := 10 iteration count; removed testflag import and related timing variables.

Changes

Cohort / File(s)	Summary
TTL Scan Integration Test `pkg/ttl/ttlworker/scan_integration_test.go`	Replaced 10,000 single-row `INSERT`s with batched multi-row inserts (1,000 rows total, 100 per batch via `strings.Join`); enabled `github.com/pingcap/tidb/pkg/store/copr/sleepCoprRequest` failpoint (`return(200)`); changed scan/cancel loop from time-based to fixed `rounds := 10`; removed `testStart`, `testDuration`, and `testflag` import.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

ttl: honor scan task cancellation across statement boundaries #67285: Related changes to TTL scan cancellation test and enabling sleepCoprRequest failpoint in scan_integration_test.go.

Suggested labels

ok-to-test, approved, lgtm

Suggested reviewers

wjhuang2016
YangKeao
bb7133

Poem

🐰 Batches hop in, neat and spry,
Ten rounds now dance beneath the sky,
A sleepy copr asks for a pause,
Tests run steady, no wild cause,
Hooray — the rabbit stamps its paws!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main change: stabilizing the runtime of the TestCancelWhileScan test through more efficient data insertion and deterministic looping.
Description check	✅ Passed	The description includes all required sections: problem summary with issue reference, explanation of changes, completed checklist, and release note. It clearly explains the optimization strategy.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.11.4)

Command failed

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

zanmato1984

Role: Reviewer-R1

LGTM (round 1).

Checks:

Original test intent is preserved: TestCancelWhileScan still asserts cancellation completes within 1s after cancel().
Change scope is test-only and minimal: only pkg/ttl/ttlworker/scan_integration_test.go is updated.
Recurrence analysis is consistent with evidence from the reopened flaky issue/build logs: current recurrence is timeout/runtime pressure rather than statement-boundary correctness.
No unnecessary timing-only precise repro is retained: this update simplifies the stress profile and keeps deterministic cancellation coverage without extra reproducer-only paths.

Validation:

./tools/check/failpoint-go-test.sh pkg/ttl/ttlworker -run '^TestCancelWhileScan$' -count=20 (pass)

codecov · 2026-04-09T11:46:08Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 77.3567%. Comparing base (997e75c) to head (10aeb41).
⚠️ Report is 29 commits behind head on master.

Additional details and impacted files

@@               Coverage Diff                @@
##             master     #67657        +/-   ##
================================================
- Coverage   77.5871%   77.3567%   -0.2305%     
================================================
  Files          1981       1965        -16     
  Lines        547950     551723      +3773     
================================================
+ Hits         425139     426795      +1656     
- Misses       122001     124914      +2913     
+ Partials        810         14       -796

Flag	Coverage Δ
integration	`40.8968% <ø> (+6.5571%)`	⬆️
unit	`76.6452% <ø> (+0.3119%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
dumpling	`61.5065% <ø> (+0.0901%)`	⬆️
parser	`∅ <ø> (∅)`
br	`49.9148% <ø> (-10.5164%)`	⬇️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

YangKeao · 2026-04-14T02:22:02Z

-	testStart := time.Now()
-	testDuration := time.Second
+	rounds := 10
 	if testflag.Long() {


Long mode doesn't affect the CI, so I don't think this change will be helpful.

Good point. I removed the testflag.Long() branch so CI and local runs now execute the same loop count (rounds := 10), while keeping the original cancellation assertion and test intent unchanged.

Validation:

./tools/check/failpoint-go-test.sh pkg/ttl/ttlworker -run "^TestCancelWhileScan$" -count=1 (pass)

make lint (pass)

Included in commit 10aeb41daf.

ttl: stabilize TestCancelWhileScan runtime

fca6b36

ti-chi-bot Bot added the release-note-none Denotes a PR that doesn't merit a release note. label Apr 9, 2026

ti-chi-bot Bot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Apr 9, 2026

zanmato1984 commented Apr 9, 2026

View reviewed changes

YangKeao reviewed Apr 14, 2026

View reviewed changes

ttl: remove long-mode branch from cancel scan test

10aeb41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ttl: stabilize TestCancelWhileScan runtime#67657

ttl: stabilize TestCancelWhileScan runtime#67657
zanmato1984 wants to merge 2 commits intopingcap:masterfrom
zanmato1984:issue-66982-flaky-timeout

zanmato1984 commented Apr 9, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

pantheon-ai Bot commented Apr 9, 2026 •

edited

Loading

Uh oh!

ti-chi-bot Bot commented Apr 9, 2026

Uh oh!

tiprow Bot commented Apr 9, 2026

Uh oh!

coderabbitai Bot commented Apr 9, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

zanmato1984 left a comment

Uh oh!

codecov Bot commented Apr 9, 2026 •

edited

Loading

Uh oh!

YangKeao Apr 14, 2026

Uh oh!

zanmato1984 Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zanmato1984 commented Apr 9, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

What changed and how does it work?

Check List

Release note

Summary by CodeRabbit

Release Notes

Uh oh!

pantheon-ai Bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ti-chi-bot Bot commented Apr 9, 2026

Uh oh!

tiprow Bot commented Apr 9, 2026

Uh oh!

coderabbitai Bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

zanmato1984 left a comment

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

YangKeao Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

zanmato1984 Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zanmato1984 commented Apr 9, 2026 •

edited by coderabbitai Bot

Loading

pantheon-ai Bot commented Apr 9, 2026 •

edited

Loading

coderabbitai Bot commented Apr 9, 2026 •

edited

Loading

codecov Bot commented Apr 9, 2026 •

edited

Loading