ttl: honor scan task cancellation across statement boundaries#67285
ttl: honor scan task cancellation across statement boundaries#67285ti-chi-bot[bot] merged 6 commits intopingcap:masterfrom
Conversation
|
Review failed due to infrastructure/execution failure after retries. Please re-trigger review. ℹ️ Learn more details on Pantheon AI. |
|
Hi @zanmato1984. Thanks for your PR. PRs from untrusted users cannot be marked as trusted with I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThreads a cancellable per-scan context into TTL scan execution and sessions, injects a failpoint immediately before SQL-killer reset at statement cleanup, and enforces an explicit cancellation check after statement-context reset at statement boundaries. Changes
Sequence DiagramsequenceDiagram
participant KG as Kill Goroutine
participant Task as TTL Scan Task
participant Sess as Scan Session / Executor
participant Killer as SQL Killer
Task->>Task: scanCtx = context.WithCancel(ctx)
Task->>Sess: NewScanSession(scanCtx, ...)
Task->>Sess: Execute SQL using scanCtx
alt normal execution
Sess->>Sess: run statement
Note over Sess: failpoint beforeResetSQLKillerForTTLScan
Sess->>Killer: vars.SQLKiller.Reset()
Sess->>Sess: ResetContextOfStmt returns
Sess->>Sess: check ctx.Err() -> continue if nil
else cancellation path
KG->>KG: receive kill trigger
KG->>Task: cancelScanCtx()
KG->>Killer: rawSess.KillStmt()
Task->>Sess: scanCtx canceled -> operations return scanCtx.Err()
end
Sess-->>Task: propagate cancellation error
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
/ok-to-test |
703660b to
6ed9f08
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #67285 +/- ##
================================================
- Coverage 77.7173% 77.6777% -0.0396%
================================================
Files 1959 1945 -14
Lines 543377 545796 +2419
================================================
+ Hits 422298 423962 +1664
- Misses 120238 121832 +1594
+ Partials 841 2 -839
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
|
/retest |
|
/retest |
|
@pantheon-ai please review |
|
Review Complete Findings: 0 issues ℹ️ Learn more details on Pantheon AI. |
|
/retest |
1 similar comment
|
/retest |
|
/check-issue-triage-complete |
|
/retest |
|
/hold |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/dxf/framework/storage/task_state_test.go`:
- Around line 151-153: The test currently calls sqlexec.ExecSQL(ctx,
se.GetSQLExecutor(), "select sleep(10)"), asserts require.NoError(t, err) and
then returns ctx.Err(); instead capture and return the ExecSQL error directly so
the test observes statement-level cancellation: replace the require.NoError
check with returning the err from the ExecSQL call (i.e., keep the call to
sqlexec.ExecSQL using ctx and se.GetSQLExecutor(), assign its error and return
that error) so cancellation surfaced by ExecSQL is asserted instead of
fabricating context.Canceled.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: e60c7db3-0ab7-4a56-b235-434f4553a6d6
📒 Files selected for processing (2)
pkg/dxf/framework/storage/task_state_test.gopkg/dxf/framework/storage/task_table.go
YangKeao
left a comment
There was a problem hiding this comment.
LGTM
Though I doubt whether it'll be very helpful, because the original implementation will leak at most one statement, but after all this PR makes things better 👍 .,
| _, commitErr := sqlexec.ExecSQL(ctx, se.GetSQLExecutor(), sql) | ||
| if err == nil && commitErr != nil { | ||
| err = commitErr | ||
| commitErr := se.CommitTxn(ctx) |
There was a problem hiding this comment.
can you put this inside the comment on why we use begin explicitly, but use named method for commit/rollback
|
/retest |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bb7133, D3Hunter, YangKeao The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest |
7 similar comments
|
/retest |
|
/retest |
|
/retest |
|
/retest |
|
/retest |
|
/retest |
|
/retest |
|
/hold |
|
/retest |
|
/unhold |
|
/retest |
What problem does this PR solve?
Issue Number: ref #66982
Problem Summary
TestCancelWhileScanis flaky because TTL scan cancellation can fall into a statement-boundary gap.The scan task currently relies on
KillStmtto interrupt the running internal SQL. If cancellation happens between statements, the next internal statement resets the statement context before execution, which clears the statement-bound kill state. As a result, the scanSELECTcan still start and continue running even though the TTL task has already been canceled.What is changed and how it works?
This PR fixes the issue in two parts:
KillStmt.ResetContextOfStmt, immediately honor a canceled caller context before executing the next statement.Together, these changes close the statement-boundary cancellation gap and make TTL scan cancellation respond to task cancellation directly.
This PR also adds targeted regression coverage for the statement-boundary cancellation case.
Check List
Tests
Side effects
This changes internal SQL behavior only when the caller context is already canceled. That behavior should be more correct and should have negligible performance impact beyond an additional
ctx.Err()check and context propagation in TTL scan.Release note
Summary by CodeRabbit
Bug Fixes
Tests
Chores