Fix/8787 cancel pipeline keeps running#8832
Merged
klesh merged 2 commits intoapache:mainfrom Apr 18, 2026
Merged
Conversation
…he#8787) Three independent bugs caused pipeline cancellation to silently fail: 1. CancelPipeline discarded errors from CancelTask and never cancelled TASK_CREATED tasks in future stages. Now running tasks are cancelled via context, non-running tasks are batch-updated to TASK_CANCELLED in the DB, and errors are logged and returned. 2. gitextractor's storeRepoSnapshot (go-git path) had no ctx.Done() checks in its commit/blame loops, making it unresponsive to cancellation for 30+ minutes on large repos. Added cancellation checkpoints following the pattern already used elsewhere in the file. 3. Cancelled tasks were marked TASK_FAILED instead of TASK_CANCELLED, and ComputePipelineStatus never returned TASK_CANCELLED. Now RunTask checks for context cancellation and writes TASK_CANCELLED, and ComputePipelineStatus returns TASK_CANCELLED when the pipeline was cancelled by the user. Test gaps: RunTask deferred status logic, CancelPipeline flow, and storeRepoSnapshot have no existing unit tests. These are pre-existing gaps not introduced by this change. The only existing test, TestComputePipelineStatus, has been extended to cover isCancelled=true. Closes apache#8787 Related: apache#5585, apache#4188
Extract cancelRunningTasks and cancelPendingTasksInDB helpers from CancelPipeline for better separation of concerns. Also fixes: - .As(NotFound) replaced with .GetType() to prevent swallowing wrapped errors - Added TASK_RESUME to cancelPendingTasksInDB status filter - Error message now includes pipeline ID for traceability - Added TestCancelPipeline e2e tests with 3 subtests
klesh
approved these changes
Apr 18, 2026
Contributor
klesh
left a comment
There was a problem hiding this comment.
LGTM
Thanks for your contribution.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
pr-type/bug-fix,pr-type/feature-development, etc.Summary
Clicking the cancel button in the UI (which fires DELETE /api/pipelines/:id) always returns "Operation successfully completed" (HTTP 200), but the pipeline continues running. Observable for 30+ minutes after the cancel request before it stops on its own.
Three independent bugs were causing this:
CancelPipelinesilently discarded errors fromCancelTask(pipeline.go). The cancel-func was never invoked for running tasks, and pending tasks were not batch-updated in the DB. Fixed by classifying tasks into running vs pending, then delegating tocancelRunningTasks(triggers context cancellation) andcancelPendingTasksInDB(batch DB update with status guard).gitextractorignored context cancellation instoreRepoSnapshot(repo_gogit.go).gogit.Blame()has no context parameter, so once a blame started it ran to completion. Addedselect { case <-ctx.Done() }checks at the top of the outer commit loop and innerpatch.Stats()loop so cancellation is respected between iterations.Cancelled tasks were recorded as
TASK_FAILED(run_task.go,pipeline_runner.go). The deferred status-update block didn't distinguish cancellation from failure. Now detects cancellation viaerrors.Is(err, context.Canceled) || ctx.Err() == context.Canceled, sets status toTASK_CANCELLED, and skips writingfailed_sub_task.ComputePipelineStatusalso now returnsTASK_CANCELLEDwhen the pipeline was cancelled, taking priority over other statuses.Additional correctness improvements:
cancelPendingTasksInDBincludesTASK_RESUMEin the status filter so tasks awaiting restart recovery are also cancelledcancelRunningTasksuseserr.GetType() == errors.NotFoundinstead of.As(errors.NotFound)to avoid walking the entire error chainSkipOnFailguard inrun_task.goreuses theisCancelledvariable instead of recomputingDoes this close any open issues?
Closes #8787
Screenshots
N/A
Other Information
Additional improvements:
cancelPendingTasksInDBincludesTASK_RESUMEin the status filter so tasks awaiting restart recovery are also cancelledcancelRunningTasksuseserr.GetType() == errors.NotFoundinstead of.As(errors.NotFound)to avoid walking the entire error chainSkipOnFailguard inrun_task.goreuses theisCancelledvariable instead of recomputingTests:
TestComputePipelineStatusverifiesTASK_CANCELLEDis returned whenisCancelled=true, regardless of individual task statusesTestCancelPipelinewith 3 subtests: cancels all pending tasks, leaves completed tasks unchanged, returns error for non-existent pipeline