Fix CloudRunExecuteJobOperator ignoring task failures in deferrable mode#67767
Closed
shahar1 wants to merge 1 commit into
Closed
Fix CloudRunExecuteJobOperator ignoring task failures in deferrable mode#67767shahar1 wants to merge 1 commit into
shahar1 wants to merge 1 commit into
Conversation
32d0167 to
13d1be3
Compare
When a Cloud Run Job's trigger receives a SUCCESS event it may still carry task-count fields indicating that not all tasks finished (e.g. a cancelled execution that resolved without an operation error). Add a defensive check in execute_complete that raises RuntimeError when succeeded_count + failed_count != task_count or when failed_count > 0, mirroring the validation already done in the non-deferrable path via _fail_if_execution_failed. Add unit tests covering both the cancelled (incomplete tasks) and failed-tasks scenarios. closes: apache#57791
13d1be3 to
635559d
Compare
1 task
Contributor
Author
|
Already fixed in main |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR continues and replaces #62278 by @Ajay9704.
Summary
When
CloudRunExecuteJobOperatorruns withdeferrable=Trueand the underlying Cloud Run Job is cancelled (e.g. via the Google Cloud UI or API), the LRO completes without settingoperation.error— cancelled tasks appear incancelled_countrather thanfailed_count. The original deferrable path therefore silently returned success instead of failing, diverging from the non-deferrable behavior.The trigger (
CloudRunJobFinishedTrigger) was already fixed inmainto detect this by deserialising theExecutionproto and emitting aFAILevent. This PR adds a matching defensive check inexecute_completefor the case where aSUCCESSevent still carries task-count fields indicating an incomplete or partially-failed run — mirroring the validation already done on the non-deferrable path via_fail_if_execution_failed.Changes
execute_complete: raisesRuntimeErrorwhensucceeded_count + failed_count != task_countor whenfailed_count > 0.closes: #57791
Was generative AI tooling used to co-author this PR?
Generated-by: Claude Code (claude-sonnet-4-6) following the guidelines