Fix CI Slack notifications never firing recovery alerts#65119
Merged
potiuk merged 1 commit intoapache:mainfrom Apr 13, 2026
Merged
Fix CI Slack notifications never firing recovery alerts#65119potiuk merged 1 commit intoapache:mainfrom
potiuk merged 1 commit intoapache:mainfrom
Conversation
The artifact lookup for the previous notification state was using 'gh api' with -f parameters, which makes the CLI default to POST instead of GET. The artifacts endpoint returns 404 on POST, so the script always treated each run as the first one with no prior state. That meant the 'all tests passing' recovery notification could never fire, since it requires the previous state to contain failures. Force --method GET on the artifact lookup and add unit tests covering both the GET requirement and the determine_action state machine.
github-actions bot
pushed a commit
that referenced
this pull request
Apr 13, 2026
…65119) The artifact lookup for the previous notification state was using 'gh api' with -f parameters, which makes the CLI default to POST instead of GET. The artifacts endpoint returns 404 on POST, so the script always treated each run as the first one with no prior state. That meant the 'all tests passing' recovery notification could never fire, since it requires the previous state to contain failures. Force --method GET on the artifact lookup and add unit tests covering both the GET requirement and the determine_action state machine. (cherry picked from commit 35e002a) Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
Backport successfully created: v3-2-testNote: As of Merging PRs targeted for Airflow 3.X In matter of doubt please ask in #release-management Slack channel.
|
dandanseo123
pushed a commit
to dandanseo123/airflow
that referenced
this pull request
Apr 13, 2026
The artifact lookup for the previous notification state was using 'gh api' with -f parameters, which makes the CLI default to POST instead of GET. The artifacts endpoint returns 404 on POST, so the script always treated each run as the first one with no prior state. That meant the 'all tests passing' recovery notification could never fire, since it requires the previous state to contain failures. Force --method GET on the artifact lookup and add unit tests covering both the GET requirement and the determine_action state machine.
eladkal
pushed a commit
that referenced
this pull request
Apr 14, 2026
…65119) (#65164) The artifact lookup for the previous notification state was using 'gh api' with -f parameters, which makes the CLI default to POST instead of GET. The artifacts endpoint returns 404 on POST, so the script always treated each run as the first one with no prior state. That meant the 'all tests passing' recovery notification could never fire, since it requires the previous state to contain failures. Force --method GET on the artifact lookup and add unit tests covering both the GET requirement and the determine_action state machine. (cherry picked from commit 35e002a) Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The CI "Notify Slack" job (in
ci-amd-arm.yml,ci-notification.yml, etc.) never sent the "all tests passing" recovery message after a failure was fixed. Looking at run 24310256628 / job 70982878551, the determination step printed:Root cause:
scripts/ci/slack_notification_state.pycallsgh api repos/.../actions/artifacts -f name=.... TheghCLI defaults to POST when any-f/-Fparameter is passed, and the artifacts list endpoint returns 404 on POST. As a result,download_previous_state()always returnedNone, every run looked like the first run with no prior state, anddetermine_action([], None)always returned\"skip\"— sonotify_recoverycould never trigger.Fix
--method GETexplicitly so the-fparameters are encoded as query string instead of a POST body.scripts/tests/ci/test_slack_notification_state.pycovering:download_previous_statemust callgh apiwith--method GET.determine_actionstate machine (skip / notify_new / notify_recovery / change-of-failures).Verified the broken call locally returns 404, and the fixed call returns the expected artifact metadata.
Test plan
uv run --project scripts pytest scripts/tests/ci/test_slack_notification_state.py -xvsci-amd-arm.ymlrun onmainshould now correctly send a recovery notification when a previously-failing run flips to all green.Was generative AI tooling used to co-author this PR?
Generated-by: Claude Code (Opus 4.6) following the guidelines