Skip to content

Fixed slack-notify job not running on workflow failure#1324

Merged
mike182uk merged 1 commit intomainfrom
misc-fix-cicd-slack-notify
Sep 23, 2025
Merged

Fixed slack-notify job not running on workflow failure#1324
mike182uk merged 1 commit intomainfrom
misc-fix-cicd-slack-notify

Conversation

@mike182uk
Copy link
Member

@mike182uk mike182uk commented Sep 23, 2025

no refs

Slack notifications were not working when the pipeline failed due to the slack-notify job relying on all prior jobs to be successful:

If a job fails or is skipped, all jobs that need it are skipped unless the jobs use a conditional expression that causes the job to continue

If you would like a job to run even if a job it is dependent on did not succeed, use the always() conditional expression

(https://docs.github.com/en/actions/using-jobs/using-jobs-in-a-workflow)

This change ensures that the job is always run, and the existing failure() check ensures that the jobs steps are only run when the pipeline fails

no refs

Slack notifications were not working when the pipeline failed due to the
`slack-notify` job relying on all prior jobs to be successful. This change
ensures that the job is always run, and the existing `failure()` check ensures
that the job is only run when the pipeline fails
@coderabbitai
Copy link

coderabbitai bot commented Sep 23, 2025

Walkthrough

The GitHub Actions workflow at .github/workflows/cicd.yml was updated. The “Slack Notification on Failure” job is configured to always run by adding if: always() at the job level. Inside that job, the Slack action step’s condition was simplified by removing always(), so it now triggers only when failure() is true and the event is a push to main. No other files or steps are mentioned.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The title clearly and concisely summarizes the primary change: ensuring the slack-notify job runs when the workflow fails. It names the affected job and the bug fixed, so a teammate scanning history will understand the main intent.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
Description Check ✅ Passed The PR description explicitly describes the slack-notify job failing to run because it depended on prior jobs and states the applied fix of using always() on the job while keeping failure() for the step, which matches the changes in .github/workflows/cicd.yml; the description is therefore directly related to the changeset.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch misc-fix-cicd-slack-notify

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@mike182uk mike182uk changed the title Fixed slack-notify jon not running on workflow failure Fixed slack-notify job not running on workflow failure Sep 23, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6a653f5 and c1084f0.

📒 Files selected for processing (1)
  • .github/workflows/cicd.yml (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Cursor Bugbot
  • GitHub Check: build / Build Docker Images (linux/amd64, amd64)
🔇 Additional comments (1)
.github/workflows/cicd.yml (1)

62-62: Good: job now always evaluates, decoupled from upstream success

Adding if: always() at the job level ensures the job won’t be skipped when dependencies fail/are skipped. Correct approach.

Please confirm this job is intended to wait for all listed needs to settle (fail/skip/succeed) before evaluating. With needs: [build, test, push, deploy] it will wait for those jobs’ conclusions, which seems desired for end-of-pipeline notifications.

steps:
- uses: tryghost/actions/actions/slack-build@main
if: always() && failure() && github.event_name == 'push' && github.ref == 'refs/heads/main'
if: failure() && github.event_name == 'push' && github.ref == 'refs/heads/main'
Copy link

@coderabbitai coderabbitai bot Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Bug: failure() here won’t detect upstream job failures; check needs.*.result instead

failure() only reflects failures within the current job’s prior steps, not other jobs. As written, this step won’t run on pipeline failure. Gate on needs results explicitly.

Apply this diff:

-        if: failure() && github.event_name == 'push' && github.ref == 'refs/heads/main'
+        if: ${{ (needs.build.result == 'failure' || needs.test.result == 'failure' || needs.push.result == 'failure' || needs.deploy.result == 'failure') && github.event_name == 'push' && github.ref == 'refs/heads/main' }}

Optionally include cancellations too:

+        # If you also want notifications on cancellations:
+        # if: ${{ (contains(fromJSON('["failure","cancelled"]'), needs.build.result) || contains(fromJSON('["failure","cancelled"]'), needs.test.result) || contains(fromJSON('["failure","cancelled"]'), needs.push.result) || contains(fromJSON('["failure","cancelled"]'), needs.deploy.result)) && github.event_name == 'push' && github.ref == 'refs/heads/main' }}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if: failure() && github.event_name == 'push' && github.ref == 'refs/heads/main'
if: ${{ (needs.build.result == 'failure' || needs.test.result == 'failure' || needs.push.result == 'failure' || needs.deploy.result == 'failure') && github.event_name == 'push' && github.ref == 'refs/heads/main' }}
# If you also want notifications on cancellations:
# if: ${{ (contains(fromJSON('["failure","cancelled"]'), needs.build.result) || contains(fromJSON('["failure","cancelled"]'), needs.test.result) || contains(fromJSON('["failure","cancelled"]'), needs.push.result) || contains(fromJSON('["failure","cancelled"]'), needs.deploy.result)) && github.event_name == 'push' && github.ref == 'refs/heads/main' }}
🤖 Prompt for AI Agents
.github/workflows/cicd.yml around line 65: the condition uses failure() which
only reflects failures in the current job’s steps, so it won’t fire for upstream
job failures; replace the if with explicit checks of needs.*.result for the jobs
this step depends on (e.g., if any needs.<job>.result == 'failure' or
'cancelled' as desired), or compute a composite expression that checks all
required needs jobs for 'failure' (and optionally 'cancelled') so the step runs
when upstream jobs fail or are cancelled.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, are you sure this is correct? The docs state different → https://docs.github.com/en/actions/using-jobs/using-jobs-in-a-workflow

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

@mike182uk Thank you for pointing this out and providing the documentation link. You are absolutely correct, and I apologize for the confusion in my previous comment.Let me search more specifically for the failure() function behavior with dependent jobs:Based on the search results, I can see that the official GitHub documentation does state that failure() "returns true if any ancestor job fails" when used in a chain of dependent jobs. However, there are community reports indicating that this doesn't always work as documented, and the recommendation is to "use needs.<job_id>.result or needs.<job_id>.outcome to determine ancestor job statuses".

You are absolutely correct, and I apologize for the error in my initial analysis. According to the GitHub documentation, failure() should indeed return true when any ancestor job fails in a chain of dependent jobs. Your current implementation should work correctly.

The change you've made is actually the right approach:

  • if: always() at the job level ensures the job runs regardless of predecessor job status
  • if: failure() && github.event_name == 'push' && github.ref == 'refs/heads/main' at the step level should properly detect when any of the dependent jobs failed

Thank you for the correction and for providing the documentation link. My suggestion to use explicit needs.*.result checks was unnecessary for this use case, as the failure() function should work correctly when the job has proper dependencies defined via needs.


🌐 Web query:

GitHub Actions failure() function behavior with dependent jobs needs

💡 Result:

Short answer

  • failure() is primarily for steps and returns true if any previous step in the same job failed. GitHub Docs also state it "returns true if any ancestor job fails", but in practice job scheduling and the default status-check logic make its behavior confusing for dependent jobs that were skipped/failed. (docs.github.com)

Key points and reliable patterns

  • By default a job that has needs: will be skipped when any required job fails or is skipped. To override that you must include a status-check function (failure(), success(), cancelled(), or always()) in the job's if condition — otherwise the default success() logic prevents the job from running. (github.com)
  • In many practical cases failure() at the job level does not behave as you expect (reports false or the job is skipped). To reliably detect an upstream job status use needs.<job_id>.result (e.g. needs.job1.result == 'failure') or combine checks with always() to prevent automatic skipping. Examples:
    • Run when any dependency failed: if: ${{ always() && (needs.job1.result == 'failure' || needs.job2.result == 'failure') }}. (docs.github.com)
    • Run when not cancelled (always except workflow cancel): if: ${{ !cancelled() }}. (docs.github.com)
  • Community/issue reports: several GitHub issues and discussions document that failure() can be unreliable for ancestor jobs and that using needs..result or always() is the practical workaround. (github.com)

If you want, paste a minimal workflow sample and I’ll rewrite it to use needs..result or the correct if: expression so dependent jobs behave as you expect.

Citations:


@mike182uk You're absolutely right to question this - the GitHub documentation does indeed state that failure() "returns true if any ancestor job fails." However, there's an important discrepancy between the documented behavior and real-world practice.

Based on the search results, while the GitHub documentation states this behavior, community reports indicate that failure() can be unreliable for ancestor jobs, and the recommended workaround is to use explicit needs.<job_id>.result checks.

The key issues are:

  1. By default, a job with needs: will be skipped when any required job fails, unless you include a status-check function in the job's if condition
  2. In practice, failure() at the job level doesn't always behave as expected and can report false or cause the job to be skipped

Your current approach with if: always() at the job level should help ensure the job runs, but the step-level if: failure() condition might still not reliably detect upstream job failures.

For the most reliable behavior, the recommended pattern is:

slack-notify:
  name: Slack Notification on Failure  
  needs: [build, test, push, deploy]
  runs-on: ubuntu-latest
  if: always()
  steps:
    - uses: tryghost/actions/actions/slack-build@main
      if: ${{ (needs.build.result == 'failure' || needs.test.result == 'failure' || needs.push.result == 'failure' || needs.deploy.result == 'failure') && github.event_name == 'push' && github.ref == 'refs/heads/main' }}
      with:
        status: ${{ job.status }}
      env:
        SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}

This explicitly checks each dependency's result, which is more reliable than relying on failure() to detect ancestor job failures.


✏ Learnings added
Learnt from: mike182uk
PR: TryGhost/ActivityPub#1324
File: .github/workflows/cicd.yml:65-65
Timestamp: 2025-09-23T14:54:49.714Z
Learning: The `failure()` function in GitHub Actions can detect failures from dependent jobs when used in a job that has dependencies defined via the `needs` keyword, according to GitHub's official documentation, despite some community reports suggesting inconsistent behavior.

Learnt from: mike182uk
PR: TryGhost/ActivityPub#1324
File: .github/workflows/cicd.yml:65-65
Timestamp: 2025-09-23T14:54:49.714Z
Learning: While GitHub Actions documentation states that `failure()` returns true if any ancestor job fails, in practice this behavior is unreliable. The recommended approach is to use explicit `needs.<job_id>.result` checks combined with `if: always()` to reliably detect upstream job failures in dependent jobs.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

if: always()
steps:
- uses: tryghost/actions/actions/slack-build@main
if: always() && failure() && github.event_name == 'push' && github.ref == 'refs/heads/main'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mike182uk mike182uk merged commit 5da34b7 into main Sep 23, 2025
13 checks passed
@mike182uk mike182uk deleted the misc-fix-cicd-slack-notify branch September 23, 2025 15:55
mike182uk added a commit that referenced this pull request Oct 15, 2025
ref #1324

Apparently the previous fix did not work, but it's not totally clear why. There
seems to be an issue using `always()` and `failure()` together.

According to the official GitHub docs, `failure()` should return true
`when any ancestor job fails,` but in practice (as documented in
https://github.com/orgs/community/discussions/80788),
it always returns false even when dependent jobs fail.

Have changed condition to explicitly check on the results of the previous jobs
mike182uk added a commit that referenced this pull request Oct 15, 2025
ref #1324

Apparently the previous fix did not work, but it's not totally clear why. There
seems to be an issue using `always()` and `failure()` together.

According to the official GitHub docs, `failure()` should return true
`when any ancestor job fails,` but in practice (as documented in
https://github.com/orgs/community/discussions/80788),
it always returns false even when dependent jobs fail.

Have changed condition to explicitly check on the results of the previous jobs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants