Skip to content

Design: automatic CI-failure issue filing (workflow_run monitor)#79

Merged
toddysm merged 2 commits into
mainfrom
docs/ci-failure-notifications
Jun 28, 2026
Merged

Design: automatic CI-failure issue filing (workflow_run monitor)#79
toddysm merged 2 commits into
mainfrom
docs/ci-failure-notifications

Conversation

@toddysm

@toddysm toddysm commented Jun 28, 2026

Copy link
Copy Markdown
Owner

Summary

Adds the design document for Option 2 — a repository-wide workflow_run monitor that automatically files (and de-duplicates) a GitHub issue when a monitored workflow concludes in failure.

New doc: docs/architecture/workflows/ci-failure-notifications.md.

Highlights

  • Approach: a single report-ci-failure.yml (report / ci-failure) subscribes to monitored workflows via workflow_run and, on conclusion == failure, opens/updates a deduped CI failure: <workflow> issue.
  • New action: manage-failure-issue (kept separate from the promotion-specific manage-issue to avoid destabilising the override flow).
  • De-dup: one open issue per failing workflow, updated until it recovers.
  • Honest startup_failure caveat: workflow_run is not reliably emitted for startup failures; prevention (actionlint/CodeQL/review) remains the mitigation for that class.
  • Phased: Phase 1 = action + monitor (issue only); Phase 2 = auto-close on recovery + Slack ci-failure template.

Process

Design only — no code yet. Implementation is tracked by a tracking issue + child tasks filed alongside this PR. Per the repo's process, this doc is up for review before implementation begins.

…tor)

Design doc for Option 2: a repo-wide report-ci-failure.yml workflow that subscribes to monitored workflows via workflow_run and opens/updates a deduped 'CI failure: <workflow>' issue (new manage-failure-issue action). Documents the startup_failure caveat, phased delivery, and alternatives. Proposed/under review; no code yet.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a proposed architecture/design document describing a repository-wide GitHub Actions workflow_run monitor that files a de-duplicated GitHub issue when selected CI workflows conclude with failure.

Changes:

  • Introduces the overall approach (single workflow_run monitor workflow) and how it would identify failures.
  • Specifies proposed components (report-ci-failure.yml + new manage-failure-issue composite action) and issue de-duplication/lifecycle rules.
  • Documents limitations (startup_failure) and outlines phased delivery + security considerations.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docs/architecture/workflows/ci-failure-notifications.md
Comment thread docs/architecture/workflows/ci-failure-notifications.md
Comment thread docs/architecture/workflows/ci-failure-notifications.md Outdated
@toddysm toddysm merged commit 192a79e into main Jun 28, 2026
3 checks passed
@toddysm toddysm deleted the docs/ci-failure-notifications branch June 28, 2026 03:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants