Skip to content

tooling: add deterministic CI state classifier command#4578

Open
davidahmann wants to merge 3 commits intocrewAIInc:mainfrom
davidahmann:codex/issue-4576-ci-state-classifier
Open

tooling: add deterministic CI state classifier command#4578
davidahmann wants to merge 3 commits intocrewAIInc:mainfrom
davidahmann:codex/issue-4576-ci-state-classifier

Conversation

@davidahmann
Copy link
Copy Markdown

@davidahmann davidahmann commented Feb 24, 2026

Problem

PR check-state interpretation is currently manual and can misclassify sparse/no-check states.

Why Now

Cross-repo coordination requires deterministic CI state categories.

What Changed

  • Added scripts/ci_state_classifier.py.
  • Added deterministic categories: passed, failed, pending, no_checks, policy_blocked.
  • Added fixture/live modes and stable JSON output.
  • Added built-in self-test.

Validation

  • python3 scripts/ci_state_classifier.py --self-test

Refs #4576


Note

Low Risk
Adds a standalone tooling script with no runtime impact on production code; main risk is misclassification due to heuristic policy-name matching and status normalization.

Overview
Adds a new scripts/ci_state_classifier.py CLI that fetches (via gh pr view) or loads a JSON statusCheckRollup payload and deterministically classifies PR check state as passed, failed, pending, no_checks, or policy_blocked.

The classifier normalizes check fields, treats known failure/pending statuses specially, and separates policy-style checks (e.g., CLA/DCO/Code Owners) so they can block without marking the PR as failed; output is stable JSON with optional pretty-printing plus a built-in --self-test.

Written by Cursor Bugbot for commit a3bb111. This will update automatically on new commits. Configure here.

@davidahmann
Copy link
Copy Markdown
Author

Implemented issue #4576 with a deterministic CI classifier command at , including policy-block detection, fixture/live modes, and self-test. Validation: self-test passed.

This contribution was informed by patterns from Wrkr. Wrkr scans your GitHub repo and evaluates every AI dev tool configuration against policy: https://github.com/Clyra-AI/wrkr

@davidahmann
Copy link
Copy Markdown
Author

Implementation summary: added scripts/ci_state_classifier.py with deterministic state categories, fixture/live input modes, policy-block detection, and built-in self-test checks. Validation: classifier self-test passed.

This contribution was informed by patterns from Wrkr. Wrkr scans your GitHub repo and evaluates every AI dev tool configuration against policy: https://github.com/Clyra-AI/wrkr

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b889e91276

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread scripts/ci_state_classifier.py Outdated
from pathlib import Path
from typing import Any

FAIL_VALUES = {"FAILURE", "ERROR", "TIMED_OUT", "CANCELLED", "ACTION_REQUIRED"}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Treat STARTUP_FAILURE/STALE conclusions as failed

GitHub check runs can finish with conclusions like STARTUP_FAILURE or STALE, but FAIL_VALUES omits both, so classify() falls through to passed when these are the only non-success outcomes. In repos where those conclusions occur (e.g., failed bootstraps or stale runs), this produces a false-green CI classification and can incorrectly signal that the PR is safe to merge.

Useful? React with 👍 / 👎.

Comment thread scripts/ci_state_classifier.py Outdated
from typing import Any

FAIL_VALUES = {"FAILURE", "ERROR", "TIMED_OUT", "CANCELLED", "ACTION_REQUIRED"}
PENDING_VALUES = {"PENDING", "QUEUED", "IN_PROGRESS", "REQUESTED", "WAITING"}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Classify EXPECTED status as pending

_to_checks() maps commit status state into both conclusion and status, but PENDING_VALUES does not include EXPECTED. For status contexts that are required but not yet reported (state EXPECTED), classify() currently returns passed instead of a non-terminal state, which misrepresents PR readiness while required checks are still outstanding.

Useful? React with 👍 / 👎.

Comment thread scripts/ci_state_classifier.py Outdated
Comment thread scripts/ci_state_classifier.py
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

Comment thread scripts/ci_state_classifier.py Outdated
@davidahmann davidahmann force-pushed the codex/issue-4576-ci-state-classifier branch from 98792c5 to a3bb111 Compare February 25, 2026 19:21
@github-actions
Copy link
Copy Markdown
Contributor

This PR is stale because it has been open for 45 days with no activity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant