Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions .github/codex/prompts/ci-failure-triage.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# EvalOps Codex CI Failure Triage

Investigate the failing GitHub Actions run for this repository and produce a
minimal fix plan or patch.

Required checks:

- Start from the exact failing run, job, and step. Do not infer from workflow
names alone.
- Fetch failed logs with `gh run view --log-failed` and fall back to the
Actions jobs API when the log output is empty.
- Distinguish stale failures on superseded SHAs from failures on the live PR or
`main` tip.
- Group related failures by root cause and avoid unrelated refactors.
- If the failure is a workflow issue, inspect path filters, generated workflow
surfaces, branch protection expectations, and pinned action policy.
- If the failure is test or code behavior, run the smallest local reproduction
before proposing broader gates.

Output:

- Root cause with run/job evidence.
- Minimal fix or the exact reason no code change is appropriate.
- Commands run locally.
- Remaining CI or review-thread work.
20 changes: 20 additions & 0 deletions .github/codex/prompts/label-churn-audit.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# EvalOps Codex Label Churn Audit

Audit PR labels that are being added and removed repeatedly by automation.

Required checks:

- Inspect the PR timeline, issue events, workflow runs, bot comments, and
repository workflows that can mutate labels.
- Group label changes by actor, label, timestamp, and likely workflow source.
- Distinguish intended mutually exclusive labels from automation loops.
- Check whether human-authored code is expected to be agent-authored in this
repo before treating agent labels as suspicious.
- Identify the smallest durable fix: workflow condition, label ownership rule,
branch filter, debounce, or documentation update.

Output:

- A concise timeline of label mutations.
- The likely source workflow or automation.
- The durable fix and how to verify it.
22 changes: 22 additions & 0 deletions .github/codex/prompts/local-traffic-canary.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# EvalOps Codex Local Traffic Canary

Investigate a failure in local developer tooling, traffic simulation, or
distributed tracing.

Required checks:

- Start from the failing command and preserve its output.
- Inspect `AGENTS.md`, Makefile targets, local compose files, traffic profiles,
and tracing docs before changing behavior.
- Prefer dry-run validations first, then dependency-backed local smoke only
when Docker and local ports are available.
- Verify that generated trace IDs, `traceparent`, NATS subjects, and manifest
paths match the repo contract.
- Keep fixes local-tooling focused unless the failure exposes a production
contract bug.

Output:

- Failing command and root cause.
- Patch or precise follow-up if credentials/local services are unavailable.
- Verification commands that future developers can run.
21 changes: 21 additions & 0 deletions .github/codex/prompts/post-merge-verify.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# EvalOps Codex Post-Merge Verification

Verify that a recently merged PR is actually healthy on the default branch.

Required checks:

- Identify the merge commit and affected workflows on `main`.
- Check the latest default-branch GitHub Actions runs, not stale PR checks.
- For deploy or runtime changes, describe the GitOps or live-state validation
path and whether credentials were available.
- For local tooling, run the relevant local smoke or dry-run target.
- For tracing/event-bus work, verify trace propagation, subject/catalog
alignment, and local simulation manifests.
- If a follow-up is needed, create or describe a precise issue with acceptance
criteria.

Output:

- Healthy / unhealthy / inconclusive status.
- Evidence links or command outputs summarized in prose.
- Follow-up PR or issue recommendations.
28 changes: 28 additions & 0 deletions .github/codex/prompts/pr-review.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# EvalOps Codex PR Review

Review the pull request as an EvalOps maintainer. Focus on defects, behavioral
regressions, missing tests, generated artifact drift, security footguns, and
operational risk. Prefer concise findings over broad summaries.

Required checks:

- Inspect the diff against the PR base and identify the affected repos,
services, workflows, contracts, generated files, and deployment surfaces.
- Read any `AGENTS.md` files that apply to changed paths before reviewing.
- Use live GitHub context when available: PR description, labels, checks,
review comments, unresolved review threads, and recent CI failures.
- For generated code, verify whether the generator or checked-in output is the
source of truth before recommending direct edits.
- For infrastructure or workflow changes, call out whether the change affects
labels, branch protection, automation, release trains, or GitOps desired
state.
- For tracing or event-bus changes, verify trace context, subject/catalog
alignment, and local simulation coverage.

Output:

- Start with actionable findings ordered by severity.
- Include file paths and line references when possible.
- Include a short residual-risk note when the diff looks clean.
- Do not approve a PR solely because tests pass if unresolved review threads or
failing checks remain.
9 changes: 9 additions & 0 deletions .github/workflow-templates/codex-ci-triage.properties.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{
"name": "Codex CI failure triage",
"description": "Manually run Codex against a failed GitHub Actions run and optionally post the fix summary to a PR.",
"iconName": "octicon pulse",
"categories": [
"Automation",
"Continuous integration"
]
}
74 changes: 74 additions & 0 deletions .github/workflow-templates/codex-ci-triage.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
name: Codex CI failure triage

on:
workflow_dispatch:
inputs:
run_id:
description: "GitHub Actions run id to triage"
required: true
type: string
pr_number:
description: "Optional PR number to comment on"
required: false
type: string

permissions:
contents: write
actions: read
pull-requests: write
issues: write

jobs:
triage:
runs-on: ubuntu-latest
timeout-minutes: 45
outputs:
final_message: ${{ steps.run-codex.outputs.final-message }}
steps:
- uses: actions/checkout@v5
with:
fetch-depth: 0

- name: Capture failed run evidence
env:
GH_TOKEN: ${{ github.token }}
RUN_ID: ${{ inputs.run_id }}
run: |
{
echo "# GitHub Actions failure"
echo
gh run view "${RUN_ID}" --repo "${GITHUB_REPOSITORY}" --json url,name,displayTitle,event,headBranch,headSha,conclusion,createdAt,updatedAt
echo
gh run view "${RUN_ID}" --repo "${GITHUB_REPOSITORY}" --log-failed || true
} > codex-ci-evidence.md

- name: Run Codex CI triage
id: run-codex
uses: openai/codex-action@5c3f4ccdb2b8790f73d6b21751ac00e602aa0c02
with:
openai-api-key: ${{ secrets.OPENAI_API_KEY }}
prompt: |
Investigate the failure described in codex-ci-evidence.md. Start
from the exact failed run/job/step, distinguish stale failures from
live failures, and make the smallest safe patch when appropriate.
Report commands run and remaining CI or review-thread work.
codex-args: '["--full-auto"]'
output-file: codex-ci-triage.md
safety-strategy: drop-sudo
sandbox: workspace-write

- name: Post triage summary
if: ${{ inputs.pr_number != '' && steps.run-codex.outputs.final-message != '' }}
uses: actions/github-script@v7
env:
CODEX_FINAL_MESSAGE: ${{ steps.run-codex.outputs.final-message }}
PR_NUMBER: ${{ inputs.pr_number }}
with:
github-token: ${{ github.token }}
script: |
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: Number(process.env.PR_NUMBER),
body: process.env.CODEX_FINAL_MESSAGE,
});
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{
"name": "Codex label churn audit",
"description": "Have Codex inspect PR label mutation events and identify automation loops.",
"iconName": "octicon tag",
"categories": [
"Automation",
"Code review"
]
}
68 changes: 68 additions & 0 deletions .github/workflow-templates/codex-label-churn-audit.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
name: Codex label churn audit

on:
workflow_dispatch:
inputs:
pr_number:
description: "Pull request number to audit"
required: true
type: string

permissions:
contents: read
pull-requests: read
issues: write

jobs:
audit:
runs-on: ubuntu-latest
timeout-minutes: 30
steps:
- uses: actions/checkout@v5
with:
fetch-depth: 0

- name: Capture label timeline
env:
GH_TOKEN: ${{ github.token }}
PR_NUMBER: ${{ inputs.pr_number }}
run: |
{
echo "# Label timeline"
echo
gh api "repos/${GITHUB_REPOSITORY}/issues/${PR_NUMBER}/events" --paginate
echo
echo "# Workflows that mention labels"
rg -n "add-label|remove-label|gh pr edit|issues.addLabels|issues.removeLabel|labels" .github/workflows scripts || true
Comment thread
haasonsaas marked this conversation as resolved.
} > codex-label-churn-evidence.md

- name: Run Codex label audit
id: run-codex
uses: openai/codex-action@5c3f4ccdb2b8790f73d6b21751ac00e602aa0c02
with:
openai-api-key: ${{ secrets.OPENAI_API_KEY }}
prompt: |
Audit codex-label-churn-evidence.md. Identify which automation is
adding and removing labels, whether the churn is intentional, and
the smallest durable fix. Remember that EvalOps human committed
code is usually LLM-authored, so agent-authorship labels should not
be treated as suspicious by default.
output-file: codex-label-churn-audit.md
safety-strategy: drop-sudo
sandbox: read-only

- name: Comment with audit
if: ${{ steps.run-codex.outputs.final-message != '' }}
uses: actions/github-script@v7
env:
CODEX_FINAL_MESSAGE: ${{ steps.run-codex.outputs.final-message }}
PR_NUMBER: ${{ inputs.pr_number }}
with:
github-token: ${{ github.token }}
script: |
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: Number(process.env.PR_NUMBER),
body: process.env.CODEX_FINAL_MESSAGE,
});
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{
"name": "Codex post-merge verification",
"description": "Have Codex inspect recent main-branch runs and summarize post-merge health.",
"iconName": "octicon checklist",
"categories": [
"Automation",
"Continuous integration"
]
}
70 changes: 70 additions & 0 deletions .github/workflow-templates/codex-post-merge-verify.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
name: Codex post-merge verification

on:
workflow_dispatch:
inputs:
merge_sha:
description: "Merge commit or main-branch SHA to verify"
required: false
type: string
schedule:
- cron: "37 */6 * * *"

permissions:
contents: read
actions: read
issues: write

jobs:
verify:
runs-on: ubuntu-latest
timeout-minutes: 30
steps:
- uses: actions/checkout@v5
with:
fetch-depth: 0

- name: Capture main-branch evidence
env:
GH_TOKEN: ${{ github.token }}
MERGE_SHA: ${{ inputs.merge_sha }}
run: |
{
echo "# Default-branch verification"
echo
echo "repository=${GITHUB_REPOSITORY}"
echo "merge_sha=${MERGE_SHA:-${GITHUB_SHA}}"
echo
gh run list --repo "${GITHUB_REPOSITORY}" --branch main --limit 20 \
--json databaseId,name,event,status,conclusion,headSha,createdAt,updatedAt,url
} > codex-post-merge-evidence.md

- name: Run Codex verifier
id: run-codex
uses: openai/codex-action@5c3f4ccdb2b8790f73d6b21751ac00e602aa0c02
with:
openai-api-key: ${{ secrets.OPENAI_API_KEY }}
prompt: |
Verify default-branch health using codex-post-merge-evidence.md and
the repository's local guidance. Decide whether the latest main
state is healthy, unhealthy, or inconclusive. If unhealthy, propose
the smallest follow-up with acceptance criteria.
output-file: codex-post-merge-verify.md
safety-strategy: drop-sudo
sandbox: read-only
Comment thread
cursor[bot] marked this conversation as resolved.

- name: Publish verification report
if: ${{ steps.run-codex.outputs.final-message != '' }}
env:
GH_TOKEN: ${{ github.token }}
run: |
title="[codex] Post-merge verification"
if issue_number="$(gh issue list --state open --search "\"${title}\" in:title" --limit 1 --json number --jq '.[0].number // empty')" && [ -n "${issue_number}" ]; then
gh issue comment "${issue_number}" --body-file codex-post-merge-verify.md
else
gh issue create --title "${title}" --body-file codex-post-merge-verify.md
fi

- name: Append report to summary
if: ${{ always() && hashFiles('codex-post-merge-verify.md') != '' }}
run: cat codex-post-merge-verify.md >> "${GITHUB_STEP_SUMMARY}"
10 changes: 10 additions & 0 deletions .github/workflow-templates/codex-pr-review.properties.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
{
"name": "Codex pull request review",
"description": "Run OpenAI Codex on PRs with EvalOps review guidance and post the findings back to the thread.",
"iconName": "octicon code-review",
"categories": [
"Automation",
"Code review",
"Continuous integration"
]
}
Loading
Loading