CI toil-reduction sweep: registry determinism + codex auto-refresh + local CI dry-run (3 beads)#261
Merged
Merged
Conversation
…fresh + local CI dry-run (soc-k47k, soc-7qq9, soc-ws40) Three compounding fixes filed during the 2026-05-07 retrospective on CI/push-gate toil. See `.agents/learnings/2026-05-07-ci-push-gate-toil-pattern.md` for the data: 9 push iterations across 4 PRs, ~45min wall-clock + thousands of orchestrator tokens spent diagnosing identical-looking failures with different root causes. soc-k47k — registry-check determinism: - scripts/generate-registry.sh build_knowledge_stores() now walks `git ls-files .agents/` instead of filesystem `find`. CI's clean checkout and a session-built local checkout produce identical output because both consult the git index (deterministic). - Removed the `[[ -d .agents ]] || return` early-exit so behavior is uniform regardless of working-tree state. - Eliminates the dominant single source of failed-push retries this session (5 iterations across PRs #255 and #260 alone). soc-7qq9 — PostToolUse codex auto-refresh: - New hooks/postedit-codex-refresh.sh fires after Edit/Write of any skills/<name>/SKILL.md, references/, scripts/, or schemas/ file. Auto-runs `scripts/refresh-codex-artifacts.sh --scope head` so codex hashes stay current; operator never sees the parity warning. - Registered in hooks/hooks.json (matchers Edit + Write, 10s timeout). - Embedded copy synced via `make sync-hooks`. - Disable via AGENTOPS_HOOKS_DISABLED=1 or AGENTOPS_CODEX_AUTOREFRESH_DISABLED=1. - Best-effort: failures log to stderr but always exit 0. soc-ws40 — local CI-equivalent dry-run: - New scripts/test-ci-deterministic-gates.sh runs the 4 deterministic CI gates that have historically surprised PRs at push time: registry- check, skill-lint, heal-skill --strict, codex artifact metadata. - Non-fail-fast: reports all failures together so you can fix them in one diagnostic round instead of N push iterations. - Currently runs in <30s on a clean tree (vs. ~5min for one CI canary cycle). Together these compound: with #1 + #2 + #3 wired, the CI surprises that drove this session's toil should be catchable locally before push. go test ./... → 11858 passed in 50 packages. bash scripts/test-ci-deterministic-gates.sh → 4/4 pass.
Required by scripts/check-test-fixture-parity.sh — every hooks/*.sh must have at least one test reference. 4 smoke tests: - non-skill path → silent exit 0 - AGENTOPS_HOOKS_DISABLED=1 short-circuits - AGENTOPS_CODEX_AUTOREFRESH_DISABLED=1 short-circuits - skill path outside any git repo → no-op
…t-codex-refresh tests
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes 3 beads (soc-k47k, soc-7qq9, soc-ws40) filed during the 2026-05-07 retrospective on CI/push-gate toil. See
.agents/learnings/2026-05-07-ci-push-gate-toil-pattern.mdfor the data: 9 push iterations across 4 PRs, ~45min wall-clock + thousands of orchestrator tokens spent diagnosing identical-looking failures with different root causes.Three compounding fixes shipped together to amortize the CI cycle cost (the very pattern the learning argues for).
What changed
soc-k47k — registry-check determinism
scripts/generate-registry.shbuild_knowledge_stores()now walksgit ls-files .agents/instead of filesystemfind. CI's clean checkout and a session-built local checkout produce identical output (both consult the git index — deterministic).[[ -d .agents ]] || returnearly-exit so behavior is uniform regardless of working-tree state.soc-7qq9 — PostToolUse codex auto-refresh
hooks/postedit-codex-refresh.shfires after Edit/Write of anyskills/<name>/SKILL.md,references/,scripts/, orschemas/file.scripts/refresh-codex-artifacts.sh --scope headso codex hashes stay current; operator never sees the parity warning.hooks/hooks.json(matchers Edit + Write, 10s timeout). Embedded copy synced viamake sync-hooks.AGENTOPS_HOOKS_DISABLED=1orAGENTOPS_CODEX_AUTOREFRESH_DISABLED=1.soc-ws40 — local CI-equivalent dry-run
scripts/test-ci-deterministic-gates.shruns the 4 deterministic CI gates in <30s: registry-check, skill-lint, heal-skill --strict, codex artifact metadata.Test coverage
postedit-codex-refresh.shintests/hooks/test-hooks.sh(non-skill path no-op, both disable env vars, orphan path).scripts/check-test-fixture-parity.shpasses (was failing without the new tests).Test plan
bash scripts/test-ci-deterministic-gates.sh→ 4/4 PASSbash tests/hooks/test-hooks.sh(post-edit-codex-refresh suite) → 4/4 PASSbash scripts/check-test-fixture-parity.sh→ exit 0cd cli && go test ./...→ 11858 passingskills-codex/<name>/.agentops-generated.jsonauto-refreshHow this composes
With #1 + #2 + #3 wired:
--checkmode is honest — no more "passes locally, fails on CI" surprise.Together: the CI surprises that drove this session's toil should be catchable locally before push. Future PR CI cycles should compress from N iterations to 1.
Closes