feat(hooks): session-pr-counter — mechanical session-scope enforcement (soc-1aou) by boshu2 · Pull Request #362 · boshu2/agentops

boshu2 · 2026-05-19T21:56:28Z

Summary

Mechanical follow-through for soc-waxr (PR #361) — the session-scope doctrine. soc-waxr encoded "2-4 PRs/session default; ≥5 triggers mandatory post-mortem" as documentation; this PR makes that rule fire mechanically as a PreToolUse hook on gh pr create.

Closes: soc-1aou · Discovered-from: soc-waxr

Fitness delta

Doc-only rules with mechanical backstop: +1 (session-scope joins AP#1→ship.sh and AP#7→verify-gate-claim.sh)
Session-scope recurrence prevention: 0 → 1 (would have fired on PR fix: align with Anthropic marketplace standards #5 of yesterday's 7-PR session)
New bats: 0 → 12 (kill switches, tool matching, threshold logic, hard-block mode, fail-open on malformed output)

What it does

PreToolUse on Bash + gh pr create
Counts the operator's PRs (any state, last 24h via gh pr list --search)
At count >= threshold-1, emits additionalContext with post-mortem prompts:
- Which PRs were planned vs reactive?
- How many self-corrections so far?
- Is the marginal PR discovery or churn?
Hard-block mode (AGENTOPS_SESSION_PR_BLOCK=1) exits 2 with clear reason instead

Configuration

Variable	Default	Purpose
`AGENTOPS_SESSION_PR_THRESHOLD`	5	PR count that triggers the reminder
`AGENTOPS_SESSION_PR_WINDOW_HOURS`	24	Window for "current session"
`AGENTOPS_SESSION_PR_BLOCK`	0	1 = hard block (exit 2) instead of advisory
`AGENTOPS_SESSION_PR_COUNTER_DISABLED`	0	1 = bypass this hook
`AGENTOPS_HOOKS_DISABLED`	0	1 = bypass all AgentOps hooks

Sibling pattern

Hook structure mirrors hooks/commit-review-gate.sh — same PreToolUse Bash matcher, same kill-switch chain, same jq+env fallback for tool input parsing, same emit_hook_context → jq -n → escape-fallback emission chain. Standards discipline matches (set -uo pipefail without -e, fail-open advisory shape). Sibling pattern: hooks/commit-review-gate.sh.

Files

File	Change
`hooks/session-pr-counter.sh`	New, 133 lines
`hooks/hooks.json`	New PreToolUse Bash entry (timeout 10s)
`cli/embedded/hooks/*`	Auto-synced via `make sync-hooks`
`tests/hooks/test-session-pr-counter.bats`	New, 12 tests, all green

Dogfood: what would have fired

Yesterday's 7-PR session (#356 through #361 + #320) would have triggered this hook at PR #5 (#359 soc-bbvw — the FIRST self-correction PR). The hook's reminder would have been visible to the agent before opening #359, prompting "is this churn or discovery?". The answer was "churn — fixing my own regression from #357", which exactly fits the failure mode soc-waxr names.

What's NOT in this PR

The soc-waxr doctrine surfaces (CLAUDE.md, AGENTS.md, ship-loop SKILL, anti-patterns.md) still say "mechanical enforcement is a successor concern". Updating those will be a tiny follow-up PR once soc-waxr (#361) itself merges — editing the same lines now would conflict on rebase.

Verification

bats tests/hooks/test-session-pr-counter.bats → 12 ok
shellcheck hooks/session-pr-counter.sh clean (SC1091 info-only on hook-helpers source, matching commit-review-gate.sh)
jq -e . hooks/hooks.json clean
cd cli && make sync-hooks clean

Bounded-context: BC0-foundations + BC5-runtime (hook plumbing)
Evidence: shellcheck

…t (soc-1aou) Implements the mechanical follow-through for soc-waxr (PR #361, the session-scope doctrine rule). soc-waxr encoded "2-4 PRs/session default; ≥5 triggers mandatory post-mortem" as documentation; soc-1aou makes that documentation fire mechanically as a PreToolUse hook on `gh pr create`. ## Fitness delta - Documentation-only rules with mechanical backstop: was {AP#1 → ship.sh, AP#7 → verify-gate-claim.sh}, now adds session-scope → session-pr-counter.sh (3 of N session-relevant rules now mechanically enforced). - Session-scope rule recurrence prevention: 0 → 1 (the rule's own derivation cited a session where the cron-loop kept nudging "keep going" past the threshold; this hook would have fired). - New bats: 0 → 12 (test-session-pr-counter.bats: kill switches, tool matching, threshold logic, hard-block mode, fail-open on malformed output). ## What it does - Fires PreToolUse on Bash + `gh pr create` substring. - Counts the operator's PRs (any state, last 24h via `gh pr list --search`). - If that count is >= threshold-1 (so the next PR tips into ≥threshold), emits a `<system-reminder>`-shaped `additionalContext` with the post-mortem prompts. - Hard-block mode (opt-in via `AGENTOPS_SESSION_PR_BLOCK=1`) exits 2 with a clear reason instead — for operators who want the gate to refuse rather than remind. ## Configuration | Variable | Default | Purpose | |---|---|---| | `AGENTOPS_SESSION_PR_THRESHOLD` | 5 | PR count that triggers the reminder | | `AGENTOPS_SESSION_PR_WINDOW_HOURS` | 24 | Window for "current session" | | `AGENTOPS_SESSION_PR_BLOCK` | 0 | 1 = hard block (exit 2) instead of advisory | | `AGENTOPS_SESSION_PR_COUNTER_DISABLED` | 0 | 1 = bypass this hook | | `AGENTOPS_HOOKS_DISABLED` | 0 | 1 = bypass all AgentOps hooks | ## Sibling pattern Hook structure mirrors `hooks/commit-review-gate.sh` (cycle 54 — also a PreToolUse Bash hook that synthesizes `additionalContext` via either `emit_hook_context` or a `jq -n` fallback). Sibling pattern: `hooks/commit-review-gate.sh`. Standards discipline (set -uo pipefail without -e, kill-switch chain, jq+env fallback for tool input) matches the same sibling. ## Files | File | Change | |---|---| | `hooks/session-pr-counter.sh` | New (133 lines), PreToolUse Bash hook | | `hooks/hooks.json` | New PreToolUse Bash entry (timeout 10s) | | `cli/embedded/hooks/session-pr-counter.sh` + `hooks.json` | Auto-synced via `cli/make sync-hooks` | | `tests/hooks/test-session-pr-counter.bats` | New (12 tests, all green) | ## Verification - 12/12 bats green - `shellcheck hooks/session-pr-counter.sh` clean (SC1091 info-only on hook-helpers source, matching the existing commit-review-gate.sh pattern) - `jq -e . hooks/hooks.json` clean (valid JSON) - `cd cli && make sync-hooks` clean - Dogfooded shape: the hook would have fired on PR #5 of yesterday's 7-PR session ## What's NOT in this PR The soc-waxr doctrine surfaces (CLAUDE.md, AGENTS.md, ship-loop SKILL, anti-patterns.md) still say "mechanical enforcement is a successor concern". Updating those will be a tiny follow-up PR once soc-waxr (PR #361) itself merges — editing the same lines now would conflict on rebase. Closes: soc-1aou Discovered-from: soc-waxr Bounded-context: BC0-foundations + BC5-runtime (hook plumbing) Evidence: shellcheck

…vb #cobra-writer-leak) (#363) ## Summary Fix main CI red on PR #362 — two `goals_measure` tests in `cli/cmd/ao` flake under `go test -race -shuffle=on`: - `TestGoalsMeasure_FullModeJSONCarriesSnapshotAndScenarios` - `TestGoalsMeasure_MissingArtifactYieldsUnknownNotError` Both fail with `unmarshal payload: unexpected end of JSON input` + empty raw stdout. Closes: soc-n6vb ## Root cause Both tests call `goalsMeasureCmd.RunE(goalsMeasureCmd, nil)` directly. Inside RunE, output is written via `cmd.OutOrStdout()`, which walks the cobra command tree until it finds a non-nil `outWriter`: ``` goalsMeasureCmd.outWriter -> nil goalsCmd.outWriter -> nil rootCmd.outWriter -> ??? (if stale: writes here; test's os.Stdout redirect misses it) fallback -> os.Stdout ``` Under `-shuffle=on`, some earlier test leaves `rootCmd.outWriter` pointing at a buffer that's gone out of scope but whose pointer is still live. The failing tests' `captureJSONStdout` redirects `os.Stdout`, but cobra writes to the leaked buffer instead — empty captured payload. The likely vector is `executeCommand` in `cobra_commands_test.go`: it sets `rootCmd.SetOut(cmdBuf)` and restores inline at the end. If `rootCmd.Execute()` panics or `os.Pipe()` fails mid-flight, restoration is skipped. Reproducible locally with: ``` cd cli && go test -race -shuffle=1779241411657363775 -count=1 ./cmd/ao/... ``` ## Fix **Two layers** — root-cause hardening plus defensive belt-and-suspenders. ### Root cause (`cobra_commands_test.go`) Wrap `executeCommand`'s restoration in `defer` so it always runs even if `rootCmd.Execute()` panics: ```go defer func() { rootCmd.SetOut(nil) rootCmd.SetErr(nil) rootCmd.SetArgs(nil) }() // ... defer func() { os.Stdout = oldStdout }() ``` This removes the inline restoration that was vulnerable to panics, and consolidates the cleanup at one site. ### Defensive (`goals_measure_scenarios_test.go`) `setupMeasureScenarioProject` already saved/restored 8 package-level globals (soc-hwgm/soc-xyt1). Add cobra writer reset on entry so future flakes from any upstream leaker can't reach these tests: ```go rootCmd.SetOut(nil) rootCmd.SetErr(nil) goalsCmd.SetOut(nil) goalsCmd.SetErr(nil) goalsMeasureCmd.SetOut(nil) goalsMeasureCmd.SetErr(nil) ``` ## Verification - `cd cli && go test -race -shuffle=1779241411657363775 -count=1 ./cmd/ao/...` — PASSES (was the failing seed). - Test of both targeted functions in isolation — PASSES. - `gofmt -l` clean. `go vet ./cmd/ao/...` clean. ## Why this isn't masking a real bug The user-facing `ao goals measure` command always runs through `goalsMeasureCmd` under `rootCmd.Execute()`, where cobra's writer-walking lands at the real `os.Stdout`. The flake only affects direct `RunE(cmd, nil)` test invocations that bypass `Execute()` — pure test infrastructure. Bounded-context: BC5-runtime (test infrastructure for CLI command surface) Evidence: `cd cli && go test -race -shuffle=1779241411657363775 -count=1 ./cmd/ao/...` Co-authored-by: Codex <codex@example.invalid>

…rift (soc-h1cr #registry-regen) (#365) ## Summary Regenerate `registry.json` to add the `session-pr-counter` hook entry added by PR #362 (soc-1aou). The registry was stale on main because PR #362's path filter skipped the `registry-check` job, masking the drift. Closes: soc-h1cr Discovered-from: soc-1aou (PR #362) via soc-1nsx (PR #364 CI dogfood — registry-check failed on the YAML-touching PR and surfaced the pre-existing drift) ## Fix ``` bash scripts/generate-registry.sh ``` Generator output: > Wrote registry.json (79 skills, **44 hooks**, 4 stores, 14 job types, 62 evals, 171 CLI commands) The diff matches the `registry-check` job's CI-emitted expected diff line-for-line: hooks count `43 → 44` and the `session-pr-counter` entry (PreToolUse / Bash matcher / 10s timeout) added to the hooks array. ## #trivial Single mechanical regen of a generated file. No behavior change. Carve-out per CLAUDE.md: "Carve-out: `type=chore` with `#trivial` label for tiny work." ## Verification - `bash scripts/generate-registry.sh` ran cleanly - Diff matches CI-expected diff verbatim - `jq -e . registry.json` clean (valid JSON) Bounded-context: BC0-foundations Evidence: registry.json Co-authored-by: Codex <codex@example.invalid>

…oc-jmbc #waxr-pointer) (#366) ## Summary soc-waxr (PR #361) doctrinated the session-scope rule with a "Mechanical enforcement … is a successor concern" placeholder. soc-1aou (PR #362) shipped the successor: `hooks/session-pr-counter.sh`. The placeholder text remained stale in 3 surfaces. This PR updates all three to cite the concrete hook + behavior + hard-block env var. Closes: soc-jmbc Discovered-from: soc-waxr (PR #361 doctrine) via soc-1aou (PR #362 hook ship) ## Why Per the soc-waxr harvest note (queued in `.agents/rpi/next-work.jsonl`): "Tiny edit PR to point at hooks/session-pr-counter.sh. Deferred from #362 to avoid #361 rebase conflict." Now that both #361 and #362 have merged, the deferred update can land cleanly. ## Files changed - `CLAUDE.md` (line 144 region) - `AGENTS.md` (line 78 region) - `skills/ship-loop/SKILL.md` (line 113 region) Auto-regenerated by edit hook (no manual edit): - `skills-codex/.agentops-manifest.json` (ship-loop `source_hash` bump) - `skills-codex/ship-loop/.agentops-generated.json` (same bump) The codex variant doesn't carry the successor-concern placeholder, so `generated_hash` stays identical — no codex-side edit needed. ## Sibling pattern Mirrors PR #360 (soc-liyr) doctrine-sweep shape — same trio of surfaces (CLAUDE.md, AGENTS.md, `skills/<name>/SKILL.md`) updated together so source-of-truth precedence holds across all entry points an agent or operator might read first. ## Verification - `grep -rn "successor concern" CLAUDE.md AGENTS.md skills/ship-loop/SKILL.md` returns zero matches (was 3 before) - `grep -rn "session-pr-counter.sh" CLAUDE.md AGENTS.md skills/ship-loop/SKILL.md` returns 3 matches (was 0 before) - 5 files changed / 5 insertions / 5 deletions — minimal mechanical replacement ## Self-correcting Evidence claim The original PR-body Evidence line cited `hooks/session-pr-counter.sh` — a path this PR doesn't touch, only references. The just-shipped soc-1nsx per-job AP#7 check (PR #364) correctly flagged this as an unverifiable claim on the first CI run. Updating to a file the PR actually modifies, which the `changes` job's log records as `[modified]`. The per-job log fetch is working as designed. Bounded-context: BC0-foundations Evidence: skills/ship-loop/SKILL.md Co-authored-by: Codex <codex@example.invalid>

…th-filter-coverage) (#367) ## Summary `registry-check` triggered only on `skills` or `ci` changes, but `scripts/generate-registry.sh` reads from `skills/`, `hooks/`, `evals/`, AND `cli/cmd/ao/`. PR #362 added `hooks/session-pr-counter.sh` — only the `hooks` filter matched, so registry-check was SKIPPED. The drift sat on main until PR #364 (soc-1nsx) touched `.github/workflows/validate.yml`, which DID re-trigger registry-check via the `ci` filter and surfaced the gap. PR #365 (soc-h1cr) regenerated the registry as a separate concern. This PR closes the path-filter-SKIPPED-≠-drift-absent gap by extending the `if` condition to also trigger on `hooks`, `eval`, and `go` outputs. Closes: soc-xhp6 Discovered-from: soc-h1cr (the PR that surfaced the drift) Encoded-in: `.agents/learnings/2026-05-20-path-filter-skipped-not-absent.md` ## Session-scope note This is the 5th PR shipped in this autonomous session. The session-scope post-mortem (soc-waxr doctrine) was already completed by the `/post-mortem` loop cron (output: HEALTHY session, 0 reactive-spiral, 1 self-correction). This PR is harvested-from a post-mortem finding (this session's own harvest), so the marginal-PR analysis = **discovery**, not churn. The just-doctrinated `hooks/session-pr-counter.sh` hook (soc-1aou) will fire on `gh pr create` for this PR with `additionalContext` post-mortem prompts — that's the dogfood working as designed. ## Sibling pattern The fixed `if` matches the shape used by `agentops-contract-canaries` (line 687): - `contracts || go || skills || ci` — multi-source filter for a multi-source generator This PR brings registry-check to the same shape. ## Verification - `python3 -c "yaml.safe_load(open('.github/workflows/validate.yml'))"` clean - Manually compared `if` against `scripts/generate-registry.sh` source paths (`skills/`, `hooks/`, `evals/`, `cli/cmd/ao/`) - One-line edit + a comment block; no behavior change to the registry-check step itself Bounded-context: BC5-runtime Evidence: .github/workflows/validate.yml Co-authored-by: Codex <codex@example.invalid>

…tors (soc-2gd6 #eval-hard-fails) (#402) ## Why The v2.42.0 release gate (`scripts/ci-local-release.sh`) was red on 8 evals. The 3 score-0/near-0 hard fails are all **eval-staleness behind legitimate recent refactors** — verified, not gaming or security weakening. Operator decision: update eval to match source of truth (executable > contract). | Eval | Was | Cause | Fix | |---|---|---|---| | `hook-manifest-command-counts` | 0 | `session-pr-counter.sh` (PR #362) is the legit 37th hook script; eval hardcoded 43/36 | bump expected counts 43→44, 36→37 | | `push-worktree landing-plane` | 0.14 | #387 tiered-AGENTS split moved "Landing the Plane" to `AGENTS-WORKFLOW.md` (+ dropped 2 lines) | redirect eval target `AGENTS.md`→`AGENTS-WORKFLOW.md` + restore the 2 dropped policy lines | | `security-toolchain ci-soft-gate-policy` | 0 | gate is intentionally **HARD** (no `continue-on-error`); job already runs `security-gate.sh --mode quick` + uploads artifacts | drop the stale `continue-on-error` requirement (security stays HARD) | **Security note:** `security-toolchain-gate` stays a HARD blocking gate. Only the stale "soft gate" assertion was removed from the eval; the actual scan + artifact upload + summary-blocking are unchanged. ## How tested - hook-manifest jq → `hook-manifest-counts-ok` - security smoke `ci-policy` → `security-toolchain-ci-policy-ok` - all 7 landing-plane strings present in `AGENTS-WORKFLOW.md` - shellcheck clean on edited smoke ## Scope honesty This fixes the 3 **hard** fails only. The release gate still has **5 minor evals (0.71–0.99)** + the **vil/release-smoke** lane — a separate remediation, deliberately NOT in this PR (no green-washing). Sibling pattern: same "update eval to match legitimately-changed source of truth" move as the cli-command-surface canary bumps in #396/#397. Fitness: release-gate eval hard-fails 3 → 0. Closes-scenario: soc-2gd6#eval-hard-fails Bounded-context: BC4-Validation Evidence: evals/agentops-core/fixtures/security-toolchain-governance-smoke.sh

boshu2 enabled auto-merge (squash) May 19, 2026 21:56

github-actions Bot added cli tests hooks labels May 19, 2026

Merge branch 'main' into feat/soc-1aou-session-pr-counter

d05007d

boshu2 merged commit eb7874f into main May 20, 2026
67 checks passed

boshu2 deleted the feat/soc-1aou-session-pr-counter branch May 20, 2026 01:41

boshu2 mentioned this pull request May 20, 2026

fix(cli/test): goals_measure tests flake under -race -shuffle (soc-n6vb #cobra-writer-leak) #363

Merged

boshu2 mentioned this pull request May 20, 2026

chore(registry): regenerate registry.json — session-pr-counter hook drift (soc-h1cr #registry-regen) #365

Merged

boshu2 mentioned this pull request May 20, 2026

docs(ship-loop): cite session-pr-counter hook in soc-waxr doctrine (soc-jmbc #waxr-pointer) #366

Merged

boshu2 mentioned this pull request May 20, 2026

fix(ci): registry-check path filter — add hooks/eval/go (soc-xhp6 #path-filter-coverage) #367

Merged

boshu2 mentioned this pull request May 22, 2026

fix(evals): unstale 3 release-gate eval hard-fails behind legit refactors (soc-2gd6 #eval-hard-fails) #402

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(hooks): session-pr-counter — mechanical session-scope enforcement (soc-1aou)#362

feat(hooks): session-pr-counter — mechanical session-scope enforcement (soc-1aou)#362
boshu2 merged 2 commits into
mainfrom
feat/soc-1aou-session-pr-counter

boshu2 commented May 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

boshu2 commented May 19, 2026

Summary

Fitness delta

What it does

Configuration

Sibling pattern

Files

Dogfood: what would have fired

What's NOT in this PR

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant