Skip to content

fix(hooks): reconcile stale .agents/.gitignore deny-all (soc-rv5p)#263

Merged
boshu2 merged 1 commit into
mainfrom
fix/soc-rv5p-corpus-durability
May 8, 2026
Merged

fix(hooks): reconcile stale .agents/.gitignore deny-all (soc-rv5p)#263
boshu2 merged 1 commit into
mainfrom
fix/soc-rv5p-corpus-durability

Conversation

@boshu2
Copy link
Copy Markdown
Owner

@boshu2 boshu2 commented May 8, 2026

Summary

Fixes the corpus-durability bug filed as soc-rv5p (the pre-flight finding from PR #262's positioning doc sweep).

The session-start hook's deny-all .agents/.gitignore child file was silently overriding the root .gitignore's !/.agents/... force-tracked allowlist. Force-tracked audit-truth artifacts (.agents/nightly/<date>/baseline-goals.json, .agents/findings/registry.jsonl, .agents/rpi/next-work.jsonl, .agents/evolve/cycle-history.jsonl) were dropping off git invisibly between sessions because gitignore's last-match-wins rule applied to the deeper child file.

The prior hook only gated CREATION of the child file. Once a stale deny-all child existed (from an older hook version, external tooling, or a pre-fix session), nothing would remove it.

What changed

  • hooks/session-start.sh (and embedded copy): switch from one-shot guard to reconciler. When the parent .gitignore has a !/.agents/ allowlist, REMOVE any existing deny-all child. Otherwise (external repos), keep installing the safety belt.
  • tests/hooks/test-hooks.bats: 2 new regression tests
    • "removes stale deny-all child when parent has allowlist" — locks the reconciler
    • "preserves non-deny-all child gitignore" — locks reconciler scope (only * pattern is targeted)

All 86 hook bats tests pass; 226 hook shell tests still pass.

Verification

$ rm .agents/.gitignore && bash hooks/session-start.sh
$ git check-ignore -v .agents/nightly/foo.txt
.gitignore:140:!/.agents/nightly/**    .agents/nightly/foo.txt

The parent allowlist now governs as intended.

Beads

Closes soc-rv5p (filed as the pre-flight finding from the positioning doc sweep — the dogfood receipts claim becomes durable now).

Test plan

  • bats tests/hooks/test-hooks.bats — 86/86 pass
  • bash tests/hooks/test-hooks.sh — 226/226 pass
  • cd cli && make sync-hooks — embedded copy in sync
  • CI validate workflow passes

…llowlist (soc-rv5p)

The session-start hook installs a `*` deny-all `.agents/.gitignore` only
when the parent .gitignore lacks an allowlist for `/.agents/` paths. But
the prior condition `[ ! -f ... ] && ! grep -qE '^!/?\.agents/'` only
gated CREATION — it did nothing when a stale deny-all child file already
existed (left from an older hook version, an external tool, or a
pre-fix session). Once present, the deny-all child silently overrode
every parent `!/.agents/...` re-include via gitignore's last-match-wins
precedence, hiding force-tracked audit-truth artifacts:

  $ git check-ignore -v .agents/nightly/2026-05-07/baseline-goals.json
  .agents/.gitignore:2:*    .agents/nightly/2026-05-07/baseline-goals.json

This was the corpus-durability bug filed as soc-rv5p — the moat claim in
README/PRODUCT.md is fragile if the audit corpus can drop off git
invisibly between sessions.

Fix: switch from a one-shot guard to a reconciler.

- Root has `!/.agents/` allowlist → REMOVE any deny-all child (audit
  truth wins; this catches stale child files from older hook versions
  or external tooling)
- Root has no allowlist → INSTALL the deny-all child (external repos
  embedding AgentOps still get the safety belt)

Tests:

- "removes stale deny-all child when parent has allowlist" — locks the
  reconciler against regression
- "preserves non-deny-all child gitignore" — locks reconciler scope
  (only the deny-all `*` pattern is targeted; user-authored child rules
  survive)

Verified locally:

  $ rm .agents/.gitignore && bash hooks/session-start.sh
  $ git check-ignore -v .agents/nightly/foo.txt
  .gitignore:140:!/.agents/nightly/**    .agents/nightly/foo.txt

The parent allowlist now governs as intended.

Closes soc-rv5p.
@boshu2 boshu2 merged commit 2f3630a into main May 8, 2026
4 checks passed
@boshu2 boshu2 deleted the fix/soc-rv5p-corpus-durability branch May 8, 2026 01:40
boshu2 added a commit that referenced this pull request May 8, 2026
…nts (soc-bvhn)

Two eval pins drifted silently when tests were added:
  - evals/agentops-core/hook-lifecycle-behavior.json    "Total: 226"
  - evals/agentops-core/pre-push-gate-governance.json   "1..51"
The 226-pin only counted tests/hooks/test-hooks.sh and missed the 86-test
test-hooks.bats suite entirely; the 1..51-pin was a manually-bumped
plan line (commit bd10427 was 50→51 after PR #263 added a test).

Replace both with a new expectation type stdout_contains_auto_detect that:
  1. Re-executes a source-of-truth command (e.g. `bash tests/hooks/test-hooks.sh`)
  2. Captures a regex group from its combined stdout+stderr
  3. Verifies the case stdout contains the FULL match (drift = fail loud)
  4. Honors tolerance_pct (default 0; strict-by-default)
  5. Pre-checks the command's binary on PATH so missing `bats` etc. fails
     with an explicit "not found on PATH" message instead of regex miss
  6. Inherits the case's cwd / suite-dir + timeout policy

Drift surfaces a `BUMP the pinned value to "Total: N"` one-liner so the
operator can react in one edit. Migrated both pins to use the new type.

Pre-mortem M3 (.agents/council/2026-05-07-pre-mortem-drain-open-next-work-items.md)
patch is enforced indirectly: validExpectationType registers the new kind,
and the parser rejects empty command/pattern at load time.

Tests: 8 new in expectations_auto_detect_test.go covering pass-on-match,
fail-loud-on-drift, missing-binary-fails-loud, pattern-no-match, the
bats `1..N` plan-line shape, missing-command rejection, tolerance allowing
numeric drift, and registry membership.

Closes soc-bvhn from epic soc-xlw8.
boshu2 added a commit that referenced this pull request May 8, 2026
* docs(mkdocs): fix 5 anchor mismatches + 1 unresolvable relative link (soc-2yx9)

mkdocs slugify collapses em-dash spacing in headings to a single hyphen, so
'## A — B' produces id="a-b", not 'a--b'. Five doc cross-references encoded
the visual double-dash and were silently flagged INFO by `mkdocs build --strict`.

Fixes (slugs verified against built site/ HTML):
- docs/GLOSSARY.md       → how-it-works.md#ralph-wiggum-pattern-fresh-context-every-wave
- docs/cdlc.md (×2)      → the-science.md#part-6-the-convergence-cdlc-as-the-unifying-spine
- docs/context-lifecycle.md (×2) → #the-knowledge-ledger-session-to-session-flow
- docs/releases/2026-03-21-v2.28.0-notes.md → ../CHANGELOG.md#2280-2026-03-21

Plus one structural fix:
- docs/documentation-index.md → ../evals/workbench/ replaced with the
  github.com URL pattern used by every other extra-docs link in this file.
  evals/ lives outside docs_dir so mkdocs cannot resolve relative links there.

Acceptance: zero INFO/WARN/ERROR mentioning these 5 files in
`mkdocs build --strict`. Pre-existing INFO lines on templates/ and L*-*/
README directory links are out of scope (research §U4 / pre-mortem L3).

Closes soc-2yx9 from epic soc-xlw8.

* fix(goals): anchor cwd to GOALS.md repo root for `ao goals measure` (soc-crzz)

Reproduction: from /tmp, `ao goals measure --file <abs>/GOALS.md` returned
21/22 fails with "bash: scripts/check-*.sh: No such file or directory".
Goal Check strings are relative paths; bash inherited the caller's cwd.

Fix: at RunMeasure entry, derive the repo root from the absolute GoalsFile
path and chdir to it for the duration of the measurement; restore the prior
cwd via deferred no-op-safe restorer (`withGoalFileCwd`). When GoalsFile is
not inside a git repo, the helper returns a no-op so today's behavior is
preserved outside repos.

Also expose `Paths.RepoRoot` from cli/internal/paths so future callers can
detect the resolved repo top without re-shelling git. The field is empty
when the resolver was given a directory not inside a repo, mirroring
`git rev-parse --show-toplevel` semantics.

Smoke (post-fix): `cd /tmp && ao goals measure --file <abs>/GOALS.md --json`
returns 22/22 pass, zero "No such file" errors.

Tests:
- TestRunMeasure_RelativeScripts_FromExternalCwd exercises the bug path
  end-to-end (fake git repo + relative-script goal + chdir to outside).
- TestWithGoalFileCwd_NoRepoFallback verifies the no-repo no-op.
- TestWithGoalFileCwd_EmptyPath verifies the empty-input no-op.

Per pre-mortem H1 (.agents/council/2026-05-07-pre-mortem-drain-open-next-work-items.md).
Closes soc-crzz from epic soc-xlw8.

* feat(eval): add stdout_contains_auto_detect ratchet for live test counts (soc-bvhn)

Two eval pins drifted silently when tests were added:
  - evals/agentops-core/hook-lifecycle-behavior.json    "Total: 226"
  - evals/agentops-core/pre-push-gate-governance.json   "1..51"
The 226-pin only counted tests/hooks/test-hooks.sh and missed the 86-test
test-hooks.bats suite entirely; the 1..51-pin was a manually-bumped
plan line (commit bd10427 was 50→51 after PR #263 added a test).

Replace both with a new expectation type stdout_contains_auto_detect that:
  1. Re-executes a source-of-truth command (e.g. `bash tests/hooks/test-hooks.sh`)
  2. Captures a regex group from its combined stdout+stderr
  3. Verifies the case stdout contains the FULL match (drift = fail loud)
  4. Honors tolerance_pct (default 0; strict-by-default)
  5. Pre-checks the command's binary on PATH so missing `bats` etc. fails
     with an explicit "not found on PATH" message instead of regex miss
  6. Inherits the case's cwd / suite-dir + timeout policy

Drift surfaces a `BUMP the pinned value to "Total: N"` one-liner so the
operator can react in one edit. Migrated both pins to use the new type.

Pre-mortem M3 (.agents/council/2026-05-07-pre-mortem-drain-open-next-work-items.md)
patch is enforced indirectly: validExpectationType registers the new kind,
and the parser rejects empty command/pattern at load time.

Tests: 8 new in expectations_auto_detect_test.go covering pass-on-match,
fail-loud-on-drift, missing-binary-fails-loud, pattern-no-match, the
bats `1..N` plan-line shape, missing-command rejection, tolerance allowing
numeric drift, and registry membership.

Closes soc-bvhn from epic soc-xlw8.

* feat(reconcile): observation-log aggregator unblocks Wave 1E (soc-ejq2)

Wave 1E (soc-f42z9) of the Reconciliation Engine epic cannot promote the
factory-claim-ledger validator from advisory to blocking until ≥20 advisory
runs are aggregated to .agents/reconcile/observation-log.jsonl. The CI job
(`factory-claim-ledger-strict (advisory)` at validate.yml:450-565) already
emits per-run observation JSON as a workflow artifact; nothing aggregated
them. This builds the missing pull-mode aggregator.

scripts/aggregate-observation-log.sh:
  1. `gh run list --workflow validate.yml ...` for recent runs
  2. `gh run download <run-id> -n factory-claim-ledger-observation` per run
     (silently skips runs that predate the advisory job)
  3. Schema-validates EVERY observation BEFORE dedup (pre-mortem M3 patch:
     `jq unique_by(.run_id)` silently collapses null-key entries — we reject
     malformed observations up front with `jq -e all(.run_id and ...)`)
  4. Dedups on run_id, atomic write to .agents/reconcile/observation-log.jsonl
  5. Backfills merged_anyway and ledger_updated:
     - pr_number=null (push-to-main): both false. Push-to-main observations
       satisfy the no-silent-merge criterion by definition.
     - else: `gh pr view` for state + `git log` for ledger-touch detection
  Empty-output case (no runs yet): produces valid empty JSONL, exits 0.
  Idempotent: re-running does not duplicate; verified by L2 test.

Skeleton .agents/reconcile/promotion-decision.md (gitignored, not in repo)
seeded by the script for operator to fill in false-positive baseline.

L1+L2 fixture-driven tests (9/9 pass): empty-input, dedup, schema rejection
of null run_id, push-to-main backfill semantics, --dry-run no-write,
idempotent byte-identical re-runs.

Adds `reconcile` to docs/contracts/agents-write-surfaces.md (prose table
and BEGIN/END allowlist) — required for `check-agents-write-surfaces.sh`
to pass after the new write surface lands.

NOT in scope (deferred to follow-up issues):
  - Scheduled .github/workflows/aggregate-observations.yml workflow
  - Operator-filled promotion-decision.md baseline
  - Wave 1E gate flip itself (soc-f42z9)

Per pre-mortem M3 (.agents/council/2026-05-07-pre-mortem-drain-open-next-work-items.md).
Closes soc-ejq2 from epic soc-xlw8.

* feat(crank): wire CI-policy parity gate into wave acceptance (soc-il9k)

Prevents recurrence of c587b36-style manual fixes. Wave 1C of soc-e4ulx
added factory-claim-ledger-strict (advisory) to .github/workflows/validate.yml
without updating AGENTS.md or `summary.needs:`. Codex-team caught and fixed
the drift manually. This PR formalizes that recovery as a conditional
crank wave-acceptance gate, so the next wave that touches workflow YAML
fails loud at acceptance instead of slipping through.

Insertion (canonical and codex-mirrored):
  - skills/crank/SKILL.md Step 5.5: new "CI-Policy Parity Gate (conditional)"
    sub-section pointing to the worked example.
  - skills/crank/references/wave-patterns.md: appended worked example with
    drift→fix transcript referencing commit c587b36.
  - skills/swarm/references/local-mode.md (worker prompt template): added
    section 4 to CONDITIONAL PREFLIGHT CHECKS so workers run the gate
    proactively, not only at acceptance.
  - skills-codex/* parity copies + regenerated codex hashes.

Trigger (narrow grep — yaml workflow files only):
  if git diff --name-only HEAD~1 -- | grep -qE '^\.github/workflows/.*\.ya?ml$'; then
      bash scripts/validate-ci-policy-parity.sh || exit 1
  fi

CODEOWNERS-only or markdown-only changes do not trip the gate. The
validator (scripts/validate-ci-policy-parity.sh, 186 LOC, unchanged) is a
read-only dependency.

Golden fixture tests/integration/test-ci-policy-parity-wave-gate.sh
(9/9 pass): SKILL doc presence, trigger-pattern shape, worked-example
linkage, worker-template wiring, codex parity, validator presence,
drifted-fixture exits 1, aligned-fixture exits 0, live-repo PASS.

Per planning rule finding-2026-05-07-ci-parity-as-wave-acceptance — this
PR IS the formalization of that rule (self-referential by design).

Closes soc-il9k from epic soc-xlw8.

* feat(reconcile): thesis-stability gate + Wave 0 snapshot (soc-mt50)

Gates the Reconciliation Engine arc transition from Wave 1 to Wave 2
(soc-r3y8b) by diffing current README/PRODUCT/GOALS hero sections against
the snapshot frozen at Wave 0 close (commit ab479e2 — "docs(positioning):
wiki-framing sweep across all surface docs (soc-9xn0)").

scripts/check-thesis-stability.sh:
  Hero extractor anchored to '^## ' (with trailing space) per pre-mortem M2
  patch — code-fence-safe, excludes the H2 boundary line itself:
    awk 'NR==1, /^## / {if (!/^## /) print}' <file>
  Per-file diff against the snapshot's fenced ```...``` block. Operator
  decides accept/re-brainstorm/incidental on FAIL.
  Exit codes: 0 = no drift, 1 = drift (operator decision required), 2 =
  precondition error (snapshot missing, source files missing, or snapshot
  carries a literal `^<!-- WAVE_0_TODO -->$` line — anchored regex prevents
  prose mentions of the marker from triggering the precondition gate, per
  pre-mortem L6).
  --json mode emits {verdict, drifted, decision_template} for tooling.

Snapshot artifact (.agents/reconcile/wave-0-thesis-snapshot.md):
  Frozen verbatim hero blocks for README.md / PRODUCT.md / GOALS.md at
  closure SHA ab479e2. Header documents extraction contract + regen path.

Decision template (.agents/reconcile/thesis-stability-decision.md):
  Operator-fillable record. Three explicit options (accept / re-brainstorm
  / incidental). Re-validation matrix for Waves 2-4 acceptance criteria
  on accept-drift path.

.gitignore:
  Adds the snapshot + decision-template paths to the .agents/ allowlist
  alongside existing exceptions (rpi/next-work.jsonl, nightly/, evolve/,
  goals/, findings/registry.jsonl). observation-log.jsonl and
  promotion-decision.md remain runtime state (generated by aggregator and
  operator fill-in respectively) and stay ignored.

Smoke (run during implementation):
  - bash scripts/check-thesis-stability.sh        → exit 0 (no drift today)
  - bash scripts/check-thesis-stability.sh --help → usage emitted, exit 0
  - inject SMOKE-DRIFT into README hero, re-run   → exit 1 with diff
  - revert, re-run                                → exit 0

Per pre-mortem M2 + L6 (.agents/council/2026-05-07-pre-mortem-drain-open-next-work-items.md).
Closes soc-mt50 from epic soc-xlw8.

* ci(reconcile): allowlist thesis-stability snapshot + decision artifacts (soc-mt50 follow-up)

PR-E (06f416e) added two static contract artifacts under .agents/reconcile/
that need to be tracked: wave-0-thesis-snapshot.md (frozen at Wave 0 SHA
ab479e2) and thesis-stability-decision.md (operator-fillable template).

The .gitignore exception alone is not sufficient — scripts/check-no-tracked-agents.sh
maintains a narrow audit-truth allowlist of paths under .agents/ that may
be tracked. Without an allowlist update, the pre-push-gate's "tracked
.agents state" check fails on these two new files.

Extends ALLOWED_PATHS_REGEX and ALLOWED_REINCLUDES_REGEX to include the
two reconcile paths. observation-log.jsonl + promotion-decision.md
intentionally remain runtime state (generated by the aggregator and
operator fill-in respectively) and are NOT added to the allowlist.

Updates the file's header comment to document the new reconcile entry
in the audit-truth allowlist surface.

Acceptance:
  - bash scripts/check-no-tracked-agents.sh → "no disallowed tracked repo-root .agents state"
  - bash scripts/pre-push-gate.sh --fast    → exit 0 (46 skipped, 0 failed)

Per CLAUDE.md "All exceptions require a coordinated update of .gitignore +
this allowlist + CLAUDE.md/PROGRAM.md guidance." (Reconciliation Engine
arc context covered in PR-E commit message.)

* chore(rpi): mark soc-9xn0 + soc-e4ulx next-work entries consumed (soc-ot2m)

Closes the housekeeping tail of epic soc-xlw8. Both unconsumed entries in
.agents/rpi/next-work.jsonl flip to consumed=true, claim_status=consumed,
consumed_by=soc-xlw8, consumed_at=2026-05-08T09:30:00-04:00.

Mapping (10 raw items → 7 work units → 6 PRs; U1 already resolved by
Wave 1A pre-session, no PR needed):
  U2 BATS count audit gap            → PR-C f75201f (soc-bvhn)
  U3 ao goals measure cwd            → PR-B 04e585e (soc-crzz)
  U4 mkdocs link rot + anchors       → PR-A cf58467 (soc-2yx9)
  U5 observation-log aggregator      → PR-D af681e9 (soc-ejq2)
  U6 thesis-stability gate           → PR-E 06f416e (+93e4cf05)
  U7 CI-policy parity in crank       → PR-F a2fb6a2 (soc-il9k)

Closure note (gitignored, local provenance):
  .agents/rpi/closure-2026-05-07-next-work-items.md

Also adds .agents/findings/registry.jsonl with three reusable findings
generated during this session (the file was untracked due to .agents/
gitignore precedence; the audit-truth allowlist already permits it):
  - finding-2026-05-07-mkdocs-slug-observed-not-asserted
  - finding-2026-05-07-sed-range-includes-boundary-line
  - finding-2026-05-07-jq-unique-by-collapses-null-keys

Acceptance:
  - bash scripts/validate-next-work-contract-parity.sh → exit 0
  - jq -c . < .agents/rpi/next-work.jsonl              → valid JSONL, 2 entries
  - bash scripts/check-no-tracked-agents.sh            → no disallowed paths
  - bash scripts/pre-push-gate.sh --fast               → exit 0 (46 skipped)

Per pre-mortem L5 (.agents/council/2026-05-07-pre-mortem-drain-open-next-work-items.md):
the validator was already passing pre-PR (the phase-2-summary's claim of
broken parity was stale). Acceptance is steady-state ("validator continues
to exit 0 after marking entries consumed"), not transition.

Closes soc-ot2m and epic soc-xlw8.

* chore(reconcile): record Wave 1E promotion-decision baseline (Option B defer)

Filled .agents/reconcile/promotion-decision.md from skeleton with computed
baseline from observation-log.jsonl: 3 obs / 0 fail / 2 PRs / vacuous FP
rate. Sample 3 < 20 threshold and PRs 2 < 3 threshold do not justify
promotion to blocking; first-day post-aggregator data is also clustered
within ~4h. Selected Option B (defer/extend sampling) with explicit
re-evaluation trigger documented (>=20 runs, >=3 PRs, >=1 fail).

validate.yml gate stays advisory (continue-on-error: true intact). No
behavior change to CI. Re-evaluation tracked under bd soc-nsuq, which
blocks Wave 1E parent soc-f42z9.

Audit-truth allowlist extended in both .gitignore and
scripts/check-no-tracked-agents.sh to track the decision artifact, mirroring
the thesis-stability-decision.md allowlist pattern from 93e4cf0. The
promotion-decision is committed input to the Wave 1E gate-flip workflow,
not runtime state.

Refs: soc-xlw8 (RPI follow-up), soc-f42z9 (Wave 1E parent), soc-nsuq
(re-evaluation), soc-ejq2 (aggregator that produced the baseline).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant