Skip to content

docs: restructure firstmate agent guidance#79

Merged
kunchenguid merged 2 commits into
mainfrom
fm/agents-restructure-q4
Jun 25, 2026
Merged

docs: restructure firstmate agent guidance#79
kunchenguid merged 2 commits into
mainfrom
fm/agents-restructure-q4

Conversation

@kunchenguid

Copy link
Copy Markdown
Owner

Intent

Restructure firstmate's safety-critical AGENTS.md to be tighter and partly skill-extracted without behavioral loss, following the audit report at /Users/kunchen/github/kunchenguid/firstmate/data/agents-trim-a3/report.md plus the captain's deviations. Implement all lossless consolidation in AGENTS.md; create exactly three new agent-only skills with user-invocable: false for harness-adapters, stuck-crewmate-recovery, and secondmate-provisioning; fold afk daemon internals into the existing user-invocable /afk skill instead of creating afk-supervision; keep backlog/tasks-axi command mapping and project clone/create/init mechanics inline; preserve load-bearing safety reinforcements for never writing projects, yolo escalation limits, teardown protection, secondmate idle behavior, afk approval authority, and watcher drain-first/re-arm-last behavior. Preserve CLAUDE.md and .claude/skills symlinks, validate shell syntax and tests, then publish through no-mistakes as a PR for captain review.

What Changed

  • Split safety-critical agent guidance out of AGENTS.md into dedicated agent-only skills for harness adapters, stuck crewmate recovery, and secondmate provisioning, while retaining the existing user-invocable /afk skill for away-mode supervision internals.
  • Tightened the main firstmate manual to keep core lifecycle, project intake, delivery-mode, backlog, watcher, yolo, teardown, and safety rules inline while removing duplicated procedural detail.
  • Updated firstmate docs and helper scripts so spawned agents reference the restructured terminology and skill layout consistently.

Risk Assessment

✅ Low: Captain, risk is low because this is a docs-only extraction and the load-bearing procedures are either still inline or moved into explicitly triggered, non-user-invocable skills.

Testing

Inspected the target/base diff, ran the full existing shell behavior suite live and again into an evidence transcript, then generated a contract evidence report for the doc and skill surfaces; all checks passed, the worktree stayed clean, and separate shell syntax/static checks were not run because the prompt forbade static analysis.

Evidence: Behavior test transcript
# Firstmate AGENTS restructure behavior test transcript

Worktree: /Users/kunchen/.no-mistakes/worktrees/016d88035d58/01KVYS9QTF0ZWS5K5TEVQMHYQA
Base: e4d236bdc96be74863fe59b0a333bb94fb4cf3dd
Target: 12563ea06c0d84a87cf56d5ef25785c2d896852f


>>> tests/fm-afk-inject-e2e.test.sh
ok - Scenario A: partial input defers injection; digest arrives clean after idle
ok - Scenario B: swallowed Enter produces exactly one clean digest
all e2e injection tests passed

>>> tests/fm-bootstrap.test.sh
ok - bootstrap accepts treehouse get --lease support
ok - bootstrap reports treehouse without get --lease support
ok - bootstrap reports compatible optional tasks-axi availability
ok - bootstrap ignores incompatible optional tasks-axi

>>> tests/fm-composer-ghost.test.sh
ok - fm_tmux_strip_ghost drops dim/faint runs, keeps normal and bold text
ok - fm_tmux_strip_ghost handles combined SGR, ESC[22m, and reset-then-dim
ok - fm_tmux_strip_ghost keeps colored text with 2 payloads
ok - fm_pane_input_pending: a dim ghost-only composer is NOT pending
ok - fm_pane_input_pending: dim ghost inside a bordered composer is NOT pending
ok - fm_pane_input_pending: normal-intensity typed text is still pending
ok - fm_pane_input_pending: colored text with 2 payloads is still pending
ok - fm_pane_input_pending: real text plus a trailing ghost run is still pending
ok - fm-peek output is escape-free (no raw -e bytes reach firstmate context)

>>> tests/fm-secondmate.test.sh
ok - FM_HOME parameterizes data and state paths
ok - fm-lock status is scoped per home
ok - secondmates registry records scopes and allows overlapping project clone lists
ok - home seeding records routing scope from filled charter briefs
ok - home seed validation rejects duplicate home routes
ok - home seed validation rejects duplicate id routes
ok - home seed validation rejects nested home routes
leased worktree for dash
ok - home seeding durably leases treehouse-acquired dash homes under the secondmate id
ok - home seeding returns rejected acquired homes through treehouse
ok - home seed rollback warns when treehouse-acquired return fails
ok - home seeding leaves unsafe acquired active homes untouched
ok - home seeding rolls back failed clone attempts without residue
ok - home seeding refuses direct seed without filled charter text
ok - home seeding refuses unfilled placeholder charters
ok - home seeding refuses empty normalized charter fields
ok - home seeding refuses local-only projects
ok - home seeding refuses registry delimiter home paths
ok - home seeding refuses active home and repo root
ok - home seeding refuses homes marked for another id
ok - home seeding refuses homes registered to another id
ok - home seeding refuses same-id reassignment to a different home
ok - home seeding refuses registered home overlaps
ok - remote-backed subhome seeding requires a source origin
ok - remote-backed subhome seeding validates existing destination origins
ok - home seeding resolves relative source origins against the source project
ok - home seeding skips initialized existing no-mistakes clones
ok - home seeding refuses uninitialized existing no-mistakes clones
ok - home seeding refuses project destinations outside the subhome
ok - home seeding refuses operational directories outside the subhome
ok - home seeding refuses symlinked leaf files
ok - kind=secondmate spawn launches in the home and records routing meta
ok - secondmate spawn validates homes before launch
ok - secondmate spawn refuses operational directories outside the subhome
ok - fm-send resolves bare firstmate windows through this home
ok - restart recovery can respawn a secondmate from durable registry and charter
ok - secondmate teardown retires empty homes and releases routing
ok - secondmate teardown refuses to hide failed leased-home return
ok - secondmate teardown raw-removes plain-clone homes
ok - secondmate force teardown discards child work
ok - force teardown allows operational directory symlinks inside the subhome
ok - force teardown refuses operational directory symlinks outside the subhome
ok - secondmate teardown requires seeded home marker
ok - secondmate teardown refuses homes containing registered nested homes
ok - secondmate teardown refuses nested homes from the child registry
ok - force teardown validates subhome before child cleanup
ok - force teardown refuses child worktrees inside the active home
ok - force teardown refuses child worktrees inside the firstmate repo
ok - force teardown refuses unregistered child worktree paths
ok - secondmate teardown refuses ancestor homes
ok - secondmate teardown refuses descendant homes
ok - idle kind=secondmate pane is healthy and not stale
ok - secondmate charter brief is idle by default and does not self-initiate work
ok - fm-backlog-handoff moves in-scope items, is idempotent, and aborts safely
ok - fm-backlog-handoff creates absent sections and refuses unsafe homes

>>> tests/fm-spawn-batch.test.sh
ok - batch dispatch re-execs and reports every id=repo pair
ok - a single id=repo pair routes through batch dispatch
ok - single-task invocation (no '=') is untouched by batch detection
ok - batch dispatch rejects an argument that is not id=repo
ok - an arg whose id part contains '/' is not treated as a batch pair
ok - FM_HOME scopes projects/ paths for single-task spawn
ok - FM_PROJECTS_OVERRIDE scopes projects/ paths for single-task spawn

>>> tests/fm-teardown.test.sh
ok - local-only worktree with HEAD on a fork remote is torn down (fix holds)
ok - teardown prompts tasks-axi backlog refresh when compatible
ok - local-only worktree with truly unpushed work is refused (safety preserved)
ok - local-only worktree with work merged into local main is torn down (no regression)
ok - no-mistakes worktree with HEAD on origin is torn down (no regression)
ok - no-mistakes worktree with truly unpushed work is refused (no regression)
ok - local-only worktree with unpushed work is torn down under --force (escape hatch)

>>> tests/fm-update.test.sh
ok - T1 main + secondmate fast-forward, reread + nudge signalled
ok - T2 advance is a fast-forward, not a merge commit
ok - T3 reread gates on instruction surface, nudge on advancement
ok - T4 dirty secondmate skipped, local edit preserved
ok - T5 diverged secondmate skipped, local commit preserved
ok - T6 idempotent: a second run is a no-op
ok - T7 secondmate resolved from registry without inventing a window
ok - T8 deduped homes and excluded the firstmate repo itself
ok - T9 firstmate off its default branch is skipped, not forced
ok - T10 firstmate detached HEAD is skipped
ok - T11 unsafe secondmate home is not fast-forwarded
# all fm-update tests passed

>>> tests/fm-wake-queue.test.sh
ok - supervise daemon state root is scoped by FM_HOME
ok - concurrent append plus drain preserves queue records
ok - signal written while no watcher runs is caught on next run
ok - stale wake is queued before suppressor state is advanced
ok - check output is queued before cadence suppression
ok - simultaneous watcher starts leave exactly one live process
ok - two atomic drains cannot consume the same records twice
ok - drain collapses obvious duplicate heartbeat and signal records
ok - killed watcher stale lock is reclaimed
ok - live watcher lock with stale heartbeat is actionable
ok - concurrent fm_lock_try_acquire yields exactly one winner
ok - dead-pid stale lock is reclaimed by a single acquirer
ok - concurrent stale-lock steal yields exactly one winner
ok - live steal mutex is not reclaimed
ok - live-held lock is not stolen
ok - empty mid-acquire lock keeps a minimum grace
ok - late original claimant cannot claim a recreated lock
ok - paused mid-acquire claimant backs off to active stealer
ok - watch restart refuses to signal a reused pid
ok - watcher self-evicts when the lock pid no longer names it
ok - arm reports a live fresh watcher as healthy and exits zero
ok - arm starts a watcher and confirms it live before reporting started
ok - arm cleans child watcher and temp output on HUP
ok - arm propagates an immediate watcher wake before confirmation
ok - arm waits for a peer watcher beacon after child stands down
watcher: lock held by live pid 84552 but heartbeat is stale for 835657785s (>300s); inspect or stop that watcher before re-arming.
ok - arm reports FAILED and exits non-zero when no fresh watcher can be confirmed
ok - arm never reports healthy off a dead pid and self-heals into a confirmed watcher
ok - guard warns when queued wakes are pending
ok - guard orders watcher re-arm after queued wake drain
ok - guard leads with a prominent no-watcher banner before the queued-wakes warning
ok - guard stays silent when a fresh watcher is alive and no wakes are queued
ok - routine signal self-handles
ok - captain-relevant status verbs escalate
ok - check + unknown escalate; heartbeat self-handles
ok - transient stale self-handles and records a persistence marker
ok - stale + terminal status escalates immediately
ok - persistent stale escalates after threshold and clears its marker
ok - resumed (busy) stale clears its marker without escalating
ok - multiple escalations flush as a single batched digest
ok - batch flush measures max-delay from the first append, not the last
ok - catch-all scan escalates a missed terminal once, not twice
ok - handle_wake routes routine->self and captain->escalate
ok - INJECT_SKIP forces self-handle, bypassing captain-relevant classification
ok - is_wake_reason distinguishes watcher wake reasons from singleton-status stdout
ok - terminal-stale escalate removes its marker so housekeeping does not re-escalate
ok - captain signal escalate marks seen so the catch-all scan does not re-fire
ok - _collapse_newlines replaces newlines with literal separator
ok - afk flag absent: daemon does not inject, buffer preserved
ok - afk flag present: daemon injects with sentinel marker prefix
ok - injected digest is single-line (no embedded newlines)
ok - busy-guard defers injection when supervisor pane is busy
ok - marker detection: marker -> stay afk, no marker -> exit afk
ok - /afk invocation is exempt from afk exit (no self-cancel)
ok - should_exit_afk returns false when afk is not active
ok - strip_injection_marker removes the sentinel marker cleanly
ok - pane_input_pending detects partial input on the cursor line
ok - pane_input_pending: blank cursor line is not pending
ok - pane_input_pending: bare prompts are not pending (idle)
ok - pane_input_pending honors FM_COMPOSER_IDLE_RE after border stripping
ok - composer guard defers injection when pane has pending input
ok - swallowed Enter: type-once + Enter-retry, no concatenation
ok - normal inject: exactly one digest, one Enter, no duplicates
ok - classify_signal dedupes against the catch-all scan seen marker
ok - classify_stale dedupes against the signal path seen marker
ok - pane_input_pending: an idle bordered composer is NOT pending (afk-invx-i5)
ok - pane_input_pending: text inside a bordered composer is still pending
ok - submit-ACK confirms a submit when the composer returns to a bordered-empty box
ok - submit-ACK reports pending on a persistently swallowed Enter (type-once)
ok - max-defer on an empty stuck pane types once, alarms, and preserves the buffer
ok - max-defer flushes and clears the buffer on an empty bordered pane
ok - max-defer on a pending composer alarms without typing
ok - normal flush clears a stale wedge marker
ok - below MAX_DEFER: no inject, no alarm, buffer preserved
ok - max-defer does not flush or alarm while afk is inactive
ok - fm-send exits non-zero on a confirmed swallow, zero on a clean submit
ok - fm-send exits non-zero when initial text send fails
Evidence: AGENTS restructure contract evidence
# Firstmate AGENTS restructure contract evidence

Worktree: `/Users/kunchen/.no-mistakes/worktrees/016d88035d58/01KVYS9QTF0ZWS5K5TEVQMHYQA`

Base: `e4d236bdc96be74863fe59b0a333bb94fb4cf3dd`

Target: `12563ea06c0d84a87cf56d5ef25785c2d896852f`

## Changed files

`` `text
M	.agents/skills/afk/SKILL.md
A	.agents/skills/harness-adapters/SKILL.md
A	.agents/skills/secondmate-provisioning/SKILL.md
A	.agents/skills/stuck-crewmate-recovery/SKILL.md
M	AGENTS.md
`` `

## Symlink surface

`` `text
CLAUDE.md -> AGENTS.md
.claude/skills -> ../.agents/skills
`` `

## Skill front matter

`` `text
.agents/skills/afk/SKILL.md
---
name: afk
description: Enter away-mode supervision. Use when the user invokes /afk (e.g. "/afk", "/afk back in an hour", "going afk"). Sets a durable away-mode flag so the sub-supervisor daemon can self-handle routine wakes and escalate only captain-relevant events as one batched digest, cutting supervision token cost during walk-away stretches. Exit is automatic; any real (unmarked) message returns to full per-wake responsiveness.
user-invocable: true
---

.agents/skills/harness-adapters/SKILL.md
---
name: harness-adapters
description: Agent-only reference for firstmate harness operations. Use before spawning or recovering a crewmate or secondmate, handling a trust dialog, sending a harness-specific skill invocation, interrupting or exiting an agent, resuming an exited agent, or verifying a new harness adapter. Contains verified facts for claude, codex, opencode, and pi.
user-invocable: false
---

.agents/skills/secondmate-provisioning/SKILL.md
---
name: secondmate-provisioning
description: Agent-only reference for persistent secondmate setup and retirement. Use when creating, seeding, validating, recovering, handing backlog to, or retiring a secondmate home, or when editing data/secondmates.md. Covers home leases, transactional seeding, project clone restrictions, idle charter, handoff helper, and teardown safety.
user-invocable: false
---

.agents/skills/stuck-crewmate-recovery/SKILL.md
---
name: stuck-crewmate-recovery
description: Agent-only playbook for stuck firstmate direct reports. Use after a stale wake, looping pane, repeated confusion, an answered-by-brief question, an unresponsive crewmate, or a failed steer. Escalates from peek, to one-line steer, to harness-specific interrupt, to relaunch with progress, to failed status.
user-invocable: false
---

.agents/skills/updatefirstmate/SKILL.md
---
name: updatefirstmate
description: Self-update a running firstmate and its secondmates to the latest from origin. Use when the captain invokes /updatefirstmate (e.g. "/updatefirstmate", "update firstmate", "pull the latest firstmate"). Fast-forwards this firstmate repo's default branch and every secondmate home from origin (fast-forward only, never forced, never disruptive), then re-reads AGENTS.md and nudges each updated secondmate to do the same, so the whole tree runs the latest bin/ and instructions.
user-invocable: true
---

`` `

## Contract checks

- PASS: the change adds exactly three new skill files
- PASS: new harness-adapters skill is agent-only
- PASS: new secondmate-provisioning skill is agent-only
- PASS: new stuck-crewmate-recovery skill is agent-only
- PASS: existing afk skill remains user-invocable
- PASS: CLAUDE.md still points at AGENTS.md
- PASS: .claude/skills still points at .agents/skills
- PASS: AGENTS keeps never-write safety reinforcement inline
- PASS: AGENTS keeps merge approval reinforcement inline
- PASS: AGENTS keeps teardown protection inline
- PASS: AGENTS keeps yolo escalation limits inline
- PASS: AGENTS keeps secondmate idle behavior inline
- PASS: AGENTS keeps project clone mechanics inline
- PASS: AGENTS keeps project create mechanics inline
- PASS: AGENTS keeps project initialization mechanics inline
- PASS: AGENTS keeps tasks-axi command mapping inline
- PASS: AGENTS keeps watcher drain-first instruction inline
- PASS: AGENTS keeps watcher re-arm-last instruction inline
- PASS: AGENTS keeps afk approval authority inline
- PASS: AGENTS references harness-adapters load trigger
- PASS: AGENTS references stuck-crewmate-recovery load trigger
- PASS: AGENTS references secondmate-provisioning load trigger

## Inline anchor excerpts

`` `text
25:1. **Never write to a project.**
347:**yolo (orthogonal).** With `yolo=off` (default) every approval is the captain's: ask-user findings, PR merges, the local-only merge. With `yolo=on`, firstmate makes those calls itself without asking - resolve ask-user findings on your judgment, and run `gh-axi pr merge` / `bin/fm-merge-local.sh` once the work is green/approved - EXCEPT anything destructive, irreversible, or security-sensitive, which still escalates to the captain. Never merge a red PR even under yolo. After any merge you perform without asking the captain, post a one-line "merged <full PR URL or local main> after checks passed" FYI so the captain keeps a trail.
198:A secondmate is idle by default: it acts only on work the main firstmate routes to it.
409:At the start of every wake-handling turn and every recovery turn, run `bin/fm-wake-drain.sh` before peeking panes, reading status files beyond the reason line, or starting new work.
543:Map firstmate's real backlog operations to the approved commands:
583:- `harness-adapters` - load before spawning or recovering a crewmate or secondmate, handling a trust dialog, sending a harness-specific skill invocation, interrupting or exiting an agent, resuming an exited agent, or verifying a new harness adapter.
`` `

Pipeline

Updates from git push no-mistakes

✅ **intent** - passed

✅ No issues found.

✅ **Rebase** - passed

✅ No issues found.

✅ **Review** - passed

✅ No issues found.

✅ **Test** - passed

✅ No issues found.

  • git status --short && git rev-parse HEAD && git merge-base HEAD e4d236bdc96be74863fe59b0a333bb94fb4cf3dd
  • git diff --stat e4d236bdc96be74863fe59b0a333bb94fb4cf3dd..HEAD
  • git diff --name-status e4d236bdc96be74863fe59b0a333bb94fb4cf3dd..HEAD
  • find . -maxdepth 3 -type l -ls and readlink checks for CLAUDE.md and .claude/skills
  • for t in tests/*.test.sh; do bash "$t"; done
  • for t in tests/*.test.sh; do bash "$t"; done > /var/folders/5x/4nqprlbx0518k3ybcb1sz6gr0000gn/T/no-mistakes-evidence/01KVYS9QTF0ZWS5K5TEVQMHYQA/behavior-test-transcript.txt 2>&1
  • Generated /var/folders/5x/4nqprlbx0518k3ybcb1sz6gr0000gn/T/no-mistakes-evidence/01KVYS9QTF0ZWS5K5TEVQMHYQA/agents-restructure-contract-evidence.md to verify the three new agent-only skills, retained /afk user-invocable skill, symlinks, and inline safety anchors.
✅ **Document** - passed

✅ No issues found.

✅ **Lint** - passed

✅ No issues found.

✅ **Push** - passed

✅ No issues found.

@kunchenguid kunchenguid merged commit 9e21160 into main Jun 25, 2026
4 checks passed
@kunchenguid kunchenguid deleted the fm/agents-restructure-q4 branch June 25, 2026 07:25
leo1oel added a commit to leo1oel/nemo that referenced this pull request Jun 26, 2026
Port upstream kunchenguid#79's agent-guidance restructure, adapted to the Claude-only
herdr fork: relocate the detailed, situation-specific guidance out of AGENTS.md
into on-demand skill files, leaving slim pointers in the manual.

- .agents/skills/harness-adapters (new): the Claude Code adapter facts (busy
  signature, exit, interrupt, skill invocation, trust/bypass dialog, root/sudo
  IS_SANDBOX forwarding, ghost-text quirk + detector). Claude-only, herdr
  terminology - not upstream's four-harness, tmux version.
- .agents/skills/stuck-crewmate-recovery (new): the stuck-direct-report playbook.
- .agents/skills/secondmate-provisioning (new): routing table, fm-home-seed,
  idle-by-default contract, and fm-backlog-handoff.
- AGENTS.md sections 4/6/8 slimmed to pointers (624 -> 589 lines); fm-spawn.sh
  comment refs now point at the harness-adapters skill.

The agent-only skills carry user-invocable: false. No functional change.

Validation: shellcheck clean, full suite green (88 checks), no dangling refs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant