docs: restructure firstmate agent guidance by kunchenguid · Pull Request #79 · kunchenguid/firstmate

kunchenguid · 2026-06-25T07:19:04Z

Intent

Restructure firstmate's safety-critical AGENTS.md to be tighter and partly skill-extracted without behavioral loss, following the audit report at /Users/kunchen/github/kunchenguid/firstmate/data/agents-trim-a3/report.md plus the captain's deviations. Implement all lossless consolidation in AGENTS.md; create exactly three new agent-only skills with user-invocable: false for harness-adapters, stuck-crewmate-recovery, and secondmate-provisioning; fold afk daemon internals into the existing user-invocable /afk skill instead of creating afk-supervision; keep backlog/tasks-axi command mapping and project clone/create/init mechanics inline; preserve load-bearing safety reinforcements for never writing projects, yolo escalation limits, teardown protection, secondmate idle behavior, afk approval authority, and watcher drain-first/re-arm-last behavior. Preserve CLAUDE.md and .claude/skills symlinks, validate shell syntax and tests, then publish through no-mistakes as a PR for captain review.

What Changed

Split safety-critical agent guidance out of AGENTS.md into dedicated agent-only skills for harness adapters, stuck crewmate recovery, and secondmate provisioning, while retaining the existing user-invocable /afk skill for away-mode supervision internals.
Tightened the main firstmate manual to keep core lifecycle, project intake, delivery-mode, backlog, watcher, yolo, teardown, and safety rules inline while removing duplicated procedural detail.
Updated firstmate docs and helper scripts so spawned agents reference the restructured terminology and skill layout consistently.

Risk Assessment

✅ Low: Captain, risk is low because this is a docs-only extraction and the load-bearing procedures are either still inline or moved into explicitly triggered, non-user-invocable skills.

Testing

Inspected the target/base diff, ran the full existing shell behavior suite live and again into an evidence transcript, then generated a contract evidence report for the doc and skill surfaces; all checks passed, the worktree stayed clean, and separate shell syntax/static checks were not run because the prompt forbade static analysis.

Evidence: Behavior test transcript

# Firstmate AGENTS restructure behavior test transcript

Worktree: /Users/kunchen/.no-mistakes/worktrees/016d88035d58/01KVYS9QTF0ZWS5K5TEVQMHYQA
Base: e4d236bdc96be74863fe59b0a333bb94fb4cf3dd
Target: 12563ea06c0d84a87cf56d5ef25785c2d896852f


>>> tests/fm-afk-inject-e2e.test.sh
ok - Scenario A: partial input defers injection; digest arrives clean after idle
ok - Scenario B: swallowed Enter produces exactly one clean digest
all e2e injection tests passed

>>> tests/fm-bootstrap.test.sh
ok - bootstrap accepts treehouse get --lease support
ok - bootstrap reports treehouse without get --lease support
ok - bootstrap reports compatible optional tasks-axi availability
ok - bootstrap ignores incompatible optional tasks-axi

>>> tests/fm-composer-ghost.test.sh
ok - fm_tmux_strip_ghost drops dim/faint runs, keeps normal and bold text
ok - fm_tmux_strip_ghost handles combined SGR, ESC[22m, and reset-then-dim
ok - fm_tmux_strip_ghost keeps colored text with 2 payloads
ok - fm_pane_input_pending: a dim ghost-only composer is NOT pending
ok - fm_pane_input_pending: dim ghost inside a bordered composer is NOT pending
ok - fm_pane_input_pending: normal-intensity typed text is still pending
ok - fm_pane_input_pending: colored text with 2 payloads is still pending
ok - fm_pane_input_pending: real text plus a trailing ghost run is still pending
ok - fm-peek output is escape-free (no raw -e bytes reach firstmate context)

>>> tests/fm-secondmate.test.sh
ok - FM_HOME parameterizes data and state paths
ok - fm-lock status is scoped per home
ok - secondmates registry records scopes and allows overlapping project clone lists
ok - home seeding records routing scope from filled charter briefs
ok - home seed validation rejects duplicate home routes
ok - home seed validation rejects duplicate id routes
ok - home seed validation rejects nested home routes
leased worktree for dash
ok - home seeding durably leases treehouse-acquired dash homes under the secondmate id
ok - home seeding returns rejected acquired homes through treehouse
ok - home seed rollback warns when treehouse-acquired return fails
ok - home seeding leaves unsafe acquired active homes untouched
ok - home seeding rolls back failed clone attempts without residue
ok - home seeding refuses direct seed without filled charter text
ok - home seeding refuses unfilled placeholder charters
ok - home seeding refuses empty normalized charter fields
ok - home seeding refuses local-only projects
ok - home seeding refuses registry delimiter home paths
ok - home seeding refuses active home and repo root
ok - home seeding refuses homes marked for another id
ok - home seeding refuses homes registered to another id
ok - home seeding refuses same-id reassignment to a different home
ok - home seeding refuses registered home overlaps
ok - remote-backed subhome seeding requires a source origin
ok - remote-backed subhome seeding validates existing destination origins
ok - home seeding resolves relative source origins against the source project
ok - home seeding skips initialized existing no-mistakes clones
ok - home seeding refuses uninitialized existing no-mistakes clones
ok - home seeding refuses project destinations outside the subhome
ok - home seeding refuses operational directories outside the subhome
ok - home seeding refuses symlinked leaf files
ok - kind=secondmate spawn launches in the home and records routing meta
ok - secondmate spawn validates homes before launch
ok - secondmate spawn refuses operational directories outside the subhome
ok - fm-send resolves bare firstmate windows through this home
ok - restart recovery can respawn a secondmate from durable registry and charter
ok - secondmate teardown retires empty homes and releases routing
ok - secondmate teardown refuses to hide failed leased-home return
ok - secondmate teardown raw-removes plain-clone homes
ok - secondmate force teardown discards child work
ok - force teardown allows operational directory symlinks inside the subhome
ok - force teardown refuses operational directory symlinks outside the subhome
ok - secondmate teardown requires seeded home marker
ok - secondmate teardown refuses homes containing registered nested homes
ok - secondmate teardown refuses nested homes from the child registry
ok - force teardown validates subhome before child cleanup
ok - force teardown refuses child worktrees inside the active home
ok - force teardown refuses child worktrees inside the firstmate repo
ok - force teardown refuses unregistered child worktree paths
ok - secondmate teardown refuses ancestor homes
ok - secondmate teardown refuses descendant homes
ok - idle kind=secondmate pane is healthy and not stale
ok - secondmate charter brief is idle by default and does not self-initiate work
ok - fm-backlog-handoff moves in-scope items, is idempotent, and aborts safely
ok - fm-backlog-handoff creates absent sections and refuses unsafe homes

>>> tests/fm-spawn-batch.test.sh
ok - batch dispatch re-execs and reports every id=repo pair
ok - a single id=repo pair routes through batch dispatch
ok - single-task invocation (no '=') is untouched by batch detection
ok - batch dispatch rejects an argument that is not id=repo
ok - an arg whose id part contains '/' is not treated as a batch pair
ok - FM_HOME scopes projects/ paths for single-task spawn
ok - FM_PROJECTS_OVERRIDE scopes projects/ paths for single-task spawn

>>> tests/fm-teardown.test.sh
ok - local-only worktree with HEAD on a fork remote is torn down (fix holds)
ok - teardown prompts tasks-axi backlog refresh when compatible
ok - local-only worktree with truly unpushed work is refused (safety preserved)
ok - local-only worktree with work merged into local main is torn down (no regression)
ok - no-mistakes worktree with HEAD on origin is torn down (no regression)
ok - no-mistakes worktree with truly unpushed work is refused (no regression)
ok - local-only worktree with unpushed work is torn down under --force (escape hatch)

>>> tests/fm-update.test.sh
ok - T1 main + secondmate fast-forward, reread + nudge signalled
ok - T2 advance is a fast-forward, not a merge commit
ok - T3 reread gates on instruction surface, nudge on advancement
ok - T4 dirty secondmate skipped, local edit preserved
ok - T5 diverged secondmate skipped, local commit preserved
ok - T6 idempotent: a second run is a no-op
ok - T7 secondmate resolved from registry without inventing a window
ok - T8 deduped homes and excluded the firstmate repo itself
ok - T9 firstmate off its default branch is skipped, not forced
ok - T10 firstmate detached HEAD is skipped
ok - T11 unsafe secondmate home is not fast-forwarded
# all fm-update tests passed

>>> tests/fm-wake-queue.test.sh
ok - supervise daemon state root is scoped by FM_HOME
ok - concurrent append plus drain preserves queue records
ok - signal written while no watcher runs is caught on next run
ok - stale wake is queued before suppressor state is advanced
ok - check output is queued before cadence suppression
ok - simultaneous watcher starts leave exactly one live process
ok - two atomic drains cannot consume the same records twice
ok - drain collapses obvious duplicate heartbeat and signal records
ok - killed watcher stale lock is reclaimed
ok - live watcher lock with stale heartbeat is actionable
ok - concurrent fm_lock_try_acquire yields exactly one winner
ok - dead-pid stale lock is reclaimed by a single acquirer
ok - concurrent stale-lock steal yields exactly one winner
ok - live steal mutex is not reclaimed
ok - live-held lock is not stolen
ok - empty mid-acquire lock keeps a minimum grace
ok - late original claimant cannot claim a recreated lock
ok - paused mid-acquire claimant backs off to active stealer
ok - watch restart refuses to signal a reused pid
ok - watcher self-evicts when the lock pid no longer names it
ok - arm reports a live fresh watcher as healthy and exits zero
ok - arm starts a watcher and confirms it live before reporting started
ok - arm cleans child watcher and temp output on HUP
ok - arm propagates an immediate watcher wake before confirmation
ok - arm waits for a peer watcher beacon after child stands down
watcher: lock held by live pid 84552 but heartbeat is stale for 835657785s (>300s); inspect or stop that watcher before re-arming.
ok - arm reports FAILED and exits non-zero when no fresh watcher can be confirmed
ok - arm never reports healthy off a dead pid and self-heals into a confirmed watcher
ok - guard warns when queued wakes are pending
ok - guard orders watcher re-arm after queued wake drain
ok - guard leads with a prominent no-watcher banner before the queued-wakes warning
ok - guard stays silent when a fresh watcher is alive and no wakes are queued
ok - routine signal self-handles
ok - captain-relevant status verbs escalate
ok - check + unknown escalate; heartbeat self-handles
ok - transient stale self-handles and records a persistence marker
ok - stale + terminal status escalates immediately
ok - persistent stale escalates after threshold and clears its marker
ok - resumed (busy) stale clears its marker without escalating
ok - multiple escalations flush as a single batched digest
ok - batch flush measures max-delay from the first append, not the last
ok - catch-all scan escalates a missed terminal once, not twice
ok - handle_wake routes routine->self and captain->escalate
ok - INJECT_SKIP forces self-handle, bypassing captain-relevant classification
ok - is_wake_reason distinguishes watcher wake reasons from singleton-status stdout
ok - terminal-stale escalate removes its marker so housekeeping does not re-escalate
ok - captain signal escalate marks seen so the catch-all scan does not re-fire
ok - _collapse_newlines replaces newlines with literal separator
ok - afk flag absent: daemon does not inject, buffer preserved
ok - afk flag present: daemon injects with sentinel marker prefix
ok - injected digest is single-line (no embedded newlines)
ok - busy-guard defers injection when supervisor pane is busy
ok - marker detection: marker -> stay afk, no marker -> exit afk
ok - /afk invocation is exempt from afk exit (no self-cancel)
ok - should_exit_afk returns false when afk is not active
ok - strip_injection_marker removes the sentinel marker cleanly
ok - pane_input_pending detects partial input on the cursor line
ok - pane_input_pending: blank cursor line is not pending
ok - pane_input_pending: bare prompts are not pending (idle)
ok - pane_input_pending honors FM_COMPOSER_IDLE_RE after border stripping
ok - composer guard defers injection when pane has pending input
ok - swallowed Enter: type-once + Enter-retry, no concatenation
ok - normal inject: exactly one digest, one Enter, no duplicates
ok - classify_signal dedupes against the catch-all scan seen marker
ok - classify_stale dedupes against the signal path seen marker
ok - pane_input_pending: an idle bordered composer is NOT pending (afk-invx-i5)
ok - pane_input_pending: text inside a bordered composer is still pending
ok - submit-ACK confirms a submit when the composer returns to a bordered-empty box
ok - submit-ACK reports pending on a persistently swallowed Enter (type-once)
ok - max-defer on an empty stuck pane types once, alarms, and preserves the buffer
ok - max-defer flushes and clears the buffer on an empty bordered pane
ok - max-defer on a pending composer alarms without typing
ok - normal flush clears a stale wedge marker
ok - below MAX_DEFER: no inject, no alarm, buffer preserved
ok - max-defer does not flush or alarm while afk is inactive
ok - fm-send exits non-zero on a confirmed swallow, zero on a clean submit
ok - fm-send exits non-zero when initial text send fails

Evidence: AGENTS restructure contract evidence

# Firstmate AGENTS restructure contract evidence

Worktree: `/Users/kunchen/.no-mistakes/worktrees/016d88035d58/01KVYS9QTF0ZWS5K5TEVQMHYQA`

Base: `e4d236bdc96be74863fe59b0a333bb94fb4cf3dd`

Target: `12563ea06c0d84a87cf56d5ef25785c2d896852f`

## Changed files

`` `text
M	.agents/skills/afk/SKILL.md
A	.agents/skills/harness-adapters/SKILL.md
A	.agents/skills/secondmate-provisioning/SKILL.md
A	.agents/skills/stuck-crewmate-recovery/SKILL.md
M	AGENTS.md
`` `

## Symlink surface

`` `text
CLAUDE.md -> AGENTS.md
.claude/skills -> ../.agents/skills
`` `

## Skill front matter

`` `text
.agents/skills/afk/SKILL.md
---
name: afk
description: Enter away-mode supervision. Use when the user invokes /afk (e.g. "/afk", "/afk back in an hour", "going afk"). Sets a durable away-mode flag so the sub-supervisor daemon can self-handle routine wakes and escalate only captain-relevant events as one batched digest, cutting supervision token cost during walk-away stretches. Exit is automatic; any real (unmarked) message returns to full per-wake responsiveness.
user-invocable: true
---

.agents/skills/harness-adapters/SKILL.md
---
name: harness-adapters
description: Agent-only reference for firstmate harness operations. Use before spawning or recovering a crewmate or secondmate, handling a trust dialog, sending a harness-specific skill invocation, interrupting or exiting an agent, resuming an exited agent, or verifying a new harness adapter. Contains verified facts for claude, codex, opencode, and pi.
user-invocable: false
---

.agents/skills/secondmate-provisioning/SKILL.md
---
name: secondmate-provisioning
description: Agent-only reference for persistent secondmate setup and retirement. Use when creating, seeding, validating, recovering, handing backlog to, or retiring a secondmate home, or when editing data/secondmates.md. Covers home leases, transactional seeding, project clone restrictions, idle charter, handoff helper, and teardown safety.
user-invocable: false
---

.agents/skills/stuck-crewmate-recovery/SKILL.md
---
name: stuck-crewmate-recovery
description: Agent-only playbook for stuck firstmate direct reports. Use after a stale wake, looping pane, repeated confusion, an answered-by-brief question, an unresponsive crewmate, or a failed steer. Escalates from peek, to one-line steer, to harness-specific interrupt, to relaunch with progress, to failed status.
user-invocable: false
---

.agents/skills/updatefirstmate/SKILL.md
---
name: updatefirstmate
description: Self-update a running firstmate and its secondmates to the latest from origin. Use when the captain invokes /updatefirstmate (e.g. "/updatefirstmate", "update firstmate", "pull the latest firstmate"). Fast-forwards this firstmate repo's default branch and every secondmate home from origin (fast-forward only, never forced, never disruptive), then re-reads AGENTS.md and nudges each updated secondmate to do the same, so the whole tree runs the latest bin/ and instructions.
user-invocable: true
---

`` `

## Contract checks

- PASS: the change adds exactly three new skill files
- PASS: new harness-adapters skill is agent-only
- PASS: new secondmate-provisioning skill is agent-only
- PASS: new stuck-crewmate-recovery skill is agent-only
- PASS: existing afk skill remains user-invocable
- PASS: CLAUDE.md still points at AGENTS.md
- PASS: .claude/skills still points at .agents/skills
- PASS: AGENTS keeps never-write safety reinforcement inline
- PASS: AGENTS keeps merge approval reinforcement inline
- PASS: AGENTS keeps teardown protection inline
- PASS: AGENTS keeps yolo escalation limits inline
- PASS: AGENTS keeps secondmate idle behavior inline
- PASS: AGENTS keeps project clone mechanics inline
- PASS: AGENTS keeps project create mechanics inline
- PASS: AGENTS keeps project initialization mechanics inline
- PASS: AGENTS keeps tasks-axi command mapping inline
- PASS: AGENTS keeps watcher drain-first instruction inline
- PASS: AGENTS keeps watcher re-arm-last instruction inline
- PASS: AGENTS keeps afk approval authority inline
- PASS: AGENTS references harness-adapters load trigger
- PASS: AGENTS references stuck-crewmate-recovery load trigger
- PASS: AGENTS references secondmate-provisioning load trigger

## Inline anchor excerpts

`` `text
25:1. **Never write to a project.**
347:**yolo (orthogonal).** With `yolo=off` (default) every approval is the captain's: ask-user findings, PR merges, the local-only merge. With `yolo=on`, firstmate makes those calls itself without asking - resolve ask-user findings on your judgment, and run `gh-axi pr merge` / `bin/fm-merge-local.sh` once the work is green/approved - EXCEPT anything destructive, irreversible, or security-sensitive, which still escalates to the captain. Never merge a red PR even under yolo. After any merge you perform without asking the captain, post a one-line "merged <full PR URL or local main> after checks passed" FYI so the captain keeps a trail.
198:A secondmate is idle by default: it acts only on work the main firstmate routes to it.
409:At the start of every wake-handling turn and every recovery turn, run `bin/fm-wake-drain.sh` before peeking panes, reading status files beyond the reason line, or starting new work.
543:Map firstmate's real backlog operations to the approved commands:
583:- `harness-adapters` - load before spawning or recovering a crewmate or secondmate, handling a trust dialog, sending a harness-specific skill invocation, interrupting or exiting an agent, resuming an exited agent, or verifying a new harness adapter.
`` `

Pipeline

Updates from git push no-mistakes

✅ **intent** - passed

✅ No issues found.

✅ **Rebase** - passed

✅ No issues found.

✅ **Review** - passed

✅ No issues found.

✅ **Test** - passed

✅ No issues found.

git status --short && git rev-parse HEAD && git merge-base HEAD e4d236bdc96be74863fe59b0a333bb94fb4cf3dd
git diff --stat e4d236bdc96be74863fe59b0a333bb94fb4cf3dd..HEAD
git diff --name-status e4d236bdc96be74863fe59b0a333bb94fb4cf3dd..HEAD
find . -maxdepth 3 -type l -ls and readlink checks for CLAUDE.md and .claude/skills
for t in tests/*.test.sh; do bash "$t"; done
for t in tests/*.test.sh; do bash "$t"; done > /var/folders/5x/4nqprlbx0518k3ybcb1sz6gr0000gn/T/no-mistakes-evidence/01KVYS9QTF0ZWS5K5TEVQMHYQA/behavior-test-transcript.txt 2>&1
Generated /var/folders/5x/4nqprlbx0518k3ybcb1sz6gr0000gn/T/no-mistakes-evidence/01KVYS9QTF0ZWS5K5TEVQMHYQA/agents-restructure-contract-evidence.md to verify the three new agent-only skills, retained /afk user-invocable skill, symlinks, and inline safety anchors.

✅ **Document** - passed

✅ No issues found.

✅ **Lint** - passed

✅ No issues found.

✅ **Push** - passed

✅ No issues found.

Port upstream kunchenguid#79's agent-guidance restructure, adapted to the Claude-only herdr fork: relocate the detailed, situation-specific guidance out of AGENTS.md into on-demand skill files, leaving slim pointers in the manual. - .agents/skills/harness-adapters (new): the Claude Code adapter facts (busy signature, exit, interrupt, skill invocation, trust/bypass dialog, root/sudo IS_SANDBOX forwarding, ghost-text quirk + detector). Claude-only, herdr terminology - not upstream's four-harness, tmux version. - .agents/skills/stuck-crewmate-recovery (new): the stuck-direct-report playbook. - .agents/skills/secondmate-provisioning (new): routing table, fm-home-seed, idle-by-default contract, and fm-backlog-handoff. - AGENTS.md sections 4/6/8 slimmed to pointers (624 -> 589 lines); fm-spawn.sh comment refs now point at the harness-adapters skill. The agent-only skills carry user-invocable: false. No functional change. Validation: shellcheck clean, full suite green (88 checks), no dangling refs.

kunchenguid added 2 commits June 25, 2026 00:00

docs: restructure firstmate agent manual

12563ea

no-mistakes(document): Sync agent skill documentation

0778239

kunchenguid merged commit 9e21160 into main Jun 25, 2026
4 checks passed

kunchenguid deleted the fm/agents-restructure-q4 branch June 25, 2026 07:25

kunchenguid mentioned this pull request Jun 25, 2026

test: consolidate lifecycle behavior coverage #80

Merged

leo1oel mentioned this pull request Jun 26, 2026

docs: restructure README into a concise overview plus docs/ tree (#82) leo1oel/nemo#9

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: restructure firstmate agent guidance#79

docs: restructure firstmate agent guidance#79
kunchenguid merged 2 commits into
mainfrom
fm/agents-restructure-q4

kunchenguid commented Jun 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kunchenguid commented Jun 25, 2026

Intent

What Changed

Risk Assessment

Testing

Pipeline

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant