feat(agents): lean skill-dedup v1 — trust imported skills by ohld · Pull Request #177 · ffmemes/ff-backend

ohld · 2026-04-17T04:06:33Z

Summary

Audit of each AGENTS.md against the gstack skill catalog surfaced ~40 lines of custom prose that re-implements logic a skill the agent already imports already does, plus 3 skill imports that should be added. This PR ships the top-3 wins identified by the audit.

Stacked on top of #176 (issue hygiene v1). Target base was set to feat/agent-issue-hygiene-v1 — merge #176 first, then this.

Changes

QA (`agents/qa/AGENTS.md`, -45 lines)

Collapse Log Sources: 3 fenced blocks of raw Sentry/Coolify/DB commands become 3 one-liners.
Post-Deploy Verification: trust /canary for error scan + baseline; delete redundant "check Sentry / check DB" steps that /canary already covers. Add explicit Sentry issue scan.
Exploratory Testing: replace 8-bullet "improvise" list with /qa exhaustive (structured tier covers the same surface).
Critical-bug escalation now runs /investigate before handing off to CTO, producing a real root-cause report.
Added skills: health, investigate (attached via Paperclip API PATCH).

CEO (`agents/ceo/AGENTS.md`, -12 lines)

Drop the "think like a CEO" bullet list in the daily heartbeat — /plan-ceo-review already carries the 10-star thinking prompts.
Drop the weekly-review checklist — /retro produces exactly this output (commit history, velocity, per-person contributions, trends).

Staff Engineer (`agents/staff-engineer/AGENTS.md`, -11 lines)

Drop the generic PR review checklist (N+1, SQL injection, trust boundaries, stale reads, secrets) — /review covers it all.
Keep only project-specific paranoia: candidates.py SQL interpolation, blender weight invariants, public-repo secret rejection.
Add /codex review as adversarial second-opinion step (Codex CLI verified installed + authenticated on the Paperclip runtime at /paperclip/.codex/auth.json).
Added skill: codex (attached via Paperclip API PATCH).

Skills NOT added

sentry-cli and simplify exist in Claude Code's plugin marketplace but are NOT in garrytan/gstack — so we can't import them into Paperclip without extra work. QA's Sentry calls use the raw sentry CLI directly (already installed on the server). Staff Engineer drops /simplify entirely; /codex review + /review cover the surface.

What's already done

Edited AGENTS.md files locally
Deployed to Paperclip server (./agents/deploy.sh)
Re-imported gstack skills to pick up latest (POST /skills/import)
PATCH'd QA agent — attached health, investigate
PATCH'd Staff Engineer agent — attached codex
Verified attachment via GET /api/agents/<id>

Test plan

Next QA scan triggers /investigate on a critical error instead of a bare "CRITICAL" paperclip ticket
Next QA post-deploy run invokes /canary only (no manual Sentry+DB duplicate scan)
Next Staff Engineer PR review runs /review + /codex review in sequence
Next Monday CEO heartbeat runs /retro without the redundant checklist

Prompt-level hotfix targeting the three patterns from recent Paperclip issue data (200 issues analyzed): 1. Slug-first titles ([pr:NNN], [incident:<slug>], [deploy:...], etc.) so recurring work collapses onto one ticket instead of spawning new ones each firefight. 2. Dedupe preflight via paperclipListIssues/Api search before every paperclipCreateIssue — kills "DB pool exhausted x3 tickets" pattern. 3. Single-writer rule — only CEO opens strategic issues; other agents create execution tickets only from their explicit workflow. 4. Staff Engineer PR Review idempotency — check gh pr view state first; skip review entirely if merged/closed. Kills stale PR Review tickets for PRs that were merged/closed externally. Block is wrapped in HTML comment markers (BEGIN/END: issue-hygiene-v1) so it can be ripped out with one sed when Paperclip ships equivalent platform features. Also corrects runbook: local Paperclip MCP must be registered via `claude mcp add` (writes to ~/.claude.json), NOT via `mcpServers` in `.claude/settings.local.json` (Claude Code ignores that key there).

…reimplementing Based on a full audit of each AGENTS.md against the gstack skill catalog, deletes custom prose that duplicates an imported skill's logic, and adds skill imports for gaps. ## QA - Collapse "Log Sources" — 3 fenced code blocks of raw Sentry/Coolify/DB commands become 3 one-liners. `/sentry-cli` (new import) handles Sentry. - Post-Deploy Verification — trust `/canary` for error scan + baseline; delete the redundant "check Sentry, check DB" steps that `/canary` already covers. Add `/sentry-cli` scan alongside `/canary` for an explicit issue view. - Exploratory Testing — replace 8-bullet "improvise" list with `/qa exhaustive` (covers the same surface with a structured tier). - Bug escalation now runs `/investigate` (new import) before handing off to CTO, producing a real root-cause report instead of just the error string. - New skills in frontmatter: sentry-cli, health, investigate. ## CEO - Delete the "think like a CEO" bullet list in the daily heartbeat — `/plan-ceo-review` carries the 10-star thinking prompts. - Delete the weekly-review checklist — `/retro` produces exactly this output (commit history, velocity, per-person contributions, trends). ## Staff Engineer - Delete generic PR review checklist (N+1, SQL injection, trust boundaries, stale reads, secrets) — `/review` covers all of it. - Keep only project-specific paranoia: candidates.py interpolation, blender invariants, public-repo secret rejection. - Add `/codex review` (adversarial second opinion) + `/simplify` (reuse/quality audit) to the review workflow. - New skills in frontmatter: codex, simplify. Net: -40 lines across the 3 files, zero behavior loss. Skills added in frontmatter are documentation — actual attachment via Paperclip API PATCH needs to follow.

ohld · 2026-04-23T10:21:40Z

Retargeting to production base: close/reopen to re-fire CI and Staff Engineer trigger

…firefighting (#205) * feat(agents/se): Self-Check Gate + Anti-Patterns Log to close silent firefighting Adds a mandatory verification step before staff-engineer marks an execution issue done. Each PR-review outcome (merged / queued / blocked-CI / changes- requested / external / already-resolved) maps to a path with explicit checks that must pass; failures route to status=blocked with a reason instead of a silent close. Path A3 probes the next-link (Coolify deploy) via last_online_at on /api/v1/applications/<uuid>: if the container is still healthy on a pre-merge timestamp 5 min after merge, files [chain-broken:coolify-not-triggered] HIGH for CTO. This catches the GH→Coolify webhook drop case ohld reported. ANTI-PATTERNS.md is the case-log feeding the gate. Each row maps to a check; six seed rows from real PRs (#177 6-day silent trigger drop, #199 17h zero-artifact merge, #201 auto-merge race during changes-requested, #200 bare-merge CI race, the user-reported chain-break, and the SC=$(gh pr view --json comments) JSON corruption found while testing). Tested locally against PRs #199 (correctly identified as silent exit, 0 review-signal artifacts) and #201 (signal found, A3 timestamp probe passes). * fix(agents/se): address codex review of self-check gate Two real bugs codex caught before push: [P1] A2/B2/C2 grep was too narrow — only matched the comment-fallback form (STAFF ENGINEER REVIEW: APPROVED) used when GitHub self-review-blocks ohld. For non-ohld internal and external authors, step 7 posts a real `gh pr review --approve -b "Review summary"` whose body lacks the prefix. Those valid approvals would have failed the gate. Fix: accept EITHER a .reviews[] entry with state="APPROVED" OR the comment-prefix. Same dual-form for D1 CHANGES_REQUESTED. [P2] A3 Coolify probe fired immediately after merge, when last_online_at is still pre-merge from the previous deploy. Coolify needs ~3-5 min for the deploy + healthcheck cycle. Without a grace window, every healthy PR would file [chain-broken:coolify-not-triggered] and drown CTO in false alarms. Fix: probe only fires when now - mergedAt >= 300s; otherwise deferred to QA's hourly Process Health Check. Both fixes logged as #7 and #8 in ANTI-PATTERNS.md so the same blind spots don't reappear in future gate revisions. * fix(agents/se): address SE agent CHANGES REQUESTED on PR #205 Two P1s the SE agent caught reviewing this PR: [P1.1] Path A3 used BSD `date -u -j -f` which doesn't exist on the Linux agent runtime. Probe failed at the first line, MERGED_EPOCH was empty, the chain-broken issue never fired. Fix: GNU `date -u -d` auto-parses both mergedAt (ISO 8601) and last_online_at (YYYY-MM-DD HH:MM:SS). [P1.2] A2/B2/C2/D1/E1 jq commands grepped ALL artifacts on the PR. Spec said "for THIS run" but had no time filter, so a stale APPROVED comment from a prior wake would let a current silent-exit wake pass A2 — exactly the silent-close mode the gate was meant to fix. Fix: capture WAKE_START_ISO at the top of every wake, filter via --arg t in A2 + D1 jq calls. Skipped per minimal-code preference: ANTI-PATTERNS rows for these (caught pre-merge, not a production failure), Coolify UUID drift note, D3 MCP-tool prose tweak. All non-blocking from the SE review. Verified locally against PR #205 own data — the new wake-scope filter correctly finds the SE review when WAKE_START < submittedAt and rejects it otherwise. * fix(agents/se): route all SE wake exits through Self-Check Gate Codex adversarial review of PR #205 (5 findings, 1 P1 + 4 P2): [P1] Steps 0/7/8 bypassed the gate entirely — `paperclipUpdateIssue done|blocked` was called directly inside each step, so the verification block in step 9 was dead code as wired (cases #1/#3/#4/#5 from the ANTI-PATTERNS log all closed via those direct calls). Restructured so each terminal branch sets `OUTCOME_PATH=A|B|C|D|E|F` and jumps to step 9; step 9 is now the single `paperclipUpdateIssue` call site for the wake. Added an explicit terminal-status mapping table. [P2] `/tmp/sc.json` and `/tmp/app.json` are cross-run race traps — two parallel SE wakes overwrite each other's snapshots. PR-scoped to `/tmp/sc-${PR_NUMBER}.json` and `/tmp/app-${PR_NUMBER}.json`. [P2] Path A3 Coolify probe didn't validate the curl response. Empty body on 401/404/500/network failure → `jq -r .last_online_at` empty → `date -u -d ""` undefined → silent no-op. Now checks curl exit, HTTP 200, and non-empty `last_online_at`; on failure files `[chain-broken:coolify-probe-unhealthy]` for CTO. [P2] Path E was missing the `>= $WAKE_START_ISO` freshness filter that A2/B2/C2/D1 already have. A stale APPROVED review from a prior wake could have let a current silent-exit external-PR wake pass the gate. E1/E2 now filter by wake-start. [P2] A3 chain-broken contradicted the general "any failed check → blocked" rule. Called out A3 as the explicit non-blocking exception: SE delivered review + merge regardless; the broken handoff is a separate `[chain-broken:*]` ticket. Updated "When a check fails" and the terminal-status table to make this explicit. Co-Authored-By: Paperclip <noreply@paperclip.ing> * fix(agents/se): round-2 review fixes — bash semantics, precheck order Round-2 SE review of commit 72ea6ed found 2 P1 regressions and 1 P2 that the structural pass missed: [P1] CI-red branch fall-through. Old code had `exit 0`; my refactor replaced it with `# ... goto step 9`, but bash doesn't honor prose comments — execution fell through to `gh pr merge --squash --auto` and queued the merge for a PR that should stay blocked. Fix: introduce a `SKIP_MERGE` flag set by either precheck failure (CI red OR auto-merge disabled), and gate the `gh pr merge ...` block on `[ -z "$SKIP_MERGE" ]`. The single exit point is still step 9. [P1] A3 chain-broken issue never filed. Both A3 failure branches used `: "file [chain-broken:*] PR #<n> ..."`, but `:` is the bash null command — the string is just an evaluated argument, no issue is ever created. The wake closed `done` silently exactly as ANTI-PATTERNS #5 warned about. Fix: bash block now computes `A3_RESULT` and `A3_DETAIL` only; an explicit prose step below the bash block tells the agent to invoke the `paperclipCreateIssue` MCP tool when A3_RESULT is probe-unhealthy or not-triggered (filing an issue is a tool call, not a shell command, so it shouldn't have been in the bash block). [P2] `allow_auto_merge` precheck was documented AFTER the `gh pr merge --squash --auto` call that depends on it. If the setting drifts back to false, the merge call errors first and the diagnostic recovery never runs. Moved the precheck above the merge command (now under the same `c.` heading as the CI-red precheck), gated by the SKIP_MERGE flag described above. Co-Authored-By: Paperclip <noreply@paperclip.ing> --------- Co-authored-by: Paperclip <noreply@paperclip.ing>

ohld added 2 commits April 17, 2026 11:14

ohld changed the base branch from feat/agent-issue-hygiene-v1 to production April 23, 2026 10:21

ohld closed this Apr 23, 2026

ohld reopened this Apr 23, 2026

ohld merged commit a5fcf2c into production Apr 23, 2026
3 checks passed

ohld deleted the feat/agent-skill-dedup-v1 branch April 23, 2026 10:27

This was referenced Apr 24, 2026

fix(agents): Staff Engineer owns PR merge; drop Release Engineer from PR-loop #187

Merged

feat(agents/se): Self-Check Gate + Anti-Patterns Log to close silent firefighting #205

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(agents): lean skill-dedup v1 — trust imported skills#177

feat(agents): lean skill-dedup v1 — trust imported skills#177
ohld merged 2 commits intoproductionfrom
feat/agent-skill-dedup-v1

ohld commented Apr 17, 2026

Uh oh!

ohld commented Apr 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ohld commented Apr 17, 2026

Summary

Changes

QA (agents/qa/AGENTS.md, -45 lines)

CEO (agents/ceo/AGENTS.md, -12 lines)

Staff Engineer (agents/staff-engineer/AGENTS.md, -11 lines)

Skills NOT added

What's already done

Test plan

Uh oh!

ohld commented Apr 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

QA (`agents/qa/AGENTS.md`, -45 lines)

CEO (`agents/ceo/AGENTS.md`, -12 lines)

Staff Engineer (`agents/staff-engineer/AGENTS.md`, -11 lines)