Skip to content

docs: agent-fleet wave-1 synthesis (quickstart + 35-bug prioritized list)#35

Merged
avrabe merged 1 commit intomainfrom
docs/agent-fleet-findings
Apr 30, 2026
Merged

docs: agent-fleet wave-1 synthesis (quickstart + 35-bug prioritized list)#35
avrabe merged 1 commit intomainfrom
docs/agent-fleet-findings

Conversation

@avrabe
Copy link
Copy Markdown
Contributor

@avrabe avrabe commented Apr 30, 2026

14 specialist agents (junior dev, safety-critical architect, QA, security, DevOps, docs, performance, day-2 ops, Probot/Octokit expert, supply-chain, LLM prompt engineer, rivet, test-quality, senior reviewer) ran a non-overlapping parallel review of Temper. 13 reported successfully; aggregated here.

Files

  • `docs/agent-fleet/quickstart.md` — hand-off doc for wave-2: what works (don't redesign), what's broken, what's deferred, how to pick up a fix.
  • `docs/agent-fleet/bugs.md` — 35 prioritized findings with file:line + flagging-agent. 5 🔴 critical, 10 🟠 high, 15 🟡 medium, 5 🟢 low. Action-priority "if only one PR this week" guide at the bottom.

Top critical findings (independently flagged by multiple agents)

  1. `synchronizeIssueLabels` silently DELETES user-created labels on every webhook tick. Destructive across the org.
  2. `WEBHOOK_SECRET` defaults to literal `"development"` when unset. Trust-boundary fail-open.
  3. Dashboard POST routes (`/dashboard/actions/{sync,review,analyze-org}`) unauthenticated.
  4. `issues.opened` provisioning lacks the auth gate that comment commands have.
  5. Test mock retrofit fakes the broken `octokit.issues.X` namespace — the PR fix: AI review failed silently — octokit.issues namespace undefined #22 / fix: rivet binary path resolution + drop fragile .issues guard #29 / fix: rulesets empty-contexts + skip control-surface repos in config #34 regression class is still active.
  6. `success: true` returned when configuration silently degraded (ruleset 422 → legacy fallback, signed-sigs 404 → swallow).
  7. `/health` always returns 200 even when scheduler is dead.

Cross-cutting theme: "looks like it ran" failure pattern. Fix by class, not case-by-case.

What's deferred to wave-2 (open questions)

  • Multi-lens AI review (discover → fresh-session validate → emit)
  • GitHub Check Run API integration so AI review can actually gate merges
  • `pull_request.synchronize` event subscription
  • Replace `reviews.json` with a SQLite table
  • Split `src/app.js` (1159 lines)
  • `temper-admin` CLI

🤖 Generated with Claude Code

Output of a 14-agent parallel review of Temper. Each agent had a distinct
persona (junior dev, safety-critical architect, QA, security, DevOps, docs
reviewer, performance, day-2 ops, Probot/Octokit expert, supply-chain, LLM
prompt engineer, rivet integration, test quality, senior reviewer) and a
non-overlapping scope.

13 of 14 reported successfully. Aggregated into two artifacts:

## docs/agent-fleet/quickstart.md
Hand-off doc for future agents: what Temper is, what's verified working
(don't redesign), what's confirmed broken, what's deliberately deferred,
and how to pick up a fix without repeating wave-1's investigation.

## docs/agent-fleet/bugs.md
Prioritized list of 35 findings, severity-tagged (5 critical, 10 high,
15 medium, 5 low). Each carries a file:line citation and at least one
wave-1 agent flag. Ends with action priorities ("if only one PR this
week", "if a sprint").

## Why these matter

Wave-1 surfaced bugs that 9 months of iterative dev hadn't:

- `synchronizeIssueLabels` silently deletes user-created labels on every
  webhook. Destructive across the org.
- `WEBHOOK_SECRET` defaults to literal "development" when env unset.
  Trust boundary fail-open.
- Dashboard POST routes are unauthenticated.
- Test mock retrofit fakes the broken `octokit.issues.X` namespace —
  exactly why PRs #22, #29, #34 each had to fix it again at a new call site.
- `success: true` returned when configuration silently degraded.
- `/health` always returns 200.

The cross-cutting theme: "looks like it ran" failure pattern across
ruleset fallback, signed-signatures swallow, scheduler tick errors, and
review-save IO. Fix by class, not case-by-case.

## Not in this commit

Wave-2 work — picking up specific fixes, multi-lens AI review, GitHub
Check Run integration, splitting `src/app.js`. Documented as open
questions in `quickstart.md`.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@avrabe avrabe merged commit 94fce20 into main Apr 30, 2026
3 checks passed
@avrabe avrabe deleted the docs/agent-fleet-findings branch April 30, 2026 04:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant