feat: add plugin staged analysis pipeline#160
Conversation
Add core type definitions, stage database, hook utilities, debounce helper, and results database updates for the multi-stage analysis pipeline. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add multi-source session scanner, evaluation assembler, and update data extractor, deterministic scorer/mapper, background analyzer, and report template for the staged analysis pipeline. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add session-start hook, get-domain-results, get-stage-output, and save-stage-output MCP tools. Update existing hooks and tools for the staged analysis pipeline. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add decomposed analysis skills (classify-type, generate-weekly-insights, summarize-projects, summarize-sessions, translate-report, verify-evidence) and update existing skills and plugin manifest. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rebuild dist/ to match source changes for marketplace compatibility. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add packages/shared for cross-package type definitions and utilities. Add plugin parity test fixtures and unit tests for API and plugin. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update API routes, dashboard pages, analysis orchestrator, evaluation assembler, and local database to support the staged plugin pipeline. Update CLI index and uploader for compatibility. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update agent architecture docs, deployment guide, plugin docs, and user flows for the staged analysis pipeline. Update AGENTS.md and README.md with new plugin capabilities. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Code Review: PR #160 -- Plugin Staged Analysis PipelineReviewed 80+ source files across plugin core, shared package, web app, CLI, hooks, MCP tools, skills, and tests. Focused on CLAUDE.md compliance, bug scan, security, and pattern adherence. What Was Done Well
Issues FoundImportant (Should Fix)1. `readCachedParsedSessions` silently returns empty array on error export async function readCachedParsedSessions(): Promise<ParsedSession[]> {
try {
const raw = await readFile(PARSED_SESSIONS_CACHE, 'utf-8');
return JSON.parse(raw) as ParsedSession[];
} catch {
return []; // <-- silent fallback
}
}This catches ALL errors (corrupt JSON, permission denied, out of memory) and returns an empty array. The downstream `extract_data` tool checks for `sessions.length === 0` and returns a user-facing "no data" message, which masks the real error. If the cache file exists but is corrupted, the user sees "No cached parsed sessions. Call scan_sessions first" instead of learning about the corruption. The CLAUDE.md No Fallback Policy says errors must propagate. The "file not found" case is the only expected error here. Consider narrowing the catch to distinguish file-not-found (return `[]`) from unexpected errors (rethrow). 2. Sync route auth weakness: Bearer token used as email lookup without verification if (authHeader?.startsWith('Bearer ')) {
const token = authHeader.slice(7);
const userByEmail = findUserByEmail(token);
userId = userByEmail?.id ?? getCurrentUserFromRequest().id;
}The Bearer token is treated as a raw email string for user lookup. If someone sends `Authorization: Bearer admin@company.com`, they can write analysis records as that user. This bypasses actual authentication. The comment says "Token could be an email or a user ID" but there is no cryptographic verification of either. For a self-hosted deployment this may be acceptable, but the fallback to `getCurrentUserFromRequest().id` when the email lookup fails means a typo doesn't fail -- it silently writes to the wrong user. 3. Dual DB singleton risk between `stage-db.ts` and `results-db.ts` Both modules open independent `better-sqlite3` connections to the same `results.db` file. While WAL mode permits concurrent readers, having two writer connections to the same DB file from the same process can cause SQLITE_BUSY if writes overlap. The `stage_outputs` table has a foreign key to `analysis_runs`, so the `results-db` must create the run before `stage-db` can write to it. If the connections are not coordinated, the foreign key might fail or a write-write collision could occur. Consider sharing a single DB connection between the two modules, or at minimum documenting the ordering constraint. 4. `save_domain_results` schema has `severity` enum mismatch The legacy `PluginDomainResult` interface in the sync route is missing `'critical'` from its severity union. This means if a plugin sends a domain result with `severity: 'critical'`, the sync route's TypeScript types will accept it at runtime (since it's not validated against the interface), but the intent mismatch could cause confusion. This is a minor type-level inconsistency rather than a runtime bug, since the sync route uses the canonical run path now, but the legacy interface should be updated for correctness. Suggestions (Nice to Have)5. `debounce.ts` silent catches for filesystem operations are justified but undocumented 6. `readState()` returns `DEFAULT_STATE` on parse error If the state file is corrupted, this returns default state rather than surfacing the problem. This is a pragmatic choice (the state file is ephemeral), but worth a comment explaining why this is an exception to the No Fallback Policy. 7. Consider adding input schema to `get_stage_output` tool registration The `get_stage_output` tool accepts an optional `stage` string parameter, but unlike `save_stage_output` there is no enum constraint on the stage name. Adding a Zod enum validation matching `STAGE_NAMES` would give the LLM better guidance on valid stage names. 8. `generate_report` HTTP server serves all reports from `latest.html` on refresh After the first request, subsequent requests re-read `latest.html` from disk. If another analysis completes and overwrites `latest.html`, refreshing the page shows the new report instead of the originally generated one. This is probably the intended behavior but worth confirming. 9. Gemini schema nesting depth compliance -- confirmed OK 10. Structured JSON convention -- confirmed OK Architecture SummaryThe PR successfully decomposes the monolithic analysis pipeline into a staged plugin-local architecture. The key data flow change is: Before: SessionEnd hook -> spawn background process -> upload raw sessions to server -> server runs Gemini pipeline -> results stored on server After: SessionEnd hook -> mark "pending" in state file -> next SessionStart injects context -> Claude Code runs local skill-based pipeline stage by stage -> results stored in local SQLite -> optional sync to team server The `@betterprompt/shared` package becomes the canonical schema authority, with both the plugin and the web app importing from it. The evaluation assembly logic that previously lived only in the server is now shared, enabling the parity test fixtures to verify that plugin-assembled evaluations match server-assembled ones. The 8 atomic commits are well-structured, each with a clear scope (core types, analysis engine, hooks/tools, skills, dist, shared package, web app/CLI, docs). VerdictApprove with reservations. Issues #1 (silent fallback in cache reader), #2 (auth weakness in sync route), and #3 (dual DB singletons) should be addressed. None are blocking for a merge if the team is aware, but #2 especially should get a tracking issue if not fixed in this PR. |
1. No Fallback: readCachedParsedSessions now only returns [] for ENOENT (cache miss), all other errors propagate per No Fallback Policy. 2. Auth: sync route validates BETTERPROMPT_AUTH_TOKEN as shared secret instead of accepting raw email as Bearer token. User identity moved to X-User-Email header, gated behind token validation. 3. Dual DB: stage-db now reuses the shared connection from results-db instead of opening an independent connection to the same file. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The @betterprompt/shared package dist/ is gitignored (root dist/ rule) so CI has no compiled output to resolve. Add a workspace build step after npm ci to compile shared types before typecheck, tests, and build. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Code ReviewFound 6 issues (filtered from 19 raw findings across 5 parallel review agents, threshold ≥ 80/100): 1. Auth bypass when When the env var is unset, the token check at line 278 is skipped entirely, but BetterPrompt/app/api/analysis/sync/route.ts Lines 278 to 297 in 5a098c9 2.
BetterPrompt/packages/plugin/hooks/post-session-handler.ts Lines 92 to 95 in 5a098c9 3. The "Focus Area Constraints" section references BetterPrompt/packages/plugin/skills/write-content.md Lines 114 to 115 in 5a098c9 4. The schema defines 5. The local interface defines 6. CLAUDE.md Commands section lists stale CLI command (Score: 80 — Doc Accuracy)
Line 31 in 5a098c9 13 additional findings below threshold (score < 80)
Generated with Claude Code using /ship-it If useful, react with a thumbs-up. Otherwise, thumbs-down. |
When BETTERPROMPT_AUTH_TOKEN is not configured, the X-User-Email header was still trusted for identity resolution, allowing unauthenticated callers to store analysis results under any known user's account. Now X-User-Email is only read after successful token validation. Also adds missing 'critical' severity to PluginDomainResult interface to match the canonical DomainGrowthAreaSchema. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
force: true bypassed the 30-minute staleness check, causing any running analysis to be immediately marked as failed when a sub-session ended. Changed to force: false so only genuinely stale states are recovered, preventing corruption of in-flight analysis runs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- write-content.md: replace stale impactScore/relatedDomains references with relatedQualities/actions to match TopFocusAreaSchema - All 5 analyze-*.md skills: add severity and recommendation field instructions (MINIMUM 50 characters) to match DomainGrowthAreaSchema Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CLI now prints a plugin migration notice and exits. The --auto flag is no longer recognized by the new main() entry point. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
packages/sharedfor cross-package type definitions and scoring logicChanges
feat(plugin): add staged analysis types and infrastructure- Core types, stage DB, hook utilsfeat(plugin): add multi-source scanner and analysis engine- Session scanner, evaluation assembler, deterministic scoringfeat(plugin): add session hooks and MCP tools- Session-start hook, get-domain-results, get/save-stage-output toolsfeat(plugin): add analysis skills and plugin config- 6 new decomposed analysis skillsbuild(plugin): update compiled dist output- Rebuilt for marketplace compatibilityfeat: add shared package and test infrastructure- Cross-package schemas, scoring, test fixtures + unit testsrefactor: update web app and CLI for plugin analysis pipeline- API routes, dashboard, orchestrator updatesdocs: update architecture and plugin documentation- Architecture, deployment, plugin docsTest Plan
npx tsc --noEmit)npm run build)npm test)Generated with Claude Code using /ship-it