Skip to content

danieliser/strategic-planner

Repository files navigation

Strategic-Planner Plugin Architecture & Patterns

Overview

The strategic-planner plugin is a comprehensive workflow automation system with 6 main commands orchestrating multi-phase planning and execution using Agent Teams. Key patterns: structured phases with human gates, fan-out/fan-in parallelization, task DAGs for tracking, event-driven hooks for integration.


Command Structure (6 Commands)

1. /strategize — Planning Workflow

Tiers: Quick (3 phases) → Standard (5 phases) → Full (5 phases, 4-5 panelists) → Ultra (dual-team adversarial)

Phases:

  • Phase 1: Intake & Scoping — question framing, premise validation, constraints capture
  • Phase 2: Research (optional researcher agent spawned on Full/Ultra)
  • Phase 3: Spec Drafting (spec-writer agent)
  • Phase 4: Panel Review (2-5 panelists, Delphi-style, 1-3 rounds with convergence checks)
  • Phase 5: Executive Gate (lightweight exec on Standard, multi-exec on Full/Ultra)

Key Features:

  • Optional --red-team flag: Adversarial stress-test before panel
  • Optional --scenario-plan flag: Multi-scenario analysis instead of single recommendation
  • All phases have human approval gates between them
  • Creates task DAG upfront so TaskList tracks progress
  • Emits lifecycle events for integration (intake, question-framed, panel-round-complete, etc.)

Output: PLAN.md with spec, executive summary, acceptance criteria


2. /execute — Implementation Orchestration

Phases:

  • Phase 0: Plan Intake & Critical Review (parse tasks, build DAG, identify gaps)
  • Phase 1: Workspace Setup (create team, spawn reviewer + integrator)
  • Phase 2: Batch Execution Loop (implement batch → review batch → integrate)
  • Phase 3: Completion (final holistic review, lessons learned, shutdown)

Key Mechanics:

  • Full task DAG created upfront with explicit dependencies and batch boundaries
  • Reviewer agent (persistent team member) watches TaskList, runs 2-stage review on each task:
    • Stage 1: Spec compliance (did they build exactly what was requested?)
    • Stage 2: Code quality (tests, error handling, patterns, maintainability)
    • Fix-cycle protocol: max 3 cycles per stage before escalation
  • Integrator agent runs after each batch, checks full test suite + cross-task conflicts
  • Worktree strategy: creates per-implementer worktrees for parallel tasks (with --copy, --symlink, or --skip)
  • Batch gates separate batches; human checkpoint between batches to review progress, amend if needed
  • Final holistic review compares end-to-end against original plan requirements

Output: Committed implementation on feature branch, test results, TaskList showing all tasks completed


3. /audit — Codebase Audit & Modernization

Tiers: Quick (deps + security scan only) → Standard (all vectors) → Deep (+ architecture analysis)

Phases:

  • Phase 0: Project Reconnaissance (understand stack, prior audits, recall audit knowledge)
  • Phase 1: Vector Discovery (automated scan, user approval gate)
  • Phase 2: Parallel Audit (spawn auditor agents, one per vector group; 3 vectors max per agent)
  • Phase 3: Synthesis & Summary (deduplicate, cross-vector analysis, rank findings)
  • Phase 4 (optional): Fix & Loop (execute micro-plans with autonomy levels: gated/supervised/autonomous)
  • Phase 5: Reflection (store audit knowledge + process lessons)

Key Features:

  • Vectors: discrete review axes (dependencies, subsystems, patterns, concerns)
  • Health ratings: Green/Yellow/Red per vector
  • Impact scoring: (Severity × Affected Scope × Fix Cost Inversion) + Trend Weight
  • Micro-plans: executable PLAN.md files for Critical/Major findings only
  • Autonomy levels: gated (human approves each plan), supervised (human approves summary, auto-execute), autonomous (human approves vectors, auto-run cycles)
  • Loop support: iterative audit→fix→re-audit with regression circuit breaker (halts if more findings introduced than fixed)
  • Consensus validation: lead re-validates every Critical finding before including (2-agent validation cuts false positives ~60%)

Output: Audit summary with baseline metrics, micro-plans in docs/audits/<date>/micro-plans/, optional auto-execution via /execute


4. /research — Structured Research

Tiers: Quick (5-8 sources, 3-5 min) → Standard (10-15 sources, 5-10 min) → Deep (15-25 sources, 10-20 min, spawn researcher agents)

Process:

  • Question framing (the right question, adjacent questions, wrong questions to avoid, hidden assumptions)
  • Prior research recall (AutoMem + research DB)
  • Execution (query decomposition, codebase analysis on Standard+, cross-source verification on Deep)
  • Save report to research DB with sources, recommendation, confidence levels

Output: Saved research report with cited sources, comparison matrices, actionable recommendations


5. /improve-process — Plugin Self-Improvement

Scope: Focus on specific workflows (strategize, execute, research, audit, agents, storage) or audit all

Phases:

  • Phase 1: Gather evidence (recall lessons-learned, check research DB, read current plugin files, look for patterns)
  • Phase 2: Categorize findings (Bugs, Pattern Shifts, Role/Tier Gaps, Process Friction, One-Offs, Self-Improvement)
  • Phase 3: Propose changes (show all proposals with evidence, risk analysis)
  • Phase 4: Human review (user approves, rejects, defers, modifies)
  • Phase 5: Implement approved changes
  • Phase 6: Record what changed (store improvement memory, associate with addressed lessons)

Output: Updated plugin files, improvement history in AutoMem


6. /troubleshoot — Diagnostic & Recovery

Details not fully read, but follows similar orchestration patterns


Agent Structure (21 Specialized Agents)

Core Execution Agents

  • reviewer.md: Two-stage code reviewer (spec compliance → quality). Persistent team member. Monitors TaskList, sends fix requests via SendMessage.
  • implementer.md: Single-task implementation. Receives full task spec via TaskCreate description (not external files). TDD workflow, self-review before reporting.
  • integrator.md: Runs integration checks after batches (full test suite + cross-task conflicts).

Planning & Research Agents

  • researcher.md: Deep research execution. Applied on Full/Ultra tier planning. Queries, codebase analysis, cross-source verification.
  • spec-writer.md: Spec drafting & revision. Takes intake + research, produces spec. Includes pre-mortem stress-test before finalizing. Re-engages between panel rounds.

Panel Review Agents

  • panelist.md: Generic review panelist with role-injection at spawn. Scores 6 dimensions, provides structured feedback.
  • panelist-devils-advocate.md: Mandatory panelist role. Escalates dismissed concerns.
  • panelist-architect.md, panelist-user-advocate.md, panelist-ops-pragmatist.md, panelist-domain-expert.md, panelist-business-analyst.md, panelist-security-analyst.md: Role-specific versions with deeper instructions.
  • panelist-red-team.md, panelist-security-architect.md, panelist-systems-engineer.md, panelist-innovation-researcher.md: Additional roles for Ultra tier.
  • panelist-reconciliation.md: Compares two adversarial plans (Ultra tier only).

Audit & Improvement Agents

  • auditor.md: Per-vector auditor. Phase 1: read & map code, assess health. Phase 2: research alternatives. Phase 3: classify findings (observation/inference/evidence). Phase 4: self-validate. Phase 5: write micro-plans (Critical/Major only). Phase 6: report.
  • exec-reviewer.md: Executive reviewer (Standard/Full/Ultra, role varies).
  • troubleshooter.md: Diagnostic & recovery agent.

Skill Structure (6 Core Skills)

1. task-patterns — Task DAG Conventions

  • Gate tasks (start with "Gate:", activeForm = "Awaiting approval...")
  • Fan-out/fan-in (N parallel tasks blocked by predecessor, synthesis blocked by all N)
  • Batch boundaries (sequential batches separated by integration gates)
  • Phase lifecycle (mark task in_progress at phase start, completed at phase end)
  • activeForm conventions (present continuous tense)

2. execution-methodology — Plan Parsing & DAG Construction

  • Plan parsing (extract task ID, spec, files, explicit dependencies)
  • Dependency inference (file overlap, imports, type relationships)
  • DAG to batch schedule (topological sort into parallelizable batches)
  • File ownership assignment (no two agents in same batch modify same file)
  • Task description template (full spec text, acceptance criteria, verify commands, files, context)

3. review-gates — Review Mechanics

  • Two-stage review: Stage 1 (spec compliance, read actual code) → Stage 2 (code quality)
  • Fix-cycle protocol (max 3 cycles per stage, escalate after)
  • Integration checkpoints (after every batch: full test suite + cross-task conflicts)
  • Final holistic review (reviewer pass over full diff against original requirements)

4. panel-review — Panel Review Process

  • Team composition (Standard: 3 panelists; Full: 4-5; Ultra: two adversarial teams of 4 each)
  • Round structure (R1: independent scoring → R2: deliberation + re-score → R3 optional if no convergence)
  • Scoring rubric: 6 dimensions (Problem-Solution Fit, Feasibility, Completeness, Risk Awareness, Clarity, Elegance), 1-5 scale
  • Convergence check (std dev ≤ 0.75 on all dimensions)
  • Advancement thresholds (avg ≥ 3.5 auto-advance; < 3.0 blocked unless addressed)
  • Timeouts (7 min response, ping at 5 min)
  • Team lifecycle (keep panelists alive through exec review to preserve context; despawn only after final decision)

5. audit-methodology — Audit Protocol

  • Vector definition (discrete review axis: dependency, subsystem, pattern, concern)
  • Vector identification (9 categories: dependencies, runtime, build, database, security, API, state, testing, observability, infra)
  • Health factors (CAST APPMARQ: Robustness, Security, Efficiency, Transferability, Changeability)
  • Impact scoring formula (Severity × Affected Scope × Fix Cost Inversion + Trend Weight)
  • Per-vector review (Phase 1: read/map, Phase 2: assess health, Phase 3: research alternatives, Phase 4: classify findings with observation/inference/evidence, Phase 5: write micro-plans, Phase 6: report)
  • Micro-plan format (self-contained, executable, includes risks + verification commands + sources)
  • Audit summary format (executive summary, vector table, findings by severity, micro-plan index)

6. spec-format — Spec Writing Guidance

  • Structure: Problem Statement → Solution → Architecture → Acceptance Criteria → Risks/Mitigations → Dependencies → Test Strategy → Non-Goals
  • Quality standards (architecture choices have "why", risks are specific, acceptance criteria testable, non-goals prevent scope creep)
  • Common mistakes (solution looking for problem, vague criteria, missing non-goals, generic risks, no test strategy, architecture without rationale)

Orchestration Patterns

1. Human Gates Between Phases

Every major decision point stops and waits for user confirmation:

  • Spec approval (before panel review)
  • Panel advancement (after Round 2, checking thresholds)
  • Exec decision (Go / Conditional Go / Kill)
  • Batch acceptance (user reviews progress, amends next batch, or aborts)

2. Fan-Out/Fan-In Parallelization

Multiple agents work in parallel, results synthesized:

  • Panel review: spec approval gate → N panelists (fan-out) → synthesis (fan-in)
  • Batch execution: batch gate → N implementers (fan-out) → N reviewers → integration gate (fan-in)
  • Audit: vector approval gate → N auditors (fan-out) → synthesis (fan-in)

3. Event-Driven Hooks

Emit lifecycle events for integration (if agent-hooks available):

  • strategize.intake.completed, strategize.question.framed, strategize.research.completed, strategize.spec.drafted, strategize.panel.round.complete, strategize.panel.completed, strategize.exec_gate.decided, strategize.finalize.started
  • execute.intake.completed, execute.batch.started, execute.batch.completed, execute.review.completed, execute.finalize.started
  • audit.session.started, audit.vectors.identified, audit.vector.completed, audit.session.completed, audit.cycle.started, audit.cycle.completed, audit.loop.completed
  • research.question.framed, research.execution.completed, research.report.saved, research.completed
  • troubleshoot.session.started, troubleshoot.phase.entered, troubleshoot.gate.checked, troubleshoot.rootcause.identified, troubleshoot.session.completed
  • validate.session.started, validate.decompose.completed, validate.verify.completed, validate.challenge.completed, validate.session.completed

4. Task DAG for Visibility

Full task graph created upfront (not dynamically). Uses:

  • TaskCreate to define all tasks with descriptions
  • TaskUpdate to wire dependencies (addBlockedBy, addBlocks)
  • TaskList to track live progress
  • Phase tasks to mark in_progress/completed
  • Implementers/reviewers check TaskList to find unblocked work

5. Memory Management

AutoMem integration throughout:

  • Recall at phase start (prior research, audit knowledge, lessons-learned, code patterns)
  • Store at phase end (findings, process insights, decisions, rejected proposals)
  • Associations link related memories (EVOLVES_INTO, INVALIDATED_BY, DERIVED_FROM, etc.)
  • Different memory types: audit-knowledge, lessons-learned, patterns, preferences, decisions, insights

6. Configuration via CLAUDE.md or .claude/strategic-planner.local.md

Frontend YAML config:

  • automem_enabled (default: true)
  • default_model (default: sonnet)
  • worktree_strategy (copy/symlink/skip)
  • audit_output_dir, audit_autonomy, research_db_path, research_reports_path

Communication Patterns

Agent-to-Lead

  • Phase completion via SendMessage (type: "message", include summary)
  • Escalations (blockers, divergent scores, regressions)
  • Questions requiring human judgment

Lead-to-Agent

  • SendMessage with clarifications, approvals, next-phase signals
  • SendMessage to integrator: "Batch [N] merged. Run integration check."
  • SendMessage to reviewer: "All tasks complete. Run final holistic review."

Inter-Agent (Panelists)

  • DMs via SendMessage to debate specific disagreements
  • Devil's advocate escalates dismissed concerns to lead

Lead-to-User

  • Present gate decisions with context (scores, divergence, recommendations)
  • Batch checkpoints with implementation summary, review verdicts, integration status
  • Final reports with TaskList showing completion

Key Design Principles

  1. Evidence Before Assertions — Every finding needs code reference + research source. No guessing.
  2. Structured Deliberation — Debate specific disagreements, not just re-score. Devil's advocate escalates dismissed concerns.
  3. Two-Stage Review — Spec compliance first (read actual code, ignore reports), then code quality.
  4. False Positive Prevention — Auditors self-validate findings. Lead re-validates Critical findings (2-agent validation cuts false positives ~60%).
  5. Batch Boundaries — Integration gates after every batch, not just end. Catch regressions early.
  6. File Ownership — No two agents in same batch touch same file. Prevents merge conflicts + data loss.
  7. Self-Improvement Loop/improve-process reviews lessons-learned, proposes changes, validates with user before implementing.
  8. Regression Circuit Breaker — Audit loop halts if more findings introduced than fixed (prevents runaway loops).
  9. Convergence Thresholds — Panel scoring must converge (std dev ≤ 0.75) or advance with documented disagreement.
  10. Lifecycle Preservation — Keep agents alive through relevant phases (panelists through exec review, reviewer through final holistic review) to preserve context.

Project Structure

strategic-planner/
├── .claude-plugin/
│   └── plugin.json              # Registration + MCP server config
├── commands/
│   ├── strategize.md            # Planning workflow (51KB+)
│   ├── execute.md               # Implementation orchestration
│   ├── audit.md                 # Codebase audit
│   ├── research.md              # Structured research
│   ├── improve-process.md       # Plugin self-improvement
│   └── troubleshoot.md          # Diagnostic & recovery
├── agents/
│   ├── reviewer.md              # Code review (2-stage)
│   ├── implementer.md           # Single-task implementation
│   ├── integrator.md            # Integration checks
│   ├── researcher.md            # Deep research
│   ├── spec-writer.md           # Spec drafting & revision
│   ├── panelist.md              # Generic panelist (role-injected)
│   ├── panelist-*.md            # 11 specialized roles
│   ├── auditor.md               # Per-vector auditor
│   ├── exec-reviewer.md         # Executive review
│   └── troubleshooter.md        # Diagnostic
├── skills/
│   ├── task-patterns/
│   │   └── SKILL.md
│   ├── execution-methodology/
│   │   └── SKILL.md
│   ├── review-gates/
│   │   └── SKILL.md
│   ├── panel-review/
│   │   ├── SKILL.md
│   │   └── references/
│   │       ├── scoring-rubric.md
│   │       └── role-catalog.md
│   ├── audit-methodology/
│   │   └── SKILL.md
│   ├── spec-format/
│   │   ├── SKILL.md
│   │   └── references/
│   │       └── spec-template.md
│   └── planning-methodology/
│       └── SKILL.md
└── mcp/
    └── server.js                # Research DB MCP server

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors