Cognitive Squad

A Spec-Kit extension that orchestrates 19 specialized cognitive functions to handle the complete pre-code phase of software development. From an initial idea or existing codebase, Cognitive Squad autonomously discovers the domain, defines requirements, validates quality against IEEE/ISO standards, evaluates feasibility, designs architecture, builds a test strategy, and produces an estimated implementation plan -- all with evidence-graded confidence and a learning feedback loop that improves accuracy over time.

Architecture

┌──────────────────────────────────────────────────────────┐
│  TIER 1: CORE SQUAD (7 agents, always active)            │
│                                                          │
│  MANAGER → DISCOVER → WHAT → WHY → ASSESS → HOW → PLAN │
└──────────────────────────┬───────────────────────────────┘
                           │ summons on demand
┌──────────────────────────▼───────────────────────────────┐
│  TIER 2: SPECIALIST POOL (7 specialists)                 │
│                                                          │
│  SCIENTIST · SECURITY · TEST ARCHITECT · PERFORMANCE     │
│  DOMAIN EXPERT · UX/A11Y · INNOVATE                      │
└──────────────────────────┬───────────────────────────────┘
                           │ runs after/between
┌──────────────────────────▼───────────────────────────────┐
│  TIER 3: LEARNING LAYER (4 functions + feedback)         │
│                                                          │
│  REFLECT · EVOLVE · CALIBRATE · GROUND                   │
│  + FEEDBACK intake (post-implementation)                  │
└──────────────────────────────────────────────────────────┘

Totals: 7 core + 7 specialists + 4 learning + 1 feedback intake = 19 cognitive functions.

The Flow

The MANAGER routes through a state machine, dynamically adapting based on quality gates and domain signals:

INIT → DISCOVER → WHY1 (challenge assumptions)
  → WHAT (define requirements) → WHY2 (validate specs)
  → ASSESS (feasibility / kill gate)
  → [SPECIALISTS: SCIENTIST, SECURITY, DOMAIN, UX, PERFORMANCE]
  → HOW (architecture) → TEST ARCHITECT (mandatory)
  → PLAN (tasks, critical path, risk)
  → CONSENSUS (WHY3 + ASSESS2 + PLAN2 + specialists review)
  → FINALIZE (GROUND + REFLECT + CALIBRATE)
  → DONE

Any step can route back to an earlier stage if quality gates fail. ASSESS can kill a project entirely if unfeasible. The MANAGER enforces convergence after 5 iterations maximum.

Installation

Install as a Spec-Kit extension:

# From registry
specify extension add cognitive-squad

# From local path (development)
specify extension add --dev /path/to/cognitive-squad

Quick Start

# Full autonomous run with a project description
/speckit.squad.run "Build a photo album app with sharing and tagging"

# Check progress mid-run
/speckit.squad.status

# After implementation is complete, feed back results
/speckit.squad.feedback 001

Commands

Command	Description	When to use
`/speckit.squad.run`	Full autonomous cognitive squad run	Starting a new analysis or re-running on existing specs
`/speckit.squad.status`	Check current squad state and progress	Mid-run monitoring, reviewing prior runs
`/speckit.squad.innovate`	Manually trigger INNOVATE specialist	Stagnation, want alternative approaches
`/speckit.squad.investigate`	Manually trigger SCIENTIST for a question	Need evidence-graded research on a topic
`/speckit.squad.ground`	Manually trigger reality check on artifacts	Validate plans against real-world constraints
`/speckit.squad.feedback`	Post-implementation feedback intake	After building the project, to close the learning loop
`/speckit.squad.resume`	Provide answer to human escalation	Squad asked a question and is waiting for your input

Agent Roster

Tier 1: Core Squad

Agent	Role	Key Output
MANAGER	Orchestrator -- routes agents, enforces convergence, resolves conflicts	`state.json`, routing log
DISCOVER	Reconnaissance -- maps domain, glossary, boundaries, assumptions	`glossary.md`, `mental-model.md`, `boundaries.md`
WHAT	Requirements definer -- testable specs from discovered territory	`spec.md`, domain decomposition
WHY	Adversarial critic -- finds holes, runs Understanding quality gates	`issues.md`, `quality-gates.md`
ASSESS	Strategic PM -- feasibility, estimation, prioritization, kill gate	`feasibility.md`, `estimates.md`, `mvp-scope.md`
HOW	Architect -- tech stack, data model, API contracts, ADRs	`plan.md`, `data-model.md`, `contracts/`
PLAN	Operational PM -- tasks, critical path, dependencies, risk	`tasks.md`, `critical-path.md`, `risk-matrix.md`

Tier 2: Specialist Pool

Specialist	Trigger	Key Output
SCIENTIST	Unknowns, unproven tech, conflicting evidence	Investigation reports, experiment results
SECURITY	Auth, payments, PII, compliance domains	`threat-model.md`, `compliance-requirements.md`
TEST ARCHITECT	Mandatory after HOW	`test-strategy.md`, `coverage-map.md`
DOMAIN EXPERT	Domain-specific knowledge needed	Domain amendments to spec and plan
UX / A11Y	Frontend, user-facing features	`accessibility-requirements.md`, `user-flow.md`
PERFORMANCE	High-load, real-time, scalability needs	`performance-requirements.md`, `capacity-model.md`
INNOVATE	Stagnation, re-runs, circular reasoning	`alternatives.md`, `challenge-assumptions.md`

Tier 3: Learning Layer

Function	When	Purpose
REFLECT	End of every run	Extracts patterns and pitfalls to knowledge base
EVOLVE	Start/end of re-runs	Diffs artifacts, detects regressions, flags stagnation
CALIBRATE	End of run + after feedback	Tracks AI accuracy per domain, adjusts confidence
GROUND	During FINALIZE	Reality-checks artifacts against real-world data
FEEDBACK	Post-implementation (manual)	Closes prediction-to-outcome loop for calibration

Configuration

Copy the template and customize:

cp config-template.yml squad-config.yml

Key settings:

Setting	Default	Description
`analysis.mode`	`auto`	`auto` / `greenfield` / `brownfield`
`analysis.max_iterations`	`5`	Maximum squad iterations before forced convergence
`analysis.token_budget_k`	`1000`	Approximate token budget (thousands)
`analysis.convergence_delta`	`0.02`	Understanding score delta for convergence
`specialists.max_active`	`3`	Max simultaneous specialists
`specialists.always_test_architect`	`true`	Always summon TEST ARCHITECT
`quality_gates.overall`	`0.70`	Minimum Understanding overall score

See config-template.yml for the complete reference.

Knowledge Base

Cognitive Squad learns over time through YAML knowledge files:

knowledge-base/
├── patterns.yaml             # Reusable patterns (validated by REFLECT)
├── pitfalls.yaml             # Common mistakes to avoid
├── calibration-profile.yaml  # AI accuracy per domain
├── estimates-log.yaml        # Predicted vs actual effort
└── feedback/                 # Post-implementation outcome data
    └── 001-{project}.yaml

The learning loop:

REFLECT logs patterns and pitfalls after each run
CALIBRATE tracks prediction accuracy per domain
FEEDBACK (manual, post-implementation) provides ground truth
EVOLVE detects stagnation and confirmation bias
After 5-10 projects with feedback, estimates auto-adjust based on real data

Evidence Grades

All research from SCIENTIST is graded for source quality:

Grade	Description	Examples	Weight
A	Peer-reviewed research, ISO/IEEE standard	IEEE 830, published papers	1.0
B	Official documentation, proven benchmark	Framework docs, reproducible benchmarks	0.8
C	Well-regarded blog, conference talk	ThoughtWorks Radar, conference presentations	0.6
D	Stack Overflow, forum post, anecdotal	Accepted SO answers, Reddit threads	0.3
E	AI training data (unverified)	LLM-generated without citation	0.1

Higher grade wins in conflicts. Same grade: more recent wins. Experiment validation can upgrade a source from C-E to B.

Prerequisites

spec-kit >= 0.3.0 (required)
understanding >= 3.4.0 (optional, enables WHY quality gates with 31 deterministic metrics)
spec-kit-reverse-eng >= 1.0.0 (optional, enables brownfield codebase analysis)

Related Projects

spec-kit -- The specification framework this extension runs on
understanding -- IEEE/ISO-backed specification quality metrics
spec-kit-reverse-eng -- Reverse engineering extension for brownfield analysis

License

MIT -- see LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
agents		agents
commands		commands
docs		docs
knowledge-base		knowledge-base
scripts/bash		scripts/bash
templates		templates
.extensionignore		.extensionignore
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
config-template.yml		config-template.yml
extension.yml		extension.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cognitive Squad

Architecture

The Flow

Installation

Quick Start

Commands

Agent Roster

Tier 1: Core Squad

Tier 2: Specialist Pool

Tier 3: Learning Layer

Configuration

Knowledge Base

Evidence Grades

Prerequisites

Related Projects

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Cognitive Squad

Architecture

The Flow

Installation

Quick Start

Commands

Agent Roster

Tier 1: Core Squad

Tier 2: Specialist Pool

Tier 3: Learning Layer

Configuration

Knowledge Base

Evidence Grades

Prerequisites

Related Projects

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages