🧠 DevMind

Multi-agent AI system that investigates, plans, implements, reviews, and documents — autonomously.

Give DevMind a task and a GitHub repository. Six specialized agents collaborate to research the codebase, search the web for docs and best practices, design a solution, implement production-ready code, validate it with a code review pass, and generate a Pull Request with full documentation.

What it does

$ devmind run "Add rate limiting to the /api/users endpoint" \
    --repo owner/repo --stream

  🧠 DevMind — Multi-Agent Dev System v2.0

  Task:   Add rate limiting to the /api/users endpoint
  Repo:   owner/repo
  Mode:   streaming

  [ORCHESTRATOR] Task type: feature
  [ORCHESTRATOR] Phase 1/5 — Scout: investigating...
  [SCOUT] Reading 6 key files...
  [SCOUT] Searching docs for: redis...
  [SCOUT] Searching best practices...
  [SCOUT] Research complete.
  [ORCHESTRATOR] Phase 2/5 — Planner: designing...
  [PLANNER] Plan ready: 4 steps · 2 new · 1 modified · complexity=medium
  [ORCHESTRATOR] Phase 3/5 — Coder: implementing...
  [CODER]   Writing middleware/rate_limiter.py...
  [CODER]   Writing tests/test_rate_limiter.py...
  [CODER] Done: 2 source files, 1 test file.
  [ORCHESTRATOR] Phase 4/5 — Reviewer: validating...
  [ORCHESTRATOR] ✅ Reviewer: implementation approved.
  [ORCHESTRATOR] Phase 5/5 — Scribe: documenting...
  [ORCHESTRATOR] ✅ Pipeline complete.

  Files:   2 created/modified
  Tests:   1 test files
  Time:    52.3s · Tokens: 9,841

The 6 agents

Task input
    │
    ▼
┌────────────────────────────────────────────────────┐
│              🎯 Orchestrator                        │
│  Classifies task · Routes agents · Manages memory  │
└──┬──────┬──────────┬──────────┬──────────┬──────────┘
   │      │          │          │          │
   ▼      ▼          ▼          ▼          ▼
 🔍      📐         💻         🔍         📝
Scout  Planner    Coder    Reviewer    Scribe
─────  ───────    ─────    ────────    ──────
Reads  Designs    Writes   Validates   Writes
repo + solution   code +   code,       PR body
web    as JSON    tests    auto-fixes  + docs
   │      │          │          │          │
   └──────┴──────────┴──────────┴──────────┘
                    │
                    ▼
         PR with code + tests + docs

🔍 Scout — Research agent

Reads repository structure, key files, and README. Then searches the web for library documentation and best practices relevant to the task. Produces a findings report with actual code patterns from the repo plus external research.

📐 Planner — Design agent

Takes Scout's findings and designs a precise implementation plan in structured JSON: exact file paths to create or modify, test specs, and estimated complexity.

💻 Coder — Implementation agent

Implements code following the Planner's spec and the repo's exact conventions. Writes complete files with proper error handling. Writes real tests with real assertions.

🔍 Reviewer — Validation agent (new in v2)

Reviews the Coder's output before PR creation. Checks for bugs, security issues, fake tests, missing imports, and convention violations. If critical issues found, generates corrected files automatically.

📝 Scribe — Documentation agent

Writes the Pull Request body, changelog entry, and inline documentation. The PR body includes summary, changes list, testing instructions, and breaking change notes.

🎯 Orchestrator — Intelligence layer

Classifies the task type (feature/fix/refactor/tests/docs) to route optimally. Loads memory from previous runs on the same repo. Coordinates all agents and handles errors gracefully.

What's new in v2.0

Web search for Scout

Scout now searches the web for library documentation and best practices before writing its findings report. No API key needed — uses DuckDuckGo.

[SCOUT] Searching docs for: redis, fastapi...
[SCOUT] Searching best practices...

Reviewer agent (6th agent)

A dedicated code review pass before the PR is created. Catches security issues, bugs, and fake tests. Can auto-generate fixes for critical issues.

Memory system

DevMind remembers previous runs on the same repository. Architecture notes, coding conventions, and previous tasks are stored in ~/.devmind/memory/ and injected into subsequent runs.

devmind memory list              # See all repos with stored memory
devmind memory clear --repo owner/repo  # Clear memory for a repo

Streaming output

See each agent's output as it's generated in real time:

devmind run "Add tests" --repo owner/repo --stream

Smart task routing

The Orchestrator classifies tasks before running:

feature → full pipeline
tests_only → Scout + Planner + Coder (tests-focused) + Reviewer
docs_only → Scout + Scribe
fix → Scout + Planner + Coder + Reviewer (strict)
refactor → full pipeline

Quick start

Install

pip install git+https://github.com/cdelhierro5/devmind.git

Set API keys

export ANTHROPIC_API_KEY=sk-ant-...
export GITHUB_TOKEN=ghp_...

Run

# Dry run — analyze and implement without GitHub calls
devmind run "Add input validation to the registration form" \
  --repo owner/repo --dry-run

# Stream output in real time
devmind run "Add rate limiting to all API endpoints" \
  --repo owner/repo --stream

# Full run + create a real GitHub PR
devmind run "Write tests for the authentication module" \
  --repo owner/repo --create-pr

# Skip the Reviewer for faster runs
devmind run "Fix the typo in README" \
  --repo owner/repo --skip-review

Python API

from devmind.engine import DevMind

dm = DevMind(
    anthropic_key="sk-ant-...",
    github_token="ghp_...",
)

result = dm.run(
    task="Add input validation to the /api/register endpoint",
    repo="owner/repo",
    create_pr=True,
    stream=True,         # Stream output to stdout
    skip_review=False,   # Run Reviewer agent (default)
)

print(result.summary())

# Access individual agent outputs
print(result.scout_findings)
print(result.plan.summary)
print(result.plan.steps)

for path, code in result.implementation.files.items():
    print(f"\n--- {path} ---")
    print(code)

# Memory management
repos = dm.memory_list()        # List repos with memory
dm.memory_clear("owner/repo")  # Clear memory for a repo

Architecture

devmind/
│
├── devmind/
│   ├── engine.py               # DevMind class — public API
│   ├── models.py               # Shared data models
│   ├── cli.py                  # CLI entry point
│   │
│   ├── agents/
│   │   ├── base.py             # BaseAgent — streaming + standard LLM calls
│   │   ├── orchestrator.py     # Smart coordinator with task routing + memory
│   │   ├── scout.py            # Repo investigation + web search
│   │   ├── planner.py          # JSON plan generation
│   │   ├── coder.py            # Code implementation
│   │   ├── reviewer.py         # NEW: code review + auto-fix
│   │   └── scribe.py           # PR body + documentation
│   │
│   ├── providers/
│   │   └── github.py           # GitHub API — fetch context, push, create PR
│   │
│   └── tools/
│       ├── web_search.py       # NEW: DuckDuckGo search (no API key)
│       └── memory.py           # NEW: persistent JSON memory per repo
│
└── tests/
    └── test_devmind.py         # 48 tests — all run without API keys

CLI reference

devmind run <task> --repo <repo> [options]

  --repo, -r          GitHub repo URL or owner/repo (required)
  --create-pr         Create a real GitHub Pull Request
  --branch            Custom branch name (default: devmind/<task-slug>)
  --dry-run           Skip all GitHub API calls
  --stream, -s        Stream agent output in real time
  --skip-review       Skip Reviewer agent (faster)
  --verbose, -v       Show code previews in output
  --no-exit-code      Always exit 0

devmind agents          List all 6 agents and their roles
devmind memory list     List repos with stored memory
devmind memory clear --repo owner/repo
devmind --version

Python API reference

result = dm.run(task, repo, create_pr, branch_name, dry_run, stream, skip_review)

result.status              # TaskStatus.DONE | FAILED | PENDING
result.scout_findings      # str — Scout's research + web search results
result.plan                # Plan object
result.plan.summary        # str
result.plan.steps          # list[str]
result.plan.files_to_create   # list[str]
result.plan.files_to_modify   # list[str]
result.plan.estimated_complexity  # "low" | "medium" | "high"
result.implementation      # Implementation object
result.implementation.files   # dict[path → content]
result.implementation.tests   # dict[path → content]
result.pr_body             # str — markdown PR body
result.documentation       # str — documentation notes
result.messages            # list[AgentMessage] — full pipeline log
result.tokens_used         # int
result.elapsed_seconds     # float
result.error               # str
result.summary()           # human-readable summary string

Running tests

All 48 tests run without API keys — they use mocked Anthropic responses.

git clone https://github.com/cdelhierro5/devmind
cd devmind
pip install -e .
pytest tests/ -v

Test coverage:

Models, data flow, message logging
GitHub URL parsing, diff parsing
Planner JSON parsing and fallback handling
Coder file block parsing, test routing, content cleaning
Scribe block extraction, fallback PR body
Reviewer verdict parsing, auto-fix application, approved/rejected paths
Orchestrator full pipeline, error handling, skip_review flag, task classification
Scout web search library extraction
Memory: save/load, task history, context string generation
Streaming flag propagation through BaseAgent
DevMind engine: dry run, end-to-end with mocked LLM

What tasks work best

Good tasks (specific and scoped):

"Add Redis-based rate limiting to all API endpoints"
"Write pytest tests for the UserService class"
"Add JWT authentication to the /api/login endpoint"
"Refactor database.py to use a connection pool"
"Add request/response logging middleware"
"Create a /health endpoint with uptime and version info"
"Add pagination to the /api/posts endpoint"
"Extract email sending logic into a dedicated EmailService class"

Less suitable (too vague):

"Make the app better"
"Fix all bugs"
"Redesign the architecture"

Requirements

Python 3.11+
anthropic — the only runtime dependency
Anthropic API key (Claude Opus)
GitHub token (for repo access and PR creation)

pip install anthropic

GitHub token needs repo scope for private repos or public_repo for public.

Roadmap

Interactive mode — review and edit the plan before Coder runs
GitLab support — same workflow for GitLab merge requests
Multi-file context — Scout reads full file contents, not just structure
Self-healing — if tests fail, Coder retries with the error as context
Parallel Coder — implement multiple files concurrently
GitHub Issues integration — run DevMind directly from an issue number

Contributing

New agents, providers, and task examples are especially welcome.

git clone https://github.com/cdelhierro5/devmind
cd devmind
pip install -e .
pytest tests/ -v   # 48 should pass

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
coder.py		coder.py
engine.py		engine.py
memory.py		memory.py
orchestrator.py		orchestrator.py
planner.py		planner.py
reviewer.py		reviewer.py
scout.py		scout.py
scribe.py		scribe.py
test_devmind.py		test_devmind.py
web_search.py		web_search.py

Folders and files

Latest commit

History

Repository files navigation

🧠 DevMind

What it does

The 6 agents

🔍 Scout — Research agent

📐 Planner — Design agent

💻 Coder — Implementation agent

🔍 Reviewer — Validation agent (new in v2)

📝 Scribe — Documentation agent

🎯 Orchestrator — Intelligence layer

What's new in v2.0

Web search for Scout

Reviewer agent (6th agent)

Memory system

Streaming output

Smart task routing

Quick start

Install

Set API keys

Run

Python API

Architecture

CLI reference

Python API reference

Running tests

What tasks work best

Requirements

Roadmap

Contributing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages