Multi-agent AI system that investigates, plans, implements, reviews, and documents β autonomously.
Give DevMind a task and a GitHub repository. Six specialized agents collaborate to research the codebase, search the web for docs and best practices, design a solution, implement production-ready code, validate it with a code review pass, and generate a Pull Request with full documentation.
$ devmind run "Add rate limiting to the /api/users endpoint" \
--repo owner/repo --stream
π§ DevMind β Multi-Agent Dev System v2.0
Task: Add rate limiting to the /api/users endpoint
Repo: owner/repo
Mode: streaming
[ORCHESTRATOR] Task type: feature
[ORCHESTRATOR] Phase 1/5 β Scout: investigating...
[SCOUT] Reading 6 key files...
[SCOUT] Searching docs for: redis...
[SCOUT] Searching best practices...
[SCOUT] Research complete.
[ORCHESTRATOR] Phase 2/5 β Planner: designing...
[PLANNER] Plan ready: 4 steps Β· 2 new Β· 1 modified Β· complexity=medium
[ORCHESTRATOR] Phase 3/5 β Coder: implementing...
[CODER] Writing middleware/rate_limiter.py...
[CODER] Writing tests/test_rate_limiter.py...
[CODER] Done: 2 source files, 1 test file.
[ORCHESTRATOR] Phase 4/5 β Reviewer: validating...
[ORCHESTRATOR] β
Reviewer: implementation approved.
[ORCHESTRATOR] Phase 5/5 β Scribe: documenting...
[ORCHESTRATOR] β
Pipeline complete.
Files: 2 created/modified
Tests: 1 test files
Time: 52.3s Β· Tokens: 9,841
Task input
β
βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β π― Orchestrator β
β Classifies task Β· Routes agents Β· Manages memory β
ββββ¬βββββββ¬βββββββββββ¬βββββββββββ¬βββββββββββ¬βββββββββββ
β β β β β
βΌ βΌ βΌ βΌ βΌ
π π π» π π
Scout Planner Coder Reviewer Scribe
βββββ βββββββ βββββ ββββββββ ββββββ
Reads Designs Writes Validates Writes
repo + solution code + code, PR body
web as JSON tests auto-fixes + docs
β β β β β
ββββββββ΄βββββββββββ΄βββββββββββ΄βββββββββββ
β
βΌ
PR with code + tests + docs
Reads repository structure, key files, and README. Then searches the web for library documentation and best practices relevant to the task. Produces a findings report with actual code patterns from the repo plus external research.
Takes Scout's findings and designs a precise implementation plan in structured JSON: exact file paths to create or modify, test specs, and estimated complexity.
Implements code following the Planner's spec and the repo's exact conventions. Writes complete files with proper error handling. Writes real tests with real assertions.
Reviews the Coder's output before PR creation. Checks for bugs, security issues, fake tests, missing imports, and convention violations. If critical issues found, generates corrected files automatically.
Writes the Pull Request body, changelog entry, and inline documentation. The PR body includes summary, changes list, testing instructions, and breaking change notes.
Classifies the task type (feature/fix/refactor/tests/docs) to route optimally. Loads memory from previous runs on the same repo. Coordinates all agents and handles errors gracefully.
Scout now searches the web for library documentation and best practices before writing its findings report. No API key needed β uses DuckDuckGo.
[SCOUT] Searching docs for: redis, fastapi...
[SCOUT] Searching best practices...
A dedicated code review pass before the PR is created. Catches security issues, bugs, and fake tests. Can auto-generate fixes for critical issues.
DevMind remembers previous runs on the same repository. Architecture notes, coding conventions, and previous tasks are stored in ~/.devmind/memory/ and injected into subsequent runs.
devmind memory list # See all repos with stored memory
devmind memory clear --repo owner/repo # Clear memory for a repoSee each agent's output as it's generated in real time:
devmind run "Add tests" --repo owner/repo --streamThe Orchestrator classifies tasks before running:
featureβ full pipelinetests_onlyβ Scout + Planner + Coder (tests-focused) + Reviewerdocs_onlyβ Scout + Scribefixβ Scout + Planner + Coder + Reviewer (strict)refactorβ full pipeline
pip install git+https://github.com/cdelhierro5/devmind.gitexport ANTHROPIC_API_KEY=sk-ant-...
export GITHUB_TOKEN=ghp_...# Dry run β analyze and implement without GitHub calls
devmind run "Add input validation to the registration form" \
--repo owner/repo --dry-run
# Stream output in real time
devmind run "Add rate limiting to all API endpoints" \
--repo owner/repo --stream
# Full run + create a real GitHub PR
devmind run "Write tests for the authentication module" \
--repo owner/repo --create-pr
# Skip the Reviewer for faster runs
devmind run "Fix the typo in README" \
--repo owner/repo --skip-reviewfrom devmind.engine import DevMind
dm = DevMind(
anthropic_key="sk-ant-...",
github_token="ghp_...",
)
result = dm.run(
task="Add input validation to the /api/register endpoint",
repo="owner/repo",
create_pr=True,
stream=True, # Stream output to stdout
skip_review=False, # Run Reviewer agent (default)
)
print(result.summary())
# Access individual agent outputs
print(result.scout_findings)
print(result.plan.summary)
print(result.plan.steps)
for path, code in result.implementation.files.items():
print(f"\n--- {path} ---")
print(code)
# Memory management
repos = dm.memory_list() # List repos with memory
dm.memory_clear("owner/repo") # Clear memory for a repodevmind/
β
βββ devmind/
β βββ engine.py # DevMind class β public API
β βββ models.py # Shared data models
β βββ cli.py # CLI entry point
β β
β βββ agents/
β β βββ base.py # BaseAgent β streaming + standard LLM calls
β β βββ orchestrator.py # Smart coordinator with task routing + memory
β β βββ scout.py # Repo investigation + web search
β β βββ planner.py # JSON plan generation
β β βββ coder.py # Code implementation
β β βββ reviewer.py # NEW: code review + auto-fix
β β βββ scribe.py # PR body + documentation
β β
β βββ providers/
β β βββ github.py # GitHub API β fetch context, push, create PR
β β
β βββ tools/
β βββ web_search.py # NEW: DuckDuckGo search (no API key)
β βββ memory.py # NEW: persistent JSON memory per repo
β
βββ tests/
βββ test_devmind.py # 48 tests β all run without API keys
devmind run <task> --repo <repo> [options]
--repo, -r GitHub repo URL or owner/repo (required)
--create-pr Create a real GitHub Pull Request
--branch Custom branch name (default: devmind/<task-slug>)
--dry-run Skip all GitHub API calls
--stream, -s Stream agent output in real time
--skip-review Skip Reviewer agent (faster)
--verbose, -v Show code previews in output
--no-exit-code Always exit 0
devmind agents List all 6 agents and their roles
devmind memory list List repos with stored memory
devmind memory clear --repo owner/repo
devmind --version
result = dm.run(task, repo, create_pr, branch_name, dry_run, stream, skip_review)
result.status # TaskStatus.DONE | FAILED | PENDING
result.scout_findings # str β Scout's research + web search results
result.plan # Plan object
result.plan.summary # str
result.plan.steps # list[str]
result.plan.files_to_create # list[str]
result.plan.files_to_modify # list[str]
result.plan.estimated_complexity # "low" | "medium" | "high"
result.implementation # Implementation object
result.implementation.files # dict[path β content]
result.implementation.tests # dict[path β content]
result.pr_body # str β markdown PR body
result.documentation # str β documentation notes
result.messages # list[AgentMessage] β full pipeline log
result.tokens_used # int
result.elapsed_seconds # float
result.error # str
result.summary() # human-readable summary stringAll 48 tests run without API keys β they use mocked Anthropic responses.
git clone https://github.com/cdelhierro5/devmind
cd devmind
pip install -e .
pytest tests/ -vTest coverage:
- Models, data flow, message logging
- GitHub URL parsing, diff parsing
- Planner JSON parsing and fallback handling
- Coder file block parsing, test routing, content cleaning
- Scribe block extraction, fallback PR body
- Reviewer verdict parsing, auto-fix application, approved/rejected paths
- Orchestrator full pipeline, error handling, skip_review flag, task classification
- Scout web search library extraction
- Memory: save/load, task history, context string generation
- Streaming flag propagation through BaseAgent
- DevMind engine: dry run, end-to-end with mocked LLM
Good tasks (specific and scoped):
"Add Redis-based rate limiting to all API endpoints"
"Write pytest tests for the UserService class"
"Add JWT authentication to the /api/login endpoint"
"Refactor database.py to use a connection pool"
"Add request/response logging middleware"
"Create a /health endpoint with uptime and version info"
"Add pagination to the /api/posts endpoint"
"Extract email sending logic into a dedicated EmailService class"
Less suitable (too vague):
"Make the app better"
"Fix all bugs"
"Redesign the architecture"
- Python 3.11+
anthropicβ the only runtime dependency- Anthropic API key (Claude Opus)
- GitHub token (for repo access and PR creation)
pip install anthropicGitHub token needs repo scope for private repos or public_repo for public.
- Interactive mode β review and edit the plan before Coder runs
- GitLab support β same workflow for GitLab merge requests
- Multi-file context β Scout reads full file contents, not just structure
- Self-healing β if tests fail, Coder retries with the error as context
- Parallel Coder β implement multiple files concurrently
- GitHub Issues integration β run DevMind directly from an issue number
New agents, providers, and task examples are especially welcome.
git clone https://github.com/cdelhierro5/devmind
cd devmind
pip install -e .
pytest tests/ -v # 48 should passMIT