Skip to content

cdelhierro5/devmind

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🧠 DevMind

Multi-agent AI system that investigates, plans, implements, reviews, and documents β€” autonomously.

Give DevMind a task and a GitHub repository. Six specialized agents collaborate to research the codebase, search the web for docs and best practices, design a solution, implement production-ready code, validate it with a code review pass, and generate a Pull Request with full documentation.

Tests Python Powered by License


What it does

$ devmind run "Add rate limiting to the /api/users endpoint" \
    --repo owner/repo --stream

  🧠 DevMind β€” Multi-Agent Dev System v2.0

  Task:   Add rate limiting to the /api/users endpoint
  Repo:   owner/repo
  Mode:   streaming

  [ORCHESTRATOR] Task type: feature
  [ORCHESTRATOR] Phase 1/5 β€” Scout: investigating...
  [SCOUT] Reading 6 key files...
  [SCOUT] Searching docs for: redis...
  [SCOUT] Searching best practices...
  [SCOUT] Research complete.
  [ORCHESTRATOR] Phase 2/5 β€” Planner: designing...
  [PLANNER] Plan ready: 4 steps Β· 2 new Β· 1 modified Β· complexity=medium
  [ORCHESTRATOR] Phase 3/5 β€” Coder: implementing...
  [CODER]   Writing middleware/rate_limiter.py...
  [CODER]   Writing tests/test_rate_limiter.py...
  [CODER] Done: 2 source files, 1 test file.
  [ORCHESTRATOR] Phase 4/5 β€” Reviewer: validating...
  [ORCHESTRATOR] βœ… Reviewer: implementation approved.
  [ORCHESTRATOR] Phase 5/5 β€” Scribe: documenting...
  [ORCHESTRATOR] βœ… Pipeline complete.

  Files:   2 created/modified
  Tests:   1 test files
  Time:    52.3s Β· Tokens: 9,841

The 6 agents

Task input
    β”‚
    β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              🎯 Orchestrator                        β”‚
β”‚  Classifies task Β· Routes agents Β· Manages memory  β”‚
β””β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
   β”‚      β”‚          β”‚          β”‚          β”‚
   β–Ό      β–Ό          β–Ό          β–Ό          β–Ό
 πŸ”      πŸ“         πŸ’»         πŸ”         πŸ“
Scout  Planner    Coder    Reviewer    Scribe
─────  ───────    ─────    ────────    ──────
Reads  Designs    Writes   Validates   Writes
repo + solution   code +   code,       PR body
web    as JSON    tests    auto-fixes  + docs
   β”‚      β”‚          β”‚          β”‚          β”‚
   β””β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                    β”‚
                    β–Ό
         PR with code + tests + docs

πŸ” Scout β€” Research agent

Reads repository structure, key files, and README. Then searches the web for library documentation and best practices relevant to the task. Produces a findings report with actual code patterns from the repo plus external research.

πŸ“ Planner β€” Design agent

Takes Scout's findings and designs a precise implementation plan in structured JSON: exact file paths to create or modify, test specs, and estimated complexity.

πŸ’» Coder β€” Implementation agent

Implements code following the Planner's spec and the repo's exact conventions. Writes complete files with proper error handling. Writes real tests with real assertions.

πŸ” Reviewer β€” Validation agent (new in v2)

Reviews the Coder's output before PR creation. Checks for bugs, security issues, fake tests, missing imports, and convention violations. If critical issues found, generates corrected files automatically.

πŸ“ Scribe β€” Documentation agent

Writes the Pull Request body, changelog entry, and inline documentation. The PR body includes summary, changes list, testing instructions, and breaking change notes.

🎯 Orchestrator β€” Intelligence layer

Classifies the task type (feature/fix/refactor/tests/docs) to route optimally. Loads memory from previous runs on the same repo. Coordinates all agents and handles errors gracefully.


What's new in v2.0

Web search for Scout

Scout now searches the web for library documentation and best practices before writing its findings report. No API key needed β€” uses DuckDuckGo.

[SCOUT] Searching docs for: redis, fastapi...
[SCOUT] Searching best practices...

Reviewer agent (6th agent)

A dedicated code review pass before the PR is created. Catches security issues, bugs, and fake tests. Can auto-generate fixes for critical issues.

Memory system

DevMind remembers previous runs on the same repository. Architecture notes, coding conventions, and previous tasks are stored in ~/.devmind/memory/ and injected into subsequent runs.

devmind memory list              # See all repos with stored memory
devmind memory clear --repo owner/repo  # Clear memory for a repo

Streaming output

See each agent's output as it's generated in real time:

devmind run "Add tests" --repo owner/repo --stream

Smart task routing

The Orchestrator classifies tasks before running:

  • feature β†’ full pipeline
  • tests_only β†’ Scout + Planner + Coder (tests-focused) + Reviewer
  • docs_only β†’ Scout + Scribe
  • fix β†’ Scout + Planner + Coder + Reviewer (strict)
  • refactor β†’ full pipeline

Quick start

Install

pip install git+https://github.com/cdelhierro5/devmind.git

Set API keys

export ANTHROPIC_API_KEY=sk-ant-...
export GITHUB_TOKEN=ghp_...

Run

# Dry run β€” analyze and implement without GitHub calls
devmind run "Add input validation to the registration form" \
  --repo owner/repo --dry-run

# Stream output in real time
devmind run "Add rate limiting to all API endpoints" \
  --repo owner/repo --stream

# Full run + create a real GitHub PR
devmind run "Write tests for the authentication module" \
  --repo owner/repo --create-pr

# Skip the Reviewer for faster runs
devmind run "Fix the typo in README" \
  --repo owner/repo --skip-review

Python API

from devmind.engine import DevMind

dm = DevMind(
    anthropic_key="sk-ant-...",
    github_token="ghp_...",
)

result = dm.run(
    task="Add input validation to the /api/register endpoint",
    repo="owner/repo",
    create_pr=True,
    stream=True,         # Stream output to stdout
    skip_review=False,   # Run Reviewer agent (default)
)

print(result.summary())

# Access individual agent outputs
print(result.scout_findings)
print(result.plan.summary)
print(result.plan.steps)

for path, code in result.implementation.files.items():
    print(f"\n--- {path} ---")
    print(code)

# Memory management
repos = dm.memory_list()        # List repos with memory
dm.memory_clear("owner/repo")  # Clear memory for a repo

Architecture

devmind/
β”‚
β”œβ”€β”€ devmind/
β”‚   β”œβ”€β”€ engine.py               # DevMind class β€” public API
β”‚   β”œβ”€β”€ models.py               # Shared data models
β”‚   β”œβ”€β”€ cli.py                  # CLI entry point
β”‚   β”‚
β”‚   β”œβ”€β”€ agents/
β”‚   β”‚   β”œβ”€β”€ base.py             # BaseAgent β€” streaming + standard LLM calls
β”‚   β”‚   β”œβ”€β”€ orchestrator.py     # Smart coordinator with task routing + memory
β”‚   β”‚   β”œβ”€β”€ scout.py            # Repo investigation + web search
β”‚   β”‚   β”œβ”€β”€ planner.py          # JSON plan generation
β”‚   β”‚   β”œβ”€β”€ coder.py            # Code implementation
β”‚   β”‚   β”œβ”€β”€ reviewer.py         # NEW: code review + auto-fix
β”‚   β”‚   └── scribe.py           # PR body + documentation
β”‚   β”‚
β”‚   β”œβ”€β”€ providers/
β”‚   β”‚   └── github.py           # GitHub API β€” fetch context, push, create PR
β”‚   β”‚
β”‚   └── tools/
β”‚       β”œβ”€β”€ web_search.py       # NEW: DuckDuckGo search (no API key)
β”‚       └── memory.py           # NEW: persistent JSON memory per repo
β”‚
└── tests/
    └── test_devmind.py         # 48 tests β€” all run without API keys

CLI reference

devmind run <task> --repo <repo> [options]

  --repo, -r          GitHub repo URL or owner/repo (required)
  --create-pr         Create a real GitHub Pull Request
  --branch            Custom branch name (default: devmind/<task-slug>)
  --dry-run           Skip all GitHub API calls
  --stream, -s        Stream agent output in real time
  --skip-review       Skip Reviewer agent (faster)
  --verbose, -v       Show code previews in output
  --no-exit-code      Always exit 0

devmind agents          List all 6 agents and their roles
devmind memory list     List repos with stored memory
devmind memory clear --repo owner/repo
devmind --version

Python API reference

result = dm.run(task, repo, create_pr, branch_name, dry_run, stream, skip_review)

result.status              # TaskStatus.DONE | FAILED | PENDING
result.scout_findings      # str β€” Scout's research + web search results
result.plan                # Plan object
result.plan.summary        # str
result.plan.steps          # list[str]
result.plan.files_to_create   # list[str]
result.plan.files_to_modify   # list[str]
result.plan.estimated_complexity  # "low" | "medium" | "high"
result.implementation      # Implementation object
result.implementation.files   # dict[path β†’ content]
result.implementation.tests   # dict[path β†’ content]
result.pr_body             # str β€” markdown PR body
result.documentation       # str β€” documentation notes
result.messages            # list[AgentMessage] β€” full pipeline log
result.tokens_used         # int
result.elapsed_seconds     # float
result.error               # str
result.summary()           # human-readable summary string

Running tests

All 48 tests run without API keys β€” they use mocked Anthropic responses.

git clone https://github.com/cdelhierro5/devmind
cd devmind
pip install -e .
pytest tests/ -v

Test coverage:

  • Models, data flow, message logging
  • GitHub URL parsing, diff parsing
  • Planner JSON parsing and fallback handling
  • Coder file block parsing, test routing, content cleaning
  • Scribe block extraction, fallback PR body
  • Reviewer verdict parsing, auto-fix application, approved/rejected paths
  • Orchestrator full pipeline, error handling, skip_review flag, task classification
  • Scout web search library extraction
  • Memory: save/load, task history, context string generation
  • Streaming flag propagation through BaseAgent
  • DevMind engine: dry run, end-to-end with mocked LLM

What tasks work best

Good tasks (specific and scoped):

"Add Redis-based rate limiting to all API endpoints"
"Write pytest tests for the UserService class"
"Add JWT authentication to the /api/login endpoint"
"Refactor database.py to use a connection pool"
"Add request/response logging middleware"
"Create a /health endpoint with uptime and version info"
"Add pagination to the /api/posts endpoint"
"Extract email sending logic into a dedicated EmailService class"

Less suitable (too vague):

"Make the app better"
"Fix all bugs"
"Redesign the architecture"

Requirements

  • Python 3.11+
  • anthropic β€” the only runtime dependency
  • Anthropic API key (Claude Opus)
  • GitHub token (for repo access and PR creation)
pip install anthropic

GitHub token needs repo scope for private repos or public_repo for public.


Roadmap

  • Interactive mode β€” review and edit the plan before Coder runs
  • GitLab support β€” same workflow for GitLab merge requests
  • Multi-file context β€” Scout reads full file contents, not just structure
  • Self-healing β€” if tests fail, Coder retries with the error as context
  • Parallel Coder β€” implement multiple files concurrently
  • GitHub Issues integration β€” run DevMind directly from an issue number

Contributing

New agents, providers, and task examples are especially welcome.

git clone https://github.com/cdelhierro5/devmind
cd devmind
pip install -e .
pytest tests/ -v   # 48 should pass

License

MIT

About

Multi-agent AI system that investigates, plans, implements and documents code autonomously using Claude Opus.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages