Claude + GPT-5.2 Pro working together through structured debate.
____
/ __ \__ ______
/ / / / / / / __ \
/ /_/ / /_/ / /_/ /
/_____/\__,_/\____/
Debate-Driven Development
Unlike simple "agent does task, other reviews" workflows, Duo uses adversarial collaboration:
┌─────────────────────────────────────────────────────────────────┐
│ THE DEBATE WORKFLOW │
├─────────────────────────────────────────────────────────────────┤
│ │
│ CLAUDE (Lead Dev) GPT-5.2 PRO (Tech Lead) │
│ ├─ Knows codebase ├─ Fresh perspective │
│ ├─ Writes plans ├─ Asks hard questions │
│ ├─ Implements code ├─ Finds edge cases │
│ └─ Defends decisions └─ Challenges assumptions │
│ │
│ DEBATE ROUNDS (3-5x) │
│ ┌────────────────────────────────────────────┐ │
│ │ Claude: "Here's my plan..." │ │
│ │ GPT: "What about X? How does this fit Y?" │ │
│ │ Claude: "Good point, updating..." or │ │
│ │ "Actually, here's why..." │ │
│ │ GPT: "APPROVED" or more challenges │ │
│ └────────────────────────────────────────────┘ │
│ │
│ SUB-AGENTS (context fetchers) │
│ ┌────────────────────────────────────────────┐ │
│ │ - search_files: Find relevant code │ │
│ │ - summarize: Condense files for context │ │
│ │ - find_patterns: "How do we do X?" │ │
│ │ - check_deps: What imports what? │ │
│ └────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
- Claude's strength: Deep codebase knowledge from Claude Code CLI
- GPT's strength: Fresh eyes, catches assumptions, asks "why not X?"
- Debate forces justification: Weak plans get caught before implementation
- Sub-agents prevent context bloat: Main agents stay focused on high-level thinking
Both agents run through CLI tools using your subscriptions:
| Tool | Subscription | Monthly Cost | Usage |
|---|---|---|---|
| Claude Code CLI | Claude Max | ~$100/mo | Unlimited |
| Codex CLI | ChatGPT Pro | $200/mo | Unlimited GPT-5.2 Pro |
Total: ~$300/mo fixed cost. No API charges. Run unlimited tasks.
- Python 3.10+
- Claude Code CLI (requires Claude Max subscription ~$100/mo)
- OpenAI Codex CLI (requires ChatGPT Pro subscription $200/mo)
# Claude Code CLI
npm install -g @anthropic-ai/claude-code
# OpenAI Codex CLI
npm install -g @openai/codexclaude # Sign in with your Claude account
codex # Sign in with your ChatGPT Pro accountFrom GitHub:
pip install git+https://github.com/URounder/duo.gitFrom source (for development):
git clone https://github.com/URounder/duo.git
cd duo
pip install -e .# Check Duo is installed
duo --help
# Check CLIs are authenticated
claude --version
codex --versionDuo is CLI-only by design (no paid API calls). Use the safe wrapper to ensure no API keys leak:
# Add to your PATH
export PATH="$PATH:/path/to/duo/bin"
# Run with guaranteed CLI-only execution
duo_safe start --verbose
duo_safe add "My task"Or run directly with environment protection:
env -u OPENAI_API_KEY -u ANTHROPIC_API_KEY DUO_NO_API=1 duo start# Add a task
cd ~/projects/my-api
duo add "Add rate limiting to API endpoints" \
-d "Implement rate limiting with Redis. 100 req/min for auth, 1000 for general."
# Start the orchestrator
duo start
# Watch the debate unfold in your terminal
# Respond to escalations when needed
duo respond <id> -m "Yes, use the existing Redis connection"Sub-agents scan the codebase, summarize relevant files, and prepare context for both main agents.
Claude proposes architecture, files to change, and approach. Draws on deep codebase knowledge.
GPT asks hard questions:
- "What about edge case X?"
- "I looked at
bot/services/brm.py- shouldn't this follow that pattern?" - "How does this interact with the existing auth system?"
GPT can also ask sub-agents to fetch specific code to answer its own questions.
Claude either:
- Explains reasoning with evidence from codebase
- Acknowledges good points and updates the plan
3-5 rounds of back-and-forth. Both agents must explicitly agree before implementation.
Claude implements, GPT reviews the actual diff. Same challenge/defend dynamic continues.
[CLAUDE - PLANNER]
I'll implement rate limiting using our existing Redis connection.
Files to modify:
- api/middleware/rate_limit.py (new)
- api/routes/auth.py (add decorator)
- api/routes/general.py (add decorator)
Approach: Token bucket algorithm, 100/min for auth, 1000/min for general.
[GPT - CRITIC]
Before I approve:
1. How do you handle distributed rate limiting across multiple API instances?
2. What happens when Redis is down? Fail open or closed?
[SUB-AGENT searching: "redis connection" "rate limit"]
Found: api/utils/redis.py has connection pooling, no existing rate limiting.
Found: config/settings.py has REDIS_URL and REDIS_TIMEOUT.
[GPT - CRITIC]
I see we have Redis connection pooling. Your plan doesn't mention:
- Using the existing connection pool
- Graceful degradation
What's your approach?
[CLAUDE - PLANNER]
Good catches. Updating plan:
- Will use existing pool from api/utils/redis.py
- Fail OPEN if Redis unavailable (log warning, allow request)
- Add circuit breaker pattern for Redis failures
REVISED PLAN:
...
[GPT - CRITIC]
APPROVED. The fail-open approach makes sense for rate limiting.
Proceed with implementation.
Config file: ~/.duo/config.json
{
"max_debate_rounds": 5,
"max_impl_iterations": 5,
"debate_style": "adversarial",
"require_tests": true,
"auto_commit": true,
"escalation_triggers": [
"security",
"breaking_change",
"architecture_decision",
"unclear_requirements"
]
}Start the debate orchestrator.
duo start # Default settings
duo start --concurrent 2 # Run 2 tasks in parallel
duo start --debate-rounds 7 # More debate roundsAdd a task to the queue.
duo add "Task title" -d "Description"
duo add "Task" --project /path/to/project
duo add "Task" --priority urgentList all tasks.
duo list # All tasks
duo list --status pending # Only pendingShow current status and active debates.
Respond to an escalation.
duo respond <id> -m "Use JWT for the new auth endpoint"
duo respond <id> --action abort # Cancel taskView recent activity and debates.
duo logs # Last 50 lines
duo logs -n 200 # Last 200 lines# Project Context
## Architecture
- FastAPI backend with PostgreSQL
- React frontend
- Redis for caching and rate limiting
## Code Patterns
- Use dependency injection for services
- All database operations go through repositories
- Follow existing patterns in bot/services/
## Testing
- pytest for backend
- All new code needs tests
- Run with: pytest -v
## Important Files
- bot/services/brm.py - Example of a well-structured service
- api/utils/redis.py - Redis connection managementBad: "Add auth"
Good: "Add JWT authentication to the API. Should support refresh tokens, integrate with existing user model in models/user.py, and follow the patterns in api/auth/ for other auth methods."
Each task generates a detailed debate log at .duo/sessions/<task-id>.json:
{
"task": {"id": "abc123", "title": "Add rate limiting"},
"debate_rounds": 3,
"implementation_iterations": 1,
"agreed_plan": "...",
"debate_log": [
{
"speaker": "claude",
"role": "planner",
"message": "Here's my implementation plan...",
"phase": "planning"
},
{
"speaker": "gpt",
"role": "critic",
"message": "What about distributed rate limiting?",
"phase": "debate"
}
]
}Duo is designed to improve itself. Point it at its own codebase:
cd duo-agents
duo add "Improve the debate prompts to generate more specific challenges" \
-d "The GPT critic should ask more targeted questions based on codebase patterns"
duo startLet it run overnight. Review the changes in the morning.
npm install -g @anthropic-ai/claude-code
claude # Sign innpm install -g @openai/codex
codex # Sign in with ChatGPT Pro accountThe agents might disagree fundamentally. Check logs and break the tie:
duo logs -n 100
duo respond <id> -m "Go with Claude's approach, the edge case is acceptable"Improve your CLAUDE.md with more context about important files and patterns.
duo-agents/
├── duo/
│ ├── __init__.py # Package exports
│ ├── debate_orchestrator.py # Main debate workflow
│ ├── agents.py # Claude + Codex CLI wrappers
│ ├── sub_agents.py # Context gatherers
│ ├── task_queue.py # Priority queue
│ ├── workspace.py # Git/test/file operations
│ ├── human_input.py # Escalation handling
│ ├── logger.py # Colored output
│ └── cli.py # Command interface
├── pyproject.toml
├── install.sh
└── README.md
MIT