Duo v2 - Debate-Driven Multi-Agent Development

Claude + GPT-5.2 Pro working together through structured debate.

    ____              
   / __ \__  ______  
  / / / / / / / __ \ 
 / /_/ / /_/ / /_/ / 
/_____/\__,_/\____/  

Debate-Driven Development

How It Works

Unlike simple "agent does task, other reviews" workflows, Duo uses adversarial collaboration:

┌─────────────────────────────────────────────────────────────────┐
│                      THE DEBATE WORKFLOW                        │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│   CLAUDE (Lead Dev)              GPT-5.2 PRO (Tech Lead)        │
│   ├─ Knows codebase              ├─ Fresh perspective           │
│   ├─ Writes plans                ├─ Asks hard questions         │
│   ├─ Implements code             ├─ Finds edge cases            │
│   └─ Defends decisions           └─ Challenges assumptions      │
│                                                                 │
│              DEBATE ROUNDS (3-5x)                               │
│   ┌────────────────────────────────────────────┐               │
│   │ Claude: "Here's my plan..."                │               │
│   │ GPT: "What about X? How does this fit Y?"  │               │
│   │ Claude: "Good point, updating..." or       │               │
│   │         "Actually, here's why..."          │               │
│   │ GPT: "APPROVED" or more challenges         │               │
│   └────────────────────────────────────────────┘               │
│                                                                 │
│              SUB-AGENTS (context fetchers)                      │
│   ┌────────────────────────────────────────────┐               │
│   │ - search_files: Find relevant code         │               │
│   │ - summarize: Condense files for context    │               │
│   │ - find_patterns: "How do we do X?"         │               │
│   │ - check_deps: What imports what?           │               │
│   └────────────────────────────────────────────┘               │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Why Debate Works

Claude's strength: Deep codebase knowledge from Claude Code CLI
GPT's strength: Fresh eyes, catches assumptions, asks "why not X?"
Debate forces justification: Weak plans get caught before implementation
Sub-agents prevent context bloat: Main agents stay focused on high-level thinking

Cost Model

Both agents run through CLI tools using your subscriptions:

Tool	Subscription	Monthly Cost	Usage
Claude Code CLI	Claude Max	~$100/mo	Unlimited
Codex CLI	ChatGPT Pro	$200/mo	Unlimited GPT-5.2 Pro

Total: ~$300/mo fixed cost. No API charges. Run unlimited tasks.

Installation

Prerequisites

Python 3.10+
Claude Code CLI (requires Claude Max subscription ~$100/mo)
OpenAI Codex CLI (requires ChatGPT Pro subscription $200/mo)

Step 1: Install the CLI tools

# Claude Code CLI
npm install -g @anthropic-ai/claude-code

# OpenAI Codex CLI
npm install -g @openai/codex

Step 2: Authenticate both CLIs

claude    # Sign in with your Claude account
codex     # Sign in with your ChatGPT Pro account

Step 3: Install Duo

From GitHub:

pip install git+https://github.com/URounder/duo.git

From source (for development):

git clone https://github.com/URounder/duo.git
cd duo
pip install -e .

Step 4: Verify installation

# Check Duo is installed
duo --help

# Check CLIs are authenticated
claude --version
codex --version

Recommended: Use the safe wrapper

Duo is CLI-only by design (no paid API calls). Use the safe wrapper to ensure no API keys leak:

# Add to your PATH
export PATH="$PATH:/path/to/duo/bin"

# Run with guaranteed CLI-only execution
duo_safe start --verbose
duo_safe add "My task"

Or run directly with environment protection:

env -u OPENAI_API_KEY -u ANTHROPIC_API_KEY DUO_NO_API=1 duo start

Quick Start

# Add a task
cd ~/projects/my-api
duo add "Add rate limiting to API endpoints" \
    -d "Implement rate limiting with Redis. 100 req/min for auth, 1000 for general."

# Start the orchestrator
duo start

# Watch the debate unfold in your terminal
# Respond to escalations when needed
duo respond <id> -m "Yes, use the existing Redis connection"

The Debate Flow

Phase 1: Context Gathering

Sub-agents scan the codebase, summarize relevant files, and prepare context for both main agents.

Phase 2: Claude Drafts Plan

Claude proposes architecture, files to change, and approach. Draws on deep codebase knowledge.

Phase 3: GPT Challenges

GPT asks hard questions:

"What about edge case X?"
"I looked at bot/services/brm.py - shouldn't this follow that pattern?"
"How does this interact with the existing auth system?"

GPT can also ask sub-agents to fetch specific code to answer its own questions.

Phase 4: Claude Defends or Revises

Claude either:

Explains reasoning with evidence from codebase
Acknowledges good points and updates the plan

Phase 5: Repeat Until Consensus

3-5 rounds of back-and-forth. Both agents must explicitly agree before implementation.

Phase 6: Implementation + Review

Claude implements, GPT reviews the actual diff. Same challenge/defend dynamic continues.

Example Debate

[CLAUDE - PLANNER]
I'll implement rate limiting using our existing Redis connection.
Files to modify:
- api/middleware/rate_limit.py (new)
- api/routes/auth.py (add decorator)
- api/routes/general.py (add decorator)

Approach: Token bucket algorithm, 100/min for auth, 1000/min for general.

[GPT - CRITIC]
Before I approve:
1. How do you handle distributed rate limiting across multiple API instances?
2. What happens when Redis is down? Fail open or closed?

[SUB-AGENT searching: "redis connection" "rate limit"]
Found: api/utils/redis.py has connection pooling, no existing rate limiting.
Found: config/settings.py has REDIS_URL and REDIS_TIMEOUT.

[GPT - CRITIC]
I see we have Redis connection pooling. Your plan doesn't mention:
- Using the existing connection pool
- Graceful degradation
What's your approach?

[CLAUDE - PLANNER]
Good catches. Updating plan:
- Will use existing pool from api/utils/redis.py
- Fail OPEN if Redis unavailable (log warning, allow request)
- Add circuit breaker pattern for Redis failures

REVISED PLAN:
...

[GPT - CRITIC]
APPROVED. The fail-open approach makes sense for rate limiting.
Proceed with implementation.

Configuration

Config file: ~/.duo/config.json

{
    "max_debate_rounds": 5,
    "max_impl_iterations": 5,
    "debate_style": "adversarial",
    "require_tests": true,
    "auto_commit": true,
    "escalation_triggers": [
        "security",
        "breaking_change", 
        "architecture_decision",
        "unclear_requirements"
    ]
}

Commands

`duo start`

Start the debate orchestrator.

duo start                    # Default settings
duo start --concurrent 2     # Run 2 tasks in parallel
duo start --debate-rounds 7  # More debate rounds

`duo add`

Add a task to the queue.

duo add "Task title" -d "Description"
duo add "Task" --project /path/to/project
duo add "Task" --priority urgent

`duo list`

List all tasks.

duo list                     # All tasks
duo list --status pending    # Only pending

`duo status`

Show current status and active debates.

`duo respond`

Respond to an escalation.

duo respond <id> -m "Use JWT for the new auth endpoint"
duo respond <id> --action abort  # Cancel task

`duo logs`

View recent activity and debates.

duo logs              # Last 50 lines
duo logs -n 200       # Last 200 lines

Project Setup for Best Results

Add a `CLAUDE.md` file

# Project Context

## Architecture
- FastAPI backend with PostgreSQL
- React frontend
- Redis for caching and rate limiting

## Code Patterns
- Use dependency injection for services
- All database operations go through repositories
- Follow existing patterns in bot/services/

## Testing
- pytest for backend
- All new code needs tests
- Run with: pytest -v

## Important Files
- bot/services/brm.py - Example of a well-structured service
- api/utils/redis.py - Redis connection management

Use descriptive task descriptions

Bad: "Add auth"

Good: "Add JWT authentication to the API. Should support refresh tokens, integrate with existing user model in models/user.py, and follow the patterns in api/auth/ for other auth methods."

Session Logs

Each task generates a detailed debate log at .duo/sessions/<task-id>.json:

{
    "task": {"id": "abc123", "title": "Add rate limiting"},
    "debate_rounds": 3,
    "implementation_iterations": 1,
    "agreed_plan": "...",
    "debate_log": [
        {
            "speaker": "claude",
            "role": "planner", 
            "message": "Here's my implementation plan...",
            "phase": "planning"
        },
        {
            "speaker": "gpt",
            "role": "critic",
            "message": "What about distributed rate limiting?",
            "phase": "debate"
        }
    ]
}

Self-Improvement

Duo is designed to improve itself. Point it at its own codebase:

cd duo-agents
duo add "Improve the debate prompts to generate more specific challenges" \
    -d "The GPT critic should ask more targeted questions based on codebase patterns"
duo start

Let it run overnight. Review the changes in the morning.

Troubleshooting

"Claude CLI not found"

npm install -g @anthropic-ai/claude-code
claude  # Sign in

"Codex CLI not found"

npm install -g @openai/codex
codex  # Sign in with ChatGPT Pro account

"Debate going in circles"

The agents might disagree fundamentally. Check logs and break the tie:

duo logs -n 100
duo respond <id> -m "Go with Claude's approach, the edge case is acceptable"

"Sub-agents not finding relevant code"

Improve your CLAUDE.md with more context about important files and patterns.

Architecture

duo-agents/
├── duo/
│   ├── __init__.py              # Package exports
│   ├── debate_orchestrator.py   # Main debate workflow
│   ├── agents.py                # Claude + Codex CLI wrappers
│   ├── sub_agents.py            # Context gatherers
│   ├── task_queue.py            # Priority queue
│   ├── workspace.py             # Git/test/file operations
│   ├── human_input.py           # Escalation handling
│   ├── logger.py                # Colored output
│   └── cli.py                   # Command interface
├── pyproject.toml
├── install.sh
└── README.md

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
bin		bin
duo		duo
duo_agents.egg-info		duo_agents.egg-info
tests		tests
ARCHITECTURE.md		ARCHITECTURE.md
CLAUDE.md		CLAUDE.md
README.md		README.md
install.sh		install.sh
pyproject.toml		pyproject.toml

URounder/duo

Folders and files

Latest commit

History

Repository files navigation

Duo v2 - Debate-Driven Multi-Agent Development

How It Works

Why Debate Works

Cost Model

Installation

Prerequisites

Step 1: Install the CLI tools

Step 2: Authenticate both CLIs

Step 3: Install Duo

Step 4: Verify installation

Recommended: Use the safe wrapper

Quick Start

The Debate Flow

Phase 1: Context Gathering

Phase 2: Claude Drafts Plan

Phase 3: GPT Challenges

Phase 4: Claude Defends or Revises

Phase 5: Repeat Until Consensus

Phase 6: Implementation + Review

Example Debate

Configuration

Commands

duo start

duo add

duo list

duo status

duo respond

duo logs

Project Setup for Best Results

Add a CLAUDE.md file

Use descriptive task descriptions

Session Logs

Self-Improvement

Troubleshooting

"Claude CLI not found"

"Codex CLI not found"

"Debate going in circles"

"Sub-agents not finding relevant code"

Architecture

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 2