Production-ready Python implementation of the Free-MAD algorithm from the paper "Free-MAD: Consensus-Free Multi-Agent Debate".
freemad_ui.mp4
Free-MAD is a revolutionary approach to multi-agent AI systems that eliminates the need for consensus among agents while achieving better accuracy and efficiency than traditional debate methods.
When you have multiple AI agents working on the same problem, traditional approaches (MAD - Multi-Agent Debate) work like this:
- Agents debate until they agree (reach consensus)
- The final answer is chosen by majority vote
This has serious problems:
- Conformity bias: Agents with the right answer get pressured by the majority into changing their minds (like peer pressure)
- High cost: Multiple debate rounds are needed to reach agreement
- Majority tyranny: The right answer can lose if fewer agents picked it—truth doesn't always win by popularity
Free-MAD takes a fundamentally different approach:
- No consensus required - Agents can disagree throughout the entire debate
- Score the journey, not just the destination - Instead of only looking at final votes, Free-MAD evaluates the quality of reasoning across ALL debate rounds
- Quality beats quantity - A single agent with strong reasoning can win, even if all others disagree
Think of it like judges scoring a debate competition: they don't wait to see who "wins" by convincing everyone else. Instead, they score the quality of each debater's arguments throughout the entire debate. The best-argued position wins, regardless of whether it convinced the majority.
The Algorithm:
- Round 0 (Generation): All agents independently propose solutions
- Round 1+ (Critique): Agents debate in two modes:
- Conformity mode: Present arguments supporting their answer
- Anti-conformity mode: Find flaws in other agents' answers
- Scoring: Track the entire debate trajectory and score based on:
- Quality of arguments
- Valid criticisms found
- How positions evolved over time
- Decision: Select the answer with the highest score (not the most votes)
Example:
Round 1:
Agent 1: Answer A (with strong reasoning)
Agent 2: Answer B
Agent 3: Answer B
Round 2:
Agent 1: Stays with A, points out flaws in B
Agent 2: Switches to A (convinced by Agent 1's arguments)
Agent 3: Stays with B
Traditional MAD: B wins (2 votes)
Free-MAD: A wins (higher score due to quality of reasoning)
This means a single agent with the right answer and strong reasoning can win, even if the majority disagrees—something impossible with traditional consensus-based approaches.
# With Poetry (recommended)
poetry install
poetry run freemad --version
# With pip
pip install -e .
freemad --version# Using YAML configuration
poetry run freemad "Write a function that returns Fibonacci(n)." \
--rounds 2 \
--config config_examples/multi_agent.yaml
# Using JSON configuration
poetry run freemad "Write a function that returns Fibonacci(n)." \
--rounds 2 \
--config config_examples/multi_agent.jsonBoth YAML and JSON formats are supported. See config_examples/multi_agent.yaml or config_examples/multi_agent.json for complete configuration examples.
Free-MAD is configured via YAML or JSON files. Here's a minimal example:
agents:
- id: claude-sonnet
type: claude_code
cli_command: "claude"
cli_args: {model: "sonnet"}
timeout: 600
- id: gpt-5
type: openai_codex
cli_command: "codex exec"
cli_args: {--model: "gpt-5.1"}
cli_flags: ["--skip-git-repo-check"]
cli_positional: ["-"]
timeout: 600
topology:
type: all_to_all # all agents review all others
seed: 427 # deterministic peer assignment
deadlines:
soft_timeout_ms: 15000 # quorum wait
hard_timeout_ms: 30000 # hard stop
min_agents: 2 # quorum size
scoring:
weights: [20.0, 25.0, 30.0, 20.0] # [initial, change-penalty, change-bonus, keep]
normalize: true # contributor-based normalization
tie_break: deterministic # or 'random'
security:
cli_allowed_commands: ["claude", "codex"]
cli_use_shell: false
max_requirement_size: 20000
max_solution_size: 400000
output:
save_transcript: true
transcript_dir: transcripts
format: jsonComplete configuration examples:
- YAML:
config_examples/multi_agent.yaml - JSON:
config_examples/multi_agent.json - All available options:
config_examples/ALL_KEYS.yaml
Define the AI agents participating in the debate:
id: Unique identifiertype: Adapter type (claude_code,openai_codex)cli_command: Command to invoke the agentcli_args: Key-value arguments passed to the CLIcli_flags: Boolean flags (e.g.,["--verbose"])cli_positional: Positional arguments (e.g.,["-"]for stdin)timeout: Per-call timeout in secondsconfig.temperature: Model temperature (0.0-1.0)config.max_tokens: Max output tokens (null = unlimited)
Control how agents review each other's work:
all_to_all: Every agent reviews all others (full debate)k_reviewers: Each agent reviews k random peersring: Agents review in a circular patternstar: All agents review a central hub agent
Configure the Free-MAD scoring algorithm:
weights:[initial, change_penalty, change_bonus, keep]- Weights for different scoring componentsnormalize: Divide by contributor count to prevent score inflationtie_break:deterministic(first in list) orrandomrandom_seed: Seed for random tie-breaking
Control debate round timing:
soft_timeout_ms: Wait for quorum before proceedinghard_timeout_ms: Absolute deadline (accept late arrivals until this)min_agents: Quorum size at soft deadline
cli_allowed_commands: Whitelist of allowed executablescli_use_shell: Must befalsefor securitymax_requirement_size: Input size cap (chars)max_solution_size: Output size cap (chars)redact_patterns: Regex patterns to redact from logs
max_total_time_sec: Overall wall time budgetmax_round_time_sec: Per-round budgetmax_agent_time_sec: Per-agent call budgetmax_tokens_per_agent_per_round: Prompt truncation capenable_token_truncation: Allow prompt truncationmax_concurrent_agents: Parallelism limit
save_transcript: Persist debate transcripttranscript_dir: Output directoryformat:jsonormarkdownverbose: Print extra info during execution
enable_sandbox: Run solutions in restricted Python sandboxsandbox_timeout_ms: Sandbox execution limit
enabled: On-disk memoization of agent outputsdir: Cache directorymax_entries: Eviction limit
Free-MAD communicates with agents via stdin/stdout. Your agent CLI must:
- Accept mode as argument:
<cli_command> generateor<cli_command> critique - Read prompt from stdin: The debate requirement or critique instructions
- Output structured response:
SOLUTION:
<your proposed solution>
REASONING:
<your reasoning/arguments>
If your agent doesn't follow this contract, wrap it:
#!/usr/bin/env python3
import sys
import subprocess
mode = sys.argv[1] # 'generate' or 'critique'
prompt = sys.stdin.read()
# Call your actual agent
result = subprocess.run(
["your-agent-command", "--mode", mode],
input=prompt,
capture_output=True,
text=True
)
# Format output
print(f"SOLUTION:\n{result.stdout}")
print(f"\nREASONING:\nGenerated in {mode} mode")# Install dev dependencies
poetry install --with dev
# Run tests
poetry run pytest -q
# With coverage
poetry run pytest --cov=freemad --cov-report=term --cov-report=xmlmypy .poetry run pre-commit install
poetry run pre-commit run --all-filesSee AGENTS.md for detailed conventions:
- Immutable dataclasses
- StrEnums for constants
- No hard-coded strings internally
- Serialization at boundaries only
Debate transcripts capture the complete history for analysis:
{
"final_answer_id": "abc123...",
"final_solution": "def fibonacci(n): ...",
"scores": {
"abc123...": 85.5,
"def456...": 72.3
},
"winning_agents": ["claude-sonnet"],
"transcript": [
{
"round": 0,
"type": "generation",
"agents": {
"claude-sonnet": {
"response": { "solution": "...", "reasoning": "..." },
"peers_assigned": [],
"peers_seen": []
}
}
},
{
"round": 1,
"type": "critique",
"agents": { ... }
}
]
}Find transcripts in transcripts/ by default when output.save_transcript: true.
Free-MAD includes a web-based dashboard to visualize debate results. The dashboard reads JSON transcripts and displays the final answer, winning agents, and scores.
poetry run freemad-dashboard --dir transcripts --host 127.0.0.1 --port 8001Then open your browser to http://127.0.0.1:8001 to view the results.
Command Options:
--dir: Directory containing JSON transcripts (default:transcripts)--host: Server host address (default:127.0.0.1)--port: Server port (default:8001)
- ✅ View final debate results
- ✅ See winning agents and scores
- ✅ Browse all transcript files
The dashboard is actively being developed. Planned features include:
Real-Time Debate Visualization:
- Live conversation view showing agent-to-agent interactions
- Visual timeline of debate rounds
- See who said what in each round
Metrics & Analytics:
- Token usage tracking per agent and per round
- Time/duration metrics for each debate phase
- Cost estimation based on model pricing
Agent Information:
- Display model configurations (temperature, max_tokens)
- Show agent types and CLI commands used
- Topology visualization (peer assignment graphs)
Configuration UI:
- Configure agents through the web interface
- Edit debate parameters (rounds, weights, timeouts)
- Save and load configuration presets
Interactive Final Agent:
- Chat with a final orchestrator agent
- Execute the winning solution interactively
- Provide feedback and iterate on results
Enhanced UX:
- Make the system more user-friendly vs. command-line only
- Drag-and-drop configuration builder
- Real-time progress indicators
Contributions Welcome! If you'd like to help build these features, please see CONTRIBUTING.md or open an issue to discuss implementation ideas.
- Verify
cli_commandis in your PATH - Check
cli_commandis insecurity.cli_allowed_commands - Increase
agents[].timeoutif needed - Enable debug logging:
logging.level: DEBUG
- Agents must output exactly
SOLUTION:andREASONING:markers - Check transcript to see what agents actually produced
- Test your agent CLI manually with echo prompts
- Increase
deadlines.hard_timeout_ms - Increase
budget.max_round_time_sec - Ensure
deadlines.min_agents≤ number of enabled agents - Check
early_stop_reasonin transcript
- Set
topology.seedfor consistent peer assignments - Set
scoring.random_seedfor consistent tie-breaking - Use
scoring.tie_break: deterministic
- Issues: GitHub Issues
- Contributing: See CONTRIBUTING.md
- Code of Conduct: See CODE_OF_CONDUCT.md
- Security: See SECURITY.md for private vulnerability reporting
- Governance: See GOVERNANCE.md
If you use this implementation in your research, please cite:
@software{freemad2025,
author = {Santilli, Jonathan},
title = {FREE-MAD: Consensus-Free Multi-Agent Debate Implementation},
year = {2025},
url = {https://github.com/jonathansantilli/mad}
}And the original paper:
@article{freemad2024,
title={Free-MAD: Consensus-Free Multi-Agent Debate},
author={...},
journal={arXiv preprint arXiv:2509.11035},
year={2024}
}MIT License © 2025 Jonathan Santilli. See LICENSE for full text.
This project is independent and not affiliated with Anthropic, OpenAI, or any other vendor. "Claude", "Codex", and any other product names are trademarks of their respective owners and are used here only for identification.
This implementation is based on the paper:
"Free-MAD: Consensus-Free Multi-Agent Debate" arXiv:2509.11035v1 https://arxiv.org/html/2509.11035v1
- Eliminates consensus requirement: Agents can disagree throughout the debate
- Score-based decision mechanism: Evaluates entire debate trajectory, not just final votes
- Improved accuracy: Outperforms traditional MAD on reasoning benchmarks
- Better efficiency: Requires fewer debate rounds than consensus-based approaches
- Robustness: Resistant to conformity bias and communication attacks