Anti-sycophancy LLM council for business ideas.
Single-model LLM interactions are structurally biased toward agreement. RLHF training selects for responses that users rate positively — and users rate agreeable responses more positively. The result: any idea you bring to an LLM with enthusiasm gets received with enthusiasm. This is not a bug you can prompt your way out of. It requires a different architecture.
The Board runs five structured adversarial personas across multiple providers, has each model anonymously critique the others, and synthesizes the result — with critics by design, not by request.
Inspired by Karpathy's LLM Council and pAI's critical reviewer council (Poggio Lab, MIT).
Stage 1 — Parallel perspectives. Five personas run simultaneously, each on a configurable provider:
| Persona | Default model | Role |
|---|---|---|
| The Advocate | openai/gpt-5.4-mini |
Strongest honest case for |
| The Devil's Advocate | anthropic/claude-sonnet-4-6 |
Strongest honest case against + one fatal objection |
| The Realist | google/gemini-3.1-flash-lite-preview |
Most likely actual outcome with base rates |
| The Market Skeptic | mistralai/mistral-large-3 |
Three unanswered market questions |
| The Execution Skeptic | deepseek/deepseek-chat |
Three execution risks assuming market is correct |
Stage 2 — Anonymous cross-review. Each persona reviews the others' outputs with identities stripped. Models evaluate reasoning, not source — preventing deference to "prestigious" models.
Stage 3 — Chair synthesis. A Chair model synthesizes into a fixed structure: where the board agrees, where it disagrees, the three hardest objections, what would change the picture, and concrete kill criteria. The Chair is not allowed to offer a verdict or end with encouragement.
# 1. Clone and install
git clone https://github.com/your-org/the-board
cd the-board
pip install -r requirements.txt
# 2. Configure
cp config.example.yaml config.yaml
# Edit config.yaml — add your OpenRouter API key
# Get a key at: https://openrouter.ai/keys
# 3. Run
python board.pyOr pass your idea directly:
python board.py --idea "A B2B SaaS tool for managing freelancer contracts"Each persona has a drop-in model field in config.yaml. Any OpenRouter model string works:
personas:
devils_advocate:
model: "anthropic/claude-opus-4-6" # upgrade the most critical persona
market_skeptic:
model: "x-ai/grok-4" # swap to a different provider entirelyOne API key. The Board uses OpenRouter to route to all providers. You need one key, not five.
Why provider diversity matters. When five different model families — trained on different data, with different RLHF lineages — produce the adversarial perspectives, the disagreements are real, not simulated. Running five GPT-5.4-mini calls produces one model wearing five hats. Running five different providers produces five genuinely distinct perspectives.
python board.py [--idea TEXT] [--config PATH] [--fast] [--preview] [--budget USD]
[--persona NAME:MODEL ...]
--idea TEXT Idea to review (prompted interactively if omitted)
--config PATH Path to config file (default: config.yaml)
--fast Devil's Advocate + Chair only — lower cost, quick sanity check
--preview Show cost estimate and model assignments, then exit
--budget USD Auto-select cheapest models under this USD ceiling
--persona NAME:MODEL Override one persona's model inline (repeatable)
Examples:
# Preview cost and model assignments before running
python board.py --idea "..." --preview
# Run with budget ceiling
python board.py --idea "..." --budget 0.15
# Override the Chair to use a stronger model
python board.py --idea "..." --persona chair:anthropic/claude-opus-4-6
# Quick sanity check (Devil's Advocate + Chair only)
python board.py --idea "..." --fast| Mode | Config | Approx. cost |
|---|---|---|
| Fast mode | Devil's Advocate + Chair | ~$0.02–0.08 |
| Full board, budget models | All → GPT-5.4 nano | ~$0.02–0.05 |
| Full board, recommended | Mixed providers (default config) | ~$0.25–0.60 |
| Full board, premium | All frontier models | ~$1.00–2.50 |
Set cost_limit_usd in config.yaml to prevent accidental overruns. The board shows an estimate before running and aborts if it exceeds the limit.
Each session saves a Markdown report to outputs/ with:
- The Chair synthesis (structured, fixed format)
- Full per-persona Stage 1 perspectives with model labels
- Full Stage 2 cross-review responses
outputs/
20260411-143022-a-b2b-saas-tool-for-managing-fr.md
It does not eliminate sycophancy. Each individual persona is still an LLM with training-induced tendencies to soften criticism. The system prompts resist this but don't fully override it.
It does not replace judgment. The Board surfaces disagreement and names hard objections. A committed believer can still read the Kill Criteria and dismiss them. The tool changes the structure of the conversation — it cannot change the reader.
It does not give verdicts. The Chair is explicitly prohibited from saying "this is a good idea" or "this will fail." What it produces is: where confidence is warranted, where it isn't, and what would change that.
Persona prompts live in personas/. Edit them or add new ones. When contributing a new persona, include:
- The system prompt
- Three example ideas it was tested on
- The objections it produced (unedited)
- Whether those objections surprised the proposer
A persona is better if it produces criticism that surprises — not just confirms what was already suspected.
- v0.1 — CLI, 5 personas, drop-in provider config, cost preview, Markdown output
- v0.2 — Web UI with provider dropdowns, domain templates (B2B SaaS, consumer, services), session history
- v0.3 — Track record mode, Slack/email integration, custom persona builder
The Board does not tell you whether your idea is good. It tells you what questions you haven't answered yet.