Claude, GPT, and Gemini deliberate together — debating requirements, architecture, and implementation across 7 structured rounds — while the developer stays in control at every decision point.
When using AI to write code, I kept running into the same problem: the answer depends on which AI you ask.
Claude catches a security issue that GPT missed. GPT suggests a library that Gemini flags as deprecated. Gemini's architecture is clean but Claude's error handling is more robust. There's no single AI that's always right — each has different training data, different strengths, and fundamentally produces outputs by sampling from a probability distribution. Ask the same question twice and you may get different answers.
AI Roundtable's answer: run all three and let the best answer win.
Each AI speaks, sees what the others said, and can revise its position. A rotating judge evaluates the outputs and produces a merge directive — not "use Claude's code," but "use Claude's base, replace the auth function with GPT's argon2 implementation, and apply Gemini's transaction rollback." The result is synthesized output that is better than any single agent could produce alone. This is ensemble learning applied to software engineering.
The second problem: most AI coding tools are black boxes. Devin and similar tools run autonomously and hand you a result. If it's wrong, you don't know why. AI Roundtable exposes the full deliberation in real time. You see every agent's reasoning, every disagreement, every consensus decision. You can intervene, redirect, or override at any point. The developer is a participant, not a spectator.
- Features
- Quick Start
- How It Works
- Auth0 & Security
- API Reference
- Configuration
- Project Structure
- Future Improvements
- Multi-agent deliberation — Claude, GPT, and Gemini debate each round; up to 3 iterations before consensus
- 7-round structured workflow — Requirements → Architecture → Development → Code Review → QA → DevOps → Execution Analysis
- Configurable rounds — Enable/disable any round and add custom instructions per round before starting
- Rotating judge — Different AI acts as judge each round to prevent anchoring bias
- Function-level code synthesis — After consensus, the BASE agent merges the best parts from all agents at the function level
- Chunked code review — Large codebases split into ~20K-token chunks so each agent stays within TPM limits
- Developer intervention — Inject notes mid-round or at consensus; retry any round with updated context
- Docker execution — Build and run generated code in Docker; stdout/stderr streamed live; if it fails, agents analyze and fix — user decides whether to re-run
- WebContainer support — Frontend/fullstack projects run in-browser via WebContainer API
- GitHub export via Auth0 Token Vault — AI agent pushes code to GitHub server-side; raw token never reaches the browser
- Session persistence — Resume any session via
?session=<id>; sessions survive server restarts - Project dashboard — View and resume all past sessions at
/projects
Prerequisites: Python 3.11+, Node.js 18+, Docker, an Auth0 tenant (free at auth0.com), and API keys for at least one AI provider.
cd backend
python -m venv venv && source venv/bin/activate
pip install -r ../requirements.txtCreate backend/.env:
ALLOWED_ORIGINS=http://localhost:3000
AUTH0_DOMAIN=your-tenant.us.auth0.com
AUTH0_AUDIENCE=https://your-api-identifier
AUTH0_MGMT_CLIENT_ID=<M2M app client ID>
AUTH0_MGMT_CLIENT_SECRET=<M2M app client secret>
SECRET_KEY=<run: openssl rand -hex 32>cd frontend && npm installCreate frontend/.env.local:
AUTH0_DOMAIN=your-tenant.us.auth0.com
AUTH0_CLIENT_ID=<from Auth0 Application settings>
AUTH0_CLIENT_SECRET=<from Auth0 Application settings>
AUTH0_SECRET=<run: openssl rand -hex 32>
APP_BASE_URL=http://localhost:3000
AUTH0_AUDIENCE=https://your-api-identifier
AUTH0_MGMT_CLIENT_ID=<M2M app client ID>
AUTH0_MGMT_CLIENT_SECRET=<M2M app client secret>Auth0 Application settings:
- Allowed Callback URLs:
http://localhost:3000/auth/callback - Allowed Logout URLs:
http://localhost:3000 - Allowed Web Origins:
http://localhost:3000
Auth0 Management API — M2M scopes required:
read:users, update:users, read:user_idp_tokens
# Terminal 1
cd backend && source venv/bin/activate
uvicorn app.main:app --reload --port 8000
# Terminal 2
cd frontend && npm run devOpen http://localhost:3000.
Allows the AI agent to push generated code to GitHub without exposing any token to the browser.
- Create a GitHub OAuth App at
github.com/settings/developers— callback URL:https://<AUTH0_DOMAIN>/login/callback, scope:repo - Auth0 → Authentication → Social → GitHub → paste credentials → Purpose: Authentication and Connected Accounts for Token Vault
- Auth0 → Authentication → Social → GitHub → Applications tab → enable for your app
Sign in with GitHub and the token is stored automatically. Users who sign in with Google/email can still export via Personal Access Token.
You describe what you want to build. Three AIs deliberate across up to 7 rounds — agreeing on requirements, architecture, code, review, QA, and deployment setup. At each round you see the full debate, can inject notes, and confirm before moving forward. At the end, download a ZIP or export directly to GitHub.
Best for: APIs, web apps, CLI tools, scripts — anything that can be containerized.
Upload your existing files at the start. The agents extend or modify them rather than starting from scratch — so your structure and conventions are preserved. Useful for adding a feature, refactoring a module, or getting a multi-AI code review.
Best for: Adding features to an existing project, large refactors, getting a second (and third) opinion on code you already wrote.
Not every project needs all 7 rounds. Running just Requirements + Architecture gives you a detailed spec and system design in minutes. Running just Developer + Code Review gives you working code with a critique. Mix and match.
Common lightweight combos:
- Spec only: Requirements + Architecture
- Code only: Developer + Code Review
- Full cycle without Docker: Requirements → QA (skip DevOps + Execution & Analysis)
At every consensus point, you decide what happens next. If the agents disagree, you see the dispute and choose a direction. You can type a note mid-round to redirect the discussion. If a round goes wrong, roll it back and retry with updated instructions.
After DevOps generates a Dockerfile, the Execution & Analysis round auto-builds and runs the container, streams live logs to your browser, and has the agents analyze any failures and apply fixes automatically. Retries up to 3 times before surfacing a final failure.
Note: Success depends on AI-generated Dockerfile quality and project complexity. Simpler projects (single-service APIs) work reliably; complex multi-service setups may need manual intervention.
| # | Round | Role | Token Budget |
|---|---|---|---|
| 1 | Requirements | Principal Product Engineer | 2048 |
| 2 | Architecture | Distinguished Software Architect | 4096 |
| 3 | Development | Principal Software Engineer | 8192 |
| 4 | Code Review | Staff Engineer | 4096 |
| 5 | QA | Principal QA Engineer | 8192 |
| 6 | DevOps | Senior Platform Engineer | 4096 |
| 7 | Execution & Analysis | SRE / Runtime Debugger | 8192 |
Speaking order is randomized per round to prevent first-speaker anchoring bias. Any round can be skipped via round_configs at session start.
Each round runs up to 3 iterations. Agents see all previous outputs before speaking:
Agent A speaks → Agent B speaks (sees A) → Agent C speaks (sees A, B)
→ Judge evaluates → CONSENSUS? break : iterate (max 3)
→ After 3 iterations without consensus: escalate to developer
The judge rotates by round (selected_agents[round_index % len(selected)]) so no single AI always anchors decisions.
Three judge formats by round type:
| Category | Rounds | Format |
|---|---|---|
| Discussion | Requirements, Architecture | CONSENSUS: <summary> or DISCUSS: <options> |
| Code-producing | Developer, QA, DevOps | CONSENSUS: BASE=<agent>. INCORPORATE: ... MUST_FIX: ... |
| Code-reviewing | Reviewer, Execution & Analysis | CONSENSUS: CRITICAL: ... WARNINGS: ... CONFIRMED_CLEAN: ... |
After consensus in code-producing rounds, the BASE agent re-generates all files incorporating the INCORPORATE directives — merging the best function-level contributions from every agent.
The Reviewer round uses token-budget chunking instead of the debate loop to stay within GPT's ~30K TPM limit:
- Files sorted by priority: app code → config/infra → tests
- Grouped into ≤ 20K token chunks
- All agents review each chunk; findings accumulated
- Single consensus check on accumulated findings (text only — no code re-sent)
- Synthesizer applies fixes chunk by chunk
After DevOps round completes:
final_fileswritten to a temp directorydocker compose up --build(ifdocker-compose.ymlpresent) ordocker build && docker run- Stdout/stderr streamed as SSE events to the UI
- On container exit → Execution & Analysis round auto-starts; agents analyze the output and apply fixes
- After analysis completes → user is asked whether to re-run Docker with the fixed code
Sign in with Google or GitHub. Every session is scoped to the authenticated user's sub claim. The backend verifies Auth0 JWTs (RS256, JWKS) on every request.
| Stage | Where | Encryption |
|---|---|---|
| Saved by user | Auth0 user_metadata |
Fernet (server-side) |
| Active session | SQLite + in-memory | Fernet |
| In transit | HTTPS only | TLS |
Keys are never sent in request bodies beyond the initial save — the backend reads them from Auth0 user_metadata using the JWT's user_id.
The GitHub OAuth App is registered once (under the developer's account) as an application identity. Each user who connects GitHub gets their own token — the developer's account is never used for user pushes.
User signs in with GitHub → Auth0 stores that user's token in Token Vault
User clicks "Export to GitHub" → POST /api/github/push
Server reads user's token from Token Vault (Management API)
Server pushes files to GitHub using user's token
Returns repo URL — token never sent to browser
- Rate limiting: 10 req/min per IP (
slowapi) - Security headers:
X-Frame-Options: DENY,X-Content-Type-Options: nosniff,HSTS - Docker Dockerfile scanning: rejects
curl | sh, privileged flags, host network/PID mounts
| Provider | Where to get |
|---|---|
| Anthropic (Claude) | console.anthropic.com/settings/keys |
| OpenAI (GPT) | platform.openai.com/api-keys |
| Google (Gemini) | aistudio.google.com/app/apikey |
FastAPI backend (/api):
| Method | Endpoint | Description |
|---|---|---|
POST |
/session/start |
Create session |
GET |
/sessions |
List user's sessions (JWT required) |
GET |
/session/{id} |
Get session state |
GET |
/session/{id}/round |
Stream current round (SSE) |
POST |
/session/developer-input |
Inject developer note |
POST |
/session/{id}/retry |
Roll back to round index |
GET |
/session/{id}/chat?message= |
Direct Q&A with judge (SSE) |
GET |
/session/{id}/execute |
Run in Docker (SSE) |
POST |
/session/{id}/stop |
Stop Docker container |
POST |
/session/{id}/complete |
Mark complete |
GET |
/session/{id}/files |
Get generated files |
GET |
/session/{id}/download |
Download as ZIP |
Next.js proxy routes (/api, Auth0 session required):
| Method | Endpoint | Description |
|---|---|---|
GET/PATCH |
/api/user/keys |
AI API keys via Auth0 user_metadata |
GET |
/api/user/sessions |
Session list (adds Bearer token server-side) |
GET |
/api/github/status |
GitHub Token Vault connection status |
POST |
/api/github/push |
Server-side GitHub push via Token Vault |
SSE event types:
| Event | Description |
|---|---|
round_start |
Round begins with metadata |
agent_start / agent_end |
Agent speaking boundaries |
token |
Streaming token {agent, token} |
chunk_start |
Reviewer chunk starting |
debate_iteration |
New debate turn |
synthesis_start / synthesis_end |
Code merge in progress |
consensus |
Round result with summary, options, next_round |
exec_output / exec_done / exec_error |
Docker execution events |
backend/.env:
| Variable | Default | Description |
|---|---|---|
ALLOWED_ORIGINS |
http://localhost:3000 |
CORS allowed origins |
AUTH0_DOMAIN |
(unset) | Enables JWT verification when set |
AUTH0_AUDIENCE |
(unset) | Must match frontend value |
AUTH0_MGMT_CLIENT_ID |
(unset) | M2M app — for user_metadata access |
AUTH0_MGMT_CLIENT_SECRET |
(unset) | M2M app secret |
SECRET_KEY |
(required) | Fernet key for API key encryption (openssl rand -hex 32) |
frontend/.env.local:
| Variable | Description |
|---|---|
AUTH0_DOMAIN |
Auth0 tenant domain |
AUTH0_CLIENT_ID |
Application client ID |
AUTH0_CLIENT_SECRET |
Application client secret |
AUTH0_SECRET |
Cookie encryption key (openssl rand -hex 32) |
APP_BASE_URL |
App base URL |
AUTH0_AUDIENCE |
API identifier |
AUTH0_MGMT_CLIENT_ID |
M2M client ID |
AUTH0_MGMT_CLIENT_SECRET |
M2M client secret |
AI Models:
| Agent | Deliberation | Judge (consensus) |
|---|---|---|
| Claude | claude-sonnet-4-6 |
claude-haiku-4-5-20251001 |
| GPT | gpt-4o |
gpt-4o-mini |
| Gemini | gemini-1.5-pro |
gemini-1.5-flash |
ai-roundtable/
├── backend/
│ ├── app/
│ │ ├── main.py # FastAPI app, CORS, rate limiting, security headers
│ │ ├── auth.py # Auth0 JWT verification (RS256 + JWKS)
│ │ ├── config.py # Settings from environment
│ │ ├── models/schemas.py # Pydantic request/response models
│ │ ├── routers/session.py # All API endpoints + debate loop orchestration
│ │ └── services/
│ │ ├── claude.py # Anthropic streaming client
│ │ ├── gpt.py # OpenAI streaming client
│ │ ├── gemini.py # Google Gemini streaming client
│ │ ├── consensus.py # Judge logic, consensus parsing
│ │ ├── orchestrator.py # Round order, token budgets, role prompts
│ │ └── auth0_mgmt.py # Auth0 Management API (user_metadata)
│ └── tests/
│ ├── unit/ # Consensus, chunked review, session lock
│ └── integration/ # Session lifecycle, restore, retry
└── frontend/
├── app/
│ ├── page.tsx # Home — routes to setup or session restore
│ ├── projects/page.tsx # Session dashboard (Server Component)
│ └── api/
│ ├── user/keys/ # AI API keys via Auth0 user_metadata
│ ├── user/sessions/ # Session list proxy (adds Bearer token)
│ ├── github/status/ # Token Vault connection check
│ └── github/push/ # Server-side GitHub push
├── components/
│ ├── ChatRoom.tsx # Main session UI
│ ├── SetupScreen.tsx # Agent selection + API key entry
│ ├── GitHubExportModal.tsx # Token Vault + PAT export
│ ├── ConsensusBanner.tsx # Dispute resolution UI
│ ├── ExecutionPanel.tsx # Docker terminal output
│ └── FileViewerModal.tsx # In-browser file browser
└── lib/
├── useRoundStream.ts # Session state + SSE management
├── useAgentSetup.ts # Agent config + Auth0 API key sync
├── api.ts # HTTP/SSE client functions
└── github.ts # Client-side GitHub API (PAT fallback)
# Backend
cd backend && source venv/bin/activate
pytest # all tests
pytest --cov=app --cov-report=term-missing # with coverage
# Frontend
cd frontend && npm test| Area | Current | Direction |
|---|---|---|
| Debate iterations | 3 (hardcoded) | Per-round config; more for complex projects |
| Judge model | Cost-optimized (Haiku, mini, Flash) | User-selectable: cost vs. accuracy |
| Token budget & code review | Fixed per round type; code review splits files into ~12K-token chunks (cross-file relationships can be missed); QA restricted to Claude/Gemini when multiple agents selected (GPT's 30K TPM limit is too low for full codebase) | Tiered plans: higher tiers get larger budgets, skip chunking, and allow all agents in QA — sending all files in one pass for a complete, relationship-aware review |
| WebContainer | npm projects only | Python, Deno runtime support |
| Cross-round memory | No persistent context | Long-running project continuity |
| GitHub export | Push to new repo only | Push to existing repo, open PR |
| Orchestration | Custom round loop in orchestrator.py | Migrate to LangGraph — declarative graph nodes per round, conditional edges for retry/consensus, built-in human-in-the-loop and state persistence |