AI Roundtable

Claude, GPT, and Gemini deliberate together — debating requirements, architecture, and implementation across 7 structured rounds — while the developer stays in control at every decision point.

Why This Exists

When using AI to write code, I kept running into the same problem: the answer depends on which AI you ask.

Claude catches a security issue that GPT missed. GPT suggests a library that Gemini flags as deprecated. Gemini's architecture is clean but Claude's error handling is more robust. There's no single AI that's always right — each has different training data, different strengths, and fundamentally produces outputs by sampling from a probability distribution. Ask the same question twice and you may get different answers.

AI Roundtable's answer: run all three and let the best answer win.

Each AI speaks, sees what the others said, and can revise its position. A rotating judge evaluates the outputs and produces a merge directive — not "use Claude's code," but "use Claude's base, replace the auth function with GPT's argon2 implementation, and apply Gemini's transaction rollback." The result is synthesized output that is better than any single agent could produce alone. This is ensemble learning applied to software engineering.

The second problem: most AI coding tools are black boxes. Devin and similar tools run autonomously and hand you a result. If it's wrong, you don't know why. AI Roundtable exposes the full deliberation in real time. You see every agent's reasoning, every disagreement, every consensus decision. You can intervene, redirect, or override at any point. The developer is a participant, not a spectator.

Features

Multi-agent deliberation — Claude, GPT, and Gemini debate each round; up to 3 iterations before consensus
7-round structured workflow — Requirements → Architecture → Development → Code Review → QA → DevOps → Execution Analysis
Configurable rounds — Enable/disable any round and add custom instructions per round before starting
Rotating judge — Different AI acts as judge each round to prevent anchoring bias
Function-level code synthesis — After consensus, the BASE agent merges the best parts from all agents at the function level
Chunked code review — Large codebases split into ~20K-token chunks so each agent stays within TPM limits
Developer intervention — Inject notes mid-round or at consensus; retry any round with updated context
Docker execution — Build and run generated code in Docker; stdout/stderr streamed live; if it fails, agents analyze and fix — user decides whether to re-run
WebContainer support — Frontend/fullstack projects run in-browser via WebContainer API
GitHub export via Auth0 Token Vault — AI agent pushes code to GitHub server-side; raw token never reaches the browser
Session persistence — Resume any session via ?session=<id>; sessions survive server restarts
Project dashboard — View and resume all past sessions at /projects

Quick Start

Prerequisites: Python 3.11+, Node.js 18+, Docker, an Auth0 tenant (free at auth0.com), and API keys for at least one AI provider.

1. Backend

cd backend
python -m venv venv && source venv/bin/activate
pip install -r ../requirements.txt

Create backend/.env:

ALLOWED_ORIGINS=http://localhost:3000
AUTH0_DOMAIN=your-tenant.us.auth0.com
AUTH0_AUDIENCE=https://your-api-identifier
AUTH0_MGMT_CLIENT_ID=<M2M app client ID>
AUTH0_MGMT_CLIENT_SECRET=<M2M app client secret>
SECRET_KEY=<run: openssl rand -hex 32>

2. Frontend

cd frontend && npm install

Create frontend/.env.local:

AUTH0_DOMAIN=your-tenant.us.auth0.com
AUTH0_CLIENT_ID=<from Auth0 Application settings>
AUTH0_CLIENT_SECRET=<from Auth0 Application settings>
AUTH0_SECRET=<run: openssl rand -hex 32>
APP_BASE_URL=http://localhost:3000
AUTH0_AUDIENCE=https://your-api-identifier
AUTH0_MGMT_CLIENT_ID=<M2M app client ID>
AUTH0_MGMT_CLIENT_SECRET=<M2M app client secret>

Auth0 Application settings:

Allowed Callback URLs: http://localhost:3000/auth/callback
Allowed Logout URLs: http://localhost:3000
Allowed Web Origins: http://localhost:3000

Auth0 Management API — M2M scopes required: read:users, update:users, read:user_idp_tokens

3. Run

# Terminal 1
cd backend && source venv/bin/activate
uvicorn app.main:app --reload --port 8000

# Terminal 2
cd frontend && npm run dev

Open http://localhost:3000.

4. GitHub Token Vault (optional)

Allows the AI agent to push generated code to GitHub without exposing any token to the browser.

Create a GitHub OAuth App at github.com/settings/developers — callback URL: https://<AUTH0_DOMAIN>/login/callback, scope: repo
Auth0 → Authentication → Social → GitHub → paste credentials → Purpose: Authentication and Connected Accounts for Token Vault
Auth0 → Authentication → Social → GitHub → Applications tab → enable for your app

Sign in with GitHub and the token is stored automatically. Users who sign in with Google/email can still export via Personal Access Token.

Main Use Cases

1. Build a new project from scratch

You describe what you want to build. Three AIs deliberate across up to 7 rounds — agreeing on requirements, architecture, code, review, QA, and deployment setup. At each round you see the full debate, can inject notes, and confirm before moving forward. At the end, download a ZIP or export directly to GitHub.

Best for: APIs, web apps, CLI tools, scripts — anything that can be containerized.

2. Improve an existing codebase

Upload your existing files at the start. The agents extend or modify them rather than starting from scratch — so your structure and conventions are preserved. Useful for adding a feature, refactoring a module, or getting a multi-AI code review.

Best for: Adding features to an existing project, large refactors, getting a second (and third) opinion on code you already wrote.

3. Pick only the rounds you need

Not every project needs all 7 rounds. Running just Requirements + Architecture gives you a detailed spec and system design in minutes. Running just Developer + Code Review gives you working code with a critique. Mix and match.

Common lightweight combos:

Spec only: Requirements + Architecture
Code only: Developer + Code Review
Full cycle without Docker: Requirements → QA (skip DevOps + Execution & Analysis)

4. Stay in control during execution

At every consensus point, you decide what happens next. If the agents disagree, you see the dispute and choose a direction. You can type a note mid-round to redirect the discussion. If a round goes wrong, roll it back and retry with updated instructions.

5. Run and debug generated code (Execution & Analysis)

After DevOps generates a Dockerfile, the Execution & Analysis round auto-builds and runs the container, streams live logs to your browser, and has the agents analyze any failures and apply fixes automatically. Retries up to 3 times before surfacing a final failure.

Note: Success depends on AI-generated Dockerfile quality and project complexity. Simpler projects (single-service APIs) work reliably; complex multi-service setups may need manual intervention.

How It Works

7-Round Structure

#	Round	Role	Token Budget
1	Requirements	Principal Product Engineer	2048
2	Architecture	Distinguished Software Architect	4096
3	Development	Principal Software Engineer	8192
4	Code Review	Staff Engineer	4096
5	QA	Principal QA Engineer	8192
6	DevOps	Senior Platform Engineer	4096
7	Execution & Analysis	SRE / Runtime Debugger	8192

Speaking order is randomized per round to prevent first-speaker anchoring bias. Any round can be skipped via round_configs at session start.

Debate Loop

Each round runs up to 3 iterations. Agents see all previous outputs before speaking:

Agent A speaks → Agent B speaks (sees A) → Agent C speaks (sees A, B)
→ Judge evaluates → CONSENSUS? break : iterate (max 3)
→ After 3 iterations without consensus: escalate to developer

Consensus & Synthesis

The judge rotates by round (selected_agents[round_index % len(selected)]) so no single AI always anchors decisions.

Three judge formats by round type:

Category	Rounds	Format
Discussion	Requirements, Architecture	`CONSENSUS: <summary>` or `DISCUSS: <options>`
Code-producing	Developer, QA, DevOps	`CONSENSUS: BASE=<agent>. INCORPORATE: ... MUST_FIX: ...`
Code-reviewing	Reviewer, Execution & Analysis	`CONSENSUS: CRITICAL: ... WARNINGS: ... CONFIRMED_CLEAN: ...`

After consensus in code-producing rounds, the BASE agent re-generates all files incorporating the INCORPORATE directives — merging the best function-level contributions from every agent.

Chunked Code Review

The Reviewer round uses token-budget chunking instead of the debate loop to stay within GPT's ~30K TPM limit:

Files sorted by priority: app code → config/infra → tests
Grouped into ≤ 20K token chunks
All agents review each chunk; findings accumulated
Single consensus check on accumulated findings (text only — no code re-sent)
Synthesizer applies fixes chunk by chunk

Execution & Analysis (Round 7)

After DevOps round completes:

final_files written to a temp directory
docker compose up --build (if docker-compose.yml present) or docker build && docker run
Stdout/stderr streamed as SSE events to the UI
On container exit → Execution & Analysis round auto-starts; agents analyze the output and apply fixes
After analysis completes → user is asked whether to re-run Docker with the fixed code

Auth0 & Security

Authentication

Sign in with Google or GitHub. Every session is scoped to the authenticated user's sub claim. The backend verifies Auth0 JWTs (RS256, JWKS) on every request.

API Key Storage

Stage	Where	Encryption
Saved by user	Auth0 `user_metadata`	Fernet (server-side)
Active session	SQLite + in-memory	Fernet
In transit	HTTPS only	TLS

Keys are never sent in request bodies beyond the initial save — the backend reads them from Auth0 user_metadata using the JWT's user_id.

GitHub Token Vault

The GitHub OAuth App is registered once (under the developer's account) as an application identity. Each user who connects GitHub gets their own token — the developer's account is never used for user pushes.

User signs in with GitHub → Auth0 stores that user's token in Token Vault
User clicks "Export to GitHub" → POST /api/github/push
  Server reads user's token from Token Vault (Management API)
  Server pushes files to GitHub using user's token
  Returns repo URL — token never sent to browser

Other Security Measures

Rate limiting: 10 req/min per IP (slowapi)
Security headers: X-Frame-Options: DENY, X-Content-Type-Options: nosniff, HSTS
Docker Dockerfile scanning: rejects curl | sh, privileged flags, host network/PID mounts

AI Provider Keys

Provider	Where to get
Anthropic (Claude)	console.anthropic.com/settings/keys
OpenAI (GPT)	platform.openai.com/api-keys
Google (Gemini)	aistudio.google.com/app/apikey

API Reference

FastAPI backend (/api):

Method	Endpoint	Description
`POST`	`/session/start`	Create session
`GET`	`/sessions`	List user's sessions (JWT required)
`GET`	`/session/{id}`	Get session state
`GET`	`/session/{id}/round`	Stream current round (SSE)
`POST`	`/session/developer-input`	Inject developer note
`POST`	`/session/{id}/retry`	Roll back to round index
`GET`	`/session/{id}/chat?message=`	Direct Q&A with judge (SSE)
`GET`	`/session/{id}/execute`	Run in Docker (SSE)
`POST`	`/session/{id}/stop`	Stop Docker container
`POST`	`/session/{id}/complete`	Mark complete
`GET`	`/session/{id}/files`	Get generated files
`GET`	`/session/{id}/download`	Download as ZIP

Next.js proxy routes (/api, Auth0 session required):

Method	Endpoint	Description
`GET/PATCH`	`/api/user/keys`	AI API keys via Auth0 user_metadata
`GET`	`/api/user/sessions`	Session list (adds Bearer token server-side)
`GET`	`/api/github/status`	GitHub Token Vault connection status
`POST`	`/api/github/push`	Server-side GitHub push via Token Vault

SSE event types:

Event	Description
`round_start`	Round begins with metadata
`agent_start / agent_end`	Agent speaking boundaries
`token`	Streaming token `{agent, token}`
`chunk_start`	Reviewer chunk starting
`debate_iteration`	New debate turn
`synthesis_start / synthesis_end`	Code merge in progress
`consensus`	Round result with summary, options, next_round
`exec_output / exec_done / exec_error`	Docker execution events

Configuration

backend/.env:

Variable	Default	Description
`ALLOWED_ORIGINS`	`http://localhost:3000`	CORS allowed origins
`AUTH0_DOMAIN`	(unset)	Enables JWT verification when set
`AUTH0_AUDIENCE`	(unset)	Must match frontend value
`AUTH0_MGMT_CLIENT_ID`	(unset)	M2M app — for user_metadata access
`AUTH0_MGMT_CLIENT_SECRET`	(unset)	M2M app secret
`SECRET_KEY`	(required)	Fernet key for API key encryption (`openssl rand -hex 32`)

frontend/.env.local:

Variable	Description
`AUTH0_DOMAIN`	Auth0 tenant domain
`AUTH0_CLIENT_ID`	Application client ID
`AUTH0_CLIENT_SECRET`	Application client secret
`AUTH0_SECRET`	Cookie encryption key (`openssl rand -hex 32`)
`APP_BASE_URL`	App base URL
`AUTH0_AUDIENCE`	API identifier
`AUTH0_MGMT_CLIENT_ID`	M2M client ID
`AUTH0_MGMT_CLIENT_SECRET`	M2M client secret

AI Models:

Agent	Deliberation	Judge (consensus)
Claude	`claude-sonnet-4-6`	`claude-haiku-4-5-20251001`
GPT	`gpt-4o`	`gpt-4o-mini`
Gemini	`gemini-1.5-pro`	`gemini-1.5-flash`

Project Structure

ai-roundtable/
├── backend/
│   ├── app/
│   │   ├── main.py               # FastAPI app, CORS, rate limiting, security headers
│   │   ├── auth.py               # Auth0 JWT verification (RS256 + JWKS)
│   │   ├── config.py             # Settings from environment
│   │   ├── models/schemas.py     # Pydantic request/response models
│   │   ├── routers/session.py    # All API endpoints + debate loop orchestration
│   │   └── services/
│   │       ├── claude.py         # Anthropic streaming client
│   │       ├── gpt.py            # OpenAI streaming client
│   │       ├── gemini.py         # Google Gemini streaming client
│   │       ├── consensus.py      # Judge logic, consensus parsing
│   │       ├── orchestrator.py   # Round order, token budgets, role prompts
│   │       └── auth0_mgmt.py     # Auth0 Management API (user_metadata)
│   └── tests/
│       ├── unit/                 # Consensus, chunked review, session lock
│       └── integration/          # Session lifecycle, restore, retry
└── frontend/
    ├── app/
    │   ├── page.tsx              # Home — routes to setup or session restore
    │   ├── projects/page.tsx     # Session dashboard (Server Component)
    │   └── api/
    │       ├── user/keys/        # AI API keys via Auth0 user_metadata
    │       ├── user/sessions/    # Session list proxy (adds Bearer token)
    │       ├── github/status/    # Token Vault connection check
    │       └── github/push/      # Server-side GitHub push
    ├── components/
    │   ├── ChatRoom.tsx          # Main session UI
    │   ├── SetupScreen.tsx       # Agent selection + API key entry
    │   ├── GitHubExportModal.tsx # Token Vault + PAT export
    │   ├── ConsensusBanner.tsx   # Dispute resolution UI
    │   ├── ExecutionPanel.tsx    # Docker terminal output
    │   └── FileViewerModal.tsx   # In-browser file browser
    └── lib/
        ├── useRoundStream.ts     # Session state + SSE management
        ├── useAgentSetup.ts      # Agent config + Auth0 API key sync
        ├── api.ts                # HTTP/SSE client functions
        └── github.ts             # Client-side GitHub API (PAT fallback)

Testing

# Backend
cd backend && source venv/bin/activate
pytest                                    # all tests
pytest --cov=app --cov-report=term-missing  # with coverage

# Frontend
cd frontend && npm test

Future Improvements

Area	Current	Direction
Debate iterations	3 (hardcoded)	Per-round config; more for complex projects
Judge model	Cost-optimized (Haiku, mini, Flash)	User-selectable: cost vs. accuracy
Token budget & code review	Fixed per round type; code review splits files into ~12K-token chunks (cross-file relationships can be missed); QA restricted to Claude/Gemini when multiple agents selected (GPT's 30K TPM limit is too low for full codebase)	Tiered plans: higher tiers get larger budgets, skip chunking, and allow all agents in QA — sending all files in one pass for a complete, relationship-aware review
WebContainer	npm projects only	Python, Deno runtime support
Cross-round memory	No persistent context	Long-running project continuity
GitHub export	Push to new repo only	Push to existing repo, open PR
Orchestration	Custom round loop in orchestrator.py	Migrate to LangGraph — declarative graph nodes per round, conditional edges for retry/consensus, built-in human-in-the-loop and state persistence

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
README.md		README.md
WORKFLOW.md		WORKFLOW.md

Folders and files

Latest commit

History

Repository files navigation

AI Roundtable

Why This Exists

Table of Contents

Features

Quick Start

1. Backend

2. Frontend

3. Run

4. GitHub Token Vault (optional)

Main Use Cases

1. Build a new project from scratch

2. Improve an existing codebase

3. Pick only the rounds you need

4. Stay in control during execution

5. Run and debug generated code (Execution & Analysis)

How It Works

7-Round Structure

Debate Loop

Consensus & Synthesis

Chunked Code Review

Execution & Analysis (Round 7)

Auth0 & Security

Authentication

API Key Storage

GitHub Token Vault

Other Security Measures

AI Provider Keys

API Reference

Configuration

Project Structure

Testing

Future Improvements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages