Skip to content

Shavi2002/debate-engine

Β 
Β 

Repository files navigation

🧠 Debate Engine

A multi-agent epistemic reasoning system that researches both sides of any question using live web sources, structured belief graphs, and semantic contradiction detection.

Built with thinkn.ai beliefs SDK + Exa + GPT-4o. Uses thinkn.ai as the core framework to simulate a (bi-opinion) debate using multiple agents that put equal weight on advocating for and against the topic. By connecting agents to EXA API the real content like articles, research papers, blogs are pulled from the internet and fed to a belief system that semantically manages arguments, identifies contradictions and resolves knowledge gaps autonomously by pursuing the promising query directions, continuously evolving and expanding over time as more and more data is collected. The system tries to minimize confusion, maximize clarity and summarizes grounded findings at the very end.

What It Does

Traditional LLM research accumulates text. This system accumulates understanding.

Two opposing agents (pro and anti) independently research a question via live web search. Every piece of content they ingest is parsed into typed, confidence-weighted belief nodes in a shared namespace. The SDK automatically:

  • Detects semantic contradictions across sources that never reference each other
  • Suppresses the clarity score when genuine epistemic conflict exists
  • Tracks open gaps and ranks the highest-value next research actions
  • Fuses multi-agent outputs into a single coherent world state

A third judge agent reads the fused namespace and produces a structured, evidence-grounded verdict via GPT-4o.


Architecture

User Question
     β”‚
     β–Ό
generateDebateConfig()          ← GPT-4o bootstraps sides, goal, 4 gaps, seed queries
     β”‚
     β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚               Shared Namespace               β”‚
β”‚                                              β”‚
β”‚  pro-agent ──► beliefs.after(webContent)     β”‚
β”‚  anti-agent ──► beliefs.after(webContent)    β”‚
β”‚                          β”‚                   β”‚
β”‚              SDK fuses, scores, detects       β”‚
β”‚              contradictions automatically     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
     β”‚
     β–Ό
judge.read()                    ← full fused belief graph
     β”‚
     β”œβ”€β”€β–Ί debateDirector()      ← GPT-4o reads world.moves[] β†’ writes Exa queries
     β”‚         β”‚
     β”‚         β–Ό
     β”‚    Exa web search β†’ beliefs.after() β†’ repeat N rounds
     β”‚
     β–Ό
judge.before()                  ← structured briefing prompt injected into GPT-4o
     β”‚
     β–Ό
GPT-4o Verdict                  ← grounded in belief graph, not raw web text

Project Structure

debate-ui/
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ page.tsx                  # Main UI β€” live SSE rendering
β”‚   β”œβ”€β”€ api/
β”‚   β”‚   β”œβ”€β”€ debate/route.ts       # SSE stream β€” runs the debate loop
β”‚   β”‚   └── verdict/route.ts      # Judge verdict endpoint
β”œβ”€β”€ src/
β”‚   └── lib/
β”‚       └── debate-runner.ts      # Core debate logic
└── .env.local                    # API keys

How It Works

1. Config Bootstrap

GPT-4o takes the user's question and generates the debate configuration:

  • Two named sides (pro / anti) with distinct research angles
  • A single overarching goal node
  • 4 investigable gap nodes (things the system doesn't know yet)
  • Seed search queries for round 1

2. Belief Agents β€” Shared Namespace

const proAgent  = new Beliefs({ apiKey, agent: 'pro',   namespace: ns })
const antiAgent = new Beliefs({ apiKey, agent: 'anti',  namespace: ns })
const judge     = new Beliefs({ apiKey, agent: 'judge', namespace: ns })

// All three agents write to and read from the same belief graph.
// The SDK fuses their outputs β€” no manual diffing needed.

3. The Round Loop

Each round:

  1. judge.read() β€” snapshot the full world state
  2. debateDirector() β€” GPT-4o reads world.moves[] (ranked by expected information gain) and writes Exa queries
  3. Both agents run Exa searches in parallel
  4. Each result page is fed into agent.after(webContent) β€” the SDK extracts beliefs, scores confidence, detects contradictions
  5. Resolved gaps are closed with beliefs.resolve(gap)
  6. State is streamed to the UI via SSE

4. Early Exit

The runner exits when the SDK signals diminishing returns:

const shouldStop =
  world.moves.length === 0          ||  // no further high-value actions
  topMove.value < 0.1               ||  // expected information gain near zero
  (world.gaps.length === 0 &&
   world.contradictions.length >= CONTRADICTION_THRESHOLD)

5. Judge Verdict

const context = await judge.before()   // structured belief-graph briefing
// context.prompt is injected into GPT-4o as system prompt
// The LLM summarises the belief graph, not the raw web

SDK Methods

Method Purpose
new Beliefs({ agent, namespace }) Three agents sharing one namespace
beliefs.add([...], { type: 'gap' }) Seed 4 investigable unknowns
beliefs.add({ type: 'goal' }) Set the debate objective
beliefs.after(webContent) Extract + fuse beliefs from Exa page text
beliefs.read() Full world state: beliefs, gaps, contradictions, moves
beliefs.before() Structured system prompt for GPT-4o verdict
beliefs.resolve(gap) Explicitly close a gap answered by evidence
beliefs.snapshot() Lightweight state read for UI polling

The Clarity Score

Clarity is not a quality score β€” it's epistemic readiness, computed across four channels:

$$\text{clarity} = f(\underbrace{\text{decisionResolution}}_{\text{goals met}},\ \underbrace{\text{knowledgeCertainty}}_{\text{high-confidence beliefs}},\ \underbrace{\text{coherence}}_{\text{low contradictions}},\ \underbrace{\text{coverage}}_{\text{gaps closed}})$$

A clarity of 0.41 after 53 ingested sources is correct behavior β€” on a genuinely contested topic, the coherence channel stays suppressed. The system knows it doesn't know.


UI Walkthrough

Header β€” Scorecard

Field Description
Total beliefs All claim nodes in the fused namespace
Established >0.70 High-confidence, consistent beliefs
Contested 0.40–0.70 Disputed or partially supported
Weak <0.40 Low signal β€” single source or contradicted
Contradictions Semantic conflicts detected across sources
Open gaps Unknowns still unresolved
Gaps resolved Gaps explicitly closed via beliefs.resolve()
Judge clarity Overall epistemic readiness (0–1)
Sources ingested Total Exa pages fed via after()

Round Cards

  • Director line (cyan italic) β€” GPT-4o's reasoning from world.moves[]
  • Next-round value β€” SDK's expected information gain for the next round
  • Source list β€” Exa pages fed into beliefs.after()
  • ⚑ Conflicts badge β€” contradictions detected in this round
  • WHAT CONFLICTED β€” the specific belief pairs that semantically negate each other
  • Clarity bars β€” per-agent clarity after round completion
  • Resolved chip (green) β€” gap closed this round via beliefs.resolve()

GPT-4o Verdict

Verdict sections map directly to belief graph confidence tiers β€” the structure comes from the SDK, not prompt engineering:

Verdict Section Belief Tier
Evidence clearly supports Established > 0.70
Actively contested Contested 0.40–0.70
Genuinely unknown Open gaps

Setup

cd debate-ui
cp .env.local.example .env.local
npm install
npm run dev
# β†’ http://localhost:3000

.env.local

BELIEFS_KEY=bel_live_...
EXA_API_KEY=...
OPENAI_API_KEY=...

Dependencies

Package Purpose
beliefs Epistemic belief state SDK
exa-js Neural web search for real-time evidence
openai GPT-4o for director reasoning + verdict
next Web UI + SSE streaming

Get API Keys


Further Reading

About

Uses thinkn ai to evolve a belief system orchestrated by multiple agents chasing overall coherence.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • TypeScript 69.1%
  • HTML 29.8%
  • Other 1.1%