🧠 ThinkGraph

Stop guessing. Start decomposing. Structured decomposition for LLM prompts — breaks complex questions into a dependency graph of atomic facts, resolves them sequentially, and synthesizes a grounded answer.

The Problem

When you ask an LLM a complex question, it tries to answer the whole thing at once — hallucinating details, missing constraints, and guessing facts it should verify first.

❌ "Compare React and Vue for enterprise SSR dashboard"
   → LLM guesses React is better without checking SSR maturity,
     enterprise adoption, team size trade-offs, or bundle size.

ThinkGraph intercepts the prompt and forces structured thinking before answering:

✅ LLM first resolves: "What is React's SSR maturity?" + "What is Vue's SSR maturity?"
   + "What are enterprise adoption rates?" + "Which fits a 10-person team?"
   Then synthesizes from verified facts only.

Result: 50%+ accuracy improvement on multi-hop prompts.

How It Works

Your Prompt
    │
    ▼
┌─────────┐     ┌───────────┐     ┌──────────┐     ┌────────────┐     ┌──────────────┐
│ TRIAGE  │────>│ DECOMPOSE │────>│  RESOLVE  │────>│ SELF-CONSIS│────>│  SYNTHESIZE  │
│         │     │           │     │           │     │   TENCY    │     │              │
│ Is this │     │ Emit DAG  │     │ Answer    │     │  VOTE (if  │     │ Build answer │
│ complex?│     │ of atomic │     │ each      │     │  enabled)  │     │ from verified│
│         │     │ sub-Qs    │     │ sub-Q in  │     │            │     │ facts only   │
│ Skip if │     │ with deps │     │ topo order│     │ Multiple   │     │              │
│ trivial │     │           │     │           │     │ attempts → │     │              │
│         │     │           │     │ Web search│     │ centroid   │     │              │
└─────────┘     └───────────┘     └──────────┘     └────────────┘     └──────────────┘

Features

Core Protocol (5-stage pipeline)

Stage	What it does
Triage	Classify prompt: trivial / single-hop / multi-hop / planning / creative
Decompose	Break into atomic sub-questions with explicit dependency DAG
Resolve	Answer each node in topological order, with caching
Synthesize	Build final answer from verified facts only
Present	Answer with uncertainty notes if any fact was low-confidence

Self-Consistency Voting

Run 2-3 synthesis attempts, vote on the most consistent one via Jaccard centroid. Catches hallucinations without extra LLM calls.

python thinkgraph.py vote "answer variant 1" "answer variant 2" "answer variant 3"
# {"winner": "...", "score": 0.72, "response_count": 3}

Web Grounding (DuckDuckGo — zero API key)

Auto-search for low-confidence facts. No API key needed — pure HTTP + HTML parsing.

python thinkgraph.py web-search "React 19 streaming SSR benchmark" --num-results 5

Prompt Compression (TF-IDF sentence extraction)

Compress long context before feeding synthesis. Keeps the most important sentences by TF-IDF weight.

python thinkgraph.py compress long_text.txt --ratio 0.4
# Compressed: 200 -> 80 words (kept 40%)

Dynamic DAG Pruning

After resolving parent nodes, automatically prune children whose questions are already answered by their parents.

python thinkgraph.py prune-dag graph.json --facts facts.json --prompt "your original question"

A/B Testing Mode

Score answers on keyword recall, precision, claim count, and uncertainty markers.

python thinkgraph.py ab-score "React has better SSR support" \
  --ground-truth "React and Vue both support SSR with React 18 offering streaming"
# keyword_recall: 50.00%  precision: 71.40%

Plugin Hooks

Register custom resolve functions (API calls, database lookups, shell commands).

python thinkgraph.py plugin-register my_api <<'EOF'
def my_api(question, ctx):
    return {"claim": api.lookup(question), "confidence": 0.95}
EOF
python thinkgraph.py plugin-list
# shell, weblookup, my_api

Export Formats

Export pipeline results as JSON, YAML, or Markdown report.

python thinkgraph.py export results.json --format markdown > report.md

MCP Server (7 tools)

Expose ThinkGraph as an MCP server. Compatible with Claude Desktop, Cursor, and any MCP client.

python mcp/thinkgraph_mcp.py

Configure in your MCP client:

{
  "mcpServers": {
    "thinkgraph": {
      "command": "python",
      "args": ["/path/to/mcp/thinkgraph_mcp.py"]
    }
  }
}

MCP Tool	Description
`thinkgraph_triage`	Classify prompt complexity
`thinkgraph_validate_dag`	Validate DAG, get execution batches
`thinkgraph_vote`	Self-consistency voting
`thinkgraph_web_search`	DuckDuckGo web search
`thinkgraph_cache_get`	Look up cached facts
`thinkgraph_cache_set`	Store resolved facts
`thinkgraph_tokens`	Estimate token count

Works With Everything

Agent	Setup	Auto-loaded?
OpenCode	`.opencode/skills/thinkgraph/SKILL.md`	✅ Yes
Claude Code	`~/.claude/skills/thinkgraph/SKILL.md`	✅ Yes
Cursor	`.cursor/rules/thinkgraph.mdc`	Via install script
Codex	`AGENTS.md` section	Via install script
Copilot	`.github/copilot-instructions.md`	Via install script
Gemini CLI	`GEMINI.md` section	Via install script

Quick Start

git clone https://github.com/Mayne-X/thinkgraph.git
cd thinkgraph

# Install adapters for all detected agents
python install.py

# Or dry-run first
python install.py --dry-run

# Restart your agent. ThinkGraph activates automatically on complex prompts.

CLI Reference

# Core pipeline
python thinkgraph.py triage "compare React and Vue"
python thinkgraph.py validate-dag graph.json
python thinkgraph.py cache-get "what is react ssr maturity"
python thinkgraph.py tokens "your text"

# New features
python thinkgraph.py vote "resp1" "resp2" "resp3"
python thinkgraph.py web-search "query" --num-results 5
python thinkgraph.py compress file.txt --ratio 0.4
python thinkgraph.py prune-dag graph.json --facts facts.json
python thinkgraph.py ab-score "answer" --ground-truth "reference"
python thinkgraph.py export results.json --format markdown
python thinkgraph.py plugin-list
python thinkgraph.py plugin-register myname "python code"

Token Budgets

Stage	Max
Triage	50
Decompose	200
Per sub-question	300
Synthesize	600
Hard ceiling	4× direct answer

Pipeline aborts to direct answer if ceiling is breached.

Performance Optimizations

Precompiled regex — all patterns compiled once at import, not per-call
LRU memoization — normalize_question, question_hash, estimate_tokens, and compute_term_freq are all cached
Global cache — facts persist at ~/.thinkgraph/cache.json across projects and sessions
Unicode-safe output — all CLI output uses sys.stdout.write to avoid cp1252 encoding errors
Efficient data structures — sets for membership, tuples for cached composite values

Tests

# All tests
python tests/test_golden.py      # 15/15 passing — triage, normalization, hashing
python tests/test_new_features.py # 14/14 passing — voting, web search, MCP server
python tests/benchmark.py        # 10 prompts — quality scoring (compression 70%, vote 64%)

# Quick smoke test
python cli/thinkgraph.py triage "compare React and Vue"
python cli/thinkgraph.py vote "React is fast" "React is very fast" "React is quick"

Project Structure

thinkgraph/
├── SKILL.md                    # Canonical protocol (the source of truth)
├── protocol/
│   ├── prompts.md              # Verbatim prompt templates for each stage
│   ├── dag.md                  # DAG schema, topo-sort pseudocode, cache format
│   └── questions.md            # Onboarding + per-invocation question templates
├── adapters/
│   ├── opencode/SKILL.md       # OpenCode skill
│   ├── claude/SKILL.md         # Claude Code (auto-loaded by OpenCode too)
│   ├── cursor/thinkgraph.mdc    # Cursor rules
│   ├── codex/AGENTS.md          # Codex section
│   ├── copilot/                # Copilot section
│   └── gemini/GEMINI.md        # Gemini CLI section
├── cli/
│   └── thinkgraph.py            # Helper CLI (Python 3.8+, stdlib only)
├── mcp/
│   ├── thinkgraph_mcp.py        # MCP server (JSON-RPC 2.0 over stdio)
│   └── README.md               # MCP setup guide
├── .github/workflows/
│   └── thinkgraph.yml          # GitHub Action — auto-analyze issues
├── tests/
│   ├── test_golden.py          # 15 tests
│   ├── test_new_features.py    # 14 tests
│   └── benchmark.py            # Quality benchmark suite
├── install.py                   # Multi-agent installer (auto-detect, idempotent)
├── README.md                   # This file
└── LICENSE

Roadmap

#	Feature	Status	Notes
1	Self-consistency voting	✅ Done	Jaccard centroid, `thinkgraph.py vote`
2	Web grounding	✅ Done	DuckDuckGo HTML, zero API key, `web-search`
3	MCP server	✅ Done	JSON-RPC 2.0 stdio, 7 tools, `mcp/thinkgraph_mcp.py`
4	Prompt compression	✅ Done	TF-IDF sentence extraction, `compress`
5	Benchmark suite	✅ Done	10 prompts, quality scoring, `tests/benchmark.py`
6	Cache sync	✅ Done	Global `~/.thinkgraph/cache.json`, per-project `.pcg/`
7	Streaming support	🔜 Planned	Incremental DAG + fact emission
8	Multi-model routing	🔜 Planned	Cheap sub-nodes, expensive synthesis
9	Dynamic DAG pruning	✅ Done	Auto-remove nodes whose parents answer them
10	Export formats	✅ Done	JSON, YAML, Markdown
11	Recursive depth (3+)	🔜 Planned	Max depth configurable, budget warnings
12	A/B testing mode	✅ Done	`ab-score`, keyword recall + precision
13	Plugin hooks	✅ Done	Custom resolve fns, `plugin-register`
14	CLI interactive mode	🔜 Planned	`thinkgraph interactive` REPL
15	GitHub Action	✅ Done	Auto-comment on issues, `thinkgraph.yml`

✅ = Implemented 🔜 = Planned 🚧 = In progress

Contributing

Fork → branch → commit → PR
Run tests: python tests/test_golden.py && python tests/test_new_features.py
Benchmark: python tests/benchmark.py

License

MIT — do whatever you want.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 ThinkGraph

The Problem

How It Works

Features

Core Protocol (5-stage pipeline)

Self-Consistency Voting

Web Grounding (DuckDuckGo — zero API key)

Prompt Compression (TF-IDF sentence extraction)

Dynamic DAG Pruning

A/B Testing Mode

Plugin Hooks

Export Formats

MCP Server (7 tools)

Works With Everything

Quick Start

CLI Reference

Token Budgets

Performance Optimizations

Tests

Project Structure

Roadmap

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.github/workflows		.github/workflows
adapters		adapters
cli		cli
mcp		mcp
protocol		protocol
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
SKILL.md		SKILL.md
install.py		install.py

Folders and files

Latest commit

History

Repository files navigation

🧠 ThinkGraph

The Problem

How It Works

Features

Core Protocol (5-stage pipeline)

Self-Consistency Voting

Web Grounding (DuckDuckGo — zero API key)

Prompt Compression (TF-IDF sentence extraction)

Dynamic DAG Pruning

A/B Testing Mode

Plugin Hooks

Export Formats

MCP Server (7 tools)

Works With Everything

Quick Start

CLI Reference

Token Budgets

Performance Optimizations

Tests

Project Structure

Roadmap

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages