Thinker

An autonomous multi-agent system that discovers research problems, writes academic papers, and iteratively finds optimal trading algorithms — all driven by Claude. Two modes, one CLI. Run it for an hour or a week; it keeps learning.

What it does

Research mode

Point it at a topic. It discovers open problems, decomposes them into sub-questions, dispatches a swarm of Claude agents to research each one, runs experiments, generates figures, writes a LaTeX paper, runs a quality critic, compiles to PDF, and optionally publishes to ArXiv or Zenodo. Every paper also generates a blog post served via a Next.js web UI.

Trading mode

Point it at the crypto markets. It generates trading hypotheses, writes backtesting strategies, tests them against historical data, diagnoses failures, evolves better strategies, and repeats across N independent experiments. After each experiment it reflects on what worked, updates a persistent leaderboard, and extracts structured knowledge into an append-only knowledge graph — so every run makes the next run smarter.

Architecture

Research mode                          Trading mode
────────────────────────────────       ────────────────────────────────────────
Gather problems (Tavily + Claude)      Outer loop: N experiments
Evaluate viability                       Discover hypothesis (KG + LB aware)
Decompose into sub-problems              Develop strategy code
Research swarm (parallel agents)         Inner loop: 5 iterations
Plan experiments                           Test (backtest) → Diagnose → Evolve
Run code experiments + prototype         Reflect (Claude synthesizes learnings)
Test & benchmark results                 Update leaderboard + knowledge graph
Compare against prior work
Generate figures (matplotlib)          Kafka error bus
Write LaTeX paper                        Healer consumer (auto-fix broken code)
Critic quality check
Compile PDF
Publish (ArXiv / Zenodo)
Blog post → Next.js web UI

Features

Persistent knowledge across runs

Every trading experiment produces two persistent artifacts that survive process restarts and inform future runs:

Leaderboard (outputs/trading/leaderboard.json) — ranks every experiment by composite score (Sharpe + 0.5 × CAGR). Before each new hypothesis is generated, Claude sees the top performers, recent failures, and which symbols have been tried, so it actively diversifies instead of repeating dead ends.

Knowledge graph (outputs/trading/knowledge_graph.json) — an append-only store of Subject-Predicate-Object triples extracted by Claude after every experiment. Compatible with ai-knowledge-graph for visualization. Example triples:

RSI oversold strategy   → fails during        → sustained bear markets
BTC/USDT 1d             → shows               → high mean-reversion in 2021
momentum crossover      → performs well in    → trending markets with low volatility
volume spike signal     → requires            → minimum 14-day warmup window

These triples are injected into every discovery prompt so hypothesis quality improves continuously across sessions.

Strategy wiki

Every experiment produces a markdown wiki (outputs/wiki/wiki_*.md) logging the hypothesis, each test iteration's metrics, the diagnosis, and what the evolved strategy changed. Readable by humans, referenced by the pipeline.

Self-healing via Kafka

When a trading strategy crashes at runtime:

The error is published to a trading.errors Kafka topic.
The healer consumer picks it up, sends the broken file + full traceback to Claude, validates the fix, and writes the corrected code to disk.
A success notification is published to trading.fixes.
The pipeline resumes from the last checkpoint automatically.

The healer only touches files under outputs/ — it can never corrupt pipeline source code.

Real-time monitor

A FastAPI server at http://localhost:8585 streams phase start/end/error events via Server-Sent Events while any pipeline is running. The blog web UI connects to it for live status.

Blog web UI

Research papers and trading wiki notes are automatically saved to a local database and served as blog posts. Deploy the Next.js frontend and browse all outputs at http://localhost:3000.

Setup

# 1. Clone and create virtual environment
python3 -m venv .venv
source .venv/bin/activate

# 2. Install dependencies
pip install -r requirements.txt

# 3. Configure API keys, agent choice, and publish settings
python -m src.main --setup

# 4. Start Kafka (required for trading mode self-healing)
docker compose up -d kafka

# 5. Authenticate the Claude CLI (used as the agent subprocess)
claude login

Requirements: Python 3.11+, Docker (for Kafka), claude CLI on PATH.

.env file (created by --setup or manually):

ANTHROPIC_API_KEY=your_key
TAVILY_API_KEY=your_key

Running

Trading mode — iterative algorithm discovery

# Recommended: one command starts Kafka + healer + pipeline
./run.sh

# Custom experiment count (default: 10)
EXPERIMENTS=5 ./run.sh

# Or run directly
python -m src.main --mode trading --agent claude --experiments 10

The pipeline runs indefinitely across the specified number of experiments. Each experiment:

Generates a new hypothesis (using prior knowledge graph + leaderboard to avoid repeats)
Writes and backtests a strategy (up to 5 evolve iterations)
Reflects and updates the knowledge graph + leaderboard
Starts the next experiment fresh

Research mode — topic to paper

python -m src.main --mode research --topic "LLM efficiency and compression" --agent claude

# Limit scope for faster runs
python -m src.main --mode research --topic "attention mechanisms" --max-problems 2 --max-accepted 1

Serve the blog

python -m src.main --deploy-blog
# Blog at http://localhost:3000
# API + monitor at http://localhost:8585

CLI reference

python -m src.main [OPTIONS]

Core
  --mode {research,trading}            Pipeline to run (default: research)
  --agent {qwen,claude}                Agent CLI to use (default: from config)
  --setup                              Interactive setup wizard

Research
  --topic TEXT                         Seed topic for problem discovery
  --max-problems N                     Cap problems to evaluate (default: all)
  --max-accepted N                     Cap accepted problems to process (default: all)
  --max-research N                     Cap parallel research sub-tasks (default: all)
  --publish {none,arxiv-pkg,zenodo}    Post-PDF publish action (default: none)

Trading
  --experiments N                      Independent experiments to run (default: 10)
  --auto-resume                        Resume active session without prompting

Blog
  --deploy-blog                        Start blog server only, skip pipeline

Outputs

Path	Description
`outputs/trading/strategy_exp{N}.py`	Best strategy from each experiment
`outputs/trading/leaderboard.json`	All experiments ranked by score
`outputs/trading/knowledge_graph.json`	Accumulated SPO triples across all runs
`outputs/wiki/wiki_*.md`	Per-experiment strategy development log
`outputs/{problem_id}/paper_draft.tex`	Generated LaTeX paper
`outputs/{problem_id}/*.pdf`	Compiled PDF
`outputs/metrics_report.json`	Token usage + timing per phase
`localhost:3000`	Blog web UI
`localhost:8585`	Real-time monitor / SSE stream

Project layout

src/
  main.py                    Orchestrator + CLI entry point
  leaderboard.py             Trading experiment leaderboard
  knowledge_graph.py         Append-only SPO knowledge graph
  healer_consumer.py         Kafka-driven auto-heal service
  monitor.py                 Real-time pipeline monitor (FastAPI + SSE)
  metrics.py                 Token + timing metrics collector
  pipeline/
    trading/
      discover.py            Hypothesis generation (KG + leaderboard aware)
      develop.py             Strategy code generation
      test.py                Backtesting (investing_algorithm_framework)
      diagnose.py            Failure root-cause analysis
      evolve.py              LLM-driven strategy improvement
      reflect.py             Post-experiment synthesis
      wiki.py                Markdown log writer
      session.py             Checkpoint / resume state machine
      error_bus.py           Kafka publish + wait-for-fix
    gather.py                Problem discovery (Tavily + Claude)
    evaluate.py              Viability screening
    decompose.py             Sub-problem decomposition
    research.py              Parallel research agent swarm
    plan.py                  Experiment planner
    code.py                  Experimenter + prototyper
    test.py                  Results evaluator
    compare.py               Baseline comparator
    write.py                 LaTeX paper writer
    critic.py                Quality reviewer
    pdf.py                   pdflatex compiler
blog_web/                    Next.js blog frontend
outputs/                     All generated artifacts
run.sh                       One-command startup (Kafka + healer + pipeline)
docker-compose.yml           Kafka service definition

Tech stack

Component	Library
Agent runtime	`claude-agent-sdk`, `anthropic`
Web search	`tavily-python`
Academic search	`semanticscholar`, `arxiv`
Backtesting	`investing-algorithm-framework`
Error bus	Apache Kafka via `kafka-python`
API / monitor	FastAPI + uvicorn
Blog frontend	Next.js
Figures	matplotlib
Data validation	pydantic
Knowledge graph format	ai-knowledge-graph compatible

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
blog_web		blog_web
logs		logs
monitor		monitor
outputs		outputs
resources		resources
src		src
thinker.egg-info		thinker.egg-info
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Thinker

What it does

Research mode

Trading mode

Architecture

Features

Persistent knowledge across runs

Strategy wiki

Self-healing via Kafka

Real-time monitor

Blog web UI

Setup

Running

Trading mode — iterative algorithm discovery

Research mode — topic to paper

Serve the blog

CLI reference

Outputs

Project layout

Tech stack

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Thinker

What it does

Research mode

Trading mode

Architecture

Features

Persistent knowledge across runs

Strategy wiki

Self-healing via Kafka

Real-time monitor

Blog web UI

Setup

Running

Trading mode — iterative algorithm discovery

Research mode — topic to paper

Serve the blog

CLI reference

Outputs

Project layout

Tech stack

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages