Skip to content

DogukanGun/thinker

Repository files navigation

Thinker

An autonomous multi-agent system that discovers research problems, writes academic papers, and iteratively finds optimal trading algorithms — all driven by Claude. Two modes, one CLI. Run it for an hour or a week; it keeps learning.


What it does

Research mode

Point it at a topic. It discovers open problems, decomposes them into sub-questions, dispatches a swarm of Claude agents to research each one, runs experiments, generates figures, writes a LaTeX paper, runs a quality critic, compiles to PDF, and optionally publishes to ArXiv or Zenodo. Every paper also generates a blog post served via a Next.js web UI.

Trading mode

Point it at the crypto markets. It generates trading hypotheses, writes backtesting strategies, tests them against historical data, diagnoses failures, evolves better strategies, and repeats across N independent experiments. After each experiment it reflects on what worked, updates a persistent leaderboard, and extracts structured knowledge into an append-only knowledge graph — so every run makes the next run smarter.


Architecture

Research mode                          Trading mode
────────────────────────────────       ────────────────────────────────────────
Gather problems (Tavily + Claude)      Outer loop: N experiments
Evaluate viability                       Discover hypothesis (KG + LB aware)
Decompose into sub-problems              Develop strategy code
Research swarm (parallel agents)         Inner loop: 5 iterations
Plan experiments                           Test (backtest) → Diagnose → Evolve
Run code experiments + prototype         Reflect (Claude synthesizes learnings)
Test & benchmark results                 Update leaderboard + knowledge graph
Compare against prior work
Generate figures (matplotlib)          Kafka error bus
Write LaTeX paper                        Healer consumer (auto-fix broken code)
Critic quality check
Compile PDF
Publish (ArXiv / Zenodo)
Blog post → Next.js web UI

Features

Persistent knowledge across runs

Every trading experiment produces two persistent artifacts that survive process restarts and inform future runs:

Leaderboard (outputs/trading/leaderboard.json) — ranks every experiment by composite score (Sharpe + 0.5 × CAGR). Before each new hypothesis is generated, Claude sees the top performers, recent failures, and which symbols have been tried, so it actively diversifies instead of repeating dead ends.

Knowledge graph (outputs/trading/knowledge_graph.json) — an append-only store of Subject-Predicate-Object triples extracted by Claude after every experiment. Compatible with ai-knowledge-graph for visualization. Example triples:

RSI oversold strategy   → fails during        → sustained bear markets
BTC/USDT 1d             → shows               → high mean-reversion in 2021
momentum crossover      → performs well in    → trending markets with low volatility
volume spike signal     → requires            → minimum 14-day warmup window

These triples are injected into every discovery prompt so hypothesis quality improves continuously across sessions.

Strategy wiki

Every experiment produces a markdown wiki (outputs/wiki/wiki_*.md) logging the hypothesis, each test iteration's metrics, the diagnosis, and what the evolved strategy changed. Readable by humans, referenced by the pipeline.

Self-healing via Kafka

When a trading strategy crashes at runtime:

  1. The error is published to a trading.errors Kafka topic.
  2. The healer consumer picks it up, sends the broken file + full traceback to Claude, validates the fix, and writes the corrected code to disk.
  3. A success notification is published to trading.fixes.
  4. The pipeline resumes from the last checkpoint automatically.

The healer only touches files under outputs/ — it can never corrupt pipeline source code.

Real-time monitor

A FastAPI server at http://localhost:8585 streams phase start/end/error events via Server-Sent Events while any pipeline is running. The blog web UI connects to it for live status.

Blog web UI

Research papers and trading wiki notes are automatically saved to a local database and served as blog posts. Deploy the Next.js frontend and browse all outputs at http://localhost:3000.


Setup

# 1. Clone and create virtual environment
python3 -m venv .venv
source .venv/bin/activate

# 2. Install dependencies
pip install -r requirements.txt

# 3. Configure API keys, agent choice, and publish settings
python -m src.main --setup

# 4. Start Kafka (required for trading mode self-healing)
docker compose up -d kafka

# 5. Authenticate the Claude CLI (used as the agent subprocess)
claude login

Requirements: Python 3.11+, Docker (for Kafka), claude CLI on PATH.

.env file (created by --setup or manually):

ANTHROPIC_API_KEY=your_key
TAVILY_API_KEY=your_key

Running

Trading mode — iterative algorithm discovery

# Recommended: one command starts Kafka + healer + pipeline
./run.sh

# Custom experiment count (default: 10)
EXPERIMENTS=5 ./run.sh

# Or run directly
python -m src.main --mode trading --agent claude --experiments 10

The pipeline runs indefinitely across the specified number of experiments. Each experiment:

  1. Generates a new hypothesis (using prior knowledge graph + leaderboard to avoid repeats)
  2. Writes and backtests a strategy (up to 5 evolve iterations)
  3. Reflects and updates the knowledge graph + leaderboard
  4. Starts the next experiment fresh

Research mode — topic to paper

python -m src.main --mode research --topic "LLM efficiency and compression" --agent claude

# Limit scope for faster runs
python -m src.main --mode research --topic "attention mechanisms" --max-problems 2 --max-accepted 1

Serve the blog

python -m src.main --deploy-blog
# Blog at http://localhost:3000
# API + monitor at http://localhost:8585

CLI reference

python -m src.main [OPTIONS]

Core
  --mode {research,trading}            Pipeline to run (default: research)
  --agent {qwen,claude}                Agent CLI to use (default: from config)
  --setup                              Interactive setup wizard

Research
  --topic TEXT                         Seed topic for problem discovery
  --max-problems N                     Cap problems to evaluate (default: all)
  --max-accepted N                     Cap accepted problems to process (default: all)
  --max-research N                     Cap parallel research sub-tasks (default: all)
  --publish {none,arxiv-pkg,zenodo}    Post-PDF publish action (default: none)

Trading
  --experiments N                      Independent experiments to run (default: 10)
  --auto-resume                        Resume active session without prompting

Blog
  --deploy-blog                        Start blog server only, skip pipeline

Outputs

Path Description
outputs/trading/strategy_exp{N}.py Best strategy from each experiment
outputs/trading/leaderboard.json All experiments ranked by score
outputs/trading/knowledge_graph.json Accumulated SPO triples across all runs
outputs/wiki/wiki_*.md Per-experiment strategy development log
outputs/{problem_id}/paper_draft.tex Generated LaTeX paper
outputs/{problem_id}/*.pdf Compiled PDF
outputs/metrics_report.json Token usage + timing per phase
localhost:3000 Blog web UI
localhost:8585 Real-time monitor / SSE stream

Project layout

src/
  main.py                    Orchestrator + CLI entry point
  leaderboard.py             Trading experiment leaderboard
  knowledge_graph.py         Append-only SPO knowledge graph
  healer_consumer.py         Kafka-driven auto-heal service
  monitor.py                 Real-time pipeline monitor (FastAPI + SSE)
  metrics.py                 Token + timing metrics collector
  pipeline/
    trading/
      discover.py            Hypothesis generation (KG + leaderboard aware)
      develop.py             Strategy code generation
      test.py                Backtesting (investing_algorithm_framework)
      diagnose.py            Failure root-cause analysis
      evolve.py              LLM-driven strategy improvement
      reflect.py             Post-experiment synthesis
      wiki.py                Markdown log writer
      session.py             Checkpoint / resume state machine
      error_bus.py           Kafka publish + wait-for-fix
    gather.py                Problem discovery (Tavily + Claude)
    evaluate.py              Viability screening
    decompose.py             Sub-problem decomposition
    research.py              Parallel research agent swarm
    plan.py                  Experiment planner
    code.py                  Experimenter + prototyper
    test.py                  Results evaluator
    compare.py               Baseline comparator
    write.py                 LaTeX paper writer
    critic.py                Quality reviewer
    pdf.py                   pdflatex compiler
blog_web/                    Next.js blog frontend
outputs/                     All generated artifacts
run.sh                       One-command startup (Kafka + healer + pipeline)
docker-compose.yml           Kafka service definition

Tech stack

Component Library
Agent runtime claude-agent-sdk, anthropic
Web search tavily-python
Academic search semanticscholar, arxiv
Backtesting investing-algorithm-framework
Error bus Apache Kafka via kafka-python
API / monitor FastAPI + uvicorn
Blog frontend Next.js
Figures matplotlib
Data validation pydantic
Knowledge graph format ai-knowledge-graph compatible

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors