Alpha Autoresearch

Autonomous Alpha Factor Research for Chinese A-Shares
AI agents invent, iterate, and optimize quantitative factors — while you sleep.

💡 What is this?

Inspired by Karpathy's autoresearch — applied to quantitative finance.

An AI agent autonomously runs an experiment loop overnight:

Modifies factors.py — inventing new Alpha101-style factors
Evaluates against a unified dataset of 495 A-shares (2020–2025)
Checks 3 Pareto metrics — predictive power, stability, tradeability
Keeps only non-dominated factors, expanding the frontier

~60 experiments/hour. ~500 overnight. Zero human intervention.

┌──────────────┐     ┌──────────────┐     ┌──────────────┐
│  factors.py  │────▶│  prepare.py  │────▶│  3 metrics   │──▶ pareto_frontier.json
│ Agent edits  │     │  Read-only   │     │ RankIC/IR/TO │    (non-dominated)
└──────────────┘     └──────────────┘     └──────────────┘

🚀 Quick Start

git clone https://github.com/1998x-stack/alpha-autoresearch.git
cd alpha_autoresearch
uv sync                          # install deps
uv run python prepare.py         # evaluate factors (sample dataset included)

Out of the box. Includes a 50-stock sample dataset (6.7 MB). No external data needed. For the full 495-stock dataset: uv run python prepare.py --build-cache

📊 Three First-Principles Metrics

A factor is only useful if it predicts returns, does so consistently, and is cheap to trade.

Metric	Formula	Means
RankIC	`mean(Spearman(factor, forward_return))`	Stronger predictive signal
IC IR	`mean(IC) / std(IC)`	More consistent predictions
Turnover	`1 − mean(	rank_t − rank_{t−1}

These form a Pareto frontier — you can't maximize all three simultaneously. The agent discovers the tradeoff surface.

🏗️ Architecture

File	Role	Modified by
`prepare.py`	Evaluation harness — 12 operators, 3 metrics, Pareto logic	Read-only
`factors.py`	Factor definitions — 1–10 Factor subclasses per experiment	AI agent
`program.md`	Agent instructions — 6 iteration principles, loop protocol	Human

12 Built-in Operators

cs_rank cs_zscore ts_rank rolling_corr rolling_cov rolling_std rolling_sum rolling_min rolling_max delta delay decay_linear

16 Data Columns

open high low close volume vwap returns adv5–adv180

🔬 Experiment Results

30+ iterations, 48 factors generated, 0 crashes.

Highlight	Factor	Value
🥇 Best predictor	`hl_range`	IC = 0.0581
🥇 Strongest new	`cs_zscore_vol`	IC = 0.0537, TO = 0.922
🥈 Most consistent	`ts_rank_vol`	IR = 0.49
🥉 Cheapest to trade	`open_vwap_dev`	TO = 0.994

📖 Full Experiment Report (Chinese) 📖 Latest Discovery Report — 10 new factors, 4 frontier domination wins

✍️ Writing a Factor

from prepare import Factor, ops

class Factor001(Factor):
    name = "momentum_5d"

    def compute(self, df):
        m = df.set_index(["datetime", "symbol"])
        val = ops.cs_rank(m["close"] - ops.delay(m["close"], 5))
        return Factor.as_cs_series(df, val)

That's it. Auto-discovered on next uv run python prepare.py.

🧪 Tests

uv run pytest tests/ -v     # 31 tests: 15 ops + 7 metrics + 9 Pareto

📚 Documentation

	Language	Content
README_ZH.md	中文	完整项目文档
REPORT.md	中文	30 轮实验详细分析
CONTEXT.md	EN	Domain glossary
program.md	EN	Agent instruction file

⚡ Design Principles

Single edit surface — agent only touches factors.py
Immutable evaluation — prepare.py never changes
Pareto optimization — multi-objective, not single-number
Factor-count budget — 10 factors/experiment, ~60/hour
Simplicity bias — 3-line factor at IC=0.05 > 30-line at IC=0.051

📄 License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github/workflows		.github/workflows
assets		assets
data		data
docs		docs
tests		tests
.gitignore		.gitignore
CONTEXT.md		CONTEXT.md
README.md		README.md
factors.py		factors.py
generate_visuals.py		generate_visuals.py
index.html		index.html
pareto_frontier.json		pareto_frontier.json
prepare.py		prepare.py
program.md		program.md
progress.log		progress.log
pyproject.toml		pyproject.toml
run_loop.py		run_loop.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Alpha Autoresearch

💡 What is this?

🚀 Quick Start

📊 Three First-Principles Metrics

🏗️ Architecture

12 Built-in Operators

16 Data Columns

🔬 Experiment Results

✍️ Writing a Factor

🧪 Tests

📚 Documentation

⚡ Design Principles

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Alpha Autoresearch

💡 What is this?

🚀 Quick Start

📊 Three First-Principles Metrics

🏗️ Architecture

12 Built-in Operators

16 Data Columns

🔬 Experiment Results

✍️ Writing a Factor

🧪 Tests

📚 Documentation

⚡ Design Principles

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages