A harness-based framework for WorldQuant-style alpha research agents.
Agent generates candidates -> harness records, gates, evaluates, remembers, and evolves -> human explicitly selects what can reach real submission.
中文说明 · Quick Start · Visual Guide · Public Demo · Alpha-GPT Harness · Alpha Search Memory · Harness Contract · Agent Roles · Architecture · API · MCP · WQ Workflow · Safety
worldquant-harness is not a one-shot alpha generator. It is an explicit-submit, memory-driven Alpha-GPT-style research harness for WorldQuant-oriented alpha workflows.
The agent can propose hypotheses, candidate specs, batches, and reviews. The harness owns the lifecycle: candidate identity, sandbox execution, no-submit gates, review queues, rejection reasons, historical memory, profile evolution, and the explicit boundary before any real WQ BRAIN action.
中文摘要:本项目不是一次性的 alpha 生成器,而是一个带显式提交边界、可复盘记忆、可审阅工件的 WorldQuant 风格研究 harness。Agent 可以提出假设和候选,但真实提交必须由人工明确选择。
This project is not affiliated with or endorsed by WorldQuant or WorldQuant BRAIN. Review Disclaimer, Security, and Responsible Use before connecting credentials or publishing artifacts.
Most AI quant workflows stop at idea -> expression -> backtest. That leaves the hard parts outside the system: traceability, rejection memory, duplicate control, platform boundary control, and reproducible review.
worldquant-harness treats factor mining as a controlled research loop:
| Problem | Harness response |
|---|---|
| Candidate batches become hard to audit | Every candidate gets a stable ID and lifecycle artifacts |
| Failed ideas are repeated | Failures become structured memory and next-round constraints |
| Submission boundaries become ambiguous | Public demo, sandbox, presubmit, check-only, and real submit are separated |
| Agent context is fragile | Notes, events, review queues, and profile patches are persisted |
| Public releases can leak private work | Demo artifacts and visual packs are synthetic or sanitized |
| Layer | Responsibility |
|---|---|
| Agent interface | Turns a research brief into candidate batches through MCP tools, CLI scripts, or REST calls |
| Harness control plane | Assigns stable candidate identity, runs sandbox evaluation, applies presubmit gates, and builds a review queue |
| Memory and evolution | Converts lifecycle events, rejection reasons, reference context, and harness scores into next-run constraints |
| Submit boundary | Keeps public demo and sandbox paths no-submit by default; real WQ BRAIN submission requires credentials and an explicit command |
The 2026-07 update adds a semantic Alpha-GPT layer above the existing harness: hypothesis records, constrained candidate specs, review decisions, reflection memory, and explicit submit evidence are now first-class artifacts. Community triage and local WQ run history are converted into reusable skill memory instead of staying in chat context.
中文架构说明:新的设计把系统拆成三层:底层 harness contract 管生命周期和 no-submit 边界;Alpha-GPT 语义层管假设、候选规格、审阅和反思;memory 层把社区经验和本地运行轨迹转成可复用的 skills、repair queue 和 submit/check queue。
The default public path does not submit anything. Real WQ BRAIN actions require explicit credentials and explicit submission commands.
The public demo is the reproducible contract. It uses synthetic fixtures and guarded adapters, so it does not require WQ BRAIN, DeepSeek, Wind, or private market data.
git clone https://github.com/gyx09212214-prog/worldquant-harness.git
cd worldquant-harness
pip install -e ".[dev]"
python scripts/run_public_harness_demo.py --output-root reports/public_harness_demo
python scripts/validate_public_harness_artifacts.py reports/public_harness_demo
python scripts/run_public_harness_eval.py --output-root reports/public_harness_evalThe demo writes a complete no-submit research bundle:
| Artifact | Purpose |
|---|---|
candidate_specs.jsonl |
Candidate source, tags, and design intent |
hypotheses.jsonl |
Alpha-GPT-style research hypothesis |
alpha_gpt_candidate_specs.jsonl |
Candidate specs linked to placeholder templates, bindings, constraints, and hypothesis |
simulation_results.jsonl |
Guarded adapter outcomes |
review_queue.jsonl |
Candidates queued for gate review |
review_decisions.jsonl |
Promote/retry/reject decisions for the Alpha-GPT loop |
presubmit_ready_sequential.jsonl |
Accepted candidates |
presubmit_rejected.jsonl |
Rejection reasons and blocker memory |
alpha_lifecycle_events.jsonl |
Append-only lifecycle trace |
submit_evidence.json |
Explicit-submit boundary evidence; public eval records no real submit attempt |
eval_summary.json |
Harness score and gate decision |
evolution_result.json |
Next-generation profile candidate |
The smallest no-submit Alpha-GPT loop does not need WQ BRAIN credentials:
python scripts/wq_alpha_gpt_workflow.py demo --topic "analyst revision momentum"It writes hypothesis, placeholder template, candidate spec, local validation,
review queue, reflection memory, profile patch, and submit-evidence artifacts
under reports/examples/alpha_gpt_demo/.
This branch adds the first full Alpha-GPT-style memory workflow. The important change is architectural: research state now moves through explicit semantic records instead of remaining as loose prompt text.
本次更新的重点是把研究过程结构化:从研究假设、候选生成、审阅决策、失败记忆、profile patch 到显式提交证据,每一步都有可审计 artifact。
| Area | What changed |
|---|---|
| Alpha-GPT harness | Adds hypotheses.jsonl, alpha_gpt_candidate_specs.jsonl, review_decisions.jsonl, reflection_records.jsonl, and submit_evidence.json to the public contract path |
| Community skill memory | Converts WQ Community triage and forum memory into reusable gates plus refined failure-action repair routes |
| WQ alpha search memory | Merges local simulation/check/submit artifacts into a trajectory ledger, family scores, near-pass repair queue, and submit/check target queues |
| Explicit submit loop | Adds a local candidate-file driven simulation/submit script; it still requires credentials and explicit non---no-submit execution |
| Iteration audit | Writes iteration_audit.jsonl, iteration_audit_summary.json, and iteration_audit.md by default so each run explains tweaks, results, failure causes, and next actions |
| Code review cleanup | Consolidates JSON artifact helpers for the new memory workflow and tightens configuration-driven scoring behavior |
Key reusable skills / 关键 skills:
| Skill | Role |
|---|---|
community::near_pass_repair |
Backward-compatible near-pass route; now points into metric overlay and correlation family-shift repair buckets |
community::alpha_template_transform |
Backward-compatible template route; now points into the direct-template clone blocker |
community::operation_attribution |
Backward-compatible operator route; now points into turnover/density, unit probe, and concentration repair buckets |
community::submission_gate |
Backward-compatible submit gate; now points into stale-check, duplicate, and similarity-blocking buckets |
community_failure::* |
Refined failure-action skills distilled from forum/submission records: metric near-pass overlay repair, correlation family shift, template clone blocker, low-coverage/concentration repair, turnover/density repair, pending-check gating, duplicate blocking, and platform/unit probes |
near_sc_cutoff_settings_repair |
Freeze a strong parent expression and vary neutralization/decay/truncation near SELF_CORRELATION cutoff |
top5_high_score_low_corr_submit |
Rank explicit submit/check work by WQ score, eligibility, and correlation risk |
Update commits on this branch:
| Commit | Purpose |
|---|---|
485a58d feat: add alpha-gpt harness memory workflow |
Adds the Alpha-GPT semantic artifacts, community skill memory, alpha search memory, docs, scripts, and tests |
chore: document and tidy alpha-gpt harness workflow |
Documents the new design in this README and performs focused review cleanup before pushing to GitHub |
For details, see Alpha-GPT Harness, Alpha Search Memory, and WQ Workflow.
The visual pack is generated from public-safe artifacts. It is meant to explain the harness rather than disclose private research.
| View | What it shows |
|---|---|
| Overview | Human goal -> agent -> harness -> memory -> review |
| Architecture | Agent interface, harness control plane, memory feedback, and submit boundary |
| Artifact lifecycle | Candidate specs, simulations, review queues, and memory |
| Public demo trace | Candidate movement through ready and rejected states |
| Memory feedback | How blockers become future constraints |
| Quality dashboard | Submitted and generated quality review |
| Submit boundary | No-submit, check-only, and real submit separation |
| Strategy display | Selected active validation metrics without alpha expressions |
| Release boundary | Public, private, and review-required artifacts |
The public demo proves the harness contract. The examples below are selected active validation records for strategy display. They are included to show that harness-controlled research can reach submit-quality candidates. Exact alpha expressions and platform code panels are intentionally omitted.
| Alpha ID | Status | WQ Sharpe | WQ Fitness | WQ Returns | Turnover | Drawdown | Neutralization |
|---|---|---|---|---|---|---|---|
3qz93wP6 |
ACTIVE | 2.38 | 1.60 | 18.94% | 41.79% | 6.49% | MARKET |
3q7Rew3e |
ACTIVE | 2.08 | 1.68 | 10.32% | 15.87% | 4.47% | SUBINDUSTRY |
YPN9QR0M |
ACTIVE | 2.06 | 1.72 | 8.70% | 10.93% | 3.36% | SUBINDUSTRY |
akd1QGp1 |
ACTIVE | 1.81 | 1.78 | 12.08% | 11.87% | 5.93% | INDUSTRY |
Historical validation metrics. Alpha expressions are not published.
Past factor performance does not guarantee future returns. These validation records are not required to run the open-source demo and do not constitute investment advice.
The agent can explore, but it works inside a contract. The harness owns state and creates reviewable artifacts.
| Agent action | Harness control |
|---|---|
| Generate candidates | Stable candidate IDs, source tags, field and operator extraction |
| Run experiments | Sandbox artifacts and no-submit defaults |
| Interpret results | Structured review queue and rejection reasons |
| Learn from failures | Memory records, blocked signatures, field-family stats |
| Plan the next batch | Profile evolution and explicit child experiment |
| Submit | Human-selected alpha IDs only |
The executable contract is implemented through HarnessRun, HarnessStep, HarnessEvent, ArtifactRef, DecisionGate, MemoryDelta, ProfilePatch, and the Alpha-GPT semantic records for hypothesis, candidate spec, review decision, reflection, and submit evidence. See Agent Harness Contract, Alpha-GPT Harness, and Agent Roles.
| Area | Capability |
|---|---|
| Harness orchestration | Public no-submit eval, sandbox experiments, presubmit gates, lifecycle traces |
| Memory | History ingest, blocker signatures, community skills, trajectory ledgers, factor-family stats, profile evolution |
| Agent access | MCP tools, CLI scripts, REST API, monitoring UI |
| Review | Quality review dashboards, Alpha-GPT review decisions, submit efficiency reports, ready/rejected queues |
| WQ boundary | Check-only inspection and explicit credentialed submission commands |
| Local research | Local parser, backtest, anti-overfit checks, walk-forward validation |
MCP tool surface
The MCP server exposes harness and research operations for agent workflows, including public harness runs, presubmit evaluation, history ingestion, memory maintenance, status inspection, local backtesting, factor scoring, diagnostics, anti-overfit checks, rolling validation, and explicit WQ BRAIN check/submit commands.
Start the HTTP server:
python -m worldquant_harness --transport httpUse MCP from Claude Code or Claude Desktop:
{
"mcpServers": {
"worldquant-harness": {
"command": "python",
"args": ["-m", "worldquant_harness"]
}
}
}For the full local setup, Windows notes, optional PostgreSQL, optional DeepSeek configuration, and API examples, use Quick Start.
worldquant-harness/
├── worldquant_harness/ # Backend, parser, harness contracts, MCP, API
├── frontend/ # React monitoring dashboard
├── scripts/ # Public demo, visual pack, review, WQ workflows
├── tests/ # Parser, backtest, workflow, API, harness tests
├── example_factor/ # Sanitized historical validation screenshots
└── docs/ # Architecture, API, MCP, workflow, safety docs
- Use your own credentials for external services and follow their terms, policies, rate limits, and data restrictions.
- Keep
.env,.secrets/, local databases, raw platform exports, submit/check ledgers, and full research reports private. - Do not publish alpha expressions, private platform exports, or unsanitized screenshots without review.
- Do not present generated factors, screenshots, or backtests as guaranteed returns.
- Review Open Source Audit and Release Checklist before publishing a fork or release.
MIT. Copyright and attribution details are recorded in NOTICE.
This public release is maintained as worldquant-harness. Derivative works should retain the copyright notice and comply with the MIT License terms.
See NOTICE, DISCLAIMER, SECURITY, and CODE_OF_CONDUCT for details.