Skip to content

gyx09212214-prog/worldquant-harness

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

161 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

worldquant-harness

A harness-based framework for WorldQuant-style alpha research agents.

Agent generates candidates -> harness records, gates, evaluates, remembers, and evolves -> human explicitly selects what can reach real submission.

CI Python FastAPI React License

中文说明 · Quick Start · Visual Guide · Public Demo · Alpha-GPT Harness · Alpha Search Memory · Harness Contract · Agent Roles · Architecture · API · MCP · WQ Workflow · Safety

worldquant-harness overview

What It Is

worldquant-harness is not a one-shot alpha generator. It is an explicit-submit, memory-driven Alpha-GPT-style research harness for WorldQuant-oriented alpha workflows.

The agent can propose hypotheses, candidate specs, batches, and reviews. The harness owns the lifecycle: candidate identity, sandbox execution, no-submit gates, review queues, rejection reasons, historical memory, profile evolution, and the explicit boundary before any real WQ BRAIN action.

中文摘要:本项目不是一次性的 alpha 生成器,而是一个带显式提交边界、可复盘记忆、可审阅工件的 WorldQuant 风格研究 harness。Agent 可以提出假设和候选,但真实提交必须由人工明确选择。

This project is not affiliated with or endorsed by WorldQuant or WorldQuant BRAIN. Review Disclaimer, Security, and Responsible Use before connecting credentials or publishing artifacts.

Why Harness

Most AI quant workflows stop at idea -> expression -> backtest. That leaves the hard parts outside the system: traceability, rejection memory, duplicate control, platform boundary control, and reproducible review.

worldquant-harness treats factor mining as a controlled research loop:

Problem Harness response
Candidate batches become hard to audit Every candidate gets a stable ID and lifecycle artifacts
Failed ideas are repeated Failures become structured memory and next-round constraints
Submission boundaries become ambiguous Public demo, sandbox, presubmit, check-only, and real submit are separated
Agent context is fragile Notes, events, review queues, and profile patches are persisted
Public releases can leak private work Demo artifacts and visual packs are synthetic or sanitized

Architecture

worldquant-harness system architecture

Layer Responsibility
Agent interface Turns a research brief into candidate batches through MCP tools, CLI scripts, or REST calls
Harness control plane Assigns stable candidate identity, runs sandbox evaluation, applies presubmit gates, and builds a review queue
Memory and evolution Converts lifecycle events, rejection reasons, reference context, and harness scores into next-run constraints
Submit boundary Keeps public demo and sandbox paths no-submit by default; real WQ BRAIN submission requires credentials and an explicit command

The 2026-07 update adds a semantic Alpha-GPT layer above the existing harness: hypothesis records, constrained candidate specs, review decisions, reflection memory, and explicit submit evidence are now first-class artifacts. Community triage and local WQ run history are converted into reusable skill memory instead of staying in chat context.

中文架构说明:新的设计把系统拆成三层:底层 harness contract 管生命周期和 no-submit 边界;Alpha-GPT 语义层管假设、候选规格、审阅和反思;memory 层把社区经验和本地运行轨迹转成可复用的 skills、repair queue 和 submit/check queue。

The default public path does not submit anything. Real WQ BRAIN actions require explicit credentials and explicit submission commands.

Public Demo

The public demo is the reproducible contract. It uses synthetic fixtures and guarded adapters, so it does not require WQ BRAIN, DeepSeek, Wind, or private market data.

git clone https://github.com/gyx09212214-prog/worldquant-harness.git
cd worldquant-harness
pip install -e ".[dev]"
python scripts/run_public_harness_demo.py --output-root reports/public_harness_demo
python scripts/validate_public_harness_artifacts.py reports/public_harness_demo
python scripts/run_public_harness_eval.py --output-root reports/public_harness_eval

The demo writes a complete no-submit research bundle:

Artifact Purpose
candidate_specs.jsonl Candidate source, tags, and design intent
hypotheses.jsonl Alpha-GPT-style research hypothesis
alpha_gpt_candidate_specs.jsonl Candidate specs linked to placeholder templates, bindings, constraints, and hypothesis
simulation_results.jsonl Guarded adapter outcomes
review_queue.jsonl Candidates queued for gate review
review_decisions.jsonl Promote/retry/reject decisions for the Alpha-GPT loop
presubmit_ready_sequential.jsonl Accepted candidates
presubmit_rejected.jsonl Rejection reasons and blocker memory
alpha_lifecycle_events.jsonl Append-only lifecycle trace
submit_evidence.json Explicit-submit boundary evidence; public eval records no real submit attempt
eval_summary.json Harness score and gate decision
evolution_result.json Next-generation profile candidate

Alpha-GPT Dry Run

The smallest no-submit Alpha-GPT loop does not need WQ BRAIN credentials:

python scripts/wq_alpha_gpt_workflow.py demo --topic "analyst revision momentum"

It writes hypothesis, placeholder template, candidate spec, local validation, review queue, reflection memory, profile patch, and submit-evidence artifacts under reports/examples/alpha_gpt_demo/.

2026-07 Update / 本次更新

This branch adds the first full Alpha-GPT-style memory workflow. The important change is architectural: research state now moves through explicit semantic records instead of remaining as loose prompt text.

本次更新的重点是把研究过程结构化:从研究假设、候选生成、审阅决策、失败记忆、profile patch 到显式提交证据,每一步都有可审计 artifact。

Area What changed
Alpha-GPT harness Adds hypotheses.jsonl, alpha_gpt_candidate_specs.jsonl, review_decisions.jsonl, reflection_records.jsonl, and submit_evidence.json to the public contract path
Community skill memory Converts WQ Community triage and forum memory into reusable gates plus refined failure-action repair routes
WQ alpha search memory Merges local simulation/check/submit artifacts into a trajectory ledger, family scores, near-pass repair queue, and submit/check target queues
Explicit submit loop Adds a local candidate-file driven simulation/submit script; it still requires credentials and explicit non---no-submit execution
Iteration audit Writes iteration_audit.jsonl, iteration_audit_summary.json, and iteration_audit.md by default so each run explains tweaks, results, failure causes, and next actions
Code review cleanup Consolidates JSON artifact helpers for the new memory workflow and tightens configuration-driven scoring behavior

Key reusable skills / 关键 skills:

Skill Role
community::near_pass_repair Backward-compatible near-pass route; now points into metric overlay and correlation family-shift repair buckets
community::alpha_template_transform Backward-compatible template route; now points into the direct-template clone blocker
community::operation_attribution Backward-compatible operator route; now points into turnover/density, unit probe, and concentration repair buckets
community::submission_gate Backward-compatible submit gate; now points into stale-check, duplicate, and similarity-blocking buckets
community_failure::* Refined failure-action skills distilled from forum/submission records: metric near-pass overlay repair, correlation family shift, template clone blocker, low-coverage/concentration repair, turnover/density repair, pending-check gating, duplicate blocking, and platform/unit probes
near_sc_cutoff_settings_repair Freeze a strong parent expression and vary neutralization/decay/truncation near SELF_CORRELATION cutoff
top5_high_score_low_corr_submit Rank explicit submit/check work by WQ score, eligibility, and correlation risk

Update commits on this branch:

Commit Purpose
485a58d feat: add alpha-gpt harness memory workflow Adds the Alpha-GPT semantic artifacts, community skill memory, alpha search memory, docs, scripts, and tests
chore: document and tidy alpha-gpt harness workflow Documents the new design in this README and performs focused review cleanup before pushing to GitHub

For details, see Alpha-GPT Harness, Alpha Search Memory, and WQ Workflow.

Visual Pack

The visual pack is generated from public-safe artifacts. It is meant to explain the harness rather than disclose private research.

View What it shows
Overview Human goal -> agent -> harness -> memory -> review
Architecture Agent interface, harness control plane, memory feedback, and submit boundary
Artifact lifecycle Candidate specs, simulations, review queues, and memory
Public demo trace Candidate movement through ready and rejected states
Memory feedback How blockers become future constraints
Quality dashboard Submitted and generated quality review
Submit boundary No-submit, check-only, and real submit separation
Strategy display Selected active validation metrics without alpha expressions
Release boundary Public, private, and review-required artifacts

Strategy Display Validation

The public demo proves the harness contract. The examples below are selected active validation records for strategy display. They are included to show that harness-controlled research can reach submit-quality candidates. Exact alpha expressions and platform code panels are intentionally omitted.

Sanitized strategy display alpha metrics without expressions

Alpha ID Status WQ Sharpe WQ Fitness WQ Returns Turnover Drawdown Neutralization
3qz93wP6 ACTIVE 2.38 1.60 18.94% 41.79% 6.49% MARKET
3q7Rew3e ACTIVE 2.08 1.68 10.32% 15.87% 4.47% SUBINDUSTRY
YPN9QR0M ACTIVE 2.06 1.72 8.70% 10.93% 3.36% SUBINDUSTRY
akd1QGp1 ACTIVE 1.81 1.78 12.08% 11.87% 5.93% INDUSTRY

Historical validation metrics. Alpha expressions are not published.

Past factor performance does not guarantee future returns. These validation records are not required to run the open-source demo and do not constitute investment advice.

Agent Contract

The agent can explore, but it works inside a contract. The harness owns state and creates reviewable artifacts.

Agent action Harness control
Generate candidates Stable candidate IDs, source tags, field and operator extraction
Run experiments Sandbox artifacts and no-submit defaults
Interpret results Structured review queue and rejection reasons
Learn from failures Memory records, blocked signatures, field-family stats
Plan the next batch Profile evolution and explicit child experiment
Submit Human-selected alpha IDs only

The executable contract is implemented through HarnessRun, HarnessStep, HarnessEvent, ArtifactRef, DecisionGate, MemoryDelta, ProfilePatch, and the Alpha-GPT semantic records for hypothesis, candidate spec, review decision, reflection, and submit evidence. See Agent Harness Contract, Alpha-GPT Harness, and Agent Roles.

Core Capabilities

Area Capability
Harness orchestration Public no-submit eval, sandbox experiments, presubmit gates, lifecycle traces
Memory History ingest, blocker signatures, community skills, trajectory ledgers, factor-family stats, profile evolution
Agent access MCP tools, CLI scripts, REST API, monitoring UI
Review Quality review dashboards, Alpha-GPT review decisions, submit efficiency reports, ready/rejected queues
WQ boundary Check-only inspection and explicit credentialed submission commands
Local research Local parser, backtest, anti-overfit checks, walk-forward validation
MCP tool surface

The MCP server exposes harness and research operations for agent workflows, including public harness runs, presubmit evaluation, history ingestion, memory maintenance, status inspection, local backtesting, factor scoring, diagnostics, anti-overfit checks, rolling validation, and explicit WQ BRAIN check/submit commands.

Setup

Start the HTTP server:

python -m worldquant_harness --transport http

Use MCP from Claude Code or Claude Desktop:

{
  "mcpServers": {
    "worldquant-harness": {
      "command": "python",
      "args": ["-m", "worldquant_harness"]
    }
  }
}

For the full local setup, Windows notes, optional PostgreSQL, optional DeepSeek configuration, and API examples, use Quick Start.

Project Layout

worldquant-harness/
├── worldquant_harness/          # Backend, parser, harness contracts, MCP, API
├── frontend/                    # React monitoring dashboard
├── scripts/                     # Public demo, visual pack, review, WQ workflows
├── tests/                       # Parser, backtest, workflow, API, harness tests
├── example_factor/              # Sanitized historical validation screenshots
└── docs/                        # Architecture, API, MCP, workflow, safety docs

Responsible Use

  • Use your own credentials for external services and follow their terms, policies, rate limits, and data restrictions.
  • Keep .env, .secrets/, local databases, raw platform exports, submit/check ledgers, and full research reports private.
  • Do not publish alpha expressions, private platform exports, or unsanitized screenshots without review.
  • Do not present generated factors, screenshots, or backtests as guaranteed returns.
  • Review Open Source Audit and Release Checklist before publishing a fork or release.

License

MIT. Copyright and attribution details are recorded in NOTICE.

This public release is maintained as worldquant-harness. Derivative works should retain the copyright notice and comply with the MIT License terms.

See NOTICE, DISCLAIMER, SECURITY, and CODE_OF_CONDUCT for details.

About

Agent research harness for WorldQuant-oriented alpha workflows

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors