Skip to content

alpha-101: extend with more alphas (starting with Alpha #2) #28

@gangtao

Description

@gangtao

Scope

Extend the existing apps/alpha-101/ app — currently focused on WorldQuant Alpha #1 — to also implement additional formulas from the 101 Formulaic Alphas paper. Start with Alpha #2:

```
−1 × correlation(rank(delta(log(volume), 2)), rank((close − open) / open), 6)
```

The two alphas share the same upstream data pipeline (random source → MV → market_data → bars), so the goal is to rewrite the shared pieces once rather than duplicate.

Required pipeline changes

  1. Random source (`ddl/001`) — add a `volume` field (random integer or log-normal).
  2. Persistent stream (`ddl/002`) — add a `volume` column.
  3. MV (`ddl/003`) — pass volume through.
  4. `v_bars` (`ddl/004`) — expose `open`, `close`, and `volume` per bucket (currently only `close`).

Alpha #1's downstream views (`v_features` → `v_ts_argmax_5` → `v_alpha_1` → `v_backtest`) need to keep working with the new schema. Live verification step.

New views for Alpha #2

  • `v_features_2` — per stock per bucket: `intraday_ret = (close − open) / open` and `log_vol_delta_2 = log(volume_t) − log(volume_{t−2})`.
  • `v_alpha_2` — per bucket compute cross-sectional ranks of both features (using the mean-zero rank pattern from Alpha Machine Learning Feature Pipeline App #1), then per stock compute the rolling 6-bucket Pearson correlation between the two rank series, then negate.
  • `v_backtest_2` — same shape as `v_backtest` for Alpha Machine Learning Feature Pipeline App #1: `pnl = lag(alpha_2) × returns`.

Dashboards

Either extend the existing two dashboards with Alpha #2 panels, or add two more dashboards (`Alpha #2 Live`, `Alpha #2 Backtest`) mirroring the Alpha #1 structure. Recommend the second — keeps each alpha's panels coherent.

Acceptance

  • Single `.tpapp` install brings up both alphas
  • `v_bars` exposes `open`, `close`, `volume`; existing Alpha Machine Learning Feature Pipeline App #1 pipeline still works
  • `v_alpha_2` emits values bounded in `[−1, 1]` (it's a correlation)
  • Backtest reports for both alphas; both produce ~null result on synthetic data (no edge in random ticks)
  • README documents both alphas + the shared pipeline

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions