Skip to content

feat(universe): Phase 4.6 — survivorship-bias fix (historical S&P 500 membership)#274

Merged
dackclup merged 1 commit into
mainfrom
claude/phase-4.6-survivorship-bias-fix
May 27, 2026
Merged

feat(universe): Phase 4.6 — survivorship-bias fix (historical S&P 500 membership)#274
dackclup merged 1 commit into
mainfrom
claude/phase-4.6-survivorship-bias-fix

Conversation

@dackclup
Copy link
Copy Markdown
Owner

⚠️ STATUS: DRAFT — DO NOT MERGE WITHOUT REVIEW

First executed feature of Research Report v1.0 autonomous pipeline. P0 per Report §7.4 — credibility-bearing prerequisite for every subsequent Phase 5 (ML), 6 (Lazy Prices), 7 (NCO) validation gate.

Why this matters

Without historical S&P 500 membership, every backtest / IC / PBO / DSR computation that uses pre-today dates is silently survivorship-biased: stocks that delisted or got dropped from the index are excluded from the historical view of the universe. Hou-Xue-Zhang (2020) RFS calls this out as the PRIMARY failure mode in factor-zoo replication; McLean-Pontiff (2016) JF locks the 32% post-publication decay budget that ONLY holds under survivorship-corrected estimates.

The weekly cron's forward compute is unaffected — current S&P 500 membership remains correct for forward-looking ranking. This fix is foundational for every honest validation in subsequent Phases.

What landed

File Purpose
data/sp500_membership_historical.csv (NEW, ~55 events) Static CSV of ADD / REMOVE / RENAME events with per-row Wikipedia or S&P Dow Jones press-release source URLs. Coverage 2020-01-01 → present.
compute/ingest/historical_universe.py (NEW) members_at(as_of_date, current_universe, anchor_date) pure function returning MembershipResult(tickers, is_complete, events_applied, note). Reverses post-as-of events to compute historical universe.
compute/output/schemas.py 2 new Metadata fields: universe_membership_as_of (ISO date) + survivorship_bias_corrected (bool).
frontend/lib/types.ts TS lockstep mirror.
frontend/lib/schema-snapshot.json Regenerated via --update-snapshot.
compute/config.py SCHEMA_VERSION 0.10.6-phase4.5e0.10.7-phase4.6.
tests/test_config.py Schema-version pin updated.
tests/test_ingest/test_historical_universe.py (NEW) 20 tests covering parse helpers + members_at semantics + CSV invariants.

Hard rules preserved

  • ✅ Rule 9 — schema triple lockstep atomic (Pydantic + TS + snapshot in same commit)
  • ✅ Rule 18 — observability surface ships in SAME PR as the module
  • ✅ Rule 16 — N/A (no scoring change; no composite touched)
  • ✅ License: factual list (uncopyrightable per Feist 1991 + 17 USC 102(b)); per-row source citation
  • ✅ No new pip / npm deps
  • ✅ No live browser inference
  • ✅ No FastAPI / Postgres / Docker
  • ✅ No trade recommendations of specific tickers (CSV is factual index history only)
  • ✅ Universe S&P 500 only

Verification

Check Result
ruff check (new files) ✅ clean
python -m compute.output.schema_check ✅ Schema snapshot in sync
python -m pytest tests/test_config.py + tests/test_ingest/test_historical_universe.py ✅ 21 passed in 0.05s
Test count delta +20
Schema-version pin 0.10.6-phase4.5e → 0.10.7-phase4.6 ✅
CSV row validation all rows have ADD/REMOVE/RENAME action + http source_url ✅
Known-event regressions TSLA 2020-12-21 ✅ · SMCI 2024-03-18 ✅ · ATVI 2023-10-18 ✅ · SVB 2023-03-13 ✅

Known-event coverage in test suite

  • TSLA add 2020-12-21: members_at(2020-12-20) excludes TSLA ✅; members_at(2020-12-22) includes TSLA ✅
  • SMCI add 2024-03-18: members_at(2024-03-17) excludes SMCI ✅
  • ATVI removal 2023-10-18 (MSFT acquisition): members_at(2023-10-17) includes ATVI ✅ (would be missed by current-universe-only)
  • SVB collapse 2023-03-13: members_at(2023-03-12) includes SVB ✅ (classic survivorship-bias example)

Methodology citations

  • Hou, Xue, Zhang (2020). "Replicating Anomalies." Review of Financial Studies 33(5):2019-2133.
  • McLean, Pontiff (2016). "Does Academic Research Destroy Stock Return Predictability?" Journal of Finance 71(1):5-32.
  • License posture: Feist Publications v. Rural Tel. Service Co. (1991) — factual lists are not copyrightable.

NOT in this PR (deferred to follow-up PRs)

  • compute/validation/pbo_dsr.py integration to accept universe_provider kwarg — unblocks PBO/DSR re-runs with honest historical universes
  • compute/main.py wiring to populate Metadata.universe_membership_as_of in forward cron
  • Verify-helper Section M for membership accounting equation
  • Extending CSV coverage backward (pre-2020 events)
  • Honest re-validation of existing pillars + manipulation_index with new universe provider (may revise published baselines)

Test plan

  • ruff check . — clean (new files)
  • python -m compute.output.schema_check — clean
  • pytest tests/test_ingest/test_historical_universe.py tests/test_config.py — 21/21 pass
  • CI (Python lint+test, Frontend build) — expected green
  • Vercel preview — expected green (no frontend code touched beyond types.ts)
  • methodology-scientist subagent — verify HXZ 2020 + MP 2016 citations (FOLLOW-UP review)
  • schema-sentinel subagent — verify Pydantic ↔ TS ↔ snapshot triple sync (auto-fired by hook)
  • dependency-auditor — N/A (no new deps)
  • security-reviewer — N/A (no new env-vars, workflows, or surfaces)

Subscribe-after-open suggestion: same as PR #271 / #272 / #273 — subscribe me to PR activity for CI + review comments.

Related


Generated by Claude Code

… membership)

Closes Research Report v1.0 §7.4 — the highest-priority credibility
fix for top-1% claim. Without this module, every backtest / IC /
PBO / DSR computation using pre-today dates is silently survivorship-
biased: stocks that delisted or got dropped from the S&P 500 are
excluded from the historical view of the universe.

The weekly cron's FORWARD compute is unaffected — current S&P 500
membership remains correct for forward-looking ranking. This fix is
the foundation for HONEST validation in subsequent Phases (5 ML, 6
Lazy Prices, 7 NCO portfolio).

## Changes

- `data/sp500_membership_historical.csv` (NEW, ~55 events) — static
  CSV of ADD / REMOVE / RENAME events with per-row Wikipedia or S&P
  press-release source URLs. Coverage: 2020-01-01 → present, the
  5-year window relevant to current backtest validation. Subsequent
  PRs may extend coverage backward.

- `compute/ingest/historical_universe.py` (NEW) —
  `members_at(as_of_date, current_universe, anchor_date=None) ->
  MembershipResult` pure function. Walks events in reverse from today
  back to as_of_date, undoing each ADD/REMOVE that post-dates the
  query. Returns `MembershipResult(tickers, as_of_date, anchor_date,
  is_complete, events_applied, note)` — `is_complete=False` for dates
  before `EARLIEST_EVENT_DATE = 2020-01-01` with a loud warning log
  (degraded mode, NOT silent fallback).

- `compute/output/schemas.py` — Rule 18 observability surface:
  `Metadata.universe_membership_as_of: str | None` (ISO date of the
  historical snapshot used by this compute run; forward cron = today)
  + `Metadata.survivorship_bias_corrected: bool | None` (True when
  the lookup was honest, False when degraded to current-only fallback).

- `frontend/lib/types.ts` — TS-side lockstep mirror of the 2 new
  fields.

- `frontend/lib/schema-snapshot.json` — regenerated via
  `python -m compute.output.schema_check --update-snapshot`.

- `compute/config.py` — `SCHEMA_VERSION` `0.10.6-phase4.5e` →
  `0.10.7-phase4.6` (PATCH bump, additive fields, non-breaking).

- `tests/test_config.py` — schema-version pin updated 0.10.6 →
  0.10.7-phase4.6 with full rationale docstring.

- `tests/test_ingest/test_historical_universe.py` (NEW) — 20 tests:
  - 5 parse-helper edge cases (well-formed / comment / blank /
    invalid action / malformed date)
  - 2 anchor / future-date guard cases
  - 5 known-historical-event regression tests (TSLA 2020-12-21,
    SMCI 2024-03-18, ATVI 2023-10-18 acquisition by MSFT, SVB
    2023-03-13 collapse)
  - 2 pre-coverage degradation cases (is_complete=False below
    EARLIEST_EVENT_DATE; exact-boundary case is_complete=True)
  - 3 list_known_events accessor tests (sorted, date filter, count
    invariant)
  - 3 CSV-structure invariants (action in valid set, every row
    has http source_url, coverage count >= 30)

## Hard rules preserved

- Rule 9 (schema triple) — atomic Pydantic + TS + snapshot bump ✅
- Rule 18 (observability-before-wiring) — diagnostic surface shipped
  in same PR as the module ✅
- Rule 16 (composite formula sacred) — N/A (no scoring change) ✅
- License: factual list (uncopyrightable per Feist 1991 + 17 USC
  102(b)); Wikipedia source CC BY-SA 4.0, our compilation = original
  work under project MIT license; per-row citation included ✅
- No new pip / npm deps ✅
- No live browser inference ✅
- No FastAPI / Postgres / Docker ✅
- No trade recommendations ✅

## Methodology citations

- Hou, Xue, Zhang (2020). "Replicating Anomalies." Review of
  Financial Studies 33(5):2019-2133. Replication crisis evidence
  emphasizing survivorship as PRIMARY failure mode in factor-zoo
  work.
- McLean, Pontiff (2016). "Does Academic Research Destroy Stock
  Return Predictability?" Journal of Finance 71(1):5-32. 32% post-
  publication decay; survivorship correction is the first-order
  honest-estimate adjustment per the Report's hard-constraint locked
  decay budget.

## Verification

- `ruff check compute/ingest/historical_universe.py
  tests/test_ingest/test_historical_universe.py` — clean
- `python -m compute.output.schema_check` — Schema snapshot in sync
- `python -m pytest tests/test_config.py::test_schema_version_is_phase4_6
  tests/test_ingest/test_historical_universe.py` — 21 passed
- Test suite count: +20 (new offline tests)

## NOT in this PR (deferred to follow-up PRs)

- `compute/validation/pbo_dsr.py` integration to ACCEPT
  `universe_provider` kwarg — separate PR after this lands;
  unblocks PBO/DSR re-runs with honest historical universes
- `compute/main.py` wiring to populate `Metadata.universe_membership_as_of`
  in forward cron — separate PR
- Verify-helper Section M for membership accounting equation
- Extending CSV coverage backward (pre-2020 events)
- Honest re-validation of existing pillars + manipulation_index
  using new universe provider (likely revises some published numbers)
@vercel
Copy link
Copy Markdown

vercel Bot commented May 27, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
quantrank Ready Ready Preview, Comment May 27, 2026 11:35am

@github-actions
Copy link
Copy Markdown
Contributor

Pre-merge production simulation

Field Value
Duration 319s
Universe size 500
Schema version 0.10.7-phase4.6
Compute commit 20685bb8dc77cb76a6e5235c9505014f329db95b
PR-branch output pr-274-compute-output (14-day retention)

Diff vs main

Field Main PR Δ
Universe size 502 500 -2
Schema version 0.10.6-phase4.5e 0.10.7-phase4.6 ⚠️ bumped

Main baseline: 2026-05-26T23:19:25Z (0.5 days old)

Top-10 movers (sorted by |Δcomposite_score|)

Ticker PR rank main rank Δrank PR score main score Δscore
AXON 500 500 +0 25.73 27.06 -1.33
NDSN 187 203 +16 53.24 52.63 +0.61
EXPD 52 45 -7 61.84 62.38 -0.54
PM 289 299 +10 48.90 48.37 +0.53
UNP 139 143 +4 56.07 55.68 +0.39
CHD 236 222 -14 51.51 51.88 -0.37
MNST 85 81 -4 58.76 59.10 -0.34
ATO 372 378 +6 44.26 43.92 +0.34
MDLZ 484 486 +2 34.18 33.87 +0.31
CPB 445 448 +3 39.01 38.70 +0.31

Tickers in main only (2): CASY, CMI

@dackclup dackclup marked this pull request as ready for review May 27, 2026 11:48
@dackclup dackclup merged commit f288884 into main May 27, 2026
5 checks passed
@dackclup dackclup deleted the claude/phase-4.6-survivorship-bias-fix branch May 27, 2026 11:48
dackclup added a commit that referenced this pull request May 27, 2026
…bo_dsr gates (#275)

Closes the next-task gap surfaced by Research Report v1.0 §7.4 follow-up:
PR #274 landed `compute/ingest/historical_universe.members_at()` as a
library but nothing called it. This PR threads an optional
`universe_provider` callable through `factor_passes_gates()` so the
returned metrics dict carries honest universe provenance — closing the
loop between the historical-membership module and the validation
gates that every Phase 4+ (OSAP / JKP / Qlib / IPCA) and Phase 5
(ML meta-learner) candidate factor must pass.

## What changed

- `compute/validation/pbo_dsr.py::factor_passes_gates()` — 3 new
  optional kwargs (`universe_provider`, `as_of_date`, `current_universe`).
  When all 3 are passed, the function calls the provider once and
  enriches the metrics dict with `universe_as_of` (ISO date),
  `universe_size` (int), and `survivorship_bias_corrected` (bool).
  When the provider returns `is_complete=False` (e.g., pre-2020
  date), the flag flips False AND a warning is logged. When the
  provider raises, behavior degrades gracefully (3 fields stay None,
  validation still completes from the supplied returns_matrix).

- `compute/validation/pbo_dsr.py::today_utc_date()` — small helper
  for callers wiring `as_of_date=today_utc_date()` in forward-cron
  validation paths.

- `tests/test_validation/test_pbo_dsr.py` — 6 new tests covering:
  * backward-compat (no universe kwargs → 3 new keys = None)
  * happy path (members_at() with date well inside coverage)
  * degraded path (pre-EARLIEST_EVENT_DATE → is_complete=False)
  * provider-raises (graceful degradation, validation still runs)
  * partial-kwargs warning (caller passes only some of the 3)
  * today_utc_date helper smoke

## Caller migration

Pre-Phase-4.6 callers (osap-integration / jkp-integration / qlib
scout / ipca scout PRs already merged) are byte-identical — all 3
new kwargs default to None and the function returns the legacy
10-key metrics dict augmented with 3 None values. No caller code
needs to change.

NEW callers (Phase 4i.1+ integration PRs + Phase 5 ML meta-learner)
should pass:

    from compute.ingest.historical_universe import members_at
    from compute.ingest.universe import get_sp500_constituents

    current = frozenset(get_sp500_constituents().ticker)
    passes, metrics = factor_passes_gates(
        factor_returns, returns_matrix,
        n_trials=n_trials,
        universe_provider=members_at,
        as_of_date=as_of_date,  # backtest cutoff or today_utc_date()
        current_universe=current,
    )

The metrics dict can then be threaded into
`compute/output/schemas.py::Metadata.universe_membership_as_of` +
`Metadata.survivorship_bias_corrected` at writer time (separate
follow-up PR — `compute/main.py` wiring stays out of this scope).

## Why this matters (anchor)

Hou-Xue-Zhang (2020) RFS replication-crisis evidence emphasizes
survivorship as the PRIMARY failure mode in factor-zoo work. Every
PBO/DSR number that doesn't carry universe provenance is suspect
— pre-Phase-4.6 we couldn't tell, post-Phase-4.6 we can. This PR
is the first place that knowledge surfaces in the validation metric
dict.

## Hard rules preserved

- Rule 9 (schema triple) — N/A; no Pydantic / TS / snapshot touched
  in this PR. Metadata wiring is deferred to a separate writer PR
  per scope discipline.
- Rule 16 (composite formula sacred) — N/A; no scoring change.
- Rule 18 (observability before wiring) — diagnostic surface
  (metrics dict keys) ships in same PR as the integration; consumer
  code (writer) lands next, after this metric surface lives in
  production for ≥ 1 cron cycle.
- License: no new deps. Pure forward TYPE_CHECKING import for type
  hinting `MembershipResult` (zero runtime cost).

## Verification

- `ruff check` — clean (linter moved `Callable` to
  `collections.abc` per modern convention; deliberate, not reverted)
- `pytest tests/test_validation/test_pbo_dsr.py
  tests/test_ingest/test_historical_universe.py
  tests/test_config.py` — 59 passed in 9.34s
- Test count delta: +6 (universe-provider integration cases)
- Backward-compat case explicitly tested (no universe kwargs →
  legacy behavior preserved)

## NOT in this PR (deferred to follow-ups)

- `compute/main.py` writer wiring to populate
  `Metadata.universe_membership_as_of` + `survivorship_bias_corrected`
  in the forward-cron output JSON
- Re-validation of existing pillars + `manipulation_index` with
  the historical universe (likely revises some published baselines
  DOWNWARD — explicit honest-correction PR)
- Verify-helper Section M for universe-provenance accounting
  equation

Co-authored-by: Claude <noreply@anthropic.com>
dackclup added a commit that referenced this pull request May 27, 2026
…tadata in forward cron (#276)

Closes the last leg of the Phase 4.6 chain. PR #274 landed the
`historical_universe.members_at()` module + 2 nullable Metadata
fields. PR #275 wired `universe_provider` into `pbo_dsr.factor_passes_gates()`
so validation gates carry honest provenance. This PR makes the
forward-cron `metadata.json` output ACTUALLY populate those fields
instead of leaving them None.

## What changed

- `compute/main.py` — `Metadata(...)` construction now passes:
  - `universe_membership_as_of=now.date().isoformat()` (today's date
    — forward cron scores as-of today)
  - `survivorship_bias_corrected=True` (today's S&P 500 IS the
    honest universe for an as-of-today query, per the PR #274
    schema docstring semantic — True means "this output's universe
    assumption is honest for its as_of_date")

- `tests/test_output/test_writer.py` — 2 new round-trip tests:
  - Phase 4.6 happy path: both fields survive Pydantic → JSON
  - Legacy snapshot back-compat: when neither field is passed
    (pre-0.10.7 caller pattern), Pydantic defaults to None and JSON
    writes nulls

## Hard rules preserved

- ✅ Rule 9 (schema triple) — no schema change in this PR (fields
  already in schemas.py + types.ts + snapshot from PR #274)
- ✅ Rule 16 — N/A (no scoring change)
- ✅ Rule 18 — observability surface from PR #274 is now actually
  populated; consumers can branch on it
- ✅ No new deps
- ✅ No new env-vars

## Verification

- `ruff check compute/main.py tests/test_output/test_writer.py` — clean
- `python -m compute.output.schema_check` — Schema snapshot in sync
- `python -m pytest tests/test_output/test_writer.py -k metadata` —
  4 passed (2 existing + 2 new)

## What goes live on next cron

Next weekday cron (Wed 2026-05-28 22:00 UTC) writes:

    metadata.json:
      ...
      universe_membership_as_of: "2026-05-28"
      survivorship_bias_corrected: true

Backward compat: legacy snapshots (pre-0.10.7) still have these
fields as null per the Pydantic optional default.

## Closes the Phase 4.6 chain

| Layer | PR | Status |
|---|---|---|
| Module | #274 | members_at() + CSV + tests |
| Schema | #274 | Metadata fields + types.ts + snapshot |
| Validation gate | #275 | universe_provider kwarg in pbo_dsr |
| **Writer** | **this PR** | **forward cron populates Metadata** |

## NOT in this PR (next follow-ups)

- Honest re-validation of existing pillars + manipulation_index with
  historical universe (likely shifts PBO/DSR baselines DOWN 5-15%)
- Verify-helper Section M for universe-provenance accounting equation
- Backtest harness that consumes the new universe_provider end-to-end

Co-authored-by: Claude <noreply@anthropic.com>
dackclup added a commit that referenced this pull request May 27, 2026
…ss (#277)

Closes the first leg of honest re-validation per Research Report v1.0 §7.4
task #2. Pure-function + CLI scaffolding that answers the foundational
question: **what's the universe drift between today and any historical
as-of date?** Future PRs in this chain layer per-pillar IC / PBO / DSR
re-baselines on top.

## What ships

- `compute/validation/universe_drift.py` (NEW, ~150 LOC) —
  `compute_universe_drift(as_of_date, current_universe) -> UniverseDriftReport`.
  Wraps `historical_universe.members_at()` and returns the 3-way
  symmetric-difference partition: `added_since` / `removed_since` /
  `unchanged` plus size + completeness diagnostic. `format_drift_report()`
  renders the human-readable text block with `+N more` cap for long
  ticker lists.

- `scripts/historical_pillar_revalidate.py` (NEW) — CLI wrapper:
  - `--as-of YYYY-MM-DD` (required)
  - `--json` for downstream tooling
  - `--no-fetch-universe` for offline / CI / smoke runs
  - Exit codes: 0 (clean) / 1 (degraded `is_complete=False`) / 2 (usage)

- `tests/test_validation/test_universe_drift.py` (NEW, 11 tests):
  - Added-since contains recent additions (SMCI, DASH, FSLR, PANW)
  - Removed-since contains delistings (SVB 2023-03-13)
  - Anchor-date = zero drift
  - Pre-EARLIEST_EVENT_DATE = is_complete=False, degraded
  - Future date raises ValueError
  - Partition + size invariants (added+unchanged = current;
    removed+unchanged = historical)
  - Text rendering contains 4 required section labels
  - max_listed cap produces "+N more" suffix
  - Dataclass is frozen
  - anchor_date default = today UTC

- `docs/research/historical-revalidation-harness.md` (NEW) —
  methodology, CLI usage with sample output, acceptance criteria
  for next 6 follow-up PRs, caveats

## CLI smoke output (2023-06-01, 7-ticker synthetic universe)

    ADDED since as_of   : 1 tickers
      SMCI
    REMOVED since as_of : 9 tickers
      AAP, ATVI, BIO, BLL, DISH, ETSY, LNC, WHR, ZION
      ↑ this is the SURVIVORSHIP-BIAS-CORRECTED cohort —
        current-universe-only views silently EXCLUDE these

These 9 are the exact cohort an honest backtest at as-of 2023-06-01
must include. Current-universe-only views (= all pre-Phase-4.6 work)
silently dropped them.

## Hard rules preserved

- ✅ Rule 9 — no schema change (validation-internal module)
- ✅ Rule 16 — N/A (no scoring change)
- ✅ Rule 18 — diagnostic surface (UniverseDriftReport dataclass)
  ships in same PR as the module
- ✅ License — no new deps; CSV already on disk from PR #274
- ✅ No frontend touched

## Verification

- `pytest tests/test_validation/test_universe_drift.py` — 11/11 pass
- `ruff check` — clean
- CLI smoke `--no-fetch-universe --as-of 2023-06-01` — exits 0,
  produces expected 9-ticker REMOVED cohort
- CLI degraded `--as-of 2010-01-01` — exits 1, loud warning

## What this PR does NOT do (deferred to next PRs in chain)

Per the docs file's "Future-work TODO list":

1. Git-archived `rankings.json` time-series loader (1d)
2. Forward-return computation per ticker from cache (0.5d)
3. Per-pillar IC at historical dates (1d, needs 1+2)
4. PBO/DSR re-baseline via `factor_passes_gates(universe_provider=...)`
   (1d, needs 3)
5. `manipulation_index` distribution shift report (0.5d, needs 1)
6. `docs/research/honest-baseline-2026-05-27.md` with revised PBO/DSR
   numbers (0.5d, needs 4)

Total to honest-baseline report: ~4-5 days focused dev across a
sequence of PRs.

## Methodology citations

- Hou, Xue, Zhang (2020). "Replicating Anomalies." Review of
  Financial Studies 33(5):2019-2133.
- McLean, Pontiff (2016). "Does Academic Research Destroy Stock
  Return Predictability?" Journal of Finance 71(1):5-32.
- License: factual list (uncopyrightable per Feist 1991).

Co-authored-by: Claude <noreply@anthropic.com>
dackclup added a commit that referenced this pull request May 27, 2026
…ignored price cache (#280)

New `compute/validation/forward_returns.py` reads the gitignored
`compute/cache/prices/<TICKER>.parquet` cache (written by
`compute.ingest.prices.fetch_prices`) and computes close-to-close N-month
forward total return at any as-of date. Pairs with PR #278's
`load_ranking_history()` to close the honest IC re-baseline loop
(ranking at T from #2a; realized return at T+horizon from this PR).

API:
- `compute_forward_return(ticker, as_of_date, horizon_months, *, cache_dir=None) -> float | None`
- `compute_forward_return_detailed(...) -> ForwardReturnResult` carrying
  the actual trading dates / closes / note
- `compute_forward_returns_batch(...)` universe batch wrapper
- `coverage_report(...)` failure-mode aggregator for the Hou-Xue-Zhang
  2020 RFS-style coverage check

Edge cases handled: missing parquet → None; no close column → None;
as-of doesn't snap within 5d → None; horizon past last cached row →
censored = None; start_close ≤ 0 or NaN → None; end_close NaN → None;
non-DatetimeIndex parquet → coerced back via `pd.to_datetime`.

Source semantics: prefers `Adj Close` (dividend-adjusted) over `Close`;
NAIVE returns (no costs / slippage); survivorship-bias correction is
NOT done here — callers pair with PR #274's `members_at()` for honest
universe construction (Hou-Xue-Zhang 2020 RFS).

Tests: 19 new + 1 live-cache smoke (auto-skipped without warm cache).
Synthetic OHLCV parquets in `tmp_path` via `cache_dir=tmp_path`. Mirrors
PR #278's synthetic-fixture + live-smoke pattern.

Schema impact: zero (read-only consumer of existing cache shape).
Production-wiring impact: zero (validation tool; no `compute/main.py`
hook). #2c per-pillar IC re-baseline will be the first consumer.

Honest-baseline disclaimer per Research Report v1.0 autonomous mission
constraint: outputs feed IC/DSR/PBO re-baselining, NOT a backtest.
Downstream α claims must net frictions (≥30bp/leg), cite McLean-
Pontiff 2016 32% decay, and cap at 2-5% net per the ceiling.

PHASE_STATUS_INFLIGHT.md updated per PR #237 side-file convention.
Harness doc TODO list updated: 4 of 6 items now landed.

https://claude.ai/code/session_01AGU8d6pm4u2fQQ5cebg9qa

Co-authored-by: Claude <noreply@anthropic.com>
dackclup added a commit that referenced this pull request May 27, 2026
…tes (#281)

New `compute/validation/historical_ic.py` orchestrator pairs PR #278's
`load_ranking_history` (ranking at T) with PR #280's
`compute_forward_returns_batch` (realized return at T + horizon) and
computes per-pillar Spearman IC across the historical window —
closes the IC re-baseline half of the Phase 4.6 chain.

API:
- `compute_pillar_ic(scores, returns, *, method, min_tickers)`
  pure cross-sectional IC for one (pillar, date) pair
- `compute_historical_ic_report(start, end, *, horizon_months,
  pillars, ...)` walks rankings.json snapshots + forward returns
  cache, aggregates into `HistoricalICReport`
- `format_ic_report(report)` human-readable text rendering
- `PillarICEntry` / `PillarICSummary` / `HistoricalICReport`
  three frozen-dataclass carriers

Spearman computed as Pearson on rank-transformed series (Spearman
1904 + Conover 1999 §5.4) to avoid pulling scipy into the dep set
(QuantRank ships without scipy; pandas' `Series.corr(method=
'spearman')` requires it transitively).

Drops with descriptive notes:
- cross-section < MIN_TICKERS_PER_DATE = 30 (Grinold-Kahn 2000 §4.2)
- None / NaN / inf in either input
- constant inputs (std=0 → correlation undefined)

Aggregates per pillar: mean / std / median / min / max / IC IR /
hit-rate. IC IR = mean/std × sqrt(n_dates) (Grinold-Kahn 2000 §4.4).
Hit-rate = fraction of dates with strictly positive IC.

Honest-baseline disclaimer per Research Report v1.0:
- IC reported here is NAIVE — no costs / slippage / sector
  neutralization. Real net-of-cost IC typically 30-50% smaller per
  McLean-Pontiff 2016 JF post-publication decay
- The historical universe MUST come from PR #274 members_at() to
  avoid survivorship bias (Hou-Xue-Zhang 2020 RFS); orchestrator
  reads the historical universe FROM rankings.json at as-of T which
  is correct by construction (snapshot itself is historical universe)
- Report is a TIME SERIES + summary, not a single headline number

Tests: 28 new (28 passing). Coverage: module constants, pure IC
computation edge cases (perfect ±1.0, constant inputs, NaN drops,
below-min cross-section, method validation), summary aggregation
math (IC IR formula pinned, hit-rate semantics), orchestrator
full-path (one date / multi-date / missing pillar / malformed JSON),
text rendering, and a live-git smoke that auto-degrades gracefully
when the gitignored price cache is absent.

Schema impact: zero. No new Pydantic / TS / snapshot field.
Production-wiring impact: zero. No compute/main.py import. The
orchestrator is purely a validation / re-baseline tool. Downstream
PRs (#2d PBO/DSR re-baseline + #2f honest-baseline report) consume
the output.

Phase 4.6 chain status: 5 of 6 items now landed (#1/#2 PR #277, #2a
PR #278, #2b PR #280, #2c this PR, #2e PR #279; #2d gate kwarg
shipped PR #275). #4 PBO/DSR re-baseline needs a warm-CI execution
to publish actual numbers; #6 honest-baseline doc closes the chain.

PHASE_STATUS_INFLIGHT.md updated per PR #237 side-file convention.
Harness doc TODO list updated: 5 of 6 items now landed.

https://claude.ai/code/session_01AGU8d6pm4u2fQQ5cebg9qa

Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants