Context
Phase 4h (merged 2026-05-18 via PR #112) shipped OSAP signal replication in factor-exposure proxy mode: every ticker receives the same signal map, derived from the market-wide OSAP long-short return cross-section at as_of. This issue tracks the graduation to true per-stock signal replication.
Scope cut acknowledgement
The proxy-mode design is intentional and documented in compute/features/osap_replicate.py:14-35:
Scope note (locked 2026-05-18 plan audit). This is the
factor-exposure proxy version: every ticker receives the same
signal map, derived from the market-wide OSAP long-short return at
as_of. True per-stock signal replication — porting the ~100
signal formulas from OSAP's SAS / Stata source into pandas, fed by
our existing compute/features/ pillar inputs — is the deferred
heavy lift. The proxy version is sufficient for Phase 4h's blend
target because:
osap_blended_score is observability-only in this phase
(Top-5 ranking still uses composite_score; SKILL.md Rule 16).
- PR 4b §2 PBO/DSR gate
(compute/validation/pbo_dsr.py::factor_passes_gates) runs on
the long-short returns themselves, not the per-stock projection
— so signal acceptance is identical to the full version.
- Per-stock replication of all 100 signals slips Phase 4h by weeks
without unblocking 4i/4j/4k.
Contract stability guarantee
Per the same docstring (osap_replicate.py L33-35):
If this module needs to graduate to true per-stock replication
later, the contract (compute_osap_signals(returns, tickers, as_of) -> dict[str, dict[str, float] | None]) stays stable — only the
inner signal → rank derivation changes per ticker.
Implication for callers: this issue is a strictly internal refactor of compute_osap_signals. No schema bump. No changes to:
compute/scoring/osap_blend.py::apply_osap_blend — still consumes dict[str, dict[str, float] | None]
compute/validation/osap_validation.py::gate_osap_signals — operates on long-short returns, not per-stock projection (proxy vs true is invisible to it)
compute/main.py wiring — public API surface identical
compute/output/schemas.py — StockDetail.osap_signals field already typed for either mode
- Top-5 rotation invariant (SKILL.md Rule 16)
Triggers (open implementation PR when EITHER fires)
-
Phase 5 backtest infrastructure lands (.claude/skills/phase-4/backtest-infrastructure/PLAN.md) — the full walk-forward + purged + embargoed CV harness can directly compare per-stock IC of proxy vs true replication. Without it, we'd be choosing between proxy and true on intuition alone. This is the recommended trigger.
-
Analyst / user feedback indicates the proxy is too coarse — e.g., concrete complaints that "ticker X's osap_blended_score doesn't reflect its idiosyncratic exposure" or metadata.osap_signals_coverage_pct shows uniform 100% / 0% bands that misrepresent reality. The first weekly cron after merge will surface real coverage distributions; revisit if they look off.
Implementation outline (when triggered)
Per-signal porting from OSAP's SAS / Stata source into pandas. Existing compute/features/ pillar inputs already cover the majority of the value / profitability / growth / momentum / investment / quality / risk inputs the 100-signal manifest needs. Implementation should be incremental:
- Port the highest-coverage signals first (Mom1m, BM, GP, Accruals — the ones used in
tests/test_features/test_osap_e2e_integration.py) — ~10-15 signals.
- Add per-signal unit tests asserting the SAS/Stata replication produces values within tolerance of OSAP's released long-short returns when bucketed.
- Toggle proxy → true inside
compute_osap_signals via a feature flag (config.OSAP_REPLICATION_MODE = "proxy" | "true") so the cutover can be A/B-compared via parallel cron runs.
- After IC evidence accumulates (Phase 5 backtest), flip the default and retire the proxy code path.
Effort estimate
- 10-15 high-coverage signals: ~2-3 weeks (1 senior eng)
- Full 100-signal port: ~6-8 weeks
- Phase 5 backtest pre-req: ~10-12 weeks (per CLAUDE.md "Next deliverable" tracking)
Realistic earliest start: ~Phase 5 +2 weeks.
Out of scope for this issue
- Top-5 ranking cutover to
composite_score_osap_adjusted — separate decision, governed by SKILL.md Rule 16 + Phase 5 IC evidence.
- WRDS path — locked CSV-only per
osap-integration/PLAN.md:165-169. WRDS replication is a different debate.
- Per-pillar OSAP weight tuning — locked 50/50 default; Phase 5 ML meta-learner re-tunes that, not this issue.
Related
Context
Phase 4h (merged 2026-05-18 via PR #112) shipped OSAP signal replication in factor-exposure proxy mode: every ticker receives the same signal map, derived from the market-wide OSAP long-short return cross-section at
as_of. This issue tracks the graduation to true per-stock signal replication.Scope cut acknowledgement
The proxy-mode design is intentional and documented in
compute/features/osap_replicate.py:14-35:Contract stability guarantee
Per the same docstring (
osap_replicate.pyL33-35):Implication for callers: this issue is a strictly internal refactor of
compute_osap_signals. No schema bump. No changes to:compute/scoring/osap_blend.py::apply_osap_blend— still consumesdict[str, dict[str, float] | None]compute/validation/osap_validation.py::gate_osap_signals— operates on long-short returns, not per-stock projection (proxy vs true is invisible to it)compute/main.pywiring — public API surface identicalcompute/output/schemas.py—StockDetail.osap_signalsfield already typed for either modeTriggers (open implementation PR when EITHER fires)
Phase 5 backtest infrastructure lands (
.claude/skills/phase-4/backtest-infrastructure/PLAN.md) — the full walk-forward + purged + embargoed CV harness can directly compare per-stock IC of proxy vs true replication. Without it, we'd be choosing between proxy and true on intuition alone. This is the recommended trigger.Analyst / user feedback indicates the proxy is too coarse — e.g., concrete complaints that "ticker X's
osap_blended_scoredoesn't reflect its idiosyncratic exposure" ormetadata.osap_signals_coverage_pctshows uniform 100% / 0% bands that misrepresent reality. The first weekly cron after merge will surface real coverage distributions; revisit if they look off.Implementation outline (when triggered)
Per-signal porting from OSAP's SAS / Stata source into pandas. Existing
compute/features/pillar inputs already cover the majority of the value / profitability / growth / momentum / investment / quality / risk inputs the 100-signal manifest needs. Implementation should be incremental:tests/test_features/test_osap_e2e_integration.py) — ~10-15 signals.compute_osap_signalsvia a feature flag (config.OSAP_REPLICATION_MODE = "proxy" | "true") so the cutover can be A/B-compared via parallel cron runs.Effort estimate
Realistic earliest start: ~Phase 5 +2 weeks.
Out of scope for this issue
composite_score_osap_adjusted— separate decision, governed by SKILL.md Rule 16 + Phase 5 IC evidence.osap-integration/PLAN.md:165-169. WRDS replication is a different debate.Related
c9abb055head, 5-commit cluster)0.9.0-phase4h(SKILL.md schema-versions table).claude/skills/phase-4/osap-integration/PLAN.md.claude/skills/phase-4/backtest-infrastructure/PLAN.md