Skip to content

Phase 4h.2 — OSAP_SIGNALS_100 manifest signal-name reconciliation (production 0% acceptance + 78% silent drop) #116

@dackclup

Description

@dackclup

Phase 4h.2 Part 1 SHIPPED → Production diagnostic data in hand

PR #118 merged 2026-05-19, schema bumped 0.9.0-phase4h0.9.1-phase4h.2,
new optional Metadata fields wired:

  • osap_signals_missing_from_dataset: list[str] | None
  • osap_gate_diagnostics: dict[str, OsapGateDiagnostic] | None

First production cron with these fields landed at commit 3da995dc
(2026-05-19 12:13 UTC, git_commit 182c02de). Production findings
below — Part 2 scope updated.


Production findings (2026-05-19 cron — first 0.9.1 run)

Manifest accounting

OSAP_SIGNALS_100 universe:           100
osap_signals_missing_from_dataset:    22   (Step 1 drops — missing column)
osap_excluded_signals (from gate):    22   (Step 3 drops — failed gate)
osap_signals_used:                     0   (Step 4 survivors — empty)
osap_gate_diagnostics:                22   (per-signal verdicts)
────────────────────────────────────────────
ACCOUNTED:                            44
UNACCOUNTED (silent-drop gap):        56  ← NEW finding

NEW: 56-signal silent-drop gap (Step 2)

Root causecompute/features/osap_replicate.py::compute_long_short_returns
at L120 + L135-136 hard-codes the long/short ports as port=01 /
port=10 (decile-organized signals). OSAP delivers some signals
as quintiles, terciles, or alternative deciles where the SHORT
bucket is not labelled 10 — those signals' pivot has no
port=10 column, and lines L135-136 silently return an empty
DataFrame.

Code referencecompute/features/osap_replicate.py:91-140:

df = df[df["port"].isin([LONG_PORT_LABEL, SHORT_PORT_LABEL])]
pivot = df.pivot_table(index=["signalname", "date"], columns="port", values="ret", ...)
if LONG_PORT_LABEL not in pivot.columns or SHORT_PORT_LABEL not in pivot.columns:
    return pd.DataFrame(columns=["signalname", "date", "ls_return"])

LONG_PORT_LABEL = "01" and SHORT_PORT_LABEL = "10" (both
constants at L60-65). Phase 4h ingest code at compute/ingest/osap.py:47
already flags this: "Phase 4h may switch to deciles_vw etc. once
the blend logic"
— the switch never landed.

Why it's not visible in the diagnostic surface — the silent
drop happens before gate_osap_signals runs. The signals
appear neither in osap_signals_missing_from_dataset (they
exist in the raw frame) nor in osap_gate_diagnostics (they
never reach the gate). They simply vanish.

Gate-reaching signals (22 of 100)

All 22 signals that reach the gate fail with rejection_reason='low_dsr'
0% acceptance rate. Sharpe distribution:

  • min: -1.40
  • median: -0.55
  • max: 0.094 (only 1 of 22 positive)
  • PBO uniformly low (~0.027 — never the rejection cause)

Top 5 ranking unaffected (Rule 16 holds)

Top-5 still ranks by raw composite_score. osap_blended_score
is None for all 502 stocks because no signals survived. Path-b
blend with w=0.5 reduces to identity when the OSAP aggregate
is None.


Part 2 scope (UPDATED)

A. Fix the 56-signal silent-drop gap (NEW — was not in original Part 1.5 brief)

Three options:

  1. Multi-port adapter — extend compute_long_short_returns
    to recognize quintile/tercile alternatives. Per-signal port
    inference: for each signalname, find min(port) and max(port);
    use those as LONG/SHORT. Adds ~20 LOC, dataset-agnostic.

  2. OSAP port_type query — query OSAP for each signal's
    port structure metadata and dispatch. Requires upstream
    investigation; depends on openassetpricing API surface.

  3. Manifest curation — drop the 56 non-decile signals from
    OSAP_SIGNALS_100 and re-publish as OSAP_SIGNALS_44 (or
    investigate which subset is decile-organized first). Smallest
    diff, but loses signal coverage.

Recommended: Option 1 (multi-port adapter). Restores 56
signals to the pipeline, gives the gate something to evaluate.

B. Investigate uniformly-negative Sharpe / failing DSR

22-of-22 low_dsr rejections is suspicious. Hypothesis:

  • Long-short returns are computed without sign-correction.
    Some OSAP signals are "anomalies" where the short leg
    outperforms the long leg (intentional naming inversion in
    the academic literature). The compute_osap_signals proxy
    may need sign-aware aggregation.
  • DSR threshold (> 0.95) is calibrated against actively-traded
    factors; OSAP's monthly returns may need a longer history
    for the haircut to clear.

Either confirm one of these hypotheses or relax the DSR threshold
(e.g., > 0.50 for scout-quality signals).

C. Surface the silent-drop gap in diagnostics

Add an additional Metadata field:

  • osap_signals_dropped_no_long_short: list[str] | None

So Step 2's silent drops are visible alongside Step 1's
(missing_from_dataset) and Step 3's (gate_diagnostics).
After Part 2 lands, the manifest accounting equation should
balance: missing + dropped_no_ls + gated == 100.

D. Production verification

After Part 2 lands and one cron cycle completes:

  • `osap_signals_missing_from_dataset + osap_signals_dropped_no_long_short
    • osap_gate_diagnostics == 100` (manifest accounted for)
  • At least 1 signal in osap_signals_used (gate acceptance > 0)
  • Top-5 unchanged from raw composite ranking (Rule 16 lock holds)

Out-of-scope (still — original Part 1.5 brief)

  • Replacing the 50/50 blend weight with IC-calibrated weight
    (defer until Part 2 IC data exists)
  • Per-sector signal weighting
  • ML-driven signal selection (Phase 5)

Schema bump for Part 2

PATCH bump 0.9.1-phase4h.20.9.2-phase4h.2 for the new
osap_signals_dropped_no_long_short field (additive optional).
No new Pydantic model needed — list[str] | None is symmetric
to the existing missing-from-dataset field.

Related

— updated 2026-05-19 post-cron by Phase 4 auditor session

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions