Phase 4h.2 Part 1 SHIPPED → Production diagnostic data in hand
PR #118 merged 2026-05-19, schema bumped 0.9.0-phase4h → 0.9.1-phase4h.2,
new optional Metadata fields wired:
osap_signals_missing_from_dataset: list[str] | None
osap_gate_diagnostics: dict[str, OsapGateDiagnostic] | None
First production cron with these fields landed at commit 3da995dc
(2026-05-19 12:13 UTC, git_commit 182c02de). Production findings
below — Part 2 scope updated.
Production findings (2026-05-19 cron — first 0.9.1 run)
Manifest accounting
OSAP_SIGNALS_100 universe: 100
osap_signals_missing_from_dataset: 22 (Step 1 drops — missing column)
osap_excluded_signals (from gate): 22 (Step 3 drops — failed gate)
osap_signals_used: 0 (Step 4 survivors — empty)
osap_gate_diagnostics: 22 (per-signal verdicts)
────────────────────────────────────────────
ACCOUNTED: 44
UNACCOUNTED (silent-drop gap): 56 ← NEW finding
NEW: 56-signal silent-drop gap (Step 2)
Root cause — compute/features/osap_replicate.py::compute_long_short_returns
at L120 + L135-136 hard-codes the long/short ports as port=01 /
port=10 (decile-organized signals). OSAP delivers some signals
as quintiles, terciles, or alternative deciles where the SHORT
bucket is not labelled 10 — those signals' pivot has no
port=10 column, and lines L135-136 silently return an empty
DataFrame.
Code reference — compute/features/osap_replicate.py:91-140:
df = df[df["port"].isin([LONG_PORT_LABEL, SHORT_PORT_LABEL])]
pivot = df.pivot_table(index=["signalname", "date"], columns="port", values="ret", ...)
if LONG_PORT_LABEL not in pivot.columns or SHORT_PORT_LABEL not in pivot.columns:
return pd.DataFrame(columns=["signalname", "date", "ls_return"])
LONG_PORT_LABEL = "01" and SHORT_PORT_LABEL = "10" (both
constants at L60-65). Phase 4h ingest code at compute/ingest/osap.py:47
already flags this: "Phase 4h may switch to deciles_vw etc. once
the blend logic" — the switch never landed.
Why it's not visible in the diagnostic surface — the silent
drop happens before gate_osap_signals runs. The signals
appear neither in osap_signals_missing_from_dataset (they
exist in the raw frame) nor in osap_gate_diagnostics (they
never reach the gate). They simply vanish.
Gate-reaching signals (22 of 100)
All 22 signals that reach the gate fail with rejection_reason='low_dsr' —
0% acceptance rate. Sharpe distribution:
- min: -1.40
- median: -0.55
- max: 0.094 (only 1 of 22 positive)
- PBO uniformly low (~0.027 — never the rejection cause)
Top 5 ranking unaffected (Rule 16 holds)
Top-5 still ranks by raw composite_score. osap_blended_score
is None for all 502 stocks because no signals survived. Path-b
blend with w=0.5 reduces to identity when the OSAP aggregate
is None.
Part 2 scope (UPDATED)
A. Fix the 56-signal silent-drop gap (NEW — was not in original Part 1.5 brief)
Three options:
-
Multi-port adapter — extend compute_long_short_returns
to recognize quintile/tercile alternatives. Per-signal port
inference: for each signalname, find min(port) and max(port);
use those as LONG/SHORT. Adds ~20 LOC, dataset-agnostic.
-
OSAP port_type query — query OSAP for each signal's
port structure metadata and dispatch. Requires upstream
investigation; depends on openassetpricing API surface.
-
Manifest curation — drop the 56 non-decile signals from
OSAP_SIGNALS_100 and re-publish as OSAP_SIGNALS_44 (or
investigate which subset is decile-organized first). Smallest
diff, but loses signal coverage.
Recommended: Option 1 (multi-port adapter). Restores 56
signals to the pipeline, gives the gate something to evaluate.
B. Investigate uniformly-negative Sharpe / failing DSR
22-of-22 low_dsr rejections is suspicious. Hypothesis:
- Long-short returns are computed without sign-correction.
Some OSAP signals are "anomalies" where the short leg
outperforms the long leg (intentional naming inversion in
the academic literature). The compute_osap_signals proxy
may need sign-aware aggregation.
- DSR threshold (
> 0.95) is calibrated against actively-traded
factors; OSAP's monthly returns may need a longer history
for the haircut to clear.
Either confirm one of these hypotheses or relax the DSR threshold
(e.g., > 0.50 for scout-quality signals).
C. Surface the silent-drop gap in diagnostics
Add an additional Metadata field:
osap_signals_dropped_no_long_short: list[str] | None
So Step 2's silent drops are visible alongside Step 1's
(missing_from_dataset) and Step 3's (gate_diagnostics).
After Part 2 lands, the manifest accounting equation should
balance: missing + dropped_no_ls + gated == 100.
D. Production verification
After Part 2 lands and one cron cycle completes:
- `osap_signals_missing_from_dataset + osap_signals_dropped_no_long_short
- osap_gate_diagnostics == 100` (manifest accounted for)
- At least 1 signal in
osap_signals_used (gate acceptance > 0)
- Top-5 unchanged from raw composite ranking (Rule 16 lock holds)
Out-of-scope (still — original Part 1.5 brief)
- Replacing the 50/50 blend weight with IC-calibrated weight
(defer until Part 2 IC data exists)
- Per-sector signal weighting
- ML-driven signal selection (Phase 5)
Schema bump for Part 2
PATCH bump 0.9.1-phase4h.2 → 0.9.2-phase4h.2 for the new
osap_signals_dropped_no_long_short field (additive optional).
No new Pydantic model needed — list[str] | None is symmetric
to the existing missing-from-dataset field.
Related
— updated 2026-05-19 post-cron by Phase 4 auditor session
Phase 4h.2 Part 1 SHIPPED → Production diagnostic data in hand
PR #118 merged 2026-05-19, schema bumped
0.9.0-phase4h→0.9.1-phase4h.2,new optional
Metadatafields wired:osap_signals_missing_from_dataset: list[str] | Noneosap_gate_diagnostics: dict[str, OsapGateDiagnostic] | NoneFirst production cron with these fields landed at commit
3da995dc(2026-05-19 12:13 UTC, git_commit
182c02de). Production findingsbelow — Part 2 scope updated.
Production findings (2026-05-19 cron — first 0.9.1 run)
Manifest accounting
NEW: 56-signal silent-drop gap (Step 2)
Root cause —
compute/features/osap_replicate.py::compute_long_short_returnsat L120 + L135-136 hard-codes the long/short ports as
port=01/port=10(decile-organized signals). OSAP delivers some signalsas quintiles, terciles, or alternative deciles where the SHORT
bucket is not labelled
10— those signals' pivot has noport=10column, and lines L135-136 silently return an emptyDataFrame.
Code reference —
compute/features/osap_replicate.py:91-140:LONG_PORT_LABEL = "01"andSHORT_PORT_LABEL = "10"(bothconstants at L60-65). Phase 4h ingest code at
compute/ingest/osap.py:47already flags this: "Phase 4h may switch to
deciles_vwetc. oncethe blend logic" — the switch never landed.
Why it's not visible in the diagnostic surface — the silent
drop happens before
gate_osap_signalsruns. The signalsappear neither in
osap_signals_missing_from_dataset(theyexist in the raw frame) nor in
osap_gate_diagnostics(theynever reach the gate). They simply vanish.
Gate-reaching signals (22 of 100)
All 22 signals that reach the gate fail with
rejection_reason='low_dsr'—0% acceptance rate. Sharpe distribution:
Top 5 ranking unaffected (Rule 16 holds)
Top-5 still ranks by raw
composite_score.osap_blended_scoreis
Nonefor all 502 stocks because no signals survived. Path-bblend with
w=0.5reduces to identity when the OSAP aggregateis
None.Part 2 scope (UPDATED)
A. Fix the 56-signal silent-drop gap (NEW — was not in original Part 1.5 brief)
Three options:
Multi-port adapter — extend
compute_long_short_returnsto recognize quintile/tercile alternatives. Per-signal port
inference: for each
signalname, find min(port) and max(port);use those as LONG/SHORT. Adds ~20 LOC, dataset-agnostic.
OSAP
port_typequery — query OSAP for each signal'sport structure metadata and dispatch. Requires upstream
investigation; depends on
openassetpricingAPI surface.Manifest curation — drop the 56 non-decile signals from
OSAP_SIGNALS_100and re-publish asOSAP_SIGNALS_44(orinvestigate which subset is decile-organized first). Smallest
diff, but loses signal coverage.
Recommended: Option 1 (multi-port adapter). Restores 56
signals to the pipeline, gives the gate something to evaluate.
B. Investigate uniformly-negative Sharpe / failing DSR
22-of-22
low_dsrrejections is suspicious. Hypothesis:Some OSAP signals are "anomalies" where the short leg
outperforms the long leg (intentional naming inversion in
the academic literature). The
compute_osap_signalsproxymay need sign-aware aggregation.
> 0.95) is calibrated against actively-tradedfactors; OSAP's monthly returns may need a longer history
for the haircut to clear.
Either confirm one of these hypotheses or relax the DSR threshold
(e.g.,
> 0.50for scout-quality signals).C. Surface the silent-drop gap in diagnostics
Add an additional
Metadatafield:osap_signals_dropped_no_long_short: list[str] | NoneSo Step 2's silent drops are visible alongside Step 1's
(
missing_from_dataset) and Step 3's (gate_diagnostics).After Part 2 lands, the manifest accounting equation should
balance:
missing + dropped_no_ls + gated == 100.D. Production verification
After Part 2 lands and one cron cycle completes:
osap_signals_used(gate acceptance > 0)Out-of-scope (still — original Part 1.5 brief)
(defer until Part 2 IC data exists)
Schema bump for Part 2
PATCH bump
0.9.1-phase4h.2→0.9.2-phase4h.2for the newosap_signals_dropped_no_long_shortfield (additive optional).No new Pydantic model needed —
list[str] | Noneis symmetricto the existing missing-from-dataset field.
Related
3da995dc)— updated 2026-05-19 post-cron by Phase 4 auditor session