feat(observability): commit 1 — schema delta for OSAP gate diagnostics + silent-drop surface

claude · claude · commit 428729add21a · 2026-05-19T07:46:45.000Z
Phase 4h.2 Part 1 (issue #116) — commit 1 of 3. Schema delta only; orchestration wiring lands in commits 2 (silent-drop) + 3 (gate diagnostics). **SCHEMA_VERSION** ``0.9.0-phase4h`` → ``0.9.1-phase4h.2``. PATCH bump per SKILL.md L305 lock: "Add a new optional field (default = None) → patch". Both new fields are ``| None = None`` additive optional; legacy 0.9.0-phase4h JSONs deserialize cleanly (verified by ``tests/test_output/test_schema_phase4h2.py:: test_metadata_backward_compat_with_0_9_0_payload``). **New Pydantic model** (``compute/output/schemas.py``): - ``OsapGateDiagnostic`` — 4 nullable fields per the locked refinement: ``pbo`` · ``dsr`` · ``sharpe`` · ``rejection_reason``. All default to ``None`` (no positional-required fields) so per-signal diagnostics serialize cleanly even for accepted signals (where ``rejection_reason`` is ``None``). - Mirrored in ``frontend/lib/types.ts::OsapGateDiagnostic`` + regenerated ``frontend/lib/schema-snapshot.json``. **Metadata additions** (both ``| None = None`` per the schema- versioning rule): - ``osap_signals_missing_from_dataset: list[str] | None`` — surfaces the silent-drop bug from #116. Production today shows 78/100 manifest signals missing from the dataset surface; commit 2 will populate this field via ``compute_missing_signals`` in ``compute/features/osap_replicate.py``. - ``osap_gate_diagnostics: dict[str, OsapGateDiagnostic] | None`` — surfaces per-signal PBO/DSR/Sharpe/rejection_reason. Production today shows 22 signals reaching the gate with 0% acceptance; commit 3 will populate this dict via the existing ``gate_results`` output of ``compute/validation/osap_validation.py:: gate_osap_signals``. **Registry updates**: - ``compute/output/schema_check.py::TRACKED_MODELS`` — added ``OsapGateDiagnostic`` so the BaseModel-subclass tracking test (``tests/test_output/test_schema_check.py::test_A5_tracked_ models_count_matches_schemas_module``) doesn't flag the new class as untracked. - ``tests/test_config.py`` — SCHEMA_VERSION pin updated to ``0.9.1-phase4h.2``. - ``tests/test_smoke.py`` — unchanged (``startswith("0.9.")`` still passes). **Tests** (``tests/test_output/test_schema_phase4h2.py``, 7 offline): 1. ``test_osap_gate_diagnostic_round_trip_with_all_fields`` — full field round-trip. 2. ``test_osap_gate_diagnostic_all_fields_default_to_none`` — empty construction validates per refinement #3 lock. 3. ``test_osap_gate_diagnostic_rejection_reason_taxonomy`` — canonical 4-value taxonomy round-trips (``high_pbo`` / ``low_dsr`` / ``insufficient_data`` / ``gate_failed``). 4. ``test_metadata_round_trip_with_new_fields_populated`` — end-to- end Metadata with both new fields filled. 5. ``test_metadata_backward_compat_with_0_9_0_payload`` — legacy payload deserializes; new fields default to ``None``. 6. ``test_metadata_new_fields_default_to_none`` — verbose restatement of the additive-optional contract. 7. ``test_metadata_extra_forbid_rejects_unknown_fields`` — locks the schema surface against silent field renames. **Verification**: - ``ruff check .`` → clean - ``python -m compute.output.schema_check`` → in-sync (Python ↔ TypeScript ↔ snapshot) - ``pytest tests/ -m "not network"`` → **918 passed** (911 prior + 7 new) **Out of scope this commit** (commits 2 + 3): - ``compute/features/osap_replicate.py`` set-diff helper for the silent-drop list (commit 2) - ``compute/main.py`` orchestration wiring both fields (commits 2 + 3) - ``compute/main.py`` unit test for silent-drop pass-through (commit 2) - Docs updates (PHASE_STATUS / SKILL / CLAUDE / WORKFLOW) (commit 3) **Defense layer**: unchanged at 17. **Top-5 rotation**: unchanged (Rule 16 lock — observability-only). https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU
diff --git a/compute/config.py b/compute/config.py
@@ -27,7 +27,7 @@
 MODELS_DIR: Path = PROJECT_ROOT / "models"
 
 UNIVERSE: str = "SP500"
-SCHEMA_VERSION: str = "0.9.0-phase4h"
+SCHEMA_VERSION: str = "0.9.1-phase4h.2"
 
 PRICES_PERIOD: str = "5y"
 MAX_PARALLEL_FETCHES: int = 10
diff --git a/compute/output/schema_check.py b/compute/output/schema_check.py
@@ -42,6 +42,7 @@
 from compute.output.schemas import (
     DataQuality,
     Metadata,
+    OsapGateDiagnostic,
     PillarBaseline,
     PillarScores,
     RawMetrics,
@@ -54,6 +55,7 @@
 TRACKED_MODELS: list[type[BaseModel]] = [
     DataQuality,
     Metadata,
+    OsapGateDiagnostic,
     PillarBaseline,
     PillarScores,
     RawMetrics,
diff --git a/compute/output/schemas.py b/compute/output/schemas.py
@@ -77,6 +77,28 @@ class StockSummary(BaseModel):
     exited_top5: bool = False
 
 
+class OsapGateDiagnostic(BaseModel):
+    """Per-signal PBO/DSR gate decision surfaced into
+    ``Metadata.osap_gate_diagnostics``. Phase 4h.2 Part 1 observability
+    addition (issue #116) — lets future debugging answer "why did this
+    signal reject?" without a local re-run of the PBO/DSR cohort.
+
+    All 4 fields default to ``None`` so legacy 0.9.0 JSONs without this
+    field deserialize cleanly. ``rejection_reason`` taxonomy mirrors
+    ``compute/validation/osap_validation.py::GateResult.rejection_reason``:
+    one of ``"high_pbo"`` / ``"low_dsr"`` / ``"insufficient_data"`` /
+    ``"gate_failed"`` for rejected signals; ``None`` for accepted
+    signals.
+    """
+
+    model_config = ConfigDict(extra="forbid")
+
+    pbo: float | None = None
+    dsr: float | None = None
+    sharpe: float | None = None
+    rejection_reason: str | None = None
+
+
 class Metadata(BaseModel):
     model_config = ConfigDict(extra="forbid")
 
@@ -96,6 +118,15 @@ class Metadata(BaseModel):
     osap_excluded_signals: list[str] | None = None
     osap_signals_ic_12m: dict[str, float] | None = None
     osap_signals_coverage_pct: dict[str, float] | None = None
+    # Phase 4h.2 Part 1 — observability for the manifest-vs-dataset gap
+    # and per-signal gate decisions surfaced by issue #116.
+    # ``osap_signals_missing_from_dataset`` lists ``OSAP_SIGNALS_100``
+    # entries that the OSAP fetch returned no rows for (silent drop in
+    # 0.9.0-phase4h; visible here). ``osap_gate_diagnostics`` carries
+    # the per-signal PBO/DSR/Sharpe/rejection_reason for every signal
+    # that reached the gate.
+    osap_signals_missing_from_dataset: list[str] | None = None
+    osap_gate_diagnostics: dict[str, OsapGateDiagnostic] | None = None
 
 
 class RawMetrics(BaseModel):
diff --git a/frontend/lib/schema-snapshot.json b/frontend/lib/schema-snapshot.json
@@ -72,6 +72,11 @@
       "required": false,
       "default": null
     },
+    "osap_gate_diagnostics": {
+      "type": "dict[str, OsapGateDiagnostic] | None",
+      "required": false,
+      "default": null
+    },
     "osap_signals_coverage_pct": {
       "type": "dict[str, float] | None",
       "required": false,
@@ -82,6 +87,11 @@
       "required": false,
       "default": null
     },
+    "osap_signals_missing_from_dataset": {
+      "type": "list[str] | None",
+      "required": false,
+      "default": null
+    },
     "osap_signals_used": {
       "type": "list[str] | None",
       "required": false,
@@ -108,6 +118,28 @@
       "default": "<required>"
     }
   },
+  "OsapGateDiagnostic": {
+    "dsr": {
+      "type": "float | None",
+      "required": false,
+      "default": null
+    },
+    "pbo": {
+      "type": "float | None",
+      "required": false,
+      "default": null
+    },
+    "rejection_reason": {
+      "type": "str | None",
+      "required": false,
+      "default": null
+    },
+    "sharpe": {
+      "type": "float | None",
+      "required": false,
+      "default": null
+    }
+  },
   "PillarBaseline": {
     "label": {
       "type": "str",
diff --git a/frontend/lib/types.ts b/frontend/lib/types.ts
@@ -82,6 +82,28 @@ export type Metadata = {
   osap_excluded_signals: string[] | null;
   osap_signals_ic_12m: Record<string, number> | null;
   osap_signals_coverage_pct: Record<string, number> | null;
+  // Phase 4h.2 Part 1 — observability for the manifest-vs-dataset gap
+  // and per-signal gate decisions surfaced by issue #116.
+  // `osap_signals_missing_from_dataset` lists OSAP_SIGNALS_100 entries
+  // that the OSAP fetch returned no rows for (silent drops in
+  // 0.9.0-phase4h; visible here). `osap_gate_diagnostics` carries the
+  // per-signal PBO/DSR/Sharpe/rejection_reason for every signal that
+  // reached the gate. Both null on legacy outputs from before
+  // 0.9.1-phase4h.2.
+  osap_signals_missing_from_dataset: string[] | null;
+  osap_gate_diagnostics: Record<string, OsapGateDiagnostic> | null;
+};
+
+// Phase 4h.2 Part 1 — per-signal gate decision shape. Mirrors
+// `compute/output/schemas.py::OsapGateDiagnostic`. All 4 fields nullable
+// so legacy 0.9.0 JSONs deserialize cleanly. `rejection_reason` is one
+// of "high_pbo" / "low_dsr" / "insufficient_data" / "gate_failed" for
+// rejected signals; null for accepted signals.
+export type OsapGateDiagnostic = {
+  pbo: number | null;
+  dsr: number | null;
+  sharpe: number | null;
+  rejection_reason: string | null;
 };
 
 // Phase 3d Tier-2 event defenses. Surfaces in StockDetail.tier2_events.
diff --git a/tests/test_config.py b/tests/test_config.py
@@ -10,8 +10,8 @@
 from compute import config
 
 
-def test_schema_version_is_phase4h():
-    assert config.SCHEMA_VERSION == "0.9.0-phase4h"
+def test_schema_version_is_phase4h_2():
+    assert config.SCHEMA_VERSION == "0.9.1-phase4h.2"
 
 
 def test_eight_k_lookback_veto_is_one_year():
diff --git a/tests/test_output/test_schema_phase4h2.py b/tests/test_output/test_schema_phase4h2.py
@@ -0,0 +1,141 @@
+"""Pydantic round-trip + legacy backward-compat tests for the
+Phase 4h.2 Part 1 schema additions (issue #116).
+
+Two new optional fields land on ``Metadata``:
+
+- ``osap_signals_missing_from_dataset: list[str] | None = None``
+- ``osap_gate_diagnostics: dict[str, OsapGateDiagnostic] | None = None``
+
+Plus a new nested model ``OsapGateDiagnostic`` with 4 nullable fields.
+All defaults are ``None`` so a legacy 0.9.0-phase4h ``metadata.json``
+deserializes cleanly with the new fields set to ``None`` — the
+canonical "additive optional field → PATCH bump" pattern from
+SKILL.md L305 ("Add a new optional field (default = None) → patch").
+"""
+
+from __future__ import annotations
+
+from compute.output.schemas import Metadata, OsapGateDiagnostic
+
+
+def _legacy_0_9_0_metadata_payload() -> dict:
+    """A canonical 0.9.0-phase4h metadata.json payload — the production
+    shape *before* this PR's additions. Used to assert backward-compat.
+    """
+    return {
+        "version": "0.9.0-phase4h",
+        "last_update_utc": "2026-05-18T22:00:00Z",
+        "next_update_utc": "2026-05-25T22:00:00Z",
+        "universe": "sp500",
+        "universe_size": 502,
+        "compute_run_id": "local",
+        "git_commit": "fbd1acf461847d835967bd701a098af93c9d2bd7",
+        "mos_trailing_ic_smoke": 0.05,
+        "tier2_coverage_pct": 0.97,
+        "fundamentals_coverage_pct": 0.99,
+        "fundamentals_latency_p50_seconds": 1.2,
+        "fundamentals_latency_p95_seconds": 5.4,
+        "osap_signals_used": None,
+        "osap_excluded_signals": [
+            "AbnormalAccruals", "AssetGrowth", "BM", "BetaFP", "CF",
+        ],
+        "osap_signals_ic_12m": None,
+        "osap_signals_coverage_pct": None,
+    }
+
+
+def test_osap_gate_diagnostic_round_trip_with_all_fields():
+    """Happy path — populate every field, round-trip through JSON."""
+    diag = OsapGateDiagnostic(
+        pbo=0.45,
+        dsr=0.12,
+        sharpe=0.34,
+        rejection_reason=None,
+    )
+    payload = diag.model_dump()
+    restored = OsapGateDiagnostic.model_validate(payload)
+    assert restored == diag
+
+
+def test_osap_gate_diagnostic_all_fields_default_to_none():
+    """Empty construction (zero args) — every field is None.
+    Per refinement #3: all 4 fields explicit ``= None`` defaults."""
+    diag = OsapGateDiagnostic()
+    assert diag.pbo is None
+    assert diag.dsr is None
+    assert diag.sharpe is None
+    assert diag.rejection_reason is None
+
+
+def test_osap_gate_diagnostic_rejection_reason_taxonomy():
+    """Mirrors ``compute/validation/osap_validation.py::GateResult``
+    rejection_reason values. The model accepts any str (no Literal
+    constraint), but the canonical taxonomy should round-trip cleanly.
+    """
+    for reason in ("high_pbo", "low_dsr", "insufficient_data", "gate_failed"):
+        diag = OsapGateDiagnostic(rejection_reason=reason)
+        restored = OsapGateDiagnostic.model_validate(diag.model_dump())
+        assert restored.rejection_reason == reason
+
+
+def test_metadata_round_trip_with_new_fields_populated():
+    """End-to-end: Metadata with both new fields populated."""
+    payload = _legacy_0_9_0_metadata_payload()
+    payload["osap_signals_missing_from_dataset"] = ["AOP", "AccrualsBM", "ChEQ"]
+    payload["osap_gate_diagnostics"] = {
+        "BM": {"pbo": 0.6, "dsr": -0.1, "sharpe": 0.05, "rejection_reason": "high_pbo"},
+        "Mom12m": {"pbo": 0.4, "dsr": -0.3, "sharpe": 0.2, "rejection_reason": "low_dsr"},
+    }
+    meta = Metadata.model_validate(payload)
+    assert meta.osap_signals_missing_from_dataset == ["AOP", "AccrualsBM", "ChEQ"]
+    assert isinstance(meta.osap_gate_diagnostics, dict)
+    assert meta.osap_gate_diagnostics["BM"].pbo == 0.6
+    assert meta.osap_gate_diagnostics["BM"].rejection_reason == "high_pbo"
+    assert meta.osap_gate_diagnostics["Mom12m"].dsr == -0.3
+
+    restored = Metadata.model_validate(meta.model_dump())
+    assert restored == meta
+
+
+def test_metadata_backward_compat_with_0_9_0_payload():
+    """Legacy 0.9.0-phase4h JSON (no new fields) deserializes cleanly
+    — the backward-compat guarantee that justifies a PATCH bump per
+    SKILL.md L305 ('Add a new optional field (default = None) →
+    patch').
+    """
+    legacy_payload = _legacy_0_9_0_metadata_payload()
+    assert "osap_signals_missing_from_dataset" not in legacy_payload
+    assert "osap_gate_diagnostics" not in legacy_payload
+
+    meta = Metadata.model_validate(legacy_payload)
+    assert meta.version == "0.9.0-phase4h"
+    assert meta.osap_signals_missing_from_dataset is None
+    assert meta.osap_gate_diagnostics is None
+    # Existing Phase 4h fields still populated.
+    assert meta.osap_excluded_signals == [
+        "AbnormalAccruals", "AssetGrowth", "BM", "BetaFP", "CF",
+    ]
+
+
+def test_metadata_new_fields_default_to_none():
+    """Constructing a Metadata without supplying the new fields leaves
+    them at None — same semantics as every other Phase-4h OSAP field."""
+    payload = _legacy_0_9_0_metadata_payload()
+    meta = Metadata.model_validate(payload)
+    assert meta.osap_signals_missing_from_dataset is None
+    assert meta.osap_gate_diagnostics is None
+
+
+def test_metadata_extra_forbid_rejects_unknown_fields():
+    """``model_config = ConfigDict(extra='forbid')`` catches typo'd
+    field names. Locks the schema surface so future refactors that
+    rename ``osap_signals_missing_from_dataset`` (e.g., to
+    ``osap_manifest_missing``) raise instead of silently producing
+    a no-op field."""
+    import pytest as _pytest
+    from pydantic import ValidationError
+
+    payload = _legacy_0_9_0_metadata_payload()
+    payload["osap_signals_missing"] = ["typo_field_name"]  # not the real field
+    with _pytest.raises(ValidationError):
+        Metadata.model_validate(payload)