feat(observability): Phase 4h.2 Part 1 — OSAP gate diagnostics + silent-drop metadata surface (#116)#118
Merged
Conversation
…s + silent-drop surface Phase 4h.2 Part 1 (issue #116) — commit 1 of 3. Schema delta only; orchestration wiring lands in commits 2 (silent-drop) + 3 (gate diagnostics). **SCHEMA_VERSION** ``0.9.0-phase4h`` → ``0.9.1-phase4h.2``. PATCH bump per SKILL.md L305 lock: "Add a new optional field (default = None) → patch". Both new fields are ``| None = None`` additive optional; legacy 0.9.0-phase4h JSONs deserialize cleanly (verified by ``tests/test_output/test_schema_phase4h2.py:: test_metadata_backward_compat_with_0_9_0_payload``). **New Pydantic model** (``compute/output/schemas.py``): - ``OsapGateDiagnostic`` — 4 nullable fields per the locked refinement: ``pbo`` · ``dsr`` · ``sharpe`` · ``rejection_reason``. All default to ``None`` (no positional-required fields) so per-signal diagnostics serialize cleanly even for accepted signals (where ``rejection_reason`` is ``None``). - Mirrored in ``frontend/lib/types.ts::OsapGateDiagnostic`` + regenerated ``frontend/lib/schema-snapshot.json``. **Metadata additions** (both ``| None = None`` per the schema- versioning rule): - ``osap_signals_missing_from_dataset: list[str] | None`` — surfaces the silent-drop bug from #116. Production today shows 78/100 manifest signals missing from the dataset surface; commit 2 will populate this field via ``compute_missing_signals`` in ``compute/features/osap_replicate.py``. - ``osap_gate_diagnostics: dict[str, OsapGateDiagnostic] | None`` — surfaces per-signal PBO/DSR/Sharpe/rejection_reason. Production today shows 22 signals reaching the gate with 0% acceptance; commit 3 will populate this dict via the existing ``gate_results`` output of ``compute/validation/osap_validation.py:: gate_osap_signals``. **Registry updates**: - ``compute/output/schema_check.py::TRACKED_MODELS`` — added ``OsapGateDiagnostic`` so the BaseModel-subclass tracking test (``tests/test_output/test_schema_check.py::test_A5_tracked_ models_count_matches_schemas_module``) doesn't flag the new class as untracked. - ``tests/test_config.py`` — SCHEMA_VERSION pin updated to ``0.9.1-phase4h.2``. - ``tests/test_smoke.py`` — unchanged (``startswith("0.9.")`` still passes). **Tests** (``tests/test_output/test_schema_phase4h2.py``, 7 offline): 1. ``test_osap_gate_diagnostic_round_trip_with_all_fields`` — full field round-trip. 2. ``test_osap_gate_diagnostic_all_fields_default_to_none`` — empty construction validates per refinement #3 lock. 3. ``test_osap_gate_diagnostic_rejection_reason_taxonomy`` — canonical 4-value taxonomy round-trips (``high_pbo`` / ``low_dsr`` / ``insufficient_data`` / ``gate_failed``). 4. ``test_metadata_round_trip_with_new_fields_populated`` — end-to- end Metadata with both new fields filled. 5. ``test_metadata_backward_compat_with_0_9_0_payload`` — legacy payload deserializes; new fields default to ``None``. 6. ``test_metadata_new_fields_default_to_none`` — verbose restatement of the additive-optional contract. 7. ``test_metadata_extra_forbid_rejects_unknown_fields`` — locks the schema surface against silent field renames. **Verification**: - ``ruff check .`` → clean - ``python -m compute.output.schema_check`` → in-sync (Python ↔ TypeScript ↔ snapshot) - ``pytest tests/ -m "not network"`` → **918 passed** (911 prior + 7 new) **Out of scope this commit** (commits 2 + 3): - ``compute/features/osap_replicate.py`` set-diff helper for the silent-drop list (commit 2) - ``compute/main.py`` orchestration wiring both fields (commits 2 + 3) - ``compute/main.py`` unit test for silent-drop pass-through (commit 2) - Docs updates (PHASE_STATUS / SKILL / CLAUDE / WORKFLOW) (commit 3) **Defense layer**: unchanged at 17. **Top-5 rotation**: unchanged (Rule 16 lock — observability-only). https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU
…wiring Phase 4h.2 Part 1 (issue #116) — commit 2 of 3. Populates the ``osap_signals_missing_from_dataset`` field landed in commit 1's schema delta. Closes the observability gap that hid 78/100 manifest signals in the first production 0.9.0-phase4h run. **Refinement 4 decision (locked)**: helper lives in ``compute/features/osap_replicate.py``, not inline in ``compute/main.py``. Rationale: - Unit-testable in isolation (no orchestrator setup) - Mirrors the existing ``coverage_by_signal`` helper one function above — same pure-helper shape (no I/O, no logging, no side effects) - Keeps ``compute/main.py`` orchestration thin **Helper** (``compute/features/osap_replicate.py``, +31 LOC): ```python def signals_in_dataframe(df: pd.DataFrame) -> frozenset[str]: """Return unique ``signalname`` values present in an OSAP returns DataFrame. Phase 4h.2 Part 1 helper (issue #116)...""" if df.empty or "signalname" not in df.columns: return frozenset() return frozenset(df["signalname"].unique().tolist()) ``` Defensive against empty DataFrame AND missing ``signalname`` column — both yield empty frozenset so the caller's set diff reports the full manifest as missing (safe-by-default). **Wiring** (``compute/main.py``, +27 LOC): 1. Import added to the existing ``from compute.features.osap_replicate import (...)`` block (4-symbol import). 2. Variable initialized BEFORE the OSAP try block alongside the other observability accumulators: ``osap_signals_missing_from_dataset: list[str] = []``. 3. Inside the try, after ``fetch_osap_returns`` returns: ```python present_signals = signals_in_dataframe(osap_returns_raw) osap_signals_missing_from_dataset = sorted( set(config.OSAP_SIGNALS_100) - present_signals ) if osap_signals_missing_from_dataset: logger.warning( "OSAP manifest signals not in dataset: %d/%d missing " "(first 5: %s)", len(osap_signals_missing_from_dataset), len(config.OSAP_SIGNALS_100), osap_signals_missing_from_dataset[:5], ) ``` Warning fires only when the set diff is non-empty (no log spam on a clean cron). 4. Reset to ``[]`` in the OSAP-pipeline-failed ``except`` branch so graceful degradation leaves every osap_* field at ``None``. 5. Wired into the ``Metadata(...)`` constructor with the same ``or None`` idiom Phase 4h established: ```python osap_signals_missing_from_dataset=( osap_signals_missing_from_dataset or None ), ``` **Tests** (``tests/test_features/test_osap_replicate.py``, +58 LOC, 4 new offline tests appended to the Phase 4h commit-2 suite): 1. ``test_signals_in_dataframe_empty_returns_empty_frozenset`` — empty DataFrame (correct schema, zero rows) → empty frozenset. 2. ``test_signals_in_dataframe_no_signalname_column_returns_empty_frozenset`` — defensive against the schema-drift case where the ``signalname`` column itself disappears upstream. 3. ``test_signals_in_dataframe_unique_signals_dedup`` — multi-row input with duplicates → set dedups. 4. ``test_signals_in_dataframe_setdiff_with_manifest_simulates_silent_drop`` — end-to-end simulation of the issue-#116 silent-drop: 5-signal manifest, 2-signal dataset, set diff = sorted missing-3 (``["AOP", "AccrualsBM", "ChEQ"]``). Helper-level integration tests are sufficient for commit 2; ``compute/main.py`` orchestration-test deferred to the next weekly cron (the production output IS the integration test). ``compute/main.py`` import smoke verified locally: ``python -c "from compute.main import run_weekly_compute; ..."`` → OK. **Verification**: - ``ruff check .`` → clean - ``python -m compute.output.schema_check`` → in-sync (NO schema delta this commit — field already landed in commit 1) - ``python -c "from compute.main import run_weekly_compute, ..., signals_in_dataframe"`` → OK - ``pytest tests/ -m "not network"`` → **922 passed** (918 prior + 4 new helper tests) **Defense layer**: unchanged at 17. **Top-5 rotation**: unchanged (Rule 16 lock — observability-only). **Out of scope this commit** (commit 3): - ``osap_gate_diagnostics`` orchestration (per-signal PBO/DSR/ Sharpe/rejection_reason populated from ``gate_results``) - Docs updates (PHASE_STATUS / SKILL / CLAUDE / WORKFLOW) - PR body + open Draft PR + STOP at verification ladder step 8 **Production observation after this lands**: Once the next weekly cron runs ``0.9.1-phase4h.2``, the ``osap_signals_missing_from_dataset`` field will surface the 78 silent-drop signals from #116 (or whatever the current count is — manifest reconciliation in a later sub-PR may reduce it). https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU
**FINAL** commit of the Phase 4h.2 Part 1 3-commit cluster (issue #116). Populates the ``osap_gate_diagnostics`` field landed in commit 1's schema delta + docs the full Part-1 surface so reviewers + future maintainers see the schema and observability contract in one place. **`compute/main.py` wiring** (+23 LOC): 1. Import added to ``from compute.output.schemas import (...)``: ``OsapGateDiagnostic`` inserted alphabetically between ``Metadata`` and ``PillarScores`` (schemas import already used at this site, no new module touched). 2. Variable initialized BEFORE the OSAP try block: ``osap_gate_diagnostics: dict[str, OsapGateDiagnostic] = {}``. 3. Populated inside the try after ``gate_results = gate_osap_signals(osap_ls, requested_signals= config.OSAP_SIGNALS_100)`` and BEFORE ``filter_accepted_signals`` — captures EVERY signal that reached the gate (both accepted and rejected). Accepted carry ``rejection_reason=None``; rejected carry one of the canonical taxonomy values (``high_pbo`` / ``low_dsr`` / ``insufficient_data`` / ``gate_failed``) per ``compute/validation/osap_validation.py::GateResult``. 4. Reset to ``{}`` in the OSAP-pipeline-failed ``except`` branch so graceful degradation continues to leave every osap_* field at ``None``. 5. Wired into the ``Metadata(...)`` constructor with the established ``or None`` idiom: ```python osap_gate_diagnostics=osap_gate_diagnostics or None, ``` **Tests** (``tests/test_output/test_schema_phase4h2.py``, +55 LOC, 2 new offline appended to commit 1's suite): 1. ``test_metadata_gate_diagnostics_round_trip_with_production_cohort_shape`` — simulates the production observation from #116 (22 signals reach the gate, all rejected with a mix of rejection_reason values across the canonical 4-value taxonomy); asserts the dict-of-OsapGateDiagnostic structure survives ``model_validate`` → ``model_dump`` → ``model_validate`` round-trip. 2. ``test_metadata_gate_diagnostics_accepted_signal_has_null_rejection_reason`` — locks the ``rejection_reason=None`` semantics for accepted signals (Pydantic preserves None rather than coercing to a sentinel string). **Docs** (atomic with the wiring): - ``CLAUDE.md`` ``## Phase status`` — schema line updated to ``0.9.1-phase4h.2`` with the PATCH-bump framing; preserved the prior MINOR-bump history (`0.8.0-phase4.5f` → `0.9.0-phase4h` via PR #112). - ``PHASE_STATUS.md`` row 4 — Phase 4h.2 Part 1 sub-status added; describes both new fields, the Part-1 / Part-2 split rationale ("Part 2 opens after ≥1 week of production diagnostic data accumulates"), and the "no new veto / no rank change" invariant. - ``SKILL.md`` schema-versions table — new row for ``0.9.1-phase4h.2`` inserted above the ``0.9.0-phase4h`` row; cites the SKILL.md L305 PATCH-bump quote verbatim, locks the ``OsapGateDiagnostic`` "all 4 fields explicit = None" refinement in writing, and documents the set-diff helper placement decision (``compute/features/osap_replicate.py::signals_in_dataframe`` per refinement #4). - ``WORKFLOW.md`` — unchanged; no "Open items" checkbox list for Phase 4h.2 yet (would be created when Part 2 is scoped). **Verification ladder** (steps 1-5 complete): - ``ruff check .`` → clean ✅ - ``pytest tests/ -m "not network"`` → **924 passed** (911 baseline + 13 new across the 3-commit cluster: 7 schema + 4 helper + 2 gate-diagnostic) ✅ - ``python -m compute.output.schema_check`` → in-sync (no new schema delta this commit; the snapshot already captured both fields + ``OsapGateDiagnostic`` from commit 1's regen) ✅ - ``python -c "from compute.main import run_weekly_compute; from compute.output.schemas import OsapGateDiagnostic; ..."`` → OK ✅ Steps 6-8 next: ``git push`` → open Draft PR → ``subscribe_pr_activity`` + STOP for user audit + Mark-Ready authorization. **Defense layer**: unchanged at 17. **Top-5 rotation**: unchanged. **Schema version**: ``0.9.1-phase4h.2`` (locked from commit 1). **Cluster summary**: | # | SHA | LOC | Tests added | |---|---|---|---| | 1 — schema delta | ``428729ad`` | 231 | +7 (round-trip + backward-compat) | | 2 — silent-drop wiring | ``c7949403`` | 116 | +4 (helper unit tests) | | 3 — gate diagnostics + docs (this) | TBD | ~86 | +2 (gate-diag round-trip) | | **Total** | — | ~433 | **+13** | Within the Option-β diagnostic-first scope (~250-350 LOC budget; + docs); under the original plan's ~300 LOC estimate. https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
dackclup
added a commit
that referenced
this pull request
May 19, 2026
…manifest + 6 offline tests (#119) Phase 4j scout PR — 3rd of 4 factor-library scouts (OSAP ✅ #110, JKP ✅ #114, Qlib THIS, IPCA next as 4k). Ships `pyqlib` install + Alpha158 158-feature manifest + 6 offline tests. NO production wiring; yfinance-to-Qlib BYO adapter + full Alpha158 compute on 502-ticker universe deferred to follow-on integration PR. 5 pre-plan investigations (all verified 2026-05-19): 1. PyPI package: `pyqlib` 0.9.7 (canonical). Alternative names (`qlib`, `microsoft-qlib`) return 404. 2. License: MIT via wheel METADATA classifier. No CC BY-NC complication like JKP — safe for Phase 6+ commercial roadmap. 3. Data init: `qlib.init(provider_uri=..., region="us")`. NO public US data bundle — Qlib's default covers CN A-share only; US universe is BYO via local .bin files. 4. Alpha158 surface: `qlib.contrib.data.handler.Alpha158` → 158 columns; manifest captured via `Alpha158DL.get_feature_config()[1]` and hardcoded; offline test 3 locks against upstream drift. 5. CI install footprint: ~150-180 MB net-new (mlflow / lightgbm / cvxpy / pymongo / redis / gym / jupyter + nbconvert transitives). One-time cold-start; pip wheel caching mitigates subsequent runs. Critical scope decisions: - NO @network test for this scout — Qlib has no remote CDN; data flow is local-bin filesystem I/O. Originally planned synthetic-OHLCV→bin→init→Alpha158 smoke test was dropped because pyqlib's PyPI wheel doesn't bundle `scripts/dump_bin.py`. Replacement: manifest-vs-runtime-introspection drift detector (stronger than the dropped test — fires on every pip install upgrade if Qlib changes the feature set). - Module name `compute/ingest/qlib_features.py` (NOT `qlib.py`) — Python import resolution would shadow the installed `qlib` package, breaking the entire factor-library integration. Distinct module name avoids namespace collision. - Tenacity NOT applied — Qlib's data flow is local filesystem I/O, no network retry semantics needed. First ingest module in QuantRank that diverges from the canonical `compute/ingest/osap.py:52-56` retry decorator (documented in module docstring). Module layer (compute/ingest/qlib_features.py, ~186 LOC): - `QLIB_DATA_CACHE: Path` constant (gitignored via parent `compute/cache/`) - `QLIB_INSTRUMENTS_UNIVERSE = "sp500"` (custom universe for future BYO bundle) - `ALPHA158_FEATURE_NAMES: tuple[str, ...]` — 158 hardcoded entries, asserted at module load - `init_qlib(provider_uri=None)` — thin wrapper around `qlib.init(region="us")`; idempotent - `fetch_alpha158_features(*, instruments, start_time, end_time)` — Alpha158 handler wrapper Config layer (compute/config.py, +23 LOC): - `QLIB_DATA_CACHE: Path = CACHE_DIR / "qlib" / "us_data"` - `QLIB_DATA_MAX_AGE_DAYS: int = 31` - `ALPHA158_FEATURE_COUNT: int = 158` (asserted against module manifest length) Tests (6 offline; ~113 LOC): 1. `test_alpha158_feature_manifest_has_158_entries` — primary CI signal (pure cardinality + uniqueness, no Qlib runtime) 2. `test_alpha158_feature_manifest_first_5_anchor` — K-bar leading features anchor (KMID, KLEN, KMID2, KUP, KUP2) 3. `test_alpha158_feature_manifest_matches_runtime_introspection` — drift detector (manifest == `Alpha158DL.get_feature_config()[1]`) 4. `test_qlib_data_cache_constant_under_repo_cache_dir` — config sanity 5. `test_init_qlib_passes_us_region_and_provider_uri` — monkeypatch capture 6. `test_init_qlib_defaults_to_config_cache_when_no_uri` — default path verified pyproject.toml: `pyqlib>=0.9.7,<0.10` added to `[factors]` extra (authorized in advance via plan-mode approval; pin range because Qlib's API drifts across minor versions). Ask-first surfaces touched: - `pyproject.toml [factors]` — extended (authorized via plan-mode) - `ci.yml` UNCHANGED (`[dev,factors]` install already covers new dep) - `compute-rankings.yml` UNTOUCHED per user hard constraint - Schema triple UNTOUCHED (no schema delta this scout) Verification (local): - ruff check . → clean - pytest tests/ -m "not network" → 930 passed (924 prior + 6 new) - python -m compute.output.schema_check → in-sync - python -c "from compute.ingest.qlib_features import ..." → OK 158 - Vercel preview ✅ READY Defense layer unchanged at 17. Top-5 rotation unchanged (no scoring touched). Schema unchanged at 0.9.1-phase4h.2. After this merges → 3 of 4 factor-library scouts done. Phase 4k (IPCA) is the final scout; once 4k merges → eligible for `v1.1.0-phase4` tag. Out of scope (deferred to follow-on full Phase 4j integration PR, ~5-commit cluster): - yfinance-to-Qlib BYO adapter (~150 LOC + custom S&P 500 instruments universe registration) - Full Alpha158 feature compute on 502-ticker universe → 502 × N_dates × 158 DataFrame - Per-feature cross-validation framework (PBO/DSR doesn't apply to per-stock-per-date features; walk-forward IC scoring per feature is the likely replacement) - Schema additions (StockDetail.qlib_features + Metadata.qlib_features_used + IC observability) → schema bump 0.9.1-phase4h.2 → 0.10.0-phase4j - compute/main.py wiring decision (observability-only? blended into composite? Phase-5 ML-meta-learner-only consumer?) Audit history: - Plan-audit round 1: 5 pre-plan investigations verified · MIT lock · heavy-deps disclosure approved - Plan-audit rounds 2-5: same plan re-paste loop (session-side stuck); main session verified PR #119 unchanged at each check - Pre-CI audit: clean (1 legitimate pivot — test #6 swapped from end-to-end smoke to manifest drift detector because pyqlib wheel lacks scripts/dump_bin.py) - Conditional Mark-Ready authorization given on Vercel ✅ + mergeable_state clean - Squash merged per "merge call is yours" delegation pattern (PR #112 / #114 / #118 precedent) https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
This was referenced May 19, 2026
dackclup
added a commit
that referenced
this pull request
May 20, 2026
Part of epic #125 (Item #4 of 6). Doc-only PR — no code changes, no schema delta, no test additions. Phase 4h timeline (2026-05-18 → 2026-05-19) demonstrated the cost of shipping production wiring + gate logic without a diagnostic surface: - PR #112 (Phase 4h): OSAP signal replication + PBO/DSR gate + Path-b blend, NO observability surface for gate decisions - First production cron: every signal failed gate, no way to know why - PR #118 (Phase 4h.2 Part 1): retrofit diagnostic surface (osap_signals_missing_from_dataset + osap_gate_diagnostics) - Second production cron: 22 missing + 22 fail low_dsr, 56 silently dropped (gap that Part 1 still couldn't fully expose) - PR #124 (Phase 4h.2 Part 2): root-cause fix (multi-port adapter) + osap_signals_dropped_no_long_short closing the accounting gap The combined cost of Phase 4h.2 Parts 1 + 2 (~10 hours across 2 PRs) would have been ~30 minutes of additional Phase 4h scope if the diagnostic surface had shipped alongside the production wiring. Files (3 changed, +83 LOC) --------------------------- - WORKFLOW.md (+63 LOC) — new section "# Observability-Before-Wiring Pattern" inserted between the mobile playbook table and the "Initial Prompts" section. Includes mandatory checklist (6 items) + anti-pattern statement + 3 reference precedents (PR #112 bad, PR #118 good, PR #124 good) - SKILL.md (+14 LOC) — new "Rule 18: Observability-before-wiring" appended to the Core Behavior Rules section (Rule 17 was the prior trailing rule). Links back to WORKFLOW.md for the mandatory checklist detail - CLAUDE.md (+6 LOC) — 1 bullet added to ## Conventions referencing the new Rule 18 + WORKFLOW.md section Files NOT touched (deliberately per scope) ------------------------------------------- - PHASE_STATUS.md — chronological log; pattern guidance belongs in WORKFLOW.md / SKILL.md / CLAUDE.md, not in the historical tracker - AGENTS.md — cross-tool agent doc; lookups defer to WORKFLOW.md by default, so a fresh duplicate would just create drift risk - compute/ / frontend/ / tests/ — doc-only PR, no behavior change Constraints honored ------------------- - No code changes — pure markdown additions - No schema delta — schema_check confirms in-sync - No test additions — pytest count unchanged at 959 - No push to main; no force-push; no --no-verify - No workflow_dispatch trigger (compute-rankings.yml untouched) Verification ladder all green ------------------------------ - ruff check . → All checks passed - python tools/check_doc_test_counts.py → exit 0 (no new hardcoded test-count claims introduced — the precedents reference PRs and hour estimates, not "N offline + M @network" drift patterns) - python -m compute.output.schema_check → in sync (no schema touch) - python -m pytest tests/ -m "not network" → 959 passed (unchanged) https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU Co-authored-by: Claude <noreply@anthropic.com>
10 tasks
dackclup
added a commit
that referenced
this pull request
May 20, 2026
…ble skills (#132) 3-task housekeeping + tacit knowledge harvest. Docs/skills-only PR — no code, no schema delta, no test additions. Task A — SKILL.md schema-version table fixes --------------------------------------------- Two stale "in flight" entries flipped to merged + 1 new row inserted: - Row 0.9.0-phase4h: "(in flight in PR #112)" → "(PR #112 merged 2026-05-19)" - Row 0.9.1-phase4h.2: "(in flight in PR #<NEXT>)" → "(PR #118 merged 2026-05-19)" - NEW row 0.9.2-phase4h.2 (above 0.9.1) — PR #124 merged, multi-port OSAP adapter + osap_signals_dropped_no_long_short field, closing the 100-signal accounting equation; DSR sign-inversion deferred to Part 3 PHASE_STATUS.md row 4 ALSO has "Phase 4h.2 Part 2 in flight in this PR" staleness — confirmed via grep but DELIBERATELY not updated here per Task A explicit scope (SKILL.md only). Recommend a follow-up phase-status-bump PR after this lands. Task B — New worker-session-handoff skill ------------------------------------------ .claude/skills/worker-session-handoff/SKILL.md (+163 LOC). YAML frontmatter + 5 sections: - When to use vs inline (≤50 LOC single-file → inline; ≥2 files / new dep / code logic → handoff) - Constraint lock library (8 standard locks: composite/PHASE3, Rule 16, Rule 18, no-merge, no force-push, no --no-verify, no workflow_dispatch, schema triple) - Anti-pattern: paste-loop avoidance (single outer code-block fence; reference PR #123 as related-but-distinct paste-loop failure mode) - Template (paste-ready, single ```` outer code block with language tag ` text` so inner triple-backticks pass through) - Reference invocations + QuantRank precedents (PR #124, #127, #131) Codifies the handoff shape that appeared verbatim across PRs #123, #124, #127, #128, #129, #131 — user copies ONE block instead of editing 5 template snippets per handoff. Task C — Portable skills library (4 skills, +417 LOC) ----------------------------------------------------- Audit step (per spec): read CLAUDE.md + AGENTS.md + SKILL.md + WORKFLOW.md + PR descriptions of #112/#118/#124/#127/#128/#129/#131. Identified 7 candidate patterns; classified by portability: - ✅ scout-then-integrate (portable; vendoring pattern, no QR logic) - ✅ observability-before-wiring (portable; gate-diagnostic pattern) - ✅ drift-detector-manifest (portable; API surface lock pattern) - ✅ schema-triple-lockstep (portable; Python/TS JSON contract) - 🟡 annotate-before-veto (portable; progressive rollout — DEFERRED to follow-up issue, lower value vs the 4 shipped) - 🟡 pre-plan-investigations (subsumed by scout-then-integrate's Phase 1 § "Pre-plan investigations" — no separate skill needed) - 🟡 graceful-degradation-try-except (portable; error-handling pattern — DEFERRED to follow-up issue, the wrapper is generally 1-line so doesn't warrant a dedicated skill) 4 shipped (each ≤ 109 LOC): .claude/skills/portable-scout-then-integrate/SKILL.md (99 LOC) .claude/skills/portable-drift-detector-manifest/SKILL.md (109 LOC) .claude/skills/portable-schema-triple-lockstep/SKILL.md (103 LOC) .claude/skills/portable-observability-before-wiring/SKILL.md (106 LOC) Flat naming convention (`portable-<name>/SKILL.md` at depth 1 from `.claude/skills/`) because Claude Code's skill registry doesn't recurse into nested subdirectories per CLAUDE.md ## Conventions. Confirmed via session reload — all 4 portable + worker-session- handoff registered correctly. Each portable skill has: - YAML frontmatter (name + description + TRIGGER + SKIP) - ## Pattern section (generic, no QR business logic) - ## Trigger conditions + ## Skip conditions - ## QuantRank precedent (1 paragraph, clearly labeled as precedent not pattern definition) Task C constraint check: - All portable skills core pattern descriptions are project- agnostic (read `.claude/skills/portable-*/SKILL.md` ## Pattern sections — zero references to OSAP / IPCA / pillar / Top-5 inside the pattern body; only inside the labeled "QuantRank precedent" section at the bottom) - 3 of 4 portable skills are 103-109 LOC (slightly over the 100-LOC target — pattern + trigger + skip + precedent sections require ~25 LOC each, leaving ~25 LOC of unavoidable scaffold). The 99-LOC one (scout-then-integrate) shows the cap is achievable but tight. Files (6 changed, +580 LOC, no deletions) ------------------------------------------ - SKILL.md — schema-version table fixes (Task A) - 5 new SKILL.md files in .claude/skills/ (Tasks B + C) Verification ladder all green ------------------------------ - ruff check . → All checks passed - python tools/check_doc_test_counts.py → exit 0 - python tools/check_branch_collisions.py "skill" "portable" → expected⚠️ on #131 (own adjacent work, not a duplicate) - python -m compute.output.schema_check → in sync (no schema touch) - python -m pytest tests/ -m "not network" → 959 passed (unchanged; tools/ + .claude/skills/ aren't imported by tests) - Claude Code skill registry pick-up verified via session reload — all 5 new skills (worker-session-handoff + 4 portable-*) appear in the available-skills list Constraints honored ------------------- - No touch to compute/ / frontend/ / tests/ - No touch to PHASE_STATUS.md / WORKFLOW.md (Task A scope = SKILL.md only; PHASE_STATUS.md staleness flagged for follow-up) - No push to main; no force-push; no --no-verify - No workflow_dispatch trigger - Task C portable skills are project-agnostic in their pattern description (QR refs confined to labeled "precedent" sections) Follow-up issue (to file post-merge) ------------------------------------ Title: "Portable Skills Library — extract remaining tacit patterns" - annotate-before-veto (progressive rule rollout) - graceful-degradation-try-except (1-line wrapper guidance) - pre-plan-investigations as standalone (currently subsumed) - Anything else surfaced by future PR descriptions https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU Co-authored-by: Claude <noreply@anthropic.com>
dackclup
added a commit
that referenced
this pull request
May 20, 2026
…sk C.1 recovery) (#135) * docs(skills): SKILL.md schema bump + worker-session-handoff + 4 portable skills 3-task housekeeping + tacit knowledge harvest. Docs/skills-only PR — no code, no schema delta, no test additions. Task A — SKILL.md schema-version table fixes --------------------------------------------- Two stale "in flight" entries flipped to merged + 1 new row inserted: - Row 0.9.0-phase4h: "(in flight in PR #112)" → "(PR #112 merged 2026-05-19)" - Row 0.9.1-phase4h.2: "(in flight in PR #<NEXT>)" → "(PR #118 merged 2026-05-19)" - NEW row 0.9.2-phase4h.2 (above 0.9.1) — PR #124 merged, multi-port OSAP adapter + osap_signals_dropped_no_long_short field, closing the 100-signal accounting equation; DSR sign-inversion deferred to Part 3 PHASE_STATUS.md row 4 ALSO has "Phase 4h.2 Part 2 in flight in this PR" staleness — confirmed via grep but DELIBERATELY not updated here per Task A explicit scope (SKILL.md only). Recommend a follow-up phase-status-bump PR after this lands. Task B — New worker-session-handoff skill ------------------------------------------ .claude/skills/worker-session-handoff/SKILL.md (+163 LOC). YAML frontmatter + 5 sections: - When to use vs inline (≤50 LOC single-file → inline; ≥2 files / new dep / code logic → handoff) - Constraint lock library (8 standard locks: composite/PHASE3, Rule 16, Rule 18, no-merge, no force-push, no --no-verify, no workflow_dispatch, schema triple) - Anti-pattern: paste-loop avoidance (single outer code-block fence; reference PR #123 as related-but-distinct paste-loop failure mode) - Template (paste-ready, single ```` outer code block with language tag ` text` so inner triple-backticks pass through) - Reference invocations + QuantRank precedents (PR #124, #127, #131) Codifies the handoff shape that appeared verbatim across PRs #123, #124, #127, #128, #129, #131 — user copies ONE block instead of editing 5 template snippets per handoff. Task C — Portable skills library (4 skills, +417 LOC) ----------------------------------------------------- Audit step (per spec): read CLAUDE.md + AGENTS.md + SKILL.md + WORKFLOW.md + PR descriptions of #112/#118/#124/#127/#128/#129/#131. Identified 7 candidate patterns; classified by portability: - ✅ scout-then-integrate (portable; vendoring pattern, no QR logic) - ✅ observability-before-wiring (portable; gate-diagnostic pattern) - ✅ drift-detector-manifest (portable; API surface lock pattern) - ✅ schema-triple-lockstep (portable; Python/TS JSON contract) - 🟡 annotate-before-veto (portable; progressive rollout — DEFERRED to follow-up issue, lower value vs the 4 shipped) - 🟡 pre-plan-investigations (subsumed by scout-then-integrate's Phase 1 § "Pre-plan investigations" — no separate skill needed) - 🟡 graceful-degradation-try-except (portable; error-handling pattern — DEFERRED to follow-up issue, the wrapper is generally 1-line so doesn't warrant a dedicated skill) 4 shipped (each ≤ 109 LOC): .claude/skills/portable-scout-then-integrate/SKILL.md (99 LOC) .claude/skills/portable-drift-detector-manifest/SKILL.md (109 LOC) .claude/skills/portable-schema-triple-lockstep/SKILL.md (103 LOC) .claude/skills/portable-observability-before-wiring/SKILL.md (106 LOC) Flat naming convention (`portable-<name>/SKILL.md` at depth 1 from `.claude/skills/`) because Claude Code's skill registry doesn't recurse into nested subdirectories per CLAUDE.md ## Conventions. Confirmed via session reload — all 4 portable + worker-session- handoff registered correctly. Each portable skill has: - YAML frontmatter (name + description + TRIGGER + SKIP) - ## Pattern section (generic, no QR business logic) - ## Trigger conditions + ## Skip conditions - ## QuantRank precedent (1 paragraph, clearly labeled as precedent not pattern definition) Task C constraint check: - All portable skills core pattern descriptions are project- agnostic (read `.claude/skills/portable-*/SKILL.md` ## Pattern sections — zero references to OSAP / IPCA / pillar / Top-5 inside the pattern body; only inside the labeled "QuantRank precedent" section at the bottom) - 3 of 4 portable skills are 103-109 LOC (slightly over the 100-LOC target — pattern + trigger + skip + precedent sections require ~25 LOC each, leaving ~25 LOC of unavoidable scaffold). The 99-LOC one (scout-then-integrate) shows the cap is achievable but tight. Files (6 changed, +580 LOC, no deletions) ------------------------------------------ - SKILL.md — schema-version table fixes (Task A) - 5 new SKILL.md files in .claude/skills/ (Tasks B + C) Verification ladder all green ------------------------------ - ruff check . → All checks passed - python tools/check_doc_test_counts.py → exit 0 - python tools/check_branch_collisions.py "skill" "portable" → expected⚠️ on #131 (own adjacent work, not a duplicate) - python -m compute.output.schema_check → in sync (no schema touch) - python -m pytest tests/ -m "not network" → 959 passed (unchanged; tools/ + .claude/skills/ aren't imported by tests) - Claude Code skill registry pick-up verified via session reload — all 5 new skills (worker-session-handoff + 4 portable-*) appear in the available-skills list Constraints honored ------------------- - No touch to compute/ / frontend/ / tests/ - No touch to PHASE_STATUS.md / WORKFLOW.md (Task A scope = SKILL.md only; PHASE_STATUS.md staleness flagged for follow-up) - No push to main; no force-push; no --no-verify - No workflow_dispatch trigger - Task C portable skills are project-agnostic in their pattern description (QR refs confined to labeled "precedent" sections) Follow-up issue (to file post-merge) ------------------------------------ Title: "Portable Skills Library — extract remaining tacit patterns" - annotate-before-veto (progressive rule rollout) - graceful-degradation-try-except (1-line wrapper guidance) - pre-plan-investigations as standalone (currently subsumed) - Anything else surfaced by future PR descriptions https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU * docs(skills): Vendor karpathy-guidelines (Task C.1 recovery) + THIRD_PARTY_NOTICES.md Recovers Task C.1 from the original handoff that was silent-dropped in the prior PR #132 commit (50da720). The handoff explicitly named "Vendor karpathy-guidelines (1 skill, ~70 LOC)" as part of the portable skills library; the auditor session caught the omission and authorized this follow-up commit on the existing branch. Files (2 new, +138 LOC) ------------------------ - .claude/skills/portable-karpathy-guidelines/SKILL.md (+82 LOC) — vendored content of upstream skills/karpathy-guidelines/SKILL.md (67 LOC, byte-for-byte preserved) + 15-line appended attribution block referencing the upstream source, commit SHA, and the Karpathy tweet that motivated the guidelines. - THIRD_PARTY_NOTICES.md (+56 LOC, NEW at repo root) — third-party license disclosures. Section "karpathy-guidelines (Claude Code skill)" carries source URL, license declaration, vendored path, vendored date, upstream commit SHA, upstream first-commit date, and the full standard MIT License text with copyright attributed to "multica-ai contributors" (upstream has no individual copyright line and no standalone LICENSE file; the `license: MIT` claim appears in upstream README.md § License and each skill's YAML frontmatter). Upstream provenance ------------------- - Source: https://github.com/multica-ai/andrej-karpathy-skills - Upstream HEAD SHA at vendoring: 2c606141936f1eeef17fa3043a72095b4765b9c2 - Upstream first commit: 2026-01-27 - Vendored date: 2026-05-20 - License: MIT (declared) Verbatim content preserved -------------------------- `diff /tmp/karpathy-src/skills/karpathy-guidelines/SKILL.md .claude/skills/portable-karpathy-guidelines/SKILL.md` shows ONLY the 15-line appended attribution block at lines 68-82. The upstream 67-line content (YAML frontmatter + "Karpathy Guidelines" heading + the 4 principles) is byte-for-byte unchanged. Per the spec constraint: "เก็บ 4 principles verbatim. แก้ได้แค่ 'เพิ่ม' attribution block ท้ายไฟล์". License-disclosure caveat ------------------------- Upstream `multica-ai/andrej-karpathy-skills` declares MIT via README + YAML frontmatter but does NOT ship a standalone LICENSE file. The `THIRD_PARTY_NOTICES.md` entry includes the standard MIT License template with copyright attributed to the GitHub org ("multica-ai contributors"), matching the principle that an MIT declaration without a formal copyright line still licenses to the redistributor; the attribution is conservative. Verification ladder all green ------------------------------ - ruff check . → All checks passed - python tools/check_doc_test_counts.py → exit 0 (no test-count drift introduced by this commit) - python tools/check_branch_collisions.py "karpathy" → no scope collisions detected - python -m compute.output.schema_check → in sync (no schema touch) - python -m pytest tests/ -m "not network" → 959 passed (unchanged; .claude/skills/ + THIRD_PARTY_NOTICES.md aren't imported by tests) - Skill registry pickup verified via session reload — `portable-karpathy-guidelines` appears in the available-skills list with the upstream description verbatim Constraints honored ------------------- - No squash / amend of the prior 50da720 commit — this is a fresh commit pushed on top of the existing branch (per spec "ห้าม squash old commit") - No touch to the 4 already-shipped portable skills in 50da720 - No touch to compute/ / frontend/ / tests/ - No push to main; no force-push; no --no-verify - No workflow_dispatch trigger - Karpathy SKILL.md upstream content preserved verbatim; only the attribution block appended below the original content PR description update will follow as a separate `gh pr edit` / MCP `update_pull_request` call so the new "License Compliance" section + the audit-table row for karpathy-guidelines land in the PR body. https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU --------- Co-authored-by: Claude <noreply@anthropic.com>
dackclup
added a commit
that referenced
this pull request
May 20, 2026
…4 staleness (#139) Closes #133. Docs/skills-only PR. Task A — Portable skills library final 2 (closes #133) ------------------------------------------------------ Extracts the last 2 deferred-but-tracked patterns from epic #125: - .claude/skills/portable-annotate-before-veto/SKILL.md (108 LOC): Progressive-rollout pattern for defense / risk flags. Ship as annotate FIRST, promote to veto only after ≥ 1 production cron of observation + threshold calibration + cohort-acceptance check. Forcing precedent: Phase 4.5 cluster (loss_avoidance_pattern at 0% fire rate would've been a no-op or hotfix candidate as a veto; annotate made it observable). - .claude/skills/portable-graceful-degradation-try-except/SKILL.md (115 LOC): Wrap every external-data integration call site in a try/except that sets ALL related output fields to None on failure + writes a structured log line + sets a per-integration status Metadata field. 3-rule contract: no partial state, no log swallowing, downstream-aware. Forcing precedent: OSAP integration in compute/main.py (PRs #112 → #118 → #124). Both skills follow the established portable-* convention from PR #132 (YAML frontmatter + Pattern + Trigger + Skip + QuantRank precedent section). Each pattern section is project-agnostic; QuantRank refs confined to the labeled "QuantRank precedent" sections at the bottom. Task B — PHASE_STATUS.md row 4 staleness fix --------------------------------------------- PHASE_STATUS.md row 4 said "Phase 4h.2 Part 2 in flight in this PR" since PR #124's prep work. PR #124 merged 2026-05-19 (commit sequence visible in main: ...124...118...112...). Updated to "Phase 4h.2 Part 2 merged via PR #124 (2026-05-19)" — the rest of the row 4 text (multi-port OSAP adapter description, IC-decay deferral note) stays unchanged. This was flagged in PR #132 body and tracked as a small follow-up. No other PHASE_STATUS.md edits — row 4 is the only stale entry. Task C — Docs lockstep ----------------------- CLAUDE.md row 33 skill count: 35 → 37 (QR-origin portable category 4 → 6, total reflects the 2 new skills landed here). Categorisation unchanged otherwise; 9arm license-pending caveat still flagged with cross-reference to issue #137. Skill inventory after this PR (37 total) ----------------------------------------- - QuantRank operational: 12 - QR-origin portable extract: 6 (was 4; +annotate-before-veto + graceful-degradation-try-except) - Anthropic vendored: 6 - External MIT vendored: 9 (Karpathy + 8 mattpocock, unchanged) - External license-pending vendored: 4 (9arm, unchanged) Verification ladder ------------------- - ruff check . → All checks passed - python -m compute.output.schema_check → Schema snapshot in sync - python tools/check_doc_test_counts.py → exit 0 - pytest tests/ -m "not network" → not run locally (sandbox missing pandas); CI will verify. Changes are docs/skills-only. - Skill registry pickup verified via session reload — both portable-annotate-before-veto and portable-graceful-degradation-try-except register with full YAML-frontmatter descriptions. Constraints honored ------------------- - No touch to compute/ / frontend/ / tests/ - No touch to WORKFLOW.md (out of scope; could file a future follow-up if WORKFLOW.md needs to cross-reference the two new portable skills) - No squash / amend of prior commits - No push to main; no force-push; no --no-verify - No workflow_dispatch trigger - 2 new portable skills pattern descriptions are project-agnostic; QR refs only in labeled "precedent" sections Epic #125 status after this PR ------------------------------- - #130 (quarterly cohort-threshold review tracker) — recurring, unchanged - #133 (portable skills library remaining) — CLOSED by this PR - #137 (9arm-skills license clarification) — external action, waiting on user to file upstream issue at thananon/9arm-skills Epic #125 Item 3 (Pre-merge production simulation) remains the only substantive open scope. PHASE_STATUS.md row 4 staleness was the last housekeeping task. https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2 Co-authored-by: Claude <noreply@anthropic.com>
4 tasks
dackclup
added a commit
that referenced
this pull request
May 20, 2026
…PR A) (#141) First PR in the multi-PR .md optimization sequence (Option D scope — yกเครื่อง). PR A is the low-risk baseline: fixes 2 broken skill frontmatters that prevent dispatch + drift-fixes 4 stale facts in agent docs. Critical YAML fix: - branch-collision-check/SKILL.md and pr-quality-gate/SKILL.md had multi-line `description:` plain-scalar frontmatter that PyYAML (and Claude Code's skill loader) couldn't parse because lines contain `#123` / `#X` issue references after whitespace — YAML treats ` #` as a comment marker, so everything after the first comment-trigger got eaten and the loader fell back to displaying `name: name` in the available-skills list. Both skills were effectively undispatchable from any session. - Fix: change `description:` to `description: >` (folded block scalar) so newlines become spaces and `#` mid-content is treated as literal text. Verified live in this session — system reminder now shows the full TRIGGER/SKIP descriptions for both. Stale-fact pass: - .claude/skills/README.md L14-16: "27 invocation-triggerable skills" → references CLAUDE.md as the canonical count (38) to prevent future drift. Future top-level skill add/remove only needs to bump CLAUDE.md §Layout, not three files. - AGENTS.md L104: ".claude/skills/ # 24 loaded skills" → 38. - AGENTS.md L287: "Schema version: 0.8.0-phase4.5f" → 0.9.2-phase4h.2 (3 versions behind). Now references SKILL.md schema-version table for full history. - CLAUDE.md L181-192 (§Phase status): "Current schema 0.9.1-phase4h.2 ... Phase 4h in flight in PR #112" → 0.9.2-phase4h.2 + Phase 4h shipped (Parts 1+2 done via #112/#118/#124). - CLAUDE.md + AGENTS.md §Phase status: "Epic #125 Item 3 in flight via PR #140" → "PR 1 of 2 shipped" at commit a52aa2d; PR 2 remaining. CLAUDE.md + AGENTS.md edit ships per the lockstep convention. No code touched, no schema touched — pre-merge-prod-sim.yml won't trigger (paths compute/scoring + compute/features unaffected). Next in optimization sequence: PR B (CLAUDE.md token diet) — TBD after user reviews this one. Co-authored-by: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Phase 4h.2 Part 1 — observability follow-up to Phase 4h (PR #112). Closes the observability gap in issue #116 by adding 2 new optional
Metadatafields that surface what's currently invisible in production:osap_signals_missing_from_dataset: list[str] | None— the silent-drop list (78/100 manifest signals returned no rows from the OSAP fetch in the first 0.9.0-phase4h production run; surfaces them as a first-class metadata field instead of hiding the gap)osap_gate_diagnostics: dict[str, OsapGateDiagnostic] | None— per-signal PBO/DSR/Sharpe/rejection_reason for every signal that reaches the gate (was previously available only at compute time asgate_results[signal].rejection_reason; not persisted anywhere)No new veto, no rank change. Observability-only per SKILL.md Rule 16. Top-5 still ranks raw
composite_score. Defense layer unchanged at 17. Production rankings UNAFFECTED.Why a PATCH bump (not MINOR)
Per SKILL.md L305 verbatim:
Both new
Metadatafields default toNone. The newOsapGateDiagnosticPydantic model has all 4 fields explicit= Nonedefaults (locked refinement). Legacy 0.9.0-phase4h JSONs deserialize cleanly with new fields set toNone(asserted bytests/test_output/test_schema_phase4h2.py::test_metadata_backward_compat_with_0_9_0_payload).Schema version:
0.9.0-phase4h→0.9.1-phase4h.2.Part-1 / Part-2 split rationale
This PR is Part 1 of 2 — diagnostic-first. Part 2 (threshold calibration + manifest reconciliation) is deferred until ≥1 week of production diagnostic data accumulates from Part 1's new fields. Why split:
n_partitions=16?DSR_VETO_THRESHOLD=0.0? Without per-signal PBO/DSR floats from production, those decisions are guesswork. Part 1 provides the data; Part 2 makes the call.3-commit cluster
428729adOsapGateDiagnosticmodel + 2 newMetadatafields +SCHEMA_VERSIONbump +TRACKED_MODELSregistry + types.ts mirror + snapshot regenc7949403signals_in_dataframehelper incompute/features/osap_replicate.py+compute/main.pyorchestration populatesosap_signals_missing_from_dataset6391bdfecompute/main.pypopulatesosap_gate_diagnosticsfromgate_results+ docs (CLAUDE.mdschema line,PHASE_STATUS.mdrow 4 footer,SKILL.mdnew schema-versions row)Architectural locks (carried forward)
OsapGateDiagnostic— all 4 fields| None = None(refinement feat(phase-1): universe + prices + momentum stub #3): no positional-required fields, so per-signal diagnostics serialize cleanly even for accepted signals (whererejection_reasonisNone).compute/features/osap_replicate.py::signals_in_dataframe, mirroring the existingcoverage_by_signalhelper in the same module. Pure helper, no I/O, no logging, unit-testable in isolation.[]/{}) in the OSAP-pipeline-failedexceptbranch, then theor Noneidiom in theMetadata(...)constructor converts toNonefor the JSON output. Production output stays clean even if the OSAP fetch raises.composite_score) intact: this PR is observability-only.osap_gate_diagnosticsis metadata for debugging;osap_signals_missing_from_datasetis a list of strings. Neither touches scoring.PHASE3_WEIGHTSsum-to-1.0 invariant atcompute/scoring/composite.py:43-45untouched (Phase 4h architectural lock preserved).Verification ladder
ruff check .pytest tests/ -m "not network"pytest -m network --run-networkpython -m compute.output.schema_check--update-snapshot)python -c "from compute.main import run_weekly_compute; from compute.output.schemas import OsapGateDiagnostic; ..."git push -u origin claude/resume-quantrank-phase-4.5-Zh0pO6391bdfesubscribe_pr_activity+ STOP for user auditAsk-first surfaces touched
NONE. Verified per
AGENTS.md:.github/workflows/compute-rankings.yml— UNTOUCHED.github/workflows/ci.yml— UNTOUCHEDpyproject.toml— UNTOUCHED (no new dep)schemas.py/types.ts/schema-snapshot.json) — moved IN LOCKSTEP in commit 1;python -m compute.output.schema_checkin-sync post-edit; this is the ask-first surface that WAS authorized in advance via the plan-mode approvalProduction impact after merge
Once the next weekly
compute-rankings.ymlcron runs0.9.1-phase4h.2, two new diagnostic fields will appear infrontend/public/data/metadata.json:{ "version": "0.9.1-phase4h.2", // ... unchanged Phase 4h OSAP fields ... "osap_signals_missing_from_dataset": [ "AOP", "AbnormalAccrualsPercent", "AccrualsBM", "Activism1", ... // ~78 entries (current production) ], "osap_gate_diagnostics": { "BM": {"pbo": 0.6, "dsr": -0.1, "sharpe": 0.05, "rejection_reason": "high_pbo"}, "Mom12m": {"pbo": 0.4, "dsr": -0.3, "sharpe": 0.2, "rejection_reason": "low_dsr"}, // ~22 entries (every signal that reached the gate) } }Part 2 will consume this data to make the threshold-calibration call.
Out of scope (deferred to Part 2 — separate PR after ≥1 week of data)
PBO_THRESHOLD,DSR_THRESHOLD,n_partitionstuning)OSAP_SIGNALS_100against actual dataset surface)dl_port("op", "pandas")is even the right dataset call (dl_signal/dl_all_signalsmay carry the 78 missing).claude/skills/phase-4/osap-integration/PLAN.mdacceptance criterion (aspirational 70% → data-grounded threshold)Test plan
924 passed offline, schema-check in-sync, backward-compat verified6391bdfe(Python + Frontend + Vercel)🤖 Drafted with Claude Code via the Anthropic SDK.
Generated by Claude Code