feat(observability): Phase 4h.2 Part 1 — OSAP gate diagnostics + silent-drop metadata surface (#116) by dackclup · Pull Request #118 · dackclup/quantrank

dackclup · 2026-05-19T08:14:57Z

Summary

Phase 4h.2 Part 1 — observability follow-up to Phase 4h (PR #112). Closes the observability gap in issue #116 by adding 2 new optional Metadata fields that surface what's currently invisible in production:

osap_signals_missing_from_dataset: list[str] | None — the silent-drop list (78/100 manifest signals returned no rows from the OSAP fetch in the first 0.9.0-phase4h production run; surfaces them as a first-class metadata field instead of hiding the gap)
osap_gate_diagnostics: dict[str, OsapGateDiagnostic] | None — per-signal PBO/DSR/Sharpe/rejection_reason for every signal that reaches the gate (was previously available only at compute time as gate_results[signal].rejection_reason; not persisted anywhere)

No new veto, no rank change. Observability-only per SKILL.md Rule 16. Top-5 still ranks raw composite_score. Defense layer unchanged at 17. Production rankings UNAFFECTED.

Why a PATCH bump (not MINOR)

Per SKILL.md L305 verbatim:

Per phase-4/schema-versioning/PLAN.md: "Add a new optional field (default = None) → patch".

Both new Metadata fields default to None. The new OsapGateDiagnostic Pydantic model has all 4 fields explicit = None defaults (locked refinement). Legacy 0.9.0-phase4h JSONs deserialize cleanly with new fields set to None (asserted by tests/test_output/test_schema_phase4h2.py::test_metadata_backward_compat_with_0_9_0_payload).

Schema version: 0.9.0-phase4h → 0.9.1-phase4h.2.

Part-1 / Part-2 split rationale

This PR is Part 1 of 2 — diagnostic-first. Part 2 (threshold calibration + manifest reconciliation) is deferred until ≥1 week of production diagnostic data accumulates from Part 1's new fields. Why split:

Part 1 ships ONLY observability surface. No scoring layer touched. No threshold value changed. Low risk.
Part 2's threshold-calibration decisions need grounded data — should we relax n_partitions=16? DSR_VETO_THRESHOLD=0.0? Without per-signal PBO/DSR floats from production, those decisions are guesswork. Part 1 provides the data; Part 2 makes the call.
Phase 4h's original aspirational "≥ 70% acceptance" criterion was set pre-data; current 0% acceptance is symptomatic, not a regression. Fix the observability gap first, calibrate second.

3-commit cluster

#	SHA	Purpose	LOC	Tests added
1	`428729ad`	Schema delta — `OsapGateDiagnostic` model + 2 new `Metadata` fields + `SCHEMA_VERSION` bump + `TRACKED_MODELS` registry + types.ts mirror + snapshot regen	231	+7 (round-trip + backward-compat)
2	`c7949403`	Silent-drop wiring — `signals_in_dataframe` helper in `compute/features/osap_replicate.py` + `compute/main.py` orchestration populates `osap_signals_missing_from_dataset`	116	+4 (helper unit tests)
3	`6391bdfe`	Gate diagnostics wiring — `compute/main.py` populates `osap_gate_diagnostics` from `gate_results` + docs (`CLAUDE.md` schema line, `PHASE_STATUS.md` row 4 footer, `SKILL.md` new schema-versions row)	86	+2 (gate-diag round-trip)
Total	—	—	~433	+13

Architectural locks (carried forward)

OsapGateDiagnostic — all 4 fields | None = None (refinement feat(phase-1): universe + prices + momentum stub #3): no positional-required fields, so per-signal diagnostics serialize cleanly even for accepted signals (where rejection_reason is None).
Set-diff helper placement (refinement feat(phase-2): SEC EDGAR fundamentals + per-stock detail pages #4): lives at compute/features/osap_replicate.py::signals_in_dataframe, mirroring the existing coverage_by_signal helper in the same module. Pure helper, no I/O, no logging, unit-testable in isolation.
Graceful degradation preserved: both new fields reset to empty ([] / {}) in the OSAP-pipeline-failed except branch, then the or None idiom in the Metadata(...) constructor converts to None for the JSON output. Production output stays clean even if the OSAP fetch raises.
Rule 16 (Top-5 = raw composite_score) intact: this PR is observability-only. osap_gate_diagnostics is metadata for debugging; osap_signals_missing_from_dataset is a list of strings. Neither touches scoring.
PHASE3_WEIGHTS sum-to-1.0 invariant at compute/scoring/composite.py:43-45 untouched (Phase 4h architectural lock preserved).

Verification ladder

Step	Command	Result
1	`ruff check .`	✅ clean
2	`pytest tests/ -m "not network"`	✅ 924 passed (911 prior + 13 new)
3	`pytest -m network --run-network`	(unchanged at 20; no new @network this PR)
4	`python -m compute.output.schema_check`	✅ in-sync (snapshot regenerated in commit 1 via `--update-snapshot`)
5	`python -c "from compute.main import run_weekly_compute; from compute.output.schemas import OsapGateDiagnostic; ..."`	✅ OK
6	`git push -u origin claude/resume-quantrank-phase-4.5-Zh0pO`	✅ at `6391bdfe`
7	Open PR as Draft (this PR)	✅
8	`subscribe_pr_activity` + STOP for user audit	⏳ next

Ask-first surfaces touched

NONE. Verified per AGENTS.md:

.github/workflows/compute-rankings.yml — UNTOUCHED
.github/workflows/ci.yml — UNTOUCHED
pyproject.toml — UNTOUCHED (no new dep)
Schema triple (schemas.py / types.ts / schema-snapshot.json) — moved IN LOCKSTEP in commit 1; python -m compute.output.schema_check in-sync post-edit; this is the ask-first surface that WAS authorized in advance via the plan-mode approval

Production impact after merge

Once the next weekly compute-rankings.yml cron runs 0.9.1-phase4h.2, two new diagnostic fields will appear in frontend/public/data/metadata.json:

{
  "version": "0.9.1-phase4h.2",
  // ... unchanged Phase 4h OSAP fields ...
  "osap_signals_missing_from_dataset": [
    "AOP", "AbnormalAccrualsPercent", "AccrualsBM", "Activism1", ...
    // ~78 entries (current production)
  ],
  "osap_gate_diagnostics": {
    "BM": {"pbo": 0.6, "dsr": -0.1, "sharpe": 0.05, "rejection_reason": "high_pbo"},
    "Mom12m": {"pbo": 0.4, "dsr": -0.3, "sharpe": 0.2, "rejection_reason": "low_dsr"},
    // ~22 entries (every signal that reached the gate)
  }
}

Part 2 will consume this data to make the threshold-calibration call.

Out of scope (deferred to Part 2 — separate PR after ≥1 week of data)

Threshold calibration (PBO_THRESHOLD, DSR_THRESHOLD, n_partitions tuning)
Manifest reconciliation (rename / drop / re-source OSAP_SIGNALS_100 against actual dataset surface)
Investigation of whether dl_port("op", "pandas") is even the right dataset call (dl_signal / dl_all_signals may carry the 78 missing)
Update .claude/skills/phase-4/osap-integration/PLAN.md acceptance criterion (aspirational 70% → data-grounded threshold)
Re-run + confirm acceptance against the calibrated threshold

Test plan

Commit 1 (schema delta) local — 924 passed offline, schema-check in-sync, backward-compat verified
Commit 2 (silent-drop wiring) local — helper unit-tested, import smoke OK
Commit 3 (gate diagnostics + docs) local — gate-diag round-trip + production cohort shape simulated
CI green on 6391bdfe (Python + Frontend + Vercel)
User audit: schema delta + Part-1/Part-2 split rationale + line citations
User authorizes Draft → Ready flip
(post-merge) Next weekly cron writes the new fields into production metadata.json
(post-merge, ~1 week) Part 2 PR opens with calibration decisions grounded in real diagnostic data

🤖 Drafted with Claude Code via the Anthropic SDK.

Generated by Claude Code

…s + silent-drop surface Phase 4h.2 Part 1 (issue #116) — commit 1 of 3. Schema delta only; orchestration wiring lands in commits 2 (silent-drop) + 3 (gate diagnostics). **SCHEMA_VERSION** ``0.9.0-phase4h`` → ``0.9.1-phase4h.2``. PATCH bump per SKILL.md L305 lock: "Add a new optional field (default = None) → patch". Both new fields are ``| None = None`` additive optional; legacy 0.9.0-phase4h JSONs deserialize cleanly (verified by ``tests/test_output/test_schema_phase4h2.py:: test_metadata_backward_compat_with_0_9_0_payload``). **New Pydantic model** (``compute/output/schemas.py``): - ``OsapGateDiagnostic`` — 4 nullable fields per the locked refinement: ``pbo`` · ``dsr`` · ``sharpe`` · ``rejection_reason``. All default to ``None`` (no positional-required fields) so per-signal diagnostics serialize cleanly even for accepted signals (where ``rejection_reason`` is ``None``). - Mirrored in ``frontend/lib/types.ts::OsapGateDiagnostic`` + regenerated ``frontend/lib/schema-snapshot.json``. **Metadata additions** (both ``| None = None`` per the schema- versioning rule): - ``osap_signals_missing_from_dataset: list[str] | None`` — surfaces the silent-drop bug from #116. Production today shows 78/100 manifest signals missing from the dataset surface; commit 2 will populate this field via ``compute_missing_signals`` in ``compute/features/osap_replicate.py``. - ``osap_gate_diagnostics: dict[str, OsapGateDiagnostic] | None`` — surfaces per-signal PBO/DSR/Sharpe/rejection_reason. Production today shows 22 signals reaching the gate with 0% acceptance; commit 3 will populate this dict via the existing ``gate_results`` output of ``compute/validation/osap_validation.py:: gate_osap_signals``. **Registry updates**: - ``compute/output/schema_check.py::TRACKED_MODELS`` — added ``OsapGateDiagnostic`` so the BaseModel-subclass tracking test (``tests/test_output/test_schema_check.py::test_A5_tracked_ models_count_matches_schemas_module``) doesn't flag the new class as untracked. - ``tests/test_config.py`` — SCHEMA_VERSION pin updated to ``0.9.1-phase4h.2``. - ``tests/test_smoke.py`` — unchanged (``startswith("0.9.")`` still passes). **Tests** (``tests/test_output/test_schema_phase4h2.py``, 7 offline): 1. ``test_osap_gate_diagnostic_round_trip_with_all_fields`` — full field round-trip. 2. ``test_osap_gate_diagnostic_all_fields_default_to_none`` — empty construction validates per refinement #3 lock. 3. ``test_osap_gate_diagnostic_rejection_reason_taxonomy`` — canonical 4-value taxonomy round-trips (``high_pbo`` / ``low_dsr`` / ``insufficient_data`` / ``gate_failed``). 4. ``test_metadata_round_trip_with_new_fields_populated`` — end-to- end Metadata with both new fields filled. 5. ``test_metadata_backward_compat_with_0_9_0_payload`` — legacy payload deserializes; new fields default to ``None``. 6. ``test_metadata_new_fields_default_to_none`` — verbose restatement of the additive-optional contract. 7. ``test_metadata_extra_forbid_rejects_unknown_fields`` — locks the schema surface against silent field renames. **Verification**: - ``ruff check .`` → clean - ``python -m compute.output.schema_check`` → in-sync (Python ↔ TypeScript ↔ snapshot) - ``pytest tests/ -m "not network"`` → **918 passed** (911 prior + 7 new) **Out of scope this commit** (commits 2 + 3): - ``compute/features/osap_replicate.py`` set-diff helper for the silent-drop list (commit 2) - ``compute/main.py`` orchestration wiring both fields (commits 2 + 3) - ``compute/main.py`` unit test for silent-drop pass-through (commit 2) - Docs updates (PHASE_STATUS / SKILL / CLAUDE / WORKFLOW) (commit 3) **Defense layer**: unchanged at 17. **Top-5 rotation**: unchanged (Rule 16 lock — observability-only). https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU

…wiring Phase 4h.2 Part 1 (issue #116) — commit 2 of 3. Populates the ``osap_signals_missing_from_dataset`` field landed in commit 1's schema delta. Closes the observability gap that hid 78/100 manifest signals in the first production 0.9.0-phase4h run. **Refinement 4 decision (locked)**: helper lives in ``compute/features/osap_replicate.py``, not inline in ``compute/main.py``. Rationale: - Unit-testable in isolation (no orchestrator setup) - Mirrors the existing ``coverage_by_signal`` helper one function above — same pure-helper shape (no I/O, no logging, no side effects) - Keeps ``compute/main.py`` orchestration thin **Helper** (``compute/features/osap_replicate.py``, +31 LOC): ```python def signals_in_dataframe(df: pd.DataFrame) -> frozenset[str]: """Return unique ``signalname`` values present in an OSAP returns DataFrame. Phase 4h.2 Part 1 helper (issue #116)...""" if df.empty or "signalname" not in df.columns: return frozenset() return frozenset(df["signalname"].unique().tolist()) ``` Defensive against empty DataFrame AND missing ``signalname`` column — both yield empty frozenset so the caller's set diff reports the full manifest as missing (safe-by-default). **Wiring** (``compute/main.py``, +27 LOC): 1. Import added to the existing ``from compute.features.osap_replicate import (...)`` block (4-symbol import). 2. Variable initialized BEFORE the OSAP try block alongside the other observability accumulators: ``osap_signals_missing_from_dataset: list[str] = []``. 3. Inside the try, after ``fetch_osap_returns`` returns: ```python present_signals = signals_in_dataframe(osap_returns_raw) osap_signals_missing_from_dataset = sorted( set(config.OSAP_SIGNALS_100) - present_signals ) if osap_signals_missing_from_dataset: logger.warning( "OSAP manifest signals not in dataset: %d/%d missing " "(first 5: %s)", len(osap_signals_missing_from_dataset), len(config.OSAP_SIGNALS_100), osap_signals_missing_from_dataset[:5], ) ``` Warning fires only when the set diff is non-empty (no log spam on a clean cron). 4. Reset to ``[]`` in the OSAP-pipeline-failed ``except`` branch so graceful degradation leaves every osap_* field at ``None``. 5. Wired into the ``Metadata(...)`` constructor with the same ``or None`` idiom Phase 4h established: ```python osap_signals_missing_from_dataset=( osap_signals_missing_from_dataset or None ), ``` **Tests** (``tests/test_features/test_osap_replicate.py``, +58 LOC, 4 new offline tests appended to the Phase 4h commit-2 suite): 1. ``test_signals_in_dataframe_empty_returns_empty_frozenset`` — empty DataFrame (correct schema, zero rows) → empty frozenset. 2. ``test_signals_in_dataframe_no_signalname_column_returns_empty_frozenset`` — defensive against the schema-drift case where the ``signalname`` column itself disappears upstream. 3. ``test_signals_in_dataframe_unique_signals_dedup`` — multi-row input with duplicates → set dedups. 4. ``test_signals_in_dataframe_setdiff_with_manifest_simulates_silent_drop`` — end-to-end simulation of the issue-#116 silent-drop: 5-signal manifest, 2-signal dataset, set diff = sorted missing-3 (``["AOP", "AccrualsBM", "ChEQ"]``). Helper-level integration tests are sufficient for commit 2; ``compute/main.py`` orchestration-test deferred to the next weekly cron (the production output IS the integration test). ``compute/main.py`` import smoke verified locally: ``python -c "from compute.main import run_weekly_compute; ..."`` → OK. **Verification**: - ``ruff check .`` → clean - ``python -m compute.output.schema_check`` → in-sync (NO schema delta this commit — field already landed in commit 1) - ``python -c "from compute.main import run_weekly_compute, ..., signals_in_dataframe"`` → OK - ``pytest tests/ -m "not network"`` → **922 passed** (918 prior + 4 new helper tests) **Defense layer**: unchanged at 17. **Top-5 rotation**: unchanged (Rule 16 lock — observability-only). **Out of scope this commit** (commit 3): - ``osap_gate_diagnostics`` orchestration (per-signal PBO/DSR/ Sharpe/rejection_reason populated from ``gate_results``) - Docs updates (PHASE_STATUS / SKILL / CLAUDE / WORKFLOW) - PR body + open Draft PR + STOP at verification ladder step 8 **Production observation after this lands**: Once the next weekly cron runs ``0.9.1-phase4h.2``, the ``osap_signals_missing_from_dataset`` field will surface the 78 silent-drop signals from #116 (or whatever the current count is — manifest reconciliation in a later sub-PR may reduce it). https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU

**FINAL** commit of the Phase 4h.2 Part 1 3-commit cluster (issue #116). Populates the ``osap_gate_diagnostics`` field landed in commit 1's schema delta + docs the full Part-1 surface so reviewers + future maintainers see the schema and observability contract in one place. **`compute/main.py` wiring** (+23 LOC): 1. Import added to ``from compute.output.schemas import (...)``: ``OsapGateDiagnostic`` inserted alphabetically between ``Metadata`` and ``PillarScores`` (schemas import already used at this site, no new module touched). 2. Variable initialized BEFORE the OSAP try block: ``osap_gate_diagnostics: dict[str, OsapGateDiagnostic] = {}``. 3. Populated inside the try after ``gate_results = gate_osap_signals(osap_ls, requested_signals= config.OSAP_SIGNALS_100)`` and BEFORE ``filter_accepted_signals`` — captures EVERY signal that reached the gate (both accepted and rejected). Accepted carry ``rejection_reason=None``; rejected carry one of the canonical taxonomy values (``high_pbo`` / ``low_dsr`` / ``insufficient_data`` / ``gate_failed``) per ``compute/validation/osap_validation.py::GateResult``. 4. Reset to ``{}`` in the OSAP-pipeline-failed ``except`` branch so graceful degradation continues to leave every osap_* field at ``None``. 5. Wired into the ``Metadata(...)`` constructor with the established ``or None`` idiom: ```python osap_gate_diagnostics=osap_gate_diagnostics or None, ``` **Tests** (``tests/test_output/test_schema_phase4h2.py``, +55 LOC, 2 new offline appended to commit 1's suite): 1. ``test_metadata_gate_diagnostics_round_trip_with_production_cohort_shape`` — simulates the production observation from #116 (22 signals reach the gate, all rejected with a mix of rejection_reason values across the canonical 4-value taxonomy); asserts the dict-of-OsapGateDiagnostic structure survives ``model_validate`` → ``model_dump`` → ``model_validate`` round-trip. 2. ``test_metadata_gate_diagnostics_accepted_signal_has_null_rejection_reason`` — locks the ``rejection_reason=None`` semantics for accepted signals (Pydantic preserves None rather than coercing to a sentinel string). **Docs** (atomic with the wiring): - ``CLAUDE.md`` ``## Phase status`` — schema line updated to ``0.9.1-phase4h.2`` with the PATCH-bump framing; preserved the prior MINOR-bump history (`0.8.0-phase4.5f` → `0.9.0-phase4h` via PR #112). - ``PHASE_STATUS.md`` row 4 — Phase 4h.2 Part 1 sub-status added; describes both new fields, the Part-1 / Part-2 split rationale ("Part 2 opens after ≥1 week of production diagnostic data accumulates"), and the "no new veto / no rank change" invariant. - ``SKILL.md`` schema-versions table — new row for ``0.9.1-phase4h.2`` inserted above the ``0.9.0-phase4h`` row; cites the SKILL.md L305 PATCH-bump quote verbatim, locks the ``OsapGateDiagnostic`` "all 4 fields explicit = None" refinement in writing, and documents the set-diff helper placement decision (``compute/features/osap_replicate.py::signals_in_dataframe`` per refinement #4). - ``WORKFLOW.md`` — unchanged; no "Open items" checkbox list for Phase 4h.2 yet (would be created when Part 2 is scoped). **Verification ladder** (steps 1-5 complete): - ``ruff check .`` → clean ✅ - ``pytest tests/ -m "not network"`` → **924 passed** (911 baseline + 13 new across the 3-commit cluster: 7 schema + 4 helper + 2 gate-diagnostic) ✅ - ``python -m compute.output.schema_check`` → in-sync (no new schema delta this commit; the snapshot already captured both fields + ``OsapGateDiagnostic`` from commit 1's regen) ✅ - ``python -c "from compute.main import run_weekly_compute; from compute.output.schemas import OsapGateDiagnostic; ..."`` → OK ✅ Steps 6-8 next: ``git push`` → open Draft PR → ``subscribe_pr_activity`` + STOP for user audit + Mark-Ready authorization. **Defense layer**: unchanged at 17. **Top-5 rotation**: unchanged. **Schema version**: ``0.9.1-phase4h.2`` (locked from commit 1). **Cluster summary**: | # | SHA | LOC | Tests added | |---|---|---|---| | 1 — schema delta | ``428729ad`` | 231 | +7 (round-trip + backward-compat) | | 2 — silent-drop wiring | ``c7949403`` | 116 | +4 (helper unit tests) | | 3 — gate diagnostics + docs (this) | TBD | ~86 | +2 (gate-diag round-trip) | | **Total** | — | ~433 | **+13** | Within the Option-β diagnostic-first scope (~250-350 LOC budget; + docs); under the original plan's ~300 LOC estimate. https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU

vercel · 2026-05-19T08:15:03Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
quantrank	Ready	Preview, Comment	May 19, 2026 8:15am

…manifest + 6 offline tests (#119) Phase 4j scout PR — 3rd of 4 factor-library scouts (OSAP ✅ #110, JKP ✅ #114, Qlib THIS, IPCA next as 4k). Ships `pyqlib` install + Alpha158 158-feature manifest + 6 offline tests. NO production wiring; yfinance-to-Qlib BYO adapter + full Alpha158 compute on 502-ticker universe deferred to follow-on integration PR. 5 pre-plan investigations (all verified 2026-05-19): 1. PyPI package: `pyqlib` 0.9.7 (canonical). Alternative names (`qlib`, `microsoft-qlib`) return 404. 2. License: MIT via wheel METADATA classifier. No CC BY-NC complication like JKP — safe for Phase 6+ commercial roadmap. 3. Data init: `qlib.init(provider_uri=..., region="us")`. NO public US data bundle — Qlib's default covers CN A-share only; US universe is BYO via local .bin files. 4. Alpha158 surface: `qlib.contrib.data.handler.Alpha158` → 158 columns; manifest captured via `Alpha158DL.get_feature_config()[1]` and hardcoded; offline test 3 locks against upstream drift. 5. CI install footprint: ~150-180 MB net-new (mlflow / lightgbm / cvxpy / pymongo / redis / gym / jupyter + nbconvert transitives). One-time cold-start; pip wheel caching mitigates subsequent runs. Critical scope decisions: - NO @network test for this scout — Qlib has no remote CDN; data flow is local-bin filesystem I/O. Originally planned synthetic-OHLCV→bin→init→Alpha158 smoke test was dropped because pyqlib's PyPI wheel doesn't bundle `scripts/dump_bin.py`. Replacement: manifest-vs-runtime-introspection drift detector (stronger than the dropped test — fires on every pip install upgrade if Qlib changes the feature set). - Module name `compute/ingest/qlib_features.py` (NOT `qlib.py`) — Python import resolution would shadow the installed `qlib` package, breaking the entire factor-library integration. Distinct module name avoids namespace collision. - Tenacity NOT applied — Qlib's data flow is local filesystem I/O, no network retry semantics needed. First ingest module in QuantRank that diverges from the canonical `compute/ingest/osap.py:52-56` retry decorator (documented in module docstring). Module layer (compute/ingest/qlib_features.py, ~186 LOC): - `QLIB_DATA_CACHE: Path` constant (gitignored via parent `compute/cache/`) - `QLIB_INSTRUMENTS_UNIVERSE = "sp500"` (custom universe for future BYO bundle) - `ALPHA158_FEATURE_NAMES: tuple[str, ...]` — 158 hardcoded entries, asserted at module load - `init_qlib(provider_uri=None)` — thin wrapper around `qlib.init(region="us")`; idempotent - `fetch_alpha158_features(*, instruments, start_time, end_time)` — Alpha158 handler wrapper Config layer (compute/config.py, +23 LOC): - `QLIB_DATA_CACHE: Path = CACHE_DIR / "qlib" / "us_data"` - `QLIB_DATA_MAX_AGE_DAYS: int = 31` - `ALPHA158_FEATURE_COUNT: int = 158` (asserted against module manifest length) Tests (6 offline; ~113 LOC): 1. `test_alpha158_feature_manifest_has_158_entries` — primary CI signal (pure cardinality + uniqueness, no Qlib runtime) 2. `test_alpha158_feature_manifest_first_5_anchor` — K-bar leading features anchor (KMID, KLEN, KMID2, KUP, KUP2) 3. `test_alpha158_feature_manifest_matches_runtime_introspection` — drift detector (manifest == `Alpha158DL.get_feature_config()[1]`) 4. `test_qlib_data_cache_constant_under_repo_cache_dir` — config sanity 5. `test_init_qlib_passes_us_region_and_provider_uri` — monkeypatch capture 6. `test_init_qlib_defaults_to_config_cache_when_no_uri` — default path verified pyproject.toml: `pyqlib>=0.9.7,<0.10` added to `[factors]` extra (authorized in advance via plan-mode approval; pin range because Qlib's API drifts across minor versions). Ask-first surfaces touched: - `pyproject.toml [factors]` — extended (authorized via plan-mode) - `ci.yml` UNCHANGED (`[dev,factors]` install already covers new dep) - `compute-rankings.yml` UNTOUCHED per user hard constraint - Schema triple UNTOUCHED (no schema delta this scout) Verification (local): - ruff check . → clean - pytest tests/ -m "not network" → 930 passed (924 prior + 6 new) - python -m compute.output.schema_check → in-sync - python -c "from compute.ingest.qlib_features import ..." → OK 158 - Vercel preview ✅ READY Defense layer unchanged at 17. Top-5 rotation unchanged (no scoring touched). Schema unchanged at 0.9.1-phase4h.2. After this merges → 3 of 4 factor-library scouts done. Phase 4k (IPCA) is the final scout; once 4k merges → eligible for `v1.1.0-phase4` tag. Out of scope (deferred to follow-on full Phase 4j integration PR, ~5-commit cluster): - yfinance-to-Qlib BYO adapter (~150 LOC + custom S&P 500 instruments universe registration) - Full Alpha158 feature compute on 502-ticker universe → 502 × N_dates × 158 DataFrame - Per-feature cross-validation framework (PBO/DSR doesn't apply to per-stock-per-date features; walk-forward IC scoring per feature is the likely replacement) - Schema additions (StockDetail.qlib_features + Metadata.qlib_features_used + IC observability) → schema bump 0.9.1-phase4h.2 → 0.10.0-phase4j - compute/main.py wiring decision (observability-only? blended into composite? Phase-5 ML-meta-learner-only consumer?) Audit history: - Plan-audit round 1: 5 pre-plan investigations verified · MIT lock · heavy-deps disclosure approved - Plan-audit rounds 2-5: same plan re-paste loop (session-side stuck); main session verified PR #119 unchanged at each check - Pre-CI audit: clean (1 legitimate pivot — test #6 swapped from end-to-end smoke to manifest drift detector because pyqlib wheel lacks scripts/dump_bin.py) - Conditional Mark-Ready authorization given on Vercel ✅ + mergeable_state clean - Squash merged per "merge call is yours" delegation pattern (PR #112 / #114 / #118 precedent) https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2

Part of epic #125 (Item #4 of 6). Doc-only PR — no code changes, no schema delta, no test additions. Phase 4h timeline (2026-05-18 → 2026-05-19) demonstrated the cost of shipping production wiring + gate logic without a diagnostic surface: - PR #112 (Phase 4h): OSAP signal replication + PBO/DSR gate + Path-b blend, NO observability surface for gate decisions - First production cron: every signal failed gate, no way to know why - PR #118 (Phase 4h.2 Part 1): retrofit diagnostic surface (osap_signals_missing_from_dataset + osap_gate_diagnostics) - Second production cron: 22 missing + 22 fail low_dsr, 56 silently dropped (gap that Part 1 still couldn't fully expose) - PR #124 (Phase 4h.2 Part 2): root-cause fix (multi-port adapter) + osap_signals_dropped_no_long_short closing the accounting gap The combined cost of Phase 4h.2 Parts 1 + 2 (~10 hours across 2 PRs) would have been ~30 minutes of additional Phase 4h scope if the diagnostic surface had shipped alongside the production wiring. Files (3 changed, +83 LOC) --------------------------- - WORKFLOW.md (+63 LOC) — new section "# Observability-Before-Wiring Pattern" inserted between the mobile playbook table and the "Initial Prompts" section. Includes mandatory checklist (6 items) + anti-pattern statement + 3 reference precedents (PR #112 bad, PR #118 good, PR #124 good) - SKILL.md (+14 LOC) — new "Rule 18: Observability-before-wiring" appended to the Core Behavior Rules section (Rule 17 was the prior trailing rule). Links back to WORKFLOW.md for the mandatory checklist detail - CLAUDE.md (+6 LOC) — 1 bullet added to ## Conventions referencing the new Rule 18 + WORKFLOW.md section Files NOT touched (deliberately per scope) ------------------------------------------- - PHASE_STATUS.md — chronological log; pattern guidance belongs in WORKFLOW.md / SKILL.md / CLAUDE.md, not in the historical tracker - AGENTS.md — cross-tool agent doc; lookups defer to WORKFLOW.md by default, so a fresh duplicate would just create drift risk - compute/ / frontend/ / tests/ — doc-only PR, no behavior change Constraints honored ------------------- - No code changes — pure markdown additions - No schema delta — schema_check confirms in-sync - No test additions — pytest count unchanged at 959 - No push to main; no force-push; no --no-verify - No workflow_dispatch trigger (compute-rankings.yml untouched) Verification ladder all green ------------------------------ - ruff check . → All checks passed - python tools/check_doc_test_counts.py → exit 0 (no new hardcoded test-count claims introduced — the precedents reference PRs and hour estimates, not "N offline + M @network" drift patterns) - python -m compute.output.schema_check → in sync (no schema touch) - python -m pytest tests/ -m "not network" → 959 passed (unchanged) https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU Co-authored-by: Claude <noreply@anthropic.com>

…ble skills (#132) 3-task housekeeping + tacit knowledge harvest. Docs/skills-only PR — no code, no schema delta, no test additions. Task A — SKILL.md schema-version table fixes --------------------------------------------- Two stale "in flight" entries flipped to merged + 1 new row inserted: - Row 0.9.0-phase4h: "(in flight in PR #112)" → "(PR #112 merged 2026-05-19)" - Row 0.9.1-phase4h.2: "(in flight in PR #<NEXT>)" → "(PR #118 merged 2026-05-19)" - NEW row 0.9.2-phase4h.2 (above 0.9.1) — PR #124 merged, multi-port OSAP adapter + osap_signals_dropped_no_long_short field, closing the 100-signal accounting equation; DSR sign-inversion deferred to Part 3 PHASE_STATUS.md row 4 ALSO has "Phase 4h.2 Part 2 in flight in this PR" staleness — confirmed via grep but DELIBERATELY not updated here per Task A explicit scope (SKILL.md only). Recommend a follow-up phase-status-bump PR after this lands. Task B — New worker-session-handoff skill ------------------------------------------ .claude/skills/worker-session-handoff/SKILL.md (+163 LOC). YAML frontmatter + 5 sections: - When to use vs inline (≤50 LOC single-file → inline; ≥2 files / new dep / code logic → handoff) - Constraint lock library (8 standard locks: composite/PHASE3, Rule 16, Rule 18, no-merge, no force-push, no --no-verify, no workflow_dispatch, schema triple) - Anti-pattern: paste-loop avoidance (single outer code-block fence; reference PR #123 as related-but-distinct paste-loop failure mode) - Template (paste-ready, single ```` outer code block with language tag ` text` so inner triple-backticks pass through) - Reference invocations + QuantRank precedents (PR #124, #127, #131) Codifies the handoff shape that appeared verbatim across PRs #123, #124, #127, #128, #129, #131 — user copies ONE block instead of editing 5 template snippets per handoff. Task C — Portable skills library (4 skills, +417 LOC) ----------------------------------------------------- Audit step (per spec): read CLAUDE.md + AGENTS.md + SKILL.md + WORKFLOW.md + PR descriptions of #112/#118/#124/#127/#128/#129/#131. Identified 7 candidate patterns; classified by portability: - ✅ scout-then-integrate (portable; vendoring pattern, no QR logic) - ✅ observability-before-wiring (portable; gate-diagnostic pattern) - ✅ drift-detector-manifest (portable; API surface lock pattern) - ✅ schema-triple-lockstep (portable; Python/TS JSON contract) - 🟡 annotate-before-veto (portable; progressive rollout — DEFERRED to follow-up issue, lower value vs the 4 shipped) - 🟡 pre-plan-investigations (subsumed by scout-then-integrate's Phase 1 § "Pre-plan investigations" — no separate skill needed) - 🟡 graceful-degradation-try-except (portable; error-handling pattern — DEFERRED to follow-up issue, the wrapper is generally 1-line so doesn't warrant a dedicated skill) 4 shipped (each ≤ 109 LOC): .claude/skills/portable-scout-then-integrate/SKILL.md (99 LOC) .claude/skills/portable-drift-detector-manifest/SKILL.md (109 LOC) .claude/skills/portable-schema-triple-lockstep/SKILL.md (103 LOC) .claude/skills/portable-observability-before-wiring/SKILL.md (106 LOC) Flat naming convention (`portable-<name>/SKILL.md` at depth 1 from `.claude/skills/`) because Claude Code's skill registry doesn't recurse into nested subdirectories per CLAUDE.md ## Conventions. Confirmed via session reload — all 4 portable + worker-session- handoff registered correctly. Each portable skill has: - YAML frontmatter (name + description + TRIGGER + SKIP) - ## Pattern section (generic, no QR business logic) - ## Trigger conditions + ## Skip conditions - ## QuantRank precedent (1 paragraph, clearly labeled as precedent not pattern definition) Task C constraint check: - All portable skills core pattern descriptions are project- agnostic (read `.claude/skills/portable-*/SKILL.md` ## Pattern sections — zero references to OSAP / IPCA / pillar / Top-5 inside the pattern body; only inside the labeled "QuantRank precedent" section at the bottom) - 3 of 4 portable skills are 103-109 LOC (slightly over the 100-LOC target — pattern + trigger + skip + precedent sections require ~25 LOC each, leaving ~25 LOC of unavoidable scaffold). The 99-LOC one (scout-then-integrate) shows the cap is achievable but tight. Files (6 changed, +580 LOC, no deletions) ------------------------------------------ - SKILL.md — schema-version table fixes (Task A) - 5 new SKILL.md files in .claude/skills/ (Tasks B + C) Verification ladder all green ------------------------------ - ruff check . → All checks passed - python tools/check_doc_test_counts.py → exit 0 - python tools/check_branch_collisions.py "skill" "portable" → expected ⚠️ on #131 (own adjacent work, not a duplicate) - python -m compute.output.schema_check → in sync (no schema touch) - python -m pytest tests/ -m "not network" → 959 passed (unchanged; tools/ + .claude/skills/ aren't imported by tests) - Claude Code skill registry pick-up verified via session reload — all 5 new skills (worker-session-handoff + 4 portable-*) appear in the available-skills list Constraints honored ------------------- - No touch to compute/ / frontend/ / tests/ - No touch to PHASE_STATUS.md / WORKFLOW.md (Task A scope = SKILL.md only; PHASE_STATUS.md staleness flagged for follow-up) - No push to main; no force-push; no --no-verify - No workflow_dispatch trigger - Task C portable skills are project-agnostic in their pattern description (QR refs confined to labeled "precedent" sections) Follow-up issue (to file post-merge) ------------------------------------ Title: "Portable Skills Library — extract remaining tacit patterns" - annotate-before-veto (progressive rule rollout) - graceful-degradation-try-except (1-line wrapper guidance) - pre-plan-investigations as standalone (currently subsumed) - Anything else surfaced by future PR descriptions https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU Co-authored-by: Claude <noreply@anthropic.com>

…sk C.1 recovery) (#135) * docs(skills): SKILL.md schema bump + worker-session-handoff + 4 portable skills 3-task housekeeping + tacit knowledge harvest. Docs/skills-only PR — no code, no schema delta, no test additions. Task A — SKILL.md schema-version table fixes --------------------------------------------- Two stale "in flight" entries flipped to merged + 1 new row inserted: - Row 0.9.0-phase4h: "(in flight in PR #112)" → "(PR #112 merged 2026-05-19)" - Row 0.9.1-phase4h.2: "(in flight in PR #<NEXT>)" → "(PR #118 merged 2026-05-19)" - NEW row 0.9.2-phase4h.2 (above 0.9.1) — PR #124 merged, multi-port OSAP adapter + osap_signals_dropped_no_long_short field, closing the 100-signal accounting equation; DSR sign-inversion deferred to Part 3 PHASE_STATUS.md row 4 ALSO has "Phase 4h.2 Part 2 in flight in this PR" staleness — confirmed via grep but DELIBERATELY not updated here per Task A explicit scope (SKILL.md only). Recommend a follow-up phase-status-bump PR after this lands. Task B — New worker-session-handoff skill ------------------------------------------ .claude/skills/worker-session-handoff/SKILL.md (+163 LOC). YAML frontmatter + 5 sections: - When to use vs inline (≤50 LOC single-file → inline; ≥2 files / new dep / code logic → handoff) - Constraint lock library (8 standard locks: composite/PHASE3, Rule 16, Rule 18, no-merge, no force-push, no --no-verify, no workflow_dispatch, schema triple) - Anti-pattern: paste-loop avoidance (single outer code-block fence; reference PR #123 as related-but-distinct paste-loop failure mode) - Template (paste-ready, single ```` outer code block with language tag ` text` so inner triple-backticks pass through) - Reference invocations + QuantRank precedents (PR #124, #127, #131) Codifies the handoff shape that appeared verbatim across PRs #123, #124, #127, #128, #129, #131 — user copies ONE block instead of editing 5 template snippets per handoff. Task C — Portable skills library (4 skills, +417 LOC) ----------------------------------------------------- Audit step (per spec): read CLAUDE.md + AGENTS.md + SKILL.md + WORKFLOW.md + PR descriptions of #112/#118/#124/#127/#128/#129/#131. Identified 7 candidate patterns; classified by portability: - ✅ scout-then-integrate (portable; vendoring pattern, no QR logic) - ✅ observability-before-wiring (portable; gate-diagnostic pattern) - ✅ drift-detector-manifest (portable; API surface lock pattern) - ✅ schema-triple-lockstep (portable; Python/TS JSON contract) - 🟡 annotate-before-veto (portable; progressive rollout — DEFERRED to follow-up issue, lower value vs the 4 shipped) - 🟡 pre-plan-investigations (subsumed by scout-then-integrate's Phase 1 § "Pre-plan investigations" — no separate skill needed) - 🟡 graceful-degradation-try-except (portable; error-handling pattern — DEFERRED to follow-up issue, the wrapper is generally 1-line so doesn't warrant a dedicated skill) 4 shipped (each ≤ 109 LOC): .claude/skills/portable-scout-then-integrate/SKILL.md (99 LOC) .claude/skills/portable-drift-detector-manifest/SKILL.md (109 LOC) .claude/skills/portable-schema-triple-lockstep/SKILL.md (103 LOC) .claude/skills/portable-observability-before-wiring/SKILL.md (106 LOC) Flat naming convention (`portable-<name>/SKILL.md` at depth 1 from `.claude/skills/`) because Claude Code's skill registry doesn't recurse into nested subdirectories per CLAUDE.md ## Conventions. Confirmed via session reload — all 4 portable + worker-session- handoff registered correctly. Each portable skill has: - YAML frontmatter (name + description + TRIGGER + SKIP) - ## Pattern section (generic, no QR business logic) - ## Trigger conditions + ## Skip conditions - ## QuantRank precedent (1 paragraph, clearly labeled as precedent not pattern definition) Task C constraint check: - All portable skills core pattern descriptions are project- agnostic (read `.claude/skills/portable-*/SKILL.md` ## Pattern sections — zero references to OSAP / IPCA / pillar / Top-5 inside the pattern body; only inside the labeled "QuantRank precedent" section at the bottom) - 3 of 4 portable skills are 103-109 LOC (slightly over the 100-LOC target — pattern + trigger + skip + precedent sections require ~25 LOC each, leaving ~25 LOC of unavoidable scaffold). The 99-LOC one (scout-then-integrate) shows the cap is achievable but tight. Files (6 changed, +580 LOC, no deletions) ------------------------------------------ - SKILL.md — schema-version table fixes (Task A) - 5 new SKILL.md files in .claude/skills/ (Tasks B + C) Verification ladder all green ------------------------------ - ruff check . → All checks passed - python tools/check_doc_test_counts.py → exit 0 - python tools/check_branch_collisions.py "skill" "portable" → expected ⚠️ on #131 (own adjacent work, not a duplicate) - python -m compute.output.schema_check → in sync (no schema touch) - python -m pytest tests/ -m "not network" → 959 passed (unchanged; tools/ + .claude/skills/ aren't imported by tests) - Claude Code skill registry pick-up verified via session reload — all 5 new skills (worker-session-handoff + 4 portable-*) appear in the available-skills list Constraints honored ------------------- - No touch to compute/ / frontend/ / tests/ - No touch to PHASE_STATUS.md / WORKFLOW.md (Task A scope = SKILL.md only; PHASE_STATUS.md staleness flagged for follow-up) - No push to main; no force-push; no --no-verify - No workflow_dispatch trigger - Task C portable skills are project-agnostic in their pattern description (QR refs confined to labeled "precedent" sections) Follow-up issue (to file post-merge) ------------------------------------ Title: "Portable Skills Library — extract remaining tacit patterns" - annotate-before-veto (progressive rule rollout) - graceful-degradation-try-except (1-line wrapper guidance) - pre-plan-investigations as standalone (currently subsumed) - Anything else surfaced by future PR descriptions https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU * docs(skills): Vendor karpathy-guidelines (Task C.1 recovery) + THIRD_PARTY_NOTICES.md Recovers Task C.1 from the original handoff that was silent-dropped in the prior PR #132 commit (50da720). The handoff explicitly named "Vendor karpathy-guidelines (1 skill, ~70 LOC)" as part of the portable skills library; the auditor session caught the omission and authorized this follow-up commit on the existing branch. Files (2 new, +138 LOC) ------------------------ - .claude/skills/portable-karpathy-guidelines/SKILL.md (+82 LOC) — vendored content of upstream skills/karpathy-guidelines/SKILL.md (67 LOC, byte-for-byte preserved) + 15-line appended attribution block referencing the upstream source, commit SHA, and the Karpathy tweet that motivated the guidelines. - THIRD_PARTY_NOTICES.md (+56 LOC, NEW at repo root) — third-party license disclosures. Section "karpathy-guidelines (Claude Code skill)" carries source URL, license declaration, vendored path, vendored date, upstream commit SHA, upstream first-commit date, and the full standard MIT License text with copyright attributed to "multica-ai contributors" (upstream has no individual copyright line and no standalone LICENSE file; the `license: MIT` claim appears in upstream README.md § License and each skill's YAML frontmatter). Upstream provenance ------------------- - Source: https://github.com/multica-ai/andrej-karpathy-skills - Upstream HEAD SHA at vendoring: 2c606141936f1eeef17fa3043a72095b4765b9c2 - Upstream first commit: 2026-01-27 - Vendored date: 2026-05-20 - License: MIT (declared) Verbatim content preserved -------------------------- `diff /tmp/karpathy-src/skills/karpathy-guidelines/SKILL.md .claude/skills/portable-karpathy-guidelines/SKILL.md` shows ONLY the 15-line appended attribution block at lines 68-82. The upstream 67-line content (YAML frontmatter + "Karpathy Guidelines" heading + the 4 principles) is byte-for-byte unchanged. Per the spec constraint: "เก็บ 4 principles verbatim. แก้ได้แค่ 'เพิ่ม' attribution block ท้ายไฟล์". License-disclosure caveat ------------------------- Upstream `multica-ai/andrej-karpathy-skills` declares MIT via README + YAML frontmatter but does NOT ship a standalone LICENSE file. The `THIRD_PARTY_NOTICES.md` entry includes the standard MIT License template with copyright attributed to the GitHub org ("multica-ai contributors"), matching the principle that an MIT declaration without a formal copyright line still licenses to the redistributor; the attribution is conservative. Verification ladder all green ------------------------------ - ruff check . → All checks passed - python tools/check_doc_test_counts.py → exit 0 (no test-count drift introduced by this commit) - python tools/check_branch_collisions.py "karpathy" → no scope collisions detected - python -m compute.output.schema_check → in sync (no schema touch) - python -m pytest tests/ -m "not network" → 959 passed (unchanged; .claude/skills/ + THIRD_PARTY_NOTICES.md aren't imported by tests) - Skill registry pickup verified via session reload — `portable-karpathy-guidelines` appears in the available-skills list with the upstream description verbatim Constraints honored ------------------- - No squash / amend of the prior 50da720 commit — this is a fresh commit pushed on top of the existing branch (per spec "ห้าม squash old commit") - No touch to the 4 already-shipped portable skills in 50da720 - No touch to compute/ / frontend/ / tests/ - No push to main; no force-push; no --no-verify - No workflow_dispatch trigger - Karpathy SKILL.md upstream content preserved verbatim; only the attribution block appended below the original content PR description update will follow as a separate `gh pr edit` / MCP `update_pull_request` call so the new "License Compliance" section + the audit-table row for karpathy-guidelines land in the PR body. https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU --------- Co-authored-by: Claude <noreply@anthropic.com>

…4 staleness (#139) Closes #133. Docs/skills-only PR. Task A — Portable skills library final 2 (closes #133) ------------------------------------------------------ Extracts the last 2 deferred-but-tracked patterns from epic #125: - .claude/skills/portable-annotate-before-veto/SKILL.md (108 LOC): Progressive-rollout pattern for defense / risk flags. Ship as annotate FIRST, promote to veto only after ≥ 1 production cron of observation + threshold calibration + cohort-acceptance check. Forcing precedent: Phase 4.5 cluster (loss_avoidance_pattern at 0% fire rate would've been a no-op or hotfix candidate as a veto; annotate made it observable). - .claude/skills/portable-graceful-degradation-try-except/SKILL.md (115 LOC): Wrap every external-data integration call site in a try/except that sets ALL related output fields to None on failure + writes a structured log line + sets a per-integration status Metadata field. 3-rule contract: no partial state, no log swallowing, downstream-aware. Forcing precedent: OSAP integration in compute/main.py (PRs #112 → #118 → #124). Both skills follow the established portable-* convention from PR #132 (YAML frontmatter + Pattern + Trigger + Skip + QuantRank precedent section). Each pattern section is project-agnostic; QuantRank refs confined to the labeled "QuantRank precedent" sections at the bottom. Task B — PHASE_STATUS.md row 4 staleness fix --------------------------------------------- PHASE_STATUS.md row 4 said "Phase 4h.2 Part 2 in flight in this PR" since PR #124's prep work. PR #124 merged 2026-05-19 (commit sequence visible in main: ...124...118...112...). Updated to "Phase 4h.2 Part 2 merged via PR #124 (2026-05-19)" — the rest of the row 4 text (multi-port OSAP adapter description, IC-decay deferral note) stays unchanged. This was flagged in PR #132 body and tracked as a small follow-up. No other PHASE_STATUS.md edits — row 4 is the only stale entry. Task C — Docs lockstep ----------------------- CLAUDE.md row 33 skill count: 35 → 37 (QR-origin portable category 4 → 6, total reflects the 2 new skills landed here). Categorisation unchanged otherwise; 9arm license-pending caveat still flagged with cross-reference to issue #137. Skill inventory after this PR (37 total) ----------------------------------------- - QuantRank operational: 12 - QR-origin portable extract: 6 (was 4; +annotate-before-veto + graceful-degradation-try-except) - Anthropic vendored: 6 - External MIT vendored: 9 (Karpathy + 8 mattpocock, unchanged) - External license-pending vendored: 4 (9arm, unchanged) Verification ladder ------------------- - ruff check . → All checks passed - python -m compute.output.schema_check → Schema snapshot in sync - python tools/check_doc_test_counts.py → exit 0 - pytest tests/ -m "not network" → not run locally (sandbox missing pandas); CI will verify. Changes are docs/skills-only. - Skill registry pickup verified via session reload — both portable-annotate-before-veto and portable-graceful-degradation-try-except register with full YAML-frontmatter descriptions. Constraints honored ------------------- - No touch to compute/ / frontend/ / tests/ - No touch to WORKFLOW.md (out of scope; could file a future follow-up if WORKFLOW.md needs to cross-reference the two new portable skills) - No squash / amend of prior commits - No push to main; no force-push; no --no-verify - No workflow_dispatch trigger - 2 new portable skills pattern descriptions are project-agnostic; QR refs only in labeled "precedent" sections Epic #125 status after this PR ------------------------------- - #130 (quarterly cohort-threshold review tracker) — recurring, unchanged - #133 (portable skills library remaining) — CLOSED by this PR - #137 (9arm-skills license clarification) — external action, waiting on user to file upstream issue at thananon/9arm-skills Epic #125 Item 3 (Pre-merge production simulation) remains the only substantive open scope. PHASE_STATUS.md row 4 staleness was the last housekeeping task. https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2 Co-authored-by: Claude <noreply@anthropic.com>

…PR A) (#141) First PR in the multi-PR .md optimization sequence (Option D scope — yกเครื่อง). PR A is the low-risk baseline: fixes 2 broken skill frontmatters that prevent dispatch + drift-fixes 4 stale facts in agent docs. Critical YAML fix: - branch-collision-check/SKILL.md and pr-quality-gate/SKILL.md had multi-line `description:` plain-scalar frontmatter that PyYAML (and Claude Code's skill loader) couldn't parse because lines contain `#123` / `#X` issue references after whitespace — YAML treats ` #` as a comment marker, so everything after the first comment-trigger got eaten and the loader fell back to displaying `name: name` in the available-skills list. Both skills were effectively undispatchable from any session. - Fix: change `description:` to `description: >` (folded block scalar) so newlines become spaces and `#` mid-content is treated as literal text. Verified live in this session — system reminder now shows the full TRIGGER/SKIP descriptions for both. Stale-fact pass: - .claude/skills/README.md L14-16: "27 invocation-triggerable skills" → references CLAUDE.md as the canonical count (38) to prevent future drift. Future top-level skill add/remove only needs to bump CLAUDE.md §Layout, not three files. - AGENTS.md L104: ".claude/skills/ # 24 loaded skills" → 38. - AGENTS.md L287: "Schema version: 0.8.0-phase4.5f" → 0.9.2-phase4h.2 (3 versions behind). Now references SKILL.md schema-version table for full history. - CLAUDE.md L181-192 (§Phase status): "Current schema 0.9.1-phase4h.2 ... Phase 4h in flight in PR #112" → 0.9.2-phase4h.2 + Phase 4h shipped (Parts 1+2 done via #112/#118/#124). - CLAUDE.md + AGENTS.md §Phase status: "Epic #125 Item 3 in flight via PR #140" → "PR 1 of 2 shipped" at commit a52aa2d; PR 2 remaining. CLAUDE.md + AGENTS.md edit ships per the lockstep convention. No code touched, no schema touched — pre-merge-prod-sim.yml won't trigger (paths compute/scoring + compute/features unaffected). Next in optimization sequence: PR B (CLAUDE.md token diet) — TBD after user reviews this one. Co-authored-by: Claude <noreply@anthropic.com>

claude added 3 commits May 19, 2026 07:46

vercel Bot deployed to Preview May 19, 2026 08:15 View deployment

dackclup marked this pull request as ready for review May 19, 2026 08:38

dackclup merged commit 2125aea into main May 19, 2026
4 checks passed

dackclup deleted the claude/resume-quantrank-phase-4.5-Zh0pO branch May 19, 2026 08:39

dackclup mentioned this pull request May 20, 2026

docs(skills): SKILL.md schema bump + worker-session-handoff + 3-5 portable skills #132

Merged

10 tasks

dackclup mentioned this pull request May 20, 2026

docs(skills): Close epic #125 — 2 portable skills + PHASE_STATUS row 4 staleness #139

Merged

dackclup mentioned this pull request May 20, 2026

docs(md-drift): YAML frontmatter fix + stale-fact pass (Optimization PR A) #141

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(observability): Phase 4h.2 Part 1 — OSAP gate diagnostics + silent-drop metadata surface (#116)#118

feat(observability): Phase 4h.2 Part 1 — OSAP gate diagnostics + silent-drop metadata surface (#116)#118
dackclup merged 3 commits into
mainfrom
claude/resume-quantrank-phase-4.5-Zh0pO

dackclup commented May 19, 2026

Uh oh!

vercel Bot commented May 19, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dackclup commented May 19, 2026

Summary

Why a PATCH bump (not MINOR)

Part-1 / Part-2 split rationale

3-commit cluster

Architectural locks (carried forward)

Verification ladder

Ask-first surfaces touched

Production impact after merge

Out of scope (deferred to Part 2 — separate PR after ≥1 week of data)

Test plan

Uh oh!

vercel Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vercel Bot commented May 19, 2026 •

edited

Loading