Skip to content

feat(observability): Phase 4h.2 Part 1 — OSAP gate diagnostics + silent-drop metadata surface (#116)#118

Merged
dackclup merged 3 commits into
mainfrom
claude/resume-quantrank-phase-4.5-Zh0pO
May 19, 2026
Merged

feat(observability): Phase 4h.2 Part 1 — OSAP gate diagnostics + silent-drop metadata surface (#116)#118
dackclup merged 3 commits into
mainfrom
claude/resume-quantrank-phase-4.5-Zh0pO

Conversation

@dackclup
Copy link
Copy Markdown
Owner

Summary

Phase 4h.2 Part 1 — observability follow-up to Phase 4h (PR #112). Closes the observability gap in issue #116 by adding 2 new optional Metadata fields that surface what's currently invisible in production:

  1. osap_signals_missing_from_dataset: list[str] | None — the silent-drop list (78/100 manifest signals returned no rows from the OSAP fetch in the first 0.9.0-phase4h production run; surfaces them as a first-class metadata field instead of hiding the gap)
  2. osap_gate_diagnostics: dict[str, OsapGateDiagnostic] | Noneper-signal PBO/DSR/Sharpe/rejection_reason for every signal that reaches the gate (was previously available only at compute time as gate_results[signal].rejection_reason; not persisted anywhere)

No new veto, no rank change. Observability-only per SKILL.md Rule 16. Top-5 still ranks raw composite_score. Defense layer unchanged at 17. Production rankings UNAFFECTED.

Why a PATCH bump (not MINOR)

Per SKILL.md L305 verbatim:

Per phase-4/schema-versioning/PLAN.md: "Add a new optional field (default = None) → patch".

Both new Metadata fields default to None. The new OsapGateDiagnostic Pydantic model has all 4 fields explicit = None defaults (locked refinement). Legacy 0.9.0-phase4h JSONs deserialize cleanly with new fields set to None (asserted by tests/test_output/test_schema_phase4h2.py::test_metadata_backward_compat_with_0_9_0_payload).

Schema version: 0.9.0-phase4h0.9.1-phase4h.2.

Part-1 / Part-2 split rationale

This PR is Part 1 of 2 — diagnostic-first. Part 2 (threshold calibration + manifest reconciliation) is deferred until ≥1 week of production diagnostic data accumulates from Part 1's new fields. Why split:

  • Part 1 ships ONLY observability surface. No scoring layer touched. No threshold value changed. Low risk.
  • Part 2's threshold-calibration decisions need grounded data — should we relax n_partitions=16? DSR_VETO_THRESHOLD=0.0? Without per-signal PBO/DSR floats from production, those decisions are guesswork. Part 1 provides the data; Part 2 makes the call.
  • Phase 4h's original aspirational "≥ 70% acceptance" criterion was set pre-data; current 0% acceptance is symptomatic, not a regression. Fix the observability gap first, calibrate second.

3-commit cluster

# SHA Purpose LOC Tests added
1 428729ad Schema delta — OsapGateDiagnostic model + 2 new Metadata fields + SCHEMA_VERSION bump + TRACKED_MODELS registry + types.ts mirror + snapshot regen 231 +7 (round-trip + backward-compat)
2 c7949403 Silent-drop wiring — signals_in_dataframe helper in compute/features/osap_replicate.py + compute/main.py orchestration populates osap_signals_missing_from_dataset 116 +4 (helper unit tests)
3 6391bdfe Gate diagnostics wiring — compute/main.py populates osap_gate_diagnostics from gate_results + docs (CLAUDE.md schema line, PHASE_STATUS.md row 4 footer, SKILL.md new schema-versions row) 86 +2 (gate-diag round-trip)
Total ~433 +13

Architectural locks (carried forward)

  1. OsapGateDiagnostic — all 4 fields | None = None (refinement feat(phase-1): universe + prices + momentum stub #3): no positional-required fields, so per-signal diagnostics serialize cleanly even for accepted signals (where rejection_reason is None).
  2. Set-diff helper placement (refinement feat(phase-2): SEC EDGAR fundamentals + per-stock detail pages #4): lives at compute/features/osap_replicate.py::signals_in_dataframe, mirroring the existing coverage_by_signal helper in the same module. Pure helper, no I/O, no logging, unit-testable in isolation.
  3. Graceful degradation preserved: both new fields reset to empty ([] / {}) in the OSAP-pipeline-failed except branch, then the or None idiom in the Metadata(...) constructor converts to None for the JSON output. Production output stays clean even if the OSAP fetch raises.
  4. Rule 16 (Top-5 = raw composite_score) intact: this PR is observability-only. osap_gate_diagnostics is metadata for debugging; osap_signals_missing_from_dataset is a list of strings. Neither touches scoring.
  5. PHASE3_WEIGHTS sum-to-1.0 invariant at compute/scoring/composite.py:43-45 untouched (Phase 4h architectural lock preserved).

Verification ladder

Step Command Result
1 ruff check . ✅ clean
2 pytest tests/ -m "not network" 924 passed (911 prior + 13 new)
3 pytest -m network --run-network (unchanged at 20; no new @network this PR)
4 python -m compute.output.schema_check ✅ in-sync (snapshot regenerated in commit 1 via --update-snapshot)
5 python -c "from compute.main import run_weekly_compute; from compute.output.schemas import OsapGateDiagnostic; ..." ✅ OK
6 git push -u origin claude/resume-quantrank-phase-4.5-Zh0pO ✅ at 6391bdfe
7 Open PR as Draft (this PR)
8 subscribe_pr_activity + STOP for user audit ⏳ next

Ask-first surfaces touched

NONE. Verified per AGENTS.md:

  • .github/workflows/compute-rankings.yml — UNTOUCHED
  • .github/workflows/ci.yml — UNTOUCHED
  • pyproject.toml — UNTOUCHED (no new dep)
  • Schema triple (schemas.py / types.ts / schema-snapshot.json) — moved IN LOCKSTEP in commit 1; python -m compute.output.schema_check in-sync post-edit; this is the ask-first surface that WAS authorized in advance via the plan-mode approval

Production impact after merge

Once the next weekly compute-rankings.yml cron runs 0.9.1-phase4h.2, two new diagnostic fields will appear in frontend/public/data/metadata.json:

{
  "version": "0.9.1-phase4h.2",
  // ... unchanged Phase 4h OSAP fields ...
  "osap_signals_missing_from_dataset": [
    "AOP", "AbnormalAccrualsPercent", "AccrualsBM", "Activism1", ...
    // ~78 entries (current production)
  ],
  "osap_gate_diagnostics": {
    "BM": {"pbo": 0.6, "dsr": -0.1, "sharpe": 0.05, "rejection_reason": "high_pbo"},
    "Mom12m": {"pbo": 0.4, "dsr": -0.3, "sharpe": 0.2, "rejection_reason": "low_dsr"},
    // ~22 entries (every signal that reached the gate)
  }
}

Part 2 will consume this data to make the threshold-calibration call.

Out of scope (deferred to Part 2 — separate PR after ≥1 week of data)

  • Threshold calibration (PBO_THRESHOLD, DSR_THRESHOLD, n_partitions tuning)
  • Manifest reconciliation (rename / drop / re-source OSAP_SIGNALS_100 against actual dataset surface)
  • Investigation of whether dl_port("op", "pandas") is even the right dataset call (dl_signal / dl_all_signals may carry the 78 missing)
  • Update .claude/skills/phase-4/osap-integration/PLAN.md acceptance criterion (aspirational 70% → data-grounded threshold)
  • Re-run + confirm acceptance against the calibrated threshold

Test plan

  • Commit 1 (schema delta) local — 924 passed offline, schema-check in-sync, backward-compat verified
  • Commit 2 (silent-drop wiring) local — helper unit-tested, import smoke OK
  • Commit 3 (gate diagnostics + docs) local — gate-diag round-trip + production cohort shape simulated
  • CI green on 6391bdfe (Python + Frontend + Vercel)
  • User audit: schema delta + Part-1/Part-2 split rationale + line citations
  • User authorizes Draft → Ready flip
  • (post-merge) Next weekly cron writes the new fields into production metadata.json
  • (post-merge, ~1 week) Part 2 PR opens with calibration decisions grounded in real diagnostic data

🤖 Drafted with Claude Code via the Anthropic SDK.


Generated by Claude Code

claude added 3 commits May 19, 2026 07:46
…s + silent-drop surface

Phase 4h.2 Part 1 (issue #116) — commit 1 of 3. Schema delta only;
orchestration wiring lands in commits 2 (silent-drop) + 3 (gate
diagnostics).

**SCHEMA_VERSION** ``0.9.0-phase4h`` → ``0.9.1-phase4h.2``.

PATCH bump per SKILL.md L305 lock: "Add a new optional field
(default = None) → patch". Both new fields are ``| None = None``
additive optional; legacy 0.9.0-phase4h JSONs deserialize cleanly
(verified by ``tests/test_output/test_schema_phase4h2.py::
test_metadata_backward_compat_with_0_9_0_payload``).

**New Pydantic model** (``compute/output/schemas.py``):

- ``OsapGateDiagnostic`` — 4 nullable fields per the locked
  refinement: ``pbo`` · ``dsr`` · ``sharpe`` · ``rejection_reason``.
  All default to ``None`` (no positional-required fields) so
  per-signal diagnostics serialize cleanly even for accepted
  signals (where ``rejection_reason`` is ``None``).
- Mirrored in ``frontend/lib/types.ts::OsapGateDiagnostic`` +
  regenerated ``frontend/lib/schema-snapshot.json``.

**Metadata additions** (both ``| None = None`` per the schema-
versioning rule):

- ``osap_signals_missing_from_dataset: list[str] | None`` — surfaces
  the silent-drop bug from #116. Production today shows 78/100
  manifest signals missing from the dataset surface; commit 2 will
  populate this field via ``compute_missing_signals`` in
  ``compute/features/osap_replicate.py``.
- ``osap_gate_diagnostics: dict[str, OsapGateDiagnostic] | None``
  — surfaces per-signal PBO/DSR/Sharpe/rejection_reason. Production
  today shows 22 signals reaching the gate with 0% acceptance;
  commit 3 will populate this dict via the existing ``gate_results``
  output of ``compute/validation/osap_validation.py::
  gate_osap_signals``.

**Registry updates**:

- ``compute/output/schema_check.py::TRACKED_MODELS`` — added
  ``OsapGateDiagnostic`` so the BaseModel-subclass tracking test
  (``tests/test_output/test_schema_check.py::test_A5_tracked_
  models_count_matches_schemas_module``) doesn't flag the new
  class as untracked.
- ``tests/test_config.py`` — SCHEMA_VERSION pin updated to
  ``0.9.1-phase4h.2``.
- ``tests/test_smoke.py`` — unchanged (``startswith("0.9.")``
  still passes).

**Tests** (``tests/test_output/test_schema_phase4h2.py``, 7
offline):

1. ``test_osap_gate_diagnostic_round_trip_with_all_fields`` — full
   field round-trip.
2. ``test_osap_gate_diagnostic_all_fields_default_to_none`` — empty
   construction validates per refinement #3 lock.
3. ``test_osap_gate_diagnostic_rejection_reason_taxonomy`` —
   canonical 4-value taxonomy round-trips (``high_pbo`` /
   ``low_dsr`` / ``insufficient_data`` / ``gate_failed``).
4. ``test_metadata_round_trip_with_new_fields_populated`` — end-to-
   end Metadata with both new fields filled.
5. ``test_metadata_backward_compat_with_0_9_0_payload`` — legacy
   payload deserializes; new fields default to ``None``.
6. ``test_metadata_new_fields_default_to_none`` — verbose
   restatement of the additive-optional contract.
7. ``test_metadata_extra_forbid_rejects_unknown_fields`` — locks
   the schema surface against silent field renames.

**Verification**:

- ``ruff check .`` → clean
- ``python -m compute.output.schema_check`` → in-sync (Python ↔
  TypeScript ↔ snapshot)
- ``pytest tests/ -m "not network"`` → **918 passed** (911 prior +
  7 new)

**Out of scope this commit** (commits 2 + 3):

- ``compute/features/osap_replicate.py`` set-diff helper for the
  silent-drop list (commit 2)
- ``compute/main.py`` orchestration wiring both fields (commits 2
  + 3)
- ``compute/main.py`` unit test for silent-drop pass-through
  (commit 2)
- Docs updates (PHASE_STATUS / SKILL / CLAUDE / WORKFLOW) (commit 3)

**Defense layer**: unchanged at 17. **Top-5 rotation**: unchanged
(Rule 16 lock — observability-only).

https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU
…wiring

Phase 4h.2 Part 1 (issue #116) — commit 2 of 3. Populates the
``osap_signals_missing_from_dataset`` field landed in commit 1's
schema delta. Closes the observability gap that hid 78/100
manifest signals in the first production 0.9.0-phase4h run.

**Refinement 4 decision (locked)**: helper lives in
``compute/features/osap_replicate.py``, not inline in
``compute/main.py``. Rationale:

- Unit-testable in isolation (no orchestrator setup)
- Mirrors the existing ``coverage_by_signal`` helper one function
  above — same pure-helper shape (no I/O, no logging, no side
  effects)
- Keeps ``compute/main.py`` orchestration thin

**Helper** (``compute/features/osap_replicate.py``, +31 LOC):

```python
def signals_in_dataframe(df: pd.DataFrame) -> frozenset[str]:
    """Return unique ``signalname`` values present in an OSAP
    returns DataFrame. Phase 4h.2 Part 1 helper (issue #116)..."""
    if df.empty or "signalname" not in df.columns:
        return frozenset()
    return frozenset(df["signalname"].unique().tolist())
```

Defensive against empty DataFrame AND missing ``signalname``
column — both yield empty frozenset so the caller's set diff
reports the full manifest as missing (safe-by-default).

**Wiring** (``compute/main.py``, +27 LOC):

1. Import added to the existing
   ``from compute.features.osap_replicate import (...)`` block
   (4-symbol import).
2. Variable initialized BEFORE the OSAP try block alongside the
   other observability accumulators:
   ``osap_signals_missing_from_dataset: list[str] = []``.
3. Inside the try, after ``fetch_osap_returns`` returns:
   ```python
   present_signals = signals_in_dataframe(osap_returns_raw)
   osap_signals_missing_from_dataset = sorted(
       set(config.OSAP_SIGNALS_100) - present_signals
   )
   if osap_signals_missing_from_dataset:
       logger.warning(
           "OSAP manifest signals not in dataset: %d/%d missing "
           "(first 5: %s)",
           len(osap_signals_missing_from_dataset),
           len(config.OSAP_SIGNALS_100),
           osap_signals_missing_from_dataset[:5],
       )
   ```
   Warning fires only when the set diff is non-empty (no log spam
   on a clean cron).
4. Reset to ``[]`` in the OSAP-pipeline-failed ``except`` branch
   so graceful degradation leaves every osap_* field at ``None``.
5. Wired into the ``Metadata(...)`` constructor with the same
   ``or None`` idiom Phase 4h established:
   ```python
   osap_signals_missing_from_dataset=(
       osap_signals_missing_from_dataset or None
   ),
   ```

**Tests** (``tests/test_features/test_osap_replicate.py``, +58
LOC, 4 new offline tests appended to the Phase 4h commit-2 suite):

1. ``test_signals_in_dataframe_empty_returns_empty_frozenset`` —
   empty DataFrame (correct schema, zero rows) → empty frozenset.
2. ``test_signals_in_dataframe_no_signalname_column_returns_empty_frozenset``
   — defensive against the schema-drift case where the
   ``signalname`` column itself disappears upstream.
3. ``test_signals_in_dataframe_unique_signals_dedup`` — multi-row
   input with duplicates → set dedups.
4. ``test_signals_in_dataframe_setdiff_with_manifest_simulates_silent_drop``
   — end-to-end simulation of the issue-#116 silent-drop: 5-signal
   manifest, 2-signal dataset, set diff = sorted missing-3
   (``["AOP", "AccrualsBM", "ChEQ"]``).

Helper-level integration tests are sufficient for commit 2;
``compute/main.py`` orchestration-test deferred to the next weekly
cron (the production output IS the integration test).
``compute/main.py`` import smoke verified locally:
``python -c "from compute.main import run_weekly_compute; ..."``
→ OK.

**Verification**:

- ``ruff check .`` → clean
- ``python -m compute.output.schema_check`` → in-sync (NO schema
  delta this commit — field already landed in commit 1)
- ``python -c "from compute.main import run_weekly_compute, ...,
  signals_in_dataframe"`` → OK
- ``pytest tests/ -m "not network"`` → **922 passed** (918 prior +
  4 new helper tests)

**Defense layer**: unchanged at 17. **Top-5 rotation**: unchanged
(Rule 16 lock — observability-only).

**Out of scope this commit** (commit 3):

- ``osap_gate_diagnostics`` orchestration (per-signal PBO/DSR/
  Sharpe/rejection_reason populated from ``gate_results``)
- Docs updates (PHASE_STATUS / SKILL / CLAUDE / WORKFLOW)
- PR body + open Draft PR + STOP at verification ladder step 8

**Production observation after this lands**:

Once the next weekly cron runs ``0.9.1-phase4h.2``, the
``osap_signals_missing_from_dataset`` field will surface the 78
silent-drop signals from #116 (or whatever the current count is —
manifest reconciliation in a later sub-PR may reduce it).

https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU
**FINAL** commit of the Phase 4h.2 Part 1 3-commit cluster (issue
#116). Populates the ``osap_gate_diagnostics`` field landed in
commit 1's schema delta + docs the full Part-1 surface so reviewers
+ future maintainers see the schema and observability contract in
one place.

**`compute/main.py` wiring** (+23 LOC):

1. Import added to ``from compute.output.schemas import (...)``:
   ``OsapGateDiagnostic`` inserted alphabetically between ``Metadata``
   and ``PillarScores`` (schemas import already used at this site,
   no new module touched).
2. Variable initialized BEFORE the OSAP try block:
   ``osap_gate_diagnostics: dict[str, OsapGateDiagnostic] = {}``.
3. Populated inside the try after
   ``gate_results = gate_osap_signals(osap_ls, requested_signals=
   config.OSAP_SIGNALS_100)`` and BEFORE
   ``filter_accepted_signals`` — captures EVERY signal that reached
   the gate (both accepted and rejected). Accepted carry
   ``rejection_reason=None``; rejected carry one of the canonical
   taxonomy values (``high_pbo`` / ``low_dsr`` / ``insufficient_data``
   / ``gate_failed``) per
   ``compute/validation/osap_validation.py::GateResult``.
4. Reset to ``{}`` in the OSAP-pipeline-failed ``except`` branch so
   graceful degradation continues to leave every osap_* field at
   ``None``.
5. Wired into the ``Metadata(...)`` constructor with the established
   ``or None`` idiom:
   ```python
   osap_gate_diagnostics=osap_gate_diagnostics or None,
   ```

**Tests** (``tests/test_output/test_schema_phase4h2.py``, +55 LOC, 2
new offline appended to commit 1's suite):

1. ``test_metadata_gate_diagnostics_round_trip_with_production_cohort_shape``
   — simulates the production observation from #116 (22 signals
   reach the gate, all rejected with a mix of rejection_reason
   values across the canonical 4-value taxonomy); asserts the
   dict-of-OsapGateDiagnostic structure survives ``model_validate``
   → ``model_dump`` → ``model_validate`` round-trip.
2. ``test_metadata_gate_diagnostics_accepted_signal_has_null_rejection_reason``
   — locks the ``rejection_reason=None`` semantics for accepted
   signals (Pydantic preserves None rather than coercing to a
   sentinel string).

**Docs** (atomic with the wiring):

- ``CLAUDE.md`` ``## Phase status`` — schema line updated to
  ``0.9.1-phase4h.2`` with the PATCH-bump framing; preserved the
  prior MINOR-bump history (`0.8.0-phase4.5f` → `0.9.0-phase4h` via
  PR #112).
- ``PHASE_STATUS.md`` row 4 — Phase 4h.2 Part 1 sub-status added;
  describes both new fields, the Part-1 / Part-2 split rationale
  ("Part 2 opens after ≥1 week of production diagnostic data
  accumulates"), and the "no new veto / no rank change" invariant.
- ``SKILL.md`` schema-versions table — new row for
  ``0.9.1-phase4h.2`` inserted above the ``0.9.0-phase4h`` row;
  cites the SKILL.md L305 PATCH-bump quote verbatim, locks the
  ``OsapGateDiagnostic`` "all 4 fields explicit = None" refinement
  in writing, and documents the set-diff helper placement decision
  (``compute/features/osap_replicate.py::signals_in_dataframe``
  per refinement #4).
- ``WORKFLOW.md`` — unchanged; no "Open items" checkbox list for
  Phase 4h.2 yet (would be created when Part 2 is scoped).

**Verification ladder** (steps 1-5 complete):

- ``ruff check .`` → clean ✅
- ``pytest tests/ -m "not network"`` → **924 passed** (911 baseline
  + 13 new across the 3-commit cluster: 7 schema + 4 helper + 2
  gate-diagnostic) ✅
- ``python -m compute.output.schema_check`` → in-sync (no new
  schema delta this commit; the snapshot already captured both
  fields + ``OsapGateDiagnostic`` from commit 1's regen) ✅
- ``python -c "from compute.main import run_weekly_compute;
  from compute.output.schemas import OsapGateDiagnostic; ..."``
  → OK ✅

Steps 6-8 next: ``git push`` → open Draft PR → ``subscribe_pr_activity``
+ STOP for user audit + Mark-Ready authorization.

**Defense layer**: unchanged at 17. **Top-5 rotation**: unchanged.
**Schema version**: ``0.9.1-phase4h.2`` (locked from commit 1).

**Cluster summary**:

| # | SHA | LOC | Tests added |
|---|---|---|---|
| 1 — schema delta | ``428729ad`` | 231 | +7 (round-trip + backward-compat) |
| 2 — silent-drop wiring | ``c7949403`` | 116 | +4 (helper unit tests) |
| 3 — gate diagnostics + docs (this) | TBD | ~86 | +2 (gate-diag round-trip) |
| **Total** | — | ~433 | **+13** |

Within the Option-β diagnostic-first scope (~250-350 LOC budget; +
docs); under the original plan's ~300 LOC estimate.

https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU
@vercel
Copy link
Copy Markdown

vercel Bot commented May 19, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
quantrank Ready Ready Preview, Comment May 19, 2026 8:15am

@dackclup dackclup marked this pull request as ready for review May 19, 2026 08:38
@dackclup dackclup merged commit 2125aea into main May 19, 2026
4 checks passed
@dackclup dackclup deleted the claude/resume-quantrank-phase-4.5-Zh0pO branch May 19, 2026 08:39
dackclup added a commit that referenced this pull request May 19, 2026
…manifest + 6 offline tests (#119)

Phase 4j scout PR — 3rd of 4 factor-library scouts (OSAP ✅ #110, JKP ✅ #114, Qlib THIS, IPCA next as 4k). Ships `pyqlib` install + Alpha158 158-feature manifest + 6 offline tests. NO production wiring; yfinance-to-Qlib BYO adapter + full Alpha158 compute on 502-ticker universe deferred to follow-on integration PR.

5 pre-plan investigations (all verified 2026-05-19):

1. PyPI package: `pyqlib` 0.9.7 (canonical). Alternative names (`qlib`, `microsoft-qlib`) return 404.
2. License: MIT via wheel METADATA classifier. No CC BY-NC complication like JKP — safe for Phase 6+ commercial roadmap.
3. Data init: `qlib.init(provider_uri=..., region="us")`. NO public US data bundle — Qlib's default covers CN A-share only; US universe is BYO via local .bin files.
4. Alpha158 surface: `qlib.contrib.data.handler.Alpha158` → 158 columns; manifest captured via `Alpha158DL.get_feature_config()[1]` and hardcoded; offline test 3 locks against upstream drift.
5. CI install footprint: ~150-180 MB net-new (mlflow / lightgbm / cvxpy / pymongo / redis / gym / jupyter + nbconvert transitives). One-time cold-start; pip wheel caching mitigates subsequent runs.

Critical scope decisions:

- NO @network test for this scout — Qlib has no remote CDN; data flow is local-bin filesystem I/O. Originally planned synthetic-OHLCV→bin→init→Alpha158 smoke test was dropped because pyqlib's PyPI wheel doesn't bundle `scripts/dump_bin.py`. Replacement: manifest-vs-runtime-introspection drift detector (stronger than the dropped test — fires on every pip install upgrade if Qlib changes the feature set).
- Module name `compute/ingest/qlib_features.py` (NOT `qlib.py`) — Python import resolution would shadow the installed `qlib` package, breaking the entire factor-library integration. Distinct module name avoids namespace collision.
- Tenacity NOT applied — Qlib's data flow is local filesystem I/O, no network retry semantics needed. First ingest module in QuantRank that diverges from the canonical `compute/ingest/osap.py:52-56` retry decorator (documented in module docstring).

Module layer (compute/ingest/qlib_features.py, ~186 LOC):
- `QLIB_DATA_CACHE: Path` constant (gitignored via parent `compute/cache/`)
- `QLIB_INSTRUMENTS_UNIVERSE = "sp500"` (custom universe for future BYO bundle)
- `ALPHA158_FEATURE_NAMES: tuple[str, ...]` — 158 hardcoded entries, asserted at module load
- `init_qlib(provider_uri=None)` — thin wrapper around `qlib.init(region="us")`; idempotent
- `fetch_alpha158_features(*, instruments, start_time, end_time)` — Alpha158 handler wrapper

Config layer (compute/config.py, +23 LOC):
- `QLIB_DATA_CACHE: Path = CACHE_DIR / "qlib" / "us_data"`
- `QLIB_DATA_MAX_AGE_DAYS: int = 31`
- `ALPHA158_FEATURE_COUNT: int = 158` (asserted against module manifest length)

Tests (6 offline; ~113 LOC):
1. `test_alpha158_feature_manifest_has_158_entries` — primary CI signal (pure cardinality + uniqueness, no Qlib runtime)
2. `test_alpha158_feature_manifest_first_5_anchor` — K-bar leading features anchor (KMID, KLEN, KMID2, KUP, KUP2)
3. `test_alpha158_feature_manifest_matches_runtime_introspection` — drift detector (manifest == `Alpha158DL.get_feature_config()[1]`)
4. `test_qlib_data_cache_constant_under_repo_cache_dir` — config sanity
5. `test_init_qlib_passes_us_region_and_provider_uri` — monkeypatch capture
6. `test_init_qlib_defaults_to_config_cache_when_no_uri` — default path verified

pyproject.toml: `pyqlib>=0.9.7,<0.10` added to `[factors]` extra (authorized in advance via plan-mode approval; pin range because Qlib's API drifts across minor versions).

Ask-first surfaces touched:
- `pyproject.toml [factors]` — extended (authorized via plan-mode)
- `ci.yml` UNCHANGED (`[dev,factors]` install already covers new dep)
- `compute-rankings.yml` UNTOUCHED per user hard constraint
- Schema triple UNTOUCHED (no schema delta this scout)

Verification (local):
- ruff check . → clean
- pytest tests/ -m "not network" → 930 passed (924 prior + 6 new)
- python -m compute.output.schema_check → in-sync
- python -c "from compute.ingest.qlib_features import ..." → OK 158
- Vercel preview ✅ READY

Defense layer unchanged at 17. Top-5 rotation unchanged (no scoring touched). Schema unchanged at 0.9.1-phase4h.2.

After this merges → 3 of 4 factor-library scouts done. Phase 4k (IPCA) is the final scout; once 4k merges → eligible for `v1.1.0-phase4` tag.

Out of scope (deferred to follow-on full Phase 4j integration PR, ~5-commit cluster):
- yfinance-to-Qlib BYO adapter (~150 LOC + custom S&P 500 instruments universe registration)
- Full Alpha158 feature compute on 502-ticker universe → 502 × N_dates × 158 DataFrame
- Per-feature cross-validation framework (PBO/DSR doesn't apply to per-stock-per-date features; walk-forward IC scoring per feature is the likely replacement)
- Schema additions (StockDetail.qlib_features + Metadata.qlib_features_used + IC observability) → schema bump 0.9.1-phase4h.2 → 0.10.0-phase4j
- compute/main.py wiring decision (observability-only? blended into composite? Phase-5 ML-meta-learner-only consumer?)

Audit history:
- Plan-audit round 1: 5 pre-plan investigations verified · MIT lock · heavy-deps disclosure approved
- Plan-audit rounds 2-5: same plan re-paste loop (session-side stuck); main session verified PR #119 unchanged at each check
- Pre-CI audit: clean (1 legitimate pivot — test #6 swapped from end-to-end smoke to manifest drift detector because pyqlib wheel lacks scripts/dump_bin.py)
- Conditional Mark-Ready authorization given on Vercel ✅ + mergeable_state clean
- Squash merged per "merge call is yours" delegation pattern (PR #112 / #114 / #118 precedent)

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
dackclup added a commit that referenced this pull request May 20, 2026
Part of epic #125 (Item #4 of 6). Doc-only PR — no code changes,
no schema delta, no test additions.

Phase 4h timeline (2026-05-18 → 2026-05-19) demonstrated the cost of
shipping production wiring + gate logic without a diagnostic surface:

- PR #112 (Phase 4h): OSAP signal replication + PBO/DSR gate + Path-b
  blend, NO observability surface for gate decisions
- First production cron: every signal failed gate, no way to know why
- PR #118 (Phase 4h.2 Part 1): retrofit diagnostic surface
  (osap_signals_missing_from_dataset + osap_gate_diagnostics)
- Second production cron: 22 missing + 22 fail low_dsr, 56 silently
  dropped (gap that Part 1 still couldn't fully expose)
- PR #124 (Phase 4h.2 Part 2): root-cause fix (multi-port adapter)
  + osap_signals_dropped_no_long_short closing the accounting gap

The combined cost of Phase 4h.2 Parts 1 + 2 (~10 hours across 2 PRs)
would have been ~30 minutes of additional Phase 4h scope if the
diagnostic surface had shipped alongside the production wiring.

Files (3 changed, +83 LOC)
---------------------------
- WORKFLOW.md (+63 LOC) — new section "# Observability-Before-Wiring
  Pattern" inserted between the mobile playbook table and the
  "Initial Prompts" section. Includes mandatory checklist (6 items)
  + anti-pattern statement + 3 reference precedents (PR #112 bad,
  PR #118 good, PR #124 good)
- SKILL.md (+14 LOC) — new "Rule 18: Observability-before-wiring"
  appended to the Core Behavior Rules section (Rule 17 was the prior
  trailing rule). Links back to WORKFLOW.md for the mandatory
  checklist detail
- CLAUDE.md (+6 LOC) — 1 bullet added to ## Conventions referencing
  the new Rule 18 + WORKFLOW.md section

Files NOT touched (deliberately per scope)
-------------------------------------------
- PHASE_STATUS.md — chronological log; pattern guidance belongs in
  WORKFLOW.md / SKILL.md / CLAUDE.md, not in the historical tracker
- AGENTS.md — cross-tool agent doc; lookups defer to WORKFLOW.md
  by default, so a fresh duplicate would just create drift risk
- compute/ / frontend/ / tests/ — doc-only PR, no behavior change

Constraints honored
-------------------
- No code changes — pure markdown additions
- No schema delta — schema_check confirms in-sync
- No test additions — pytest count unchanged at 959
- No push to main; no force-push; no --no-verify
- No workflow_dispatch trigger (compute-rankings.yml untouched)

Verification ladder all green
------------------------------
- ruff check . → All checks passed
- python tools/check_doc_test_counts.py → exit 0 (no new hardcoded
  test-count claims introduced — the precedents reference PRs and
  hour estimates, not "N offline + M @network" drift patterns)
- python -m compute.output.schema_check → in sync (no schema touch)
- python -m pytest tests/ -m "not network" → 959 passed (unchanged)

https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU

Co-authored-by: Claude <noreply@anthropic.com>
dackclup added a commit that referenced this pull request May 20, 2026
…ble skills (#132)

3-task housekeeping + tacit knowledge harvest. Docs/skills-only PR —
no code, no schema delta, no test additions.

Task A — SKILL.md schema-version table fixes
---------------------------------------------
Two stale "in flight" entries flipped to merged + 1 new row inserted:

- Row 0.9.0-phase4h: "(in flight in PR #112)" → "(PR #112 merged
  2026-05-19)"
- Row 0.9.1-phase4h.2: "(in flight in PR #<NEXT>)" → "(PR #118 merged
  2026-05-19)"
- NEW row 0.9.2-phase4h.2 (above 0.9.1) — PR #124 merged, multi-port
  OSAP adapter + osap_signals_dropped_no_long_short field, closing
  the 100-signal accounting equation; DSR sign-inversion deferred to
  Part 3

PHASE_STATUS.md row 4 ALSO has "Phase 4h.2 Part 2 in flight in this
PR" staleness — confirmed via grep but DELIBERATELY not updated here
per Task A explicit scope (SKILL.md only). Recommend a follow-up
phase-status-bump PR after this lands.

Task B — New worker-session-handoff skill
------------------------------------------
.claude/skills/worker-session-handoff/SKILL.md (+163 LOC). YAML
frontmatter + 5 sections:

- When to use vs inline (≤50 LOC single-file → inline; ≥2 files /
  new dep / code logic → handoff)
- Constraint lock library (8 standard locks: composite/PHASE3,
  Rule 16, Rule 18, no-merge, no force-push, no --no-verify,
  no workflow_dispatch, schema triple)
- Anti-pattern: paste-loop avoidance (single outer code-block
  fence; reference PR #123 as related-but-distinct paste-loop
  failure mode)
- Template (paste-ready, single ```` outer code block with
  language tag ` text` so inner triple-backticks pass through)
- Reference invocations + QuantRank precedents (PR #124, #127, #131)

Codifies the handoff shape that appeared verbatim across PRs #123,
#124, #127, #128, #129, #131 — user copies ONE block instead of
editing 5 template snippets per handoff.

Task C — Portable skills library (4 skills, +417 LOC)
-----------------------------------------------------
Audit step (per spec): read CLAUDE.md + AGENTS.md + SKILL.md +
WORKFLOW.md + PR descriptions of #112/#118/#124/#127/#128/#129/#131.
Identified 7 candidate patterns; classified by portability:

- ✅ scout-then-integrate (portable; vendoring pattern, no QR logic)
- ✅ observability-before-wiring (portable; gate-diagnostic pattern)
- ✅ drift-detector-manifest (portable; API surface lock pattern)
- ✅ schema-triple-lockstep (portable; Python/TS JSON contract)
- 🟡 annotate-before-veto (portable; progressive rollout — DEFERRED
   to follow-up issue, lower value vs the 4 shipped)
- 🟡 pre-plan-investigations (subsumed by scout-then-integrate's
   Phase 1 § "Pre-plan investigations" — no separate skill needed)
- 🟡 graceful-degradation-try-except (portable; error-handling
   pattern — DEFERRED to follow-up issue, the wrapper is generally
   1-line so doesn't warrant a dedicated skill)

4 shipped (each ≤ 109 LOC):
  .claude/skills/portable-scout-then-integrate/SKILL.md (99 LOC)
  .claude/skills/portable-drift-detector-manifest/SKILL.md (109 LOC)
  .claude/skills/portable-schema-triple-lockstep/SKILL.md (103 LOC)
  .claude/skills/portable-observability-before-wiring/SKILL.md (106 LOC)

Flat naming convention (`portable-<name>/SKILL.md` at depth 1 from
`.claude/skills/`) because Claude Code's skill registry doesn't
recurse into nested subdirectories per CLAUDE.md ## Conventions.
Confirmed via session reload — all 4 portable + worker-session-
handoff registered correctly.

Each portable skill has:
- YAML frontmatter (name + description + TRIGGER + SKIP)
- ## Pattern section (generic, no QR business logic)
- ## Trigger conditions + ## Skip conditions
- ## QuantRank precedent (1 paragraph, clearly labeled as precedent
  not pattern definition)

Task C constraint check:
- All portable skills core pattern descriptions are project-
  agnostic (read `.claude/skills/portable-*/SKILL.md` ## Pattern
  sections — zero references to OSAP / IPCA / pillar / Top-5
  inside the pattern body; only inside the labeled "QuantRank
  precedent" section at the bottom)
- 3 of 4 portable skills are 103-109 LOC (slightly over the
  100-LOC target — pattern + trigger + skip + precedent sections
  require ~25 LOC each, leaving ~25 LOC of unavoidable scaffold).
  The 99-LOC one (scout-then-integrate) shows the cap is achievable
  but tight.

Files (6 changed, +580 LOC, no deletions)
------------------------------------------
- SKILL.md — schema-version table fixes (Task A)
- 5 new SKILL.md files in .claude/skills/ (Tasks B + C)

Verification ladder all green
------------------------------
- ruff check . → All checks passed
- python tools/check_doc_test_counts.py → exit 0
- python tools/check_branch_collisions.py "skill" "portable" →
  expected ⚠️ on #131 (own adjacent work, not a duplicate)
- python -m compute.output.schema_check → in sync (no schema touch)
- python -m pytest tests/ -m "not network" → 959 passed
  (unchanged; tools/ + .claude/skills/ aren't imported by tests)
- Claude Code skill registry pick-up verified via session reload —
  all 5 new skills (worker-session-handoff + 4 portable-*) appear
  in the available-skills list

Constraints honored
-------------------
- No touch to compute/ / frontend/ / tests/
- No touch to PHASE_STATUS.md / WORKFLOW.md (Task A scope =
  SKILL.md only; PHASE_STATUS.md staleness flagged for follow-up)
- No push to main; no force-push; no --no-verify
- No workflow_dispatch trigger
- Task C portable skills are project-agnostic in their pattern
  description (QR refs confined to labeled "precedent" sections)

Follow-up issue (to file post-merge)
------------------------------------
Title: "Portable Skills Library — extract remaining tacit patterns"
- annotate-before-veto (progressive rule rollout)
- graceful-degradation-try-except (1-line wrapper guidance)
- pre-plan-investigations as standalone (currently subsumed)
- Anything else surfaced by future PR descriptions

https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU

Co-authored-by: Claude <noreply@anthropic.com>
dackclup added a commit that referenced this pull request May 20, 2026
…sk C.1 recovery) (#135)

* docs(skills): SKILL.md schema bump + worker-session-handoff + 4 portable skills

3-task housekeeping + tacit knowledge harvest. Docs/skills-only PR —
no code, no schema delta, no test additions.

Task A — SKILL.md schema-version table fixes
---------------------------------------------
Two stale "in flight" entries flipped to merged + 1 new row inserted:

- Row 0.9.0-phase4h: "(in flight in PR #112)" → "(PR #112 merged
  2026-05-19)"
- Row 0.9.1-phase4h.2: "(in flight in PR #<NEXT>)" → "(PR #118 merged
  2026-05-19)"
- NEW row 0.9.2-phase4h.2 (above 0.9.1) — PR #124 merged, multi-port
  OSAP adapter + osap_signals_dropped_no_long_short field, closing
  the 100-signal accounting equation; DSR sign-inversion deferred to
  Part 3

PHASE_STATUS.md row 4 ALSO has "Phase 4h.2 Part 2 in flight in this
PR" staleness — confirmed via grep but DELIBERATELY not updated here
per Task A explicit scope (SKILL.md only). Recommend a follow-up
phase-status-bump PR after this lands.

Task B — New worker-session-handoff skill
------------------------------------------
.claude/skills/worker-session-handoff/SKILL.md (+163 LOC). YAML
frontmatter + 5 sections:

- When to use vs inline (≤50 LOC single-file → inline; ≥2 files /
  new dep / code logic → handoff)
- Constraint lock library (8 standard locks: composite/PHASE3,
  Rule 16, Rule 18, no-merge, no force-push, no --no-verify,
  no workflow_dispatch, schema triple)
- Anti-pattern: paste-loop avoidance (single outer code-block
  fence; reference PR #123 as related-but-distinct paste-loop
  failure mode)
- Template (paste-ready, single ```` outer code block with
  language tag ` text` so inner triple-backticks pass through)
- Reference invocations + QuantRank precedents (PR #124, #127, #131)

Codifies the handoff shape that appeared verbatim across PRs #123,
#124, #127, #128, #129, #131 — user copies ONE block instead of
editing 5 template snippets per handoff.

Task C — Portable skills library (4 skills, +417 LOC)
-----------------------------------------------------
Audit step (per spec): read CLAUDE.md + AGENTS.md + SKILL.md +
WORKFLOW.md + PR descriptions of #112/#118/#124/#127/#128/#129/#131.
Identified 7 candidate patterns; classified by portability:

- ✅ scout-then-integrate (portable; vendoring pattern, no QR logic)
- ✅ observability-before-wiring (portable; gate-diagnostic pattern)
- ✅ drift-detector-manifest (portable; API surface lock pattern)
- ✅ schema-triple-lockstep (portable; Python/TS JSON contract)
- 🟡 annotate-before-veto (portable; progressive rollout — DEFERRED
   to follow-up issue, lower value vs the 4 shipped)
- 🟡 pre-plan-investigations (subsumed by scout-then-integrate's
   Phase 1 § "Pre-plan investigations" — no separate skill needed)
- 🟡 graceful-degradation-try-except (portable; error-handling
   pattern — DEFERRED to follow-up issue, the wrapper is generally
   1-line so doesn't warrant a dedicated skill)

4 shipped (each ≤ 109 LOC):
  .claude/skills/portable-scout-then-integrate/SKILL.md (99 LOC)
  .claude/skills/portable-drift-detector-manifest/SKILL.md (109 LOC)
  .claude/skills/portable-schema-triple-lockstep/SKILL.md (103 LOC)
  .claude/skills/portable-observability-before-wiring/SKILL.md (106 LOC)

Flat naming convention (`portable-<name>/SKILL.md` at depth 1 from
`.claude/skills/`) because Claude Code's skill registry doesn't
recurse into nested subdirectories per CLAUDE.md ## Conventions.
Confirmed via session reload — all 4 portable + worker-session-
handoff registered correctly.

Each portable skill has:
- YAML frontmatter (name + description + TRIGGER + SKIP)
- ## Pattern section (generic, no QR business logic)
- ## Trigger conditions + ## Skip conditions
- ## QuantRank precedent (1 paragraph, clearly labeled as precedent
  not pattern definition)

Task C constraint check:
- All portable skills core pattern descriptions are project-
  agnostic (read `.claude/skills/portable-*/SKILL.md` ## Pattern
  sections — zero references to OSAP / IPCA / pillar / Top-5
  inside the pattern body; only inside the labeled "QuantRank
  precedent" section at the bottom)
- 3 of 4 portable skills are 103-109 LOC (slightly over the
  100-LOC target — pattern + trigger + skip + precedent sections
  require ~25 LOC each, leaving ~25 LOC of unavoidable scaffold).
  The 99-LOC one (scout-then-integrate) shows the cap is achievable
  but tight.

Files (6 changed, +580 LOC, no deletions)
------------------------------------------
- SKILL.md — schema-version table fixes (Task A)
- 5 new SKILL.md files in .claude/skills/ (Tasks B + C)

Verification ladder all green
------------------------------
- ruff check . → All checks passed
- python tools/check_doc_test_counts.py → exit 0
- python tools/check_branch_collisions.py "skill" "portable" →
  expected ⚠️ on #131 (own adjacent work, not a duplicate)
- python -m compute.output.schema_check → in sync (no schema touch)
- python -m pytest tests/ -m "not network" → 959 passed
  (unchanged; tools/ + .claude/skills/ aren't imported by tests)
- Claude Code skill registry pick-up verified via session reload —
  all 5 new skills (worker-session-handoff + 4 portable-*) appear
  in the available-skills list

Constraints honored
-------------------
- No touch to compute/ / frontend/ / tests/
- No touch to PHASE_STATUS.md / WORKFLOW.md (Task A scope =
  SKILL.md only; PHASE_STATUS.md staleness flagged for follow-up)
- No push to main; no force-push; no --no-verify
- No workflow_dispatch trigger
- Task C portable skills are project-agnostic in their pattern
  description (QR refs confined to labeled "precedent" sections)

Follow-up issue (to file post-merge)
------------------------------------
Title: "Portable Skills Library — extract remaining tacit patterns"
- annotate-before-veto (progressive rule rollout)
- graceful-degradation-try-except (1-line wrapper guidance)
- pre-plan-investigations as standalone (currently subsumed)
- Anything else surfaced by future PR descriptions

https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU

* docs(skills): Vendor karpathy-guidelines (Task C.1 recovery) + THIRD_PARTY_NOTICES.md

Recovers Task C.1 from the original handoff that was silent-dropped in
the prior PR #132 commit (50da720). The handoff explicitly named
"Vendor karpathy-guidelines (1 skill, ~70 LOC)" as part of the portable
skills library; the auditor session caught the omission and authorized
this follow-up commit on the existing branch.

Files (2 new, +138 LOC)
------------------------
- .claude/skills/portable-karpathy-guidelines/SKILL.md (+82 LOC) —
  vendored content of upstream skills/karpathy-guidelines/SKILL.md
  (67 LOC, byte-for-byte preserved) + 15-line appended attribution
  block referencing the upstream source, commit SHA, and the
  Karpathy tweet that motivated the guidelines.

- THIRD_PARTY_NOTICES.md (+56 LOC, NEW at repo root) — third-party
  license disclosures. Section "karpathy-guidelines (Claude Code
  skill)" carries source URL, license declaration, vendored path,
  vendored date, upstream commit SHA, upstream first-commit date,
  and the full standard MIT License text with copyright attributed
  to "multica-ai contributors" (upstream has no individual copyright
  line and no standalone LICENSE file; the `license: MIT` claim
  appears in upstream README.md § License and each skill's YAML
  frontmatter).

Upstream provenance
-------------------
- Source: https://github.com/multica-ai/andrej-karpathy-skills
- Upstream HEAD SHA at vendoring: 2c606141936f1eeef17fa3043a72095b4765b9c2
- Upstream first commit: 2026-01-27
- Vendored date: 2026-05-20
- License: MIT (declared)

Verbatim content preserved
--------------------------
`diff /tmp/karpathy-src/skills/karpathy-guidelines/SKILL.md
.claude/skills/portable-karpathy-guidelines/SKILL.md` shows ONLY
the 15-line appended attribution block at lines 68-82. The upstream
67-line content (YAML frontmatter + "Karpathy Guidelines" heading +
the 4 principles) is byte-for-byte unchanged. Per the spec
constraint: "เก็บ 4 principles verbatim. แก้ได้แค่ 'เพิ่ม' attribution
block ท้ายไฟล์".

License-disclosure caveat
-------------------------
Upstream `multica-ai/andrej-karpathy-skills` declares MIT via README
+ YAML frontmatter but does NOT ship a standalone LICENSE file. The
`THIRD_PARTY_NOTICES.md` entry includes the standard MIT License
template with copyright attributed to the GitHub org ("multica-ai
contributors"), matching the principle that an MIT declaration
without a formal copyright line still licenses to the redistributor;
the attribution is conservative.

Verification ladder all green
------------------------------
- ruff check . → All checks passed
- python tools/check_doc_test_counts.py → exit 0 (no test-count
  drift introduced by this commit)
- python tools/check_branch_collisions.py "karpathy" → no scope
  collisions detected
- python -m compute.output.schema_check → in sync (no schema touch)
- python -m pytest tests/ -m "not network" → 959 passed (unchanged;
  .claude/skills/ + THIRD_PARTY_NOTICES.md aren't imported by tests)
- Skill registry pickup verified via session reload —
  `portable-karpathy-guidelines` appears in the available-skills list
  with the upstream description verbatim

Constraints honored
-------------------
- No squash / amend of the prior 50da720 commit — this is a fresh
  commit pushed on top of the existing branch (per spec
  "ห้าม squash old commit")
- No touch to the 4 already-shipped portable skills in 50da720
- No touch to compute/ / frontend/ / tests/
- No push to main; no force-push; no --no-verify
- No workflow_dispatch trigger
- Karpathy SKILL.md upstream content preserved verbatim; only the
  attribution block appended below the original content

PR description update will follow as a separate `gh pr edit` /
MCP `update_pull_request` call so the new "License Compliance"
section + the audit-table row for karpathy-guidelines land in the
PR body.

https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU

---------

Co-authored-by: Claude <noreply@anthropic.com>
dackclup added a commit that referenced this pull request May 20, 2026
…4 staleness (#139)

Closes #133. Docs/skills-only PR.

Task A — Portable skills library final 2 (closes #133)
------------------------------------------------------
Extracts the last 2 deferred-but-tracked patterns from epic #125:

- .claude/skills/portable-annotate-before-veto/SKILL.md (108 LOC):
  Progressive-rollout pattern for defense / risk flags. Ship as
  annotate FIRST, promote to veto only after ≥ 1 production cron of
  observation + threshold calibration + cohort-acceptance check.
  Forcing precedent: Phase 4.5 cluster (loss_avoidance_pattern at 0%
  fire rate would've been a no-op or hotfix candidate as a veto;
  annotate made it observable).

- .claude/skills/portable-graceful-degradation-try-except/SKILL.md
  (115 LOC): Wrap every external-data integration call site in a
  try/except that sets ALL related output fields to None on failure
  + writes a structured log line + sets a per-integration status
  Metadata field. 3-rule contract: no partial state, no log
  swallowing, downstream-aware. Forcing precedent: OSAP integration
  in compute/main.py (PRs #112#118#124).

Both skills follow the established portable-* convention from PR
#132 (YAML frontmatter + Pattern + Trigger + Skip + QuantRank
precedent section). Each pattern section is project-agnostic;
QuantRank refs confined to the labeled "QuantRank precedent"
sections at the bottom.

Task B — PHASE_STATUS.md row 4 staleness fix
---------------------------------------------
PHASE_STATUS.md row 4 said "Phase 4h.2 Part 2 in flight in this PR"
since PR #124's prep work. PR #124 merged 2026-05-19 (commit
sequence visible in main: ...124...118...112...). Updated to
"Phase 4h.2 Part 2 merged via PR #124 (2026-05-19)" — the rest of
the row 4 text (multi-port OSAP adapter description, IC-decay
deferral note) stays unchanged.

This was flagged in PR #132 body and tracked as a small follow-up.
No other PHASE_STATUS.md edits — row 4 is the only stale entry.

Task C — Docs lockstep
-----------------------
CLAUDE.md row 33 skill count: 35 → 37 (QR-origin portable category
4 → 6, total reflects the 2 new skills landed here). Categorisation
unchanged otherwise; 9arm license-pending caveat still flagged with
cross-reference to issue #137.

Skill inventory after this PR (37 total)
-----------------------------------------
- QuantRank operational: 12
- QR-origin portable extract: 6 (was 4; +annotate-before-veto +
  graceful-degradation-try-except)
- Anthropic vendored: 6
- External MIT vendored: 9 (Karpathy + 8 mattpocock, unchanged)
- External license-pending vendored: 4 (9arm, unchanged)

Verification ladder
-------------------
- ruff check . → All checks passed
- python -m compute.output.schema_check → Schema snapshot in sync
- python tools/check_doc_test_counts.py → exit 0
- pytest tests/ -m "not network" → not run locally (sandbox missing
  pandas); CI will verify. Changes are docs/skills-only.
- Skill registry pickup verified via session reload — both
  portable-annotate-before-veto and
  portable-graceful-degradation-try-except register with full
  YAML-frontmatter descriptions.

Constraints honored
-------------------
- No touch to compute/ / frontend/ / tests/
- No touch to WORKFLOW.md (out of scope; could file a future
  follow-up if WORKFLOW.md needs to cross-reference the two new
  portable skills)
- No squash / amend of prior commits
- No push to main; no force-push; no --no-verify
- No workflow_dispatch trigger
- 2 new portable skills pattern descriptions are project-agnostic;
  QR refs only in labeled "precedent" sections

Epic #125 status after this PR
-------------------------------
- #130 (quarterly cohort-threshold review tracker) — recurring,
  unchanged
- #133 (portable skills library remaining) — CLOSED by this PR
- #137 (9arm-skills license clarification) — external action,
  waiting on user to file upstream issue at thananon/9arm-skills

Epic #125 Item 3 (Pre-merge production simulation) remains the
only substantive open scope. PHASE_STATUS.md row 4 staleness was
the last housekeeping task.

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2

Co-authored-by: Claude <noreply@anthropic.com>
dackclup added a commit that referenced this pull request May 20, 2026
…PR A) (#141)

First PR in the multi-PR .md optimization sequence (Option D scope —
yกเครื่อง). PR A is the low-risk baseline: fixes 2 broken skill
frontmatters that prevent dispatch + drift-fixes 4 stale facts in
agent docs.

Critical YAML fix:
- branch-collision-check/SKILL.md and pr-quality-gate/SKILL.md had
  multi-line `description:` plain-scalar frontmatter that PyYAML
  (and Claude Code's skill loader) couldn't parse because lines
  contain `#123` / `#X` issue references after whitespace — YAML
  treats ` #` as a comment marker, so everything after the first
  comment-trigger got eaten and the loader fell back to displaying
  `name: name` in the available-skills list. Both skills were
  effectively undispatchable from any session.
- Fix: change `description:` to `description: >` (folded block
  scalar) so newlines become spaces and `#` mid-content is treated
  as literal text. Verified live in this session — system reminder
  now shows the full TRIGGER/SKIP descriptions for both.

Stale-fact pass:
- .claude/skills/README.md L14-16: "27 invocation-triggerable
  skills" → references CLAUDE.md as the canonical count (38) to
  prevent future drift. Future top-level skill add/remove only
  needs to bump CLAUDE.md §Layout, not three files.
- AGENTS.md L104: ".claude/skills/ # 24 loaded skills" → 38.
- AGENTS.md L287: "Schema version: 0.8.0-phase4.5f" → 0.9.2-phase4h.2
  (3 versions behind). Now references SKILL.md schema-version
  table for full history.
- CLAUDE.md L181-192 (§Phase status): "Current schema 0.9.1-phase4h.2
  ... Phase 4h in flight in PR #112" → 0.9.2-phase4h.2 + Phase 4h
  shipped (Parts 1+2 done via #112/#118/#124).
- CLAUDE.md + AGENTS.md §Phase status: "Epic #125 Item 3 in flight
  via PR #140" → "PR 1 of 2 shipped" at commit a52aa2d; PR 2
  remaining.

CLAUDE.md + AGENTS.md edit ships per the lockstep convention. No
code touched, no schema touched — pre-merge-prod-sim.yml won't
trigger (paths compute/scoring + compute/features unaffected).

Next in optimization sequence: PR B (CLAUDE.md token diet) — TBD
after user reviews this one.

Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants