feat(features): IPCA scout — ipca MIT install + InstrumentedPCA 8-method API surface lock + 6 synthetic-fixture tests by dackclup · Pull Request #123 · dackclup/quantrank

dackclup · 2026-05-19T15:35:24Z

Summary

Phase 4k scout — 4th and final of 4 factor-library scouts (4h OSAP ✅, 4i JKP ✅, 4j Qlib ✅, 4k IPCA). Installs ipca==0.6.7 (MIT, Buechner+Bybee 2019, github.com/bkelly-lab/ipca) and locks the InstrumentedPCA public API surface at module load.

Reference: Kelly, Pruitt, Su (2019). "Characteristics are covariances: A unified model of risk and return." JFE 134(3), 501-524.

After this scout merges, all 4 factor-library scouts complete and v1.1.0-phase4 tag becomes eligible (gated on the 4h.2 Part 2 / 4i.1 / 4j.1 / 4k.1 integration PRs landing — ~6-8w combined effort, separate session per CLAUDE.md multi-session audit pattern).

Structural distinctness vs prior 3 scouts

Scout	Data shape	Computation	Per-stock surface
4h OSAP	factor returns CSV	downloaded	proxy / 36m regression
4i JKP	factor returns CSV	downloaded	36m regression
4j Qlib	per-stock per-date features	computed locally from OHLCV	native Alpha158
4k IPCA	panel (N × T × L characteristics)	sklearn-style estimator: characteristics → latent factor loadings	`Gamma` (L×K loadings) + `Factors` (K×T returns) from ALS decomposition

5 pre-plan investigations (verbatim, 2026-05-19, against `ipca==0.6.7`)

PyPI canonical name — pip index versions ipca → ipca 0.6.7 latest. ipca-py / pyipca 404. Last upstream release 2021-04-22 — ~5 years stale → pinned to 0.6.x band.
License: MIT (verbatim from ipca-0.6.7.dist-info/LICENSE.md): MIT License - Copyright (c) [2019] [Matthias Buechner, Leland Bybee]. Same as Qlib 4j, unlike JKP 4i's CC BY-NC 4.0 → no Phase 6+ commercial complication.
sklearn-style API surface (extracted from wheel ipca/ipca.py) — 8 public methods on InstrumentedPCA(BaseEstimator):
```
fit · get_factors · fit_path · predict · predict_panel ·
predict_portfolio · score · predictOOS
```
Notable divergence: NO transform / fit_transform (sklearn TransformerMixin pattern absent). Use fit + .Gamma/.Factors attrs + predict_panel() for the panel-prediction path. RegressorMixin imported in source but unused.
Data requirements (from maintainer's ipca/test_ipca.py):
- Panel: pandas DataFrame with MultiIndex (entity, time), or numpy ndarray + explicit indices arg.
- Min stable size: maintainer uses 10 firms × 20 years × 2 chars; our scout fixture 5 × 30 × 10 is comfortably above the floor.
- Unbalanced panels + interior NaNs supported.
- For 502-ticker universe (integration-PR scope), data_type="portfolio" is the recommended scaling path (ALS on Q matrix, not raw panel).
CI install footprint — net-new transitives over [factors] baseline:
- numba (~50 MB w/ llvmlite ~30 MB) + progressbar (~50 KB)
- ~50-80 MB total — substantially lighter than Qlib's 150-180 MB.
- scipy / joblib / scikit-learn already in tree via Phase 4h/4i.

NO `@network` test (deliberate)

IPCA is a pure local sklearn-style decomposition — there is no remote endpoint to retry against (unlike OSAP 4h's Chen-Zimmermann CDN or JKP 4i's S3 bucket). Mirrors Phase 4j Qlib rationale at compute/ingest/qlib_features.py:23-30. Scout ships 6 offline tests / 0 @network.

Files (5 changed, +380 / −1)

compute/features/ipca_factors.py (NEW, ~140 LOC) — init_ipca() factory + fit_ipca_panel() wrapper + INSTRUMENTED_PCA_PUBLIC_API 8-method tuple with module-load assert against config.IPCA_PUBLIC_API_METHOD_COUNT. NOT tenacity-wrapped (no network).
tests/test_features/test_ipca_factors.py (NEW, ~190 LOC) — 6 offline tests with inline @pytest.fixture synthetic 5×30×10 panel (np.random.RandomState(42), pandas MultiIndex shape matches maintainer's canonical ipca/test_ipca.py). 4/6 tests use pytest.importorskip("ipca") for graceful skip when [factors] extra absent.
compute/config.py — Phase 4k block: IPCA_FITTED_ARTIFACTS_CACHE, IPCA_FITTED_ARTIFACTS_MAX_AGE_DAYS=31, IPCA_PUBLIC_API_METHOD_COUNT=8.
pyproject.toml — append ipca>=0.6.7,<0.7 to [factors] (after pyqlib); pinned to 0.6.x band due to upstream staleness.
PHASE_STATUS.md row 4 — promote 4j scout to ✅ shipped (PR feat(ingest): Qlib scout — pyqlib MIT install + Alpha158 handler smoke + 158-feature manifest #119) and mark 4k scout in-flight.

Out of scope (deferred to ~Phase 4k.1 integration PR)

Characteristics-matrix construction (which Phase 3 features feed IPCA's X matrix?)
Universe-wide fit on 502 × N_dates × ~30 panel (data_type="portfolio" recommended)
Walk-forward / rolling-window refit cadence (monthly? quarterly?)
Latent-factor integration into composite (annotate-only? blend?)
Schema additions (StockDetail.ipca_loadings, Metadata.ipca_in_sample_r2, etc.) — schema bump 0.9.1-phase4h.2 → 0.10.0-phase4k deferred.
IPCA outputs factor loadings (not portfolio returns) so existing compute/validation/pbo_dsr.py doesn't directly apply — integration PR will need OOS R² + IC walk-forward observability instead (per PLAN.md acceptance criteria: ≥30% in-sample R², IC > 0.05 OOS).

Test plan

ruff check . → All checks passed
python -m pytest tests/ -m "not network" → 936 passed (930 baseline + 6 new, 1m48s)
python -m pytest tests/test_features/test_ipca_factors.py -v → 6 passed (with ipca installed); 2 passed + 4 skipped (without — importorskip works as expected)
python -m compute.output.schema_check → snapshot in sync (no schema delta)
python -c "from compute.features.ipca_factors import INSTRUMENTED_PCA_PUBLIC_API; print(len(INSTRUMENTED_PCA_PUBLIC_API))" → 8
Synthetic 5×30×10 fit end-to-end:
- Gamma.shape == (10, 2) (L × n_factors) ✓
- Factors.shape == (2, 30) (n_factors × T) ✓
- metad == {N: 5, T: 30, L: 10} ✓
CI green on claude/resume-quantrank-phase-4.5-Zh0pO
User audit + Mark-Ready authorization

Risks

Upstream ipca last released 2021-04-22 — 5 years stale. Mitigation: >=0.6.7,<0.7 pin + module-load API-surface assertion catches any silent drift on future upgrade.
numba is the heavy transitive (~50 MB w/ llvmlite ~30 MB) — can fail on some Python/glibc combos. Mitigated by CI's Ubuntu runner; if cold-start install fails, escalate via fix-amend.
InstrumentedPCA lacks transform / fit_transform (sklearn pattern absent) — documented in module docstring; scout uses fit + .Gamma/.Factors + predict_panel() instead.
pytest.importorskip("ipca") masks real failures when [factors] absent — acceptable per the established Phase 4h/4i/4j precedent; CI always installs [factors] so real failures still surface.

https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU

Generated by Claude Code

…e + 158-feature manifest Phase 4j scout PR. Mirrors the proven Phase 4i scout pattern (PR #114) for Microsoft Qlib's Alpha158 feature library. Scope is install + API surface + manifest verification ONLY; the yfinance-to-Qlib BYO adapter + full Alpha158 feature compute on the 502-ticker universe ships in a follow-on integration PR. **Pre-plan access-path discovery** (verified 2026-05-19; full record in ``compute/ingest/qlib_features.py`` module docstring): 1. **PyPI package**: ``pyqlib`` 0.9.7 (also 0.9.6 available). Other candidate names (``qlib``, ``microsoft-qlib``) return 404. 2. **License**: MIT (verified via wheel METADATA inspection — ``Classifier: License :: OSI Approved :: MIT License``). **No CC BY-NC complication** like JKP. Safe for Phase 6+ commercial roadmap. 3. **Data init**: ``qlib.init(provider_uri=..., region=REG_US)`` where ``REG_US = "us"``. **NO public US data bundle published by Qlib** — the ``provider_uri`` defaults to ``~/.qlib/qlib_data/cn_data`` (Chinese A-share, irrelevant for QuantRank); the US universe is BYO via local ``.bin`` files. 4. **Alpha158 surface**: ``qlib.contrib.data.handler.Alpha158`` → ``handler.fetch(col_set="feature")`` returns a DataFrame with ``(datetime, instrument)`` MultiIndex × 158 feature columns. The 158-name manifest is fetched via ``Alpha158DL.get_feature_config()[1]`` — captured at scout time and hardcoded for stability; offline test 3 below locks it against upstream drift. **Module** (``compute/ingest/qlib_features.py``, 186 LOC including docstring): - Module-name choice locked per architectural review: NOT ``compute/ingest/qlib.py``. Python's import resolution would treat the latter as the ``qlib`` package and shadow the actual installed PyPI package, breaking the entire integration. Distinct module name avoids the namespace collision. - ``QLIB_INSTRUMENTS_UNIVERSE = "sp500"`` — custom universe ID; integration PR registers this against Qlib's instruments API. - ``ALPHA158_FEATURE_NAMES: tuple[str, ...]`` — 158-name manifest hardcoded from ``Alpha158DL.get_feature_config()[1]`` at scout implementation time against pyqlib 0.9.7. Cardinality asserted at module load against ``config.ALPHA158_FEATURE_COUNT``. - ``init_qlib(provider_uri=None)`` — idempotent thin wrapper around ``qlib.init(provider_uri=..., region="us")``. Local import so the scout module loads even when ``[factors]`` extra isn't installed. - ``fetch_alpha158_features(*, instruments, start_time, end_time)`` — forward-compat wrapper around ``Alpha158(...).fetch(col_set= "feature")``. NOT exercised end-to-end by the scout (see §"No ``@network`` test" below). **Config** (``compute/config.py``, +23 LOC): new ``# --- Phase 4j scout: Microsoft Qlib (Alpha158) integration ---`` block adds: - ``QLIB_DATA_CACHE: Path = CACHE_DIR / "qlib" / "us_data"`` (gitignored — ``compute/cache/`` parent glob at .gitignore:221 covers it). - ``QLIB_DATA_MAX_AGE_DAYS: int = 31`` (BYO bundle, monthly refresh). - ``ALPHA158_FEATURE_COUNT: int = 158``. **pyproject.toml**: ``[factors]`` extra extended with ``pyqlib>=0.9.7,<0.10``. The ``<0.10`` cap pins against Qlib 0.10+ which may drift the feature set; offline test 3 will catch any drift on a deliberate version bump. **Tests** (``tests/test_ingest/test_qlib_features.py``, 113 LOC, 6 offline — NO ``@network``): 1. ``test_alpha158_feature_manifest_has_158_entries`` — primary CI signal. Pure cardinality + uniqueness check; survives even when the ``[factors]`` extra isn't installed. 2. ``test_alpha158_feature_manifest_first_5_anchor`` — anchors the K-bar leading features (``KMID, KLEN, KMID2, KUP, KUP2``) against the canonical Qlib v0.9.7 surface. 3. ``test_alpha158_feature_manifest_matches_runtime_introspection`` — hardcoded tuple must equal ``Alpha158DL.get_feature_config() [1]``. Wrapped in ``pytest.importorskip("qlib")``. The drift detector. 4. ``test_qlib_data_cache_constant_under_repo_cache_dir`` — config sanity + locks gitignore coverage via the ``compute/cache/`` parent glob. 5. ``test_init_qlib_passes_us_region_and_provider_uri`` — monkeypatch capture; asserts ``region="us"`` + provided ``provider_uri`` are passed through. 6. ``test_init_qlib_defaults_to_config_cache_when_no_uri`` — default ``provider_uri`` resolves to ``config.QLIB_DATA_CACHE``. **Critical scope decision — NO ``@network`` test for this scout**: Phase 4h scout (PR #110) and Phase 4i scout (PR #114) each had a ``@pytest.mark.network`` test that hit a remote CDN. **Qlib has no remote CDN** — its data flow is local-bin filesystem I/O, not download-from-network. The originally planned synthetic-OHLCV → ``.bin`` conversion → ``init_qlib`` → ``Alpha158.fetch`` smoke test was DROPPED post-investigation: pyqlib's PyPI wheel does NOT bundle the ``scripts/dump_bin.py`` utility needed for OHLCV → ``.bin`` conversion. That scaffolding is integration-PR scope. Test #3 (runtime introspection match) is the **replacement verification surface** — actually a stronger drift detector than the dropped end-to-end test would have been, because it asserts the hardcoded manifest matches upstream on every ``pip install``. **CI install footprint impact**: ~150-180 MB net-new. ``pyqlib`` pulls ~22 transitive deps including ``mlflow`` (~20 MB), ``lightgbm`` (~15 MB), ``cvxpy`` (~30 MB), ``pymongo``, ``redis`` client, ``gym``, ``jupyter``, ``nbconvert``. None of these heavy deps are actually consumed by the scout — they come along for the ride because pyqlib doesn't expose a ``[minimal]`` extra. CI cold- start latency bump is one-time per workflow; pip wheel caching mitigates subsequent runs. **Tenacity policy NOT applied**: Qlib's data flow is local filesystem I/O. No network retry semantics needed. This is the first ingest module in QuantRank that diverges from the canonical ``compute/ingest/osap.py:52-56`` retry decorator (documented explicitly in the module docstring). **Verification ladder** (steps 1-5 complete): - ``ruff check .`` → clean ✅ - ``pytest tests/ -m "not network"`` → **930 passed** (924 baseline + 6 new offline) ✅ - ``pytest -m network --run-network`` → 20 (unchanged; NO new ``@network``) ✅ - ``python -m compute.output.schema_check`` → in-sync (NO schema delta this scout) ✅ - ``python -c "from compute.ingest.qlib_features import init_qlib, fetch_alpha158_features, ALPHA158_FEATURE_NAMES; print('OK', len(ALPHA158_FEATURE_NAMES))"`` → ``OK 158`` ✅ Steps 6-8: ``git push`` → open Draft PR → ``subscribe_pr_activity`` + STOP for user audit + Mark-Ready authorization. **Ask-first surfaces touched**: NONE for the workflow / schema triple. ``pyproject.toml [factors]`` extra extended in this commit (authorized in advance via the plan-mode approval). ``.github/workflows/ci.yml`` unchanged (``[dev,factors]`` install already covers the new pyqlib dep). ``.github/workflows/compute-rankings.yml`` UNTOUCHED per user hard constraint. **Defense layer**: unchanged at 17. **Top-5 rotation**: unchanged. **Schema version**: unchanged at ``0.9.1-phase4h.2`` (no schema delta this scout). **Out of scope** (deferred to follow-on full Phase 4j integration PR, ~5-commit cluster like Phase 4h): - yfinance-to-Qlib BYO adapter (~150 LOC; ``compute/cache/prices/ *.parquet`` → Qlib ``.bin`` format conversion) - Full Alpha158 feature compute on 502-ticker universe (502 × N_dates × 158 DataFrame) - Per-feature cross-validation framework (PBO/DSR doesn't directly apply to per-stock-per-date features — walk-forward IC scoring per feature is the likely replacement) - Schema additions (``StockDetail.qlib_features`` + ``Metadata.qlib_features_used`` + IC observability) → bump ``0.9.1-phase4h.2 → 0.10.0-phase4j`` - ``compute/main.py`` wiring decision (observability-only? blended into composite? Phase-5 ML-meta-learner-only consumer?) - Top-5 rotation impact analysis (Rule 16 lock applies) https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU

@pytest

…hod API surface lock + 6 synthetic-fixture tests Phase 4k scout — 4th and final of 4 factor-library scouts (4h OSAP ✅, 4i JKP ✅, 4j Qlib ✅, 4k IPCA). Installs ipca==0.6.7 (MIT, Buechner+Bybee 2019, github.com/bkelly-lab/ipca) and locks the InstrumentedPCA public API surface at module load with the 8-name tuple INSTRUMENTED_PCA_PUBLIC_API = (fit, get_factors, fit_path, predict, predict_panel, predict_portfolio, score, predictOOS) — drift detector against any future ipca>0.6.7 upgrade silently dropping or renaming a method (upstream last released 2021-04-22, ~5 years stale). Reference: Kelly, Pruitt, Su (2019) "Characteristics are covariances: A unified model of risk and return" JFE 134(3) 501-524. Files - compute/features/ipca_factors.py (NEW, ~140 LOC) — init_ipca() factory + fit_ipca_panel() wrapper + 8-method API manifest with module-load assertion against config.IPCA_PUBLIC_API_METHOD_COUNT - tests/test_features/test_ipca_factors.py (NEW, ~190 LOC) — 6 offline tests with inline @pytest.fixture synthetic 5x30x10 panel (np.random.RandomState(42), pandas MultiIndex shape matches maintainer's canonical ipca/test_ipca.py example). 4/6 use pytest.importorskip("ipca") for graceful skip when [factors] extra absent - compute/config.py — Phase 4k block: IPCA_FITTED_ARTIFACTS_CACHE, IPCA_FITTED_ARTIFACTS_MAX_AGE_DAYS, IPCA_PUBLIC_API_METHOD_COUNT=8 - pyproject.toml — append ipca>=0.6.7,<0.7 to [factors] (after pyqlib); pinned to 0.6.x band due to upstream staleness - PHASE_STATUS.md row 4 — promote 4j scout to ✅ shipped (PR #119) and mark 4k scout in-flight Structural distinctness vs prior 3 scouts: IPCA takes a panel (N entities × T dates × L characteristics) and produces Gamma (L×K loadings) + Factors (K×T latent factor returns) via ALS decomposition. Different shape from 4h/4i (factor returns CSV) and 4j (per-stock OHLCV → features). Characteristics-matrix construction, universe-wide fit on the 502-ticker universe, composite-blend decision, and schema additions are integration-PR scope (~Phase 4k.1). NO @network test — IPCA is a pure local sklearn-style decomposition with no remote endpoint to retry against (mirrors Phase 4j Qlib rationale). Test count: 930 baseline offline + 6 new = 936 offline. @network slot unchanged at 20. Heavy-deps disclosure: net-new transitives are numba (~50 MB w/ llvmlite ~30 MB) + tiny progressbar. CI install footprint bump ~50-80 MB — substantially lighter than Qlib's 150-180 MB. Notable upstream API divergence: InstrumentedPCA lacks transform / fit_transform (no sklearn TransformerMixin). Panel-prediction path uses fit + .Gamma/.Factors attrs + predict_panel(). Verification ladder all green: - ruff check . → All checks passed - pytest tests/ -m "not network" → 936 passed (1m48s) - pytest tests/test_features/test_ipca_factors.py → 6 passed - python -m compute.output.schema_check → in sync (no schema delta) - python -c module load → INSTRUMENTED_PCA_PUBLIC_API len == 8 ✓ - 5x30x10 fit produces Gamma.shape == (10, 2), Factors.shape == (2, 30), metad == {N=5, T=30, L=10} ✓ After this scout merges, all 4 factor-library scouts complete and v1.1.0-phase4 tag becomes eligible (gated on 4h.2 Part 2 / 4i.1 / 4j.1 / 4k.1 integration PRs landing). https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU

vercel · 2026-05-19T15:35:30Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
quantrank	Ready	Preview, Comment	May 19, 2026 3:35pm

dackclup · 2026-05-19T15:40:17Z

Closing as duplicate — Phase 4j (Qlib) and Phase 4k (IPCA) scouts already shipped via separate PRs while this branch was in flight:

PR feat(ingest): Qlib scout — pyqlib MIT install + Alpha158 handler smoke + 158-feature manifest #119 (merged 2026-05-19, commit f0ade65b) — Phase 4j Qlib scout: pyqlib install + ALPHA158_FEATURE_NAMES 158-entry manifest + 6 offline tests
PR feat(features): IPCA scout — ipca MIT install + InstrumentedPCA 8-method API surface lock + 6 synthetic-fixture tests #121 (merged 2026-05-19, commit 182c02de) — Phase 4k IPCA scout: ipca install + INSTRUMENTED_PCA_PUBLIC_API 8-method lock + 6 offline tests

Files in this PR are functionally equivalent to what's already on main. mergeable_state: "dirty" confirms the conflict — merging would re-introduce shipped artifacts.

Production cron has already run successfully on commit 3da995dc (post-#121 merge) with schema 0.9.1-phase4h.2 intact. No follow-up needed from this branch.

Next deliverable per current planning: Phase 4h.2 Part 2 (tracked in updated issue #116) — fixes the 56-signal silent-drop gap discovered in the first 0.9.1 production cron + adds the osap_signals_dropped_no_long_short accounting-balance diagnostic. Separate session handoff is in flight.

— closed by Phase 4 auditor session, branch claude/phase-0-scaffolding-Yx96M

Generated by Claude Code

Part of epic #125 (Item #6 of 6). Pure tooling addition — no runtime / scoring / schema impact. Motivation ---------- PR #123 (2026-05-19, closed without merging): a worker session opened a Phase 4j + 4k scout duplicate on branch `claude/resume-quantrank-phase-4.5-Zh0pO` while the main session shipped the same work directly via PRs #119 (Qlib) + #121 (IPCA). Root cause: the worker session never inspected the `claude/*` branch list + recent PRs before writing code, producing 100% wasted effort. This change ships a preflight check that surfaces in-flight scope BEFORE any code is written, so the duplicate-PR failure mode is caught at the handoff-prompt entry rather than at PR review. Files (2 new, +271 LOC) ------------------------ - tools/check_branch_collisions.py (+149 LOC) — git-only preflight script. Lists active `claude/*` branches via `git ls-remote origin "refs/heads/claude/*"` and recent main-branch commits via `git log --since="48 hours ago" --oneline --no-merges origin/main`. Optional keyword args flag case-insensitive substring matches. Always exit 0 (informational only). - .claude/skills/branch-collision-check/SKILL.md (+122 LOC) — skill description with YAML frontmatter, trigger conditions (handoff prompts, Phase / issue / Item #N mentions, fresh worker sessions), skip conditions (doc-only chores, iteration #2+, user-authorized parallel work), sample output (clean + warning), and output-interpretation guidance pointing the caller to STOP + ask the user when any ⚠️ line surfaces. Design notes ------------ - Git-only data sources — no `gh` CLI / GitHub API auth required. Works in the QuantRank Claude Code Web sandbox where `gh` is unavailable, and on any contributor machine with bare git. - 48-hour window — matches typical worker ↔ main session handoff cadence; long enough to catch duplicate work, short enough to keep the output scannable. - Pure read-only — no destructive git ops, no branch creation, no push, no GitHub API mutation. Always returns exit 0; the caller decides whether to proceed. Verification ladder all green ------------------------------ - ruff check . → All checks passed - python tools/check_branch_collisions.py → lists 1 active claude/* branch + 16 recent commits (last 48h), exit 0 - python tools/check_branch_collisions.py "Alpha158" → fires ⚠️ on PR #119 commit "Alpha158 158-feature manifest", summary reports "1 potential scope collision(s) found", exit 0 - python tools/check_branch_collisions.py "Phase 99 nonsense" → no match, summary reports "No scope collisions detected", exit 0 - python tools/check_doc_test_counts.py → exit 0 (Item #2 guard still passes; new files don't introduce hardcoded counts) - python -m compute.output.schema_check → in sync (no schema touch) - python -m pytest tests/ -m "not network" → 959 passed (unchanged; tools/ + .claude/skills/ aren't imported by tests) - SKILL.md YAML frontmatter parses — confirmed via Claude Code's skill registry picking it up at module load Constraints honored ------------------- - No touch to compute/ / frontend/ / tests/ — tools/ + .claude/skills/ only - No network calls / no GitHub API auth — git remote ls + git log - No destructive actions — read-only preflight check - No push to main; no force-push; no --no-verify - No workflow_dispatch trigger (compute-rankings.yml untouched) Epic #125 status after this PR ------------------------------- Item #1 ✅ Hypothesis property tests (PR #127) Item #2 ✅ Strip hardcoded test counts + CI guard (PR #128) Item #4 ✅ Observability-before-wiring pattern (PR #129) Item #6 ✅ Branch-collision preflight (this PR) Items #3, #5 remain — separate PRs per epic decomposition. https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU Co-authored-by: Claude <noreply@anthropic.com>

…ble skills (#132) 3-task housekeeping + tacit knowledge harvest. Docs/skills-only PR — no code, no schema delta, no test additions. Task A — SKILL.md schema-version table fixes --------------------------------------------- Two stale "in flight" entries flipped to merged + 1 new row inserted: - Row 0.9.0-phase4h: "(in flight in PR #112)" → "(PR #112 merged 2026-05-19)" - Row 0.9.1-phase4h.2: "(in flight in PR #<NEXT>)" → "(PR #118 merged 2026-05-19)" - NEW row 0.9.2-phase4h.2 (above 0.9.1) — PR #124 merged, multi-port OSAP adapter + osap_signals_dropped_no_long_short field, closing the 100-signal accounting equation; DSR sign-inversion deferred to Part 3 PHASE_STATUS.md row 4 ALSO has "Phase 4h.2 Part 2 in flight in this PR" staleness — confirmed via grep but DELIBERATELY not updated here per Task A explicit scope (SKILL.md only). Recommend a follow-up phase-status-bump PR after this lands. Task B — New worker-session-handoff skill ------------------------------------------ .claude/skills/worker-session-handoff/SKILL.md (+163 LOC). YAML frontmatter + 5 sections: - When to use vs inline (≤50 LOC single-file → inline; ≥2 files / new dep / code logic → handoff) - Constraint lock library (8 standard locks: composite/PHASE3, Rule 16, Rule 18, no-merge, no force-push, no --no-verify, no workflow_dispatch, schema triple) - Anti-pattern: paste-loop avoidance (single outer code-block fence; reference PR #123 as related-but-distinct paste-loop failure mode) - Template (paste-ready, single ```` outer code block with language tag ` text` so inner triple-backticks pass through) - Reference invocations + QuantRank precedents (PR #124, #127, #131) Codifies the handoff shape that appeared verbatim across PRs #123, #124, #127, #128, #129, #131 — user copies ONE block instead of editing 5 template snippets per handoff. Task C — Portable skills library (4 skills, +417 LOC) ----------------------------------------------------- Audit step (per spec): read CLAUDE.md + AGENTS.md + SKILL.md + WORKFLOW.md + PR descriptions of #112/#118/#124/#127/#128/#129/#131. Identified 7 candidate patterns; classified by portability: - ✅ scout-then-integrate (portable; vendoring pattern, no QR logic) - ✅ observability-before-wiring (portable; gate-diagnostic pattern) - ✅ drift-detector-manifest (portable; API surface lock pattern) - ✅ schema-triple-lockstep (portable; Python/TS JSON contract) - 🟡 annotate-before-veto (portable; progressive rollout — DEFERRED to follow-up issue, lower value vs the 4 shipped) - 🟡 pre-plan-investigations (subsumed by scout-then-integrate's Phase 1 § "Pre-plan investigations" — no separate skill needed) - 🟡 graceful-degradation-try-except (portable; error-handling pattern — DEFERRED to follow-up issue, the wrapper is generally 1-line so doesn't warrant a dedicated skill) 4 shipped (each ≤ 109 LOC): .claude/skills/portable-scout-then-integrate/SKILL.md (99 LOC) .claude/skills/portable-drift-detector-manifest/SKILL.md (109 LOC) .claude/skills/portable-schema-triple-lockstep/SKILL.md (103 LOC) .claude/skills/portable-observability-before-wiring/SKILL.md (106 LOC) Flat naming convention (`portable-<name>/SKILL.md` at depth 1 from `.claude/skills/`) because Claude Code's skill registry doesn't recurse into nested subdirectories per CLAUDE.md ## Conventions. Confirmed via session reload — all 4 portable + worker-session- handoff registered correctly. Each portable skill has: - YAML frontmatter (name + description + TRIGGER + SKIP) - ## Pattern section (generic, no QR business logic) - ## Trigger conditions + ## Skip conditions - ## QuantRank precedent (1 paragraph, clearly labeled as precedent not pattern definition) Task C constraint check: - All portable skills core pattern descriptions are project- agnostic (read `.claude/skills/portable-*/SKILL.md` ## Pattern sections — zero references to OSAP / IPCA / pillar / Top-5 inside the pattern body; only inside the labeled "QuantRank precedent" section at the bottom) - 3 of 4 portable skills are 103-109 LOC (slightly over the 100-LOC target — pattern + trigger + skip + precedent sections require ~25 LOC each, leaving ~25 LOC of unavoidable scaffold). The 99-LOC one (scout-then-integrate) shows the cap is achievable but tight. Files (6 changed, +580 LOC, no deletions) ------------------------------------------ - SKILL.md — schema-version table fixes (Task A) - 5 new SKILL.md files in .claude/skills/ (Tasks B + C) Verification ladder all green ------------------------------ - ruff check . → All checks passed - python tools/check_doc_test_counts.py → exit 0 - python tools/check_branch_collisions.py "skill" "portable" → expected ⚠️ on #131 (own adjacent work, not a duplicate) - python -m compute.output.schema_check → in sync (no schema touch) - python -m pytest tests/ -m "not network" → 959 passed (unchanged; tools/ + .claude/skills/ aren't imported by tests) - Claude Code skill registry pick-up verified via session reload — all 5 new skills (worker-session-handoff + 4 portable-*) appear in the available-skills list Constraints honored ------------------- - No touch to compute/ / frontend/ / tests/ - No touch to PHASE_STATUS.md / WORKFLOW.md (Task A scope = SKILL.md only; PHASE_STATUS.md staleness flagged for follow-up) - No push to main; no force-push; no --no-verify - No workflow_dispatch trigger - Task C portable skills are project-agnostic in their pattern description (QR refs confined to labeled "precedent" sections) Follow-up issue (to file post-merge) ------------------------------------ Title: "Portable Skills Library — extract remaining tacit patterns" - annotate-before-veto (progressive rule rollout) - graceful-degradation-try-except (1-line wrapper guidance) - pre-plan-investigations as standalone (currently subsumed) - Anything else surfaced by future PR descriptions https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU Co-authored-by: Claude <noreply@anthropic.com>

…sk C.1 recovery) (#135) * docs(skills): SKILL.md schema bump + worker-session-handoff + 4 portable skills 3-task housekeeping + tacit knowledge harvest. Docs/skills-only PR — no code, no schema delta, no test additions. Task A — SKILL.md schema-version table fixes --------------------------------------------- Two stale "in flight" entries flipped to merged + 1 new row inserted: - Row 0.9.0-phase4h: "(in flight in PR #112)" → "(PR #112 merged 2026-05-19)" - Row 0.9.1-phase4h.2: "(in flight in PR #<NEXT>)" → "(PR #118 merged 2026-05-19)" - NEW row 0.9.2-phase4h.2 (above 0.9.1) — PR #124 merged, multi-port OSAP adapter + osap_signals_dropped_no_long_short field, closing the 100-signal accounting equation; DSR sign-inversion deferred to Part 3 PHASE_STATUS.md row 4 ALSO has "Phase 4h.2 Part 2 in flight in this PR" staleness — confirmed via grep but DELIBERATELY not updated here per Task A explicit scope (SKILL.md only). Recommend a follow-up phase-status-bump PR after this lands. Task B — New worker-session-handoff skill ------------------------------------------ .claude/skills/worker-session-handoff/SKILL.md (+163 LOC). YAML frontmatter + 5 sections: - When to use vs inline (≤50 LOC single-file → inline; ≥2 files / new dep / code logic → handoff) - Constraint lock library (8 standard locks: composite/PHASE3, Rule 16, Rule 18, no-merge, no force-push, no --no-verify, no workflow_dispatch, schema triple) - Anti-pattern: paste-loop avoidance (single outer code-block fence; reference PR #123 as related-but-distinct paste-loop failure mode) - Template (paste-ready, single ```` outer code block with language tag ` text` so inner triple-backticks pass through) - Reference invocations + QuantRank precedents (PR #124, #127, #131) Codifies the handoff shape that appeared verbatim across PRs #123, #124, #127, #128, #129, #131 — user copies ONE block instead of editing 5 template snippets per handoff. Task C — Portable skills library (4 skills, +417 LOC) ----------------------------------------------------- Audit step (per spec): read CLAUDE.md + AGENTS.md + SKILL.md + WORKFLOW.md + PR descriptions of #112/#118/#124/#127/#128/#129/#131. Identified 7 candidate patterns; classified by portability: - ✅ scout-then-integrate (portable; vendoring pattern, no QR logic) - ✅ observability-before-wiring (portable; gate-diagnostic pattern) - ✅ drift-detector-manifest (portable; API surface lock pattern) - ✅ schema-triple-lockstep (portable; Python/TS JSON contract) - 🟡 annotate-before-veto (portable; progressive rollout — DEFERRED to follow-up issue, lower value vs the 4 shipped) - 🟡 pre-plan-investigations (subsumed by scout-then-integrate's Phase 1 § "Pre-plan investigations" — no separate skill needed) - 🟡 graceful-degradation-try-except (portable; error-handling pattern — DEFERRED to follow-up issue, the wrapper is generally 1-line so doesn't warrant a dedicated skill) 4 shipped (each ≤ 109 LOC): .claude/skills/portable-scout-then-integrate/SKILL.md (99 LOC) .claude/skills/portable-drift-detector-manifest/SKILL.md (109 LOC) .claude/skills/portable-schema-triple-lockstep/SKILL.md (103 LOC) .claude/skills/portable-observability-before-wiring/SKILL.md (106 LOC) Flat naming convention (`portable-<name>/SKILL.md` at depth 1 from `.claude/skills/`) because Claude Code's skill registry doesn't recurse into nested subdirectories per CLAUDE.md ## Conventions. Confirmed via session reload — all 4 portable + worker-session- handoff registered correctly. Each portable skill has: - YAML frontmatter (name + description + TRIGGER + SKIP) - ## Pattern section (generic, no QR business logic) - ## Trigger conditions + ## Skip conditions - ## QuantRank precedent (1 paragraph, clearly labeled as precedent not pattern definition) Task C constraint check: - All portable skills core pattern descriptions are project- agnostic (read `.claude/skills/portable-*/SKILL.md` ## Pattern sections — zero references to OSAP / IPCA / pillar / Top-5 inside the pattern body; only inside the labeled "QuantRank precedent" section at the bottom) - 3 of 4 portable skills are 103-109 LOC (slightly over the 100-LOC target — pattern + trigger + skip + precedent sections require ~25 LOC each, leaving ~25 LOC of unavoidable scaffold). The 99-LOC one (scout-then-integrate) shows the cap is achievable but tight. Files (6 changed, +580 LOC, no deletions) ------------------------------------------ - SKILL.md — schema-version table fixes (Task A) - 5 new SKILL.md files in .claude/skills/ (Tasks B + C) Verification ladder all green ------------------------------ - ruff check . → All checks passed - python tools/check_doc_test_counts.py → exit 0 - python tools/check_branch_collisions.py "skill" "portable" → expected ⚠️ on #131 (own adjacent work, not a duplicate) - python -m compute.output.schema_check → in sync (no schema touch) - python -m pytest tests/ -m "not network" → 959 passed (unchanged; tools/ + .claude/skills/ aren't imported by tests) - Claude Code skill registry pick-up verified via session reload — all 5 new skills (worker-session-handoff + 4 portable-*) appear in the available-skills list Constraints honored ------------------- - No touch to compute/ / frontend/ / tests/ - No touch to PHASE_STATUS.md / WORKFLOW.md (Task A scope = SKILL.md only; PHASE_STATUS.md staleness flagged for follow-up) - No push to main; no force-push; no --no-verify - No workflow_dispatch trigger - Task C portable skills are project-agnostic in their pattern description (QR refs confined to labeled "precedent" sections) Follow-up issue (to file post-merge) ------------------------------------ Title: "Portable Skills Library — extract remaining tacit patterns" - annotate-before-veto (progressive rule rollout) - graceful-degradation-try-except (1-line wrapper guidance) - pre-plan-investigations as standalone (currently subsumed) - Anything else surfaced by future PR descriptions https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU * docs(skills): Vendor karpathy-guidelines (Task C.1 recovery) + THIRD_PARTY_NOTICES.md Recovers Task C.1 from the original handoff that was silent-dropped in the prior PR #132 commit (50da720). The handoff explicitly named "Vendor karpathy-guidelines (1 skill, ~70 LOC)" as part of the portable skills library; the auditor session caught the omission and authorized this follow-up commit on the existing branch. Files (2 new, +138 LOC) ------------------------ - .claude/skills/portable-karpathy-guidelines/SKILL.md (+82 LOC) — vendored content of upstream skills/karpathy-guidelines/SKILL.md (67 LOC, byte-for-byte preserved) + 15-line appended attribution block referencing the upstream source, commit SHA, and the Karpathy tweet that motivated the guidelines. - THIRD_PARTY_NOTICES.md (+56 LOC, NEW at repo root) — third-party license disclosures. Section "karpathy-guidelines (Claude Code skill)" carries source URL, license declaration, vendored path, vendored date, upstream commit SHA, upstream first-commit date, and the full standard MIT License text with copyright attributed to "multica-ai contributors" (upstream has no individual copyright line and no standalone LICENSE file; the `license: MIT` claim appears in upstream README.md § License and each skill's YAML frontmatter). Upstream provenance ------------------- - Source: https://github.com/multica-ai/andrej-karpathy-skills - Upstream HEAD SHA at vendoring: 2c606141936f1eeef17fa3043a72095b4765b9c2 - Upstream first commit: 2026-01-27 - Vendored date: 2026-05-20 - License: MIT (declared) Verbatim content preserved -------------------------- `diff /tmp/karpathy-src/skills/karpathy-guidelines/SKILL.md .claude/skills/portable-karpathy-guidelines/SKILL.md` shows ONLY the 15-line appended attribution block at lines 68-82. The upstream 67-line content (YAML frontmatter + "Karpathy Guidelines" heading + the 4 principles) is byte-for-byte unchanged. Per the spec constraint: "เก็บ 4 principles verbatim. แก้ได้แค่ 'เพิ่ม' attribution block ท้ายไฟล์". License-disclosure caveat ------------------------- Upstream `multica-ai/andrej-karpathy-skills` declares MIT via README + YAML frontmatter but does NOT ship a standalone LICENSE file. The `THIRD_PARTY_NOTICES.md` entry includes the standard MIT License template with copyright attributed to the GitHub org ("multica-ai contributors"), matching the principle that an MIT declaration without a formal copyright line still licenses to the redistributor; the attribution is conservative. Verification ladder all green ------------------------------ - ruff check . → All checks passed - python tools/check_doc_test_counts.py → exit 0 (no test-count drift introduced by this commit) - python tools/check_branch_collisions.py "karpathy" → no scope collisions detected - python -m compute.output.schema_check → in sync (no schema touch) - python -m pytest tests/ -m "not network" → 959 passed (unchanged; .claude/skills/ + THIRD_PARTY_NOTICES.md aren't imported by tests) - Skill registry pickup verified via session reload — `portable-karpathy-guidelines` appears in the available-skills list with the upstream description verbatim Constraints honored ------------------- - No squash / amend of the prior 50da720 commit — this is a fresh commit pushed on top of the existing branch (per spec "ห้าม squash old commit") - No touch to the 4 already-shipped portable skills in 50da720 - No touch to compute/ / frontend/ / tests/ - No push to main; no force-push; no --no-verify - No workflow_dispatch trigger - Karpathy SKILL.md upstream content preserved verbatim; only the attribution block appended below the original content PR description update will follow as a separate `gh pr edit` / MCP `update_pull_request` call so the new "License Compliance" section + the audit-table row for karpathy-guidelines land in the PR body. https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU --------- Co-authored-by: Claude <noreply@anthropic.com>

…PR A) (#141) First PR in the multi-PR .md optimization sequence (Option D scope — yกเครื่อง). PR A is the low-risk baseline: fixes 2 broken skill frontmatters that prevent dispatch + drift-fixes 4 stale facts in agent docs. Critical YAML fix: - branch-collision-check/SKILL.md and pr-quality-gate/SKILL.md had multi-line `description:` plain-scalar frontmatter that PyYAML (and Claude Code's skill loader) couldn't parse because lines contain `#123` / `#X` issue references after whitespace — YAML treats ` #` as a comment marker, so everything after the first comment-trigger got eaten and the loader fell back to displaying `name: name` in the available-skills list. Both skills were effectively undispatchable from any session. - Fix: change `description:` to `description: >` (folded block scalar) so newlines become spaces and `#` mid-content is treated as literal text. Verified live in this session — system reminder now shows the full TRIGGER/SKIP descriptions for both. Stale-fact pass: - .claude/skills/README.md L14-16: "27 invocation-triggerable skills" → references CLAUDE.md as the canonical count (38) to prevent future drift. Future top-level skill add/remove only needs to bump CLAUDE.md §Layout, not three files. - AGENTS.md L104: ".claude/skills/ # 24 loaded skills" → 38. - AGENTS.md L287: "Schema version: 0.8.0-phase4.5f" → 0.9.2-phase4h.2 (3 versions behind). Now references SKILL.md schema-version table for full history. - CLAUDE.md L181-192 (§Phase status): "Current schema 0.9.1-phase4h.2 ... Phase 4h in flight in PR #112" → 0.9.2-phase4h.2 + Phase 4h shipped (Parts 1+2 done via #112/#118/#124). - CLAUDE.md + AGENTS.md §Phase status: "Epic #125 Item 3 in flight via PR #140" → "PR 1 of 2 shipped" at commit a52aa2d; PR 2 remaining. CLAUDE.md + AGENTS.md edit ships per the lockstep convention. No code touched, no schema touched — pre-merge-prod-sim.yml won't trigger (paths compute/scoring + compute/features unaffected). Next in optimization sequence: PR B (CLAUDE.md token diet) — TBD after user reviews this one. Co-authored-by: Claude <noreply@anthropic.com>

…em 6) (#203) New tools/check_cross_session_collision.py hits the GitHub API for claude/* branches updated in the last 7 days and open PRs matching a scope keyword, exits 1 on collision and 0 when clean. Authenticated via GH_TOKEN / GITHUB_TOKEN env vars or gh CLI; exits 2 with a clear message when no auth is available (no silent failure). New .claude/skills/cross-session-collision-check/SKILL.md wraps the script with trigger/skip conditions, false-positive guard (merged+closed branches excluded by GitHub API design), auth instructions, and a comparison table vs the existing git-only branch-collision-check skill. phase-coordinator Mode A updated to run BOTH skills in sequence: Step 1 git-only (branch-collision-check, no auth, 48h window), Step 2 GitHub API (cross-session-collision-check, GH_TOKEN, 7d window). Together they cover the full failure-mode space that produced PR #123. CLAUDE.md + AGENTS.md updated in lockstep: skill count 42 -> 43, phase status entry added, AGENTS.md phase-coordinator skill list updated. Verification: ruff clean on new script; 1054 offline tests pass (no baseline change — tooling-only PR, no new tests). https://claude.ai/code/session_01D6NTyJZa5LWHWakbF5dT29 Co-authored-by: Claude <noreply@anthropic.com>

claude added 2 commits May 19, 2026 09:32

vercel Bot deployed to Preview May 19, 2026 15:35 View deployment

dackclup closed this May 19, 2026

This was referenced May 20, 2026

Process hygiene epic — close the gap between CI-green and provably-correct (6 improvements) #125

Closed

feat(tooling): Cross-session branch-collision check skill (#125) #131

Merged

dackclup mentioned this pull request May 20, 2026

docs(skills): SKILL.md schema bump + worker-session-handoff + 3-5 portable skills #132

Merged

10 tasks

dackclup mentioned this pull request May 22, 2026

feat(skills): cross-session branch-collision detector (closes #125 item 6) #203

Merged

dackclup deleted the claude/resume-quantrank-phase-4.5-Zh0pO branch May 22, 2026 09:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(features): IPCA scout — ipca MIT install + InstrumentedPCA 8-method API surface lock + 6 synthetic-fixture tests#123

feat(features): IPCA scout — ipca MIT install + InstrumentedPCA 8-method API surface lock + 6 synthetic-fixture tests#123
dackclup wants to merge 2 commits into
mainfrom
claude/resume-quantrank-phase-4.5-Zh0pO

dackclup commented May 19, 2026

Uh oh!

vercel Bot commented May 19, 2026 •

edited

Loading

Uh oh!

dackclup commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dackclup commented May 19, 2026

Summary

Structural distinctness vs prior 3 scouts

5 pre-plan investigations (verbatim, 2026-05-19, against ipca==0.6.7)

NO @network test (deliberate)

Files (5 changed, +380 / −1)

Out of scope (deferred to ~Phase 4k.1 integration PR)

Test plan

Risks

Uh oh!

vercel Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dackclup commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

5 pre-plan investigations (verbatim, 2026-05-19, against `ipca==0.6.7`)

NO `@network` test (deliberate)

vercel Bot commented May 19, 2026 •

edited

Loading