feat(phase-4h): OSAP integration — foundation + replicate + blend + PBO/DSR gate#112
Merged
Merged
Conversation
Phase 4h commit 1 of N. Lays the groundwork for the OSAP composite-blend
integration without yet wiring any new compute paths into main.py — the
schema bump + workflow install bump land first so subsequent commits
(replicate.py, blend.py, validation.py, main.py wiring) can extend the
foundation without each touching ask-first surfaces.
Changes:
- `compute/ingest/osap.py`: extend `fetch_osap_returns()` with keyword-only
`signals: list[str] | None` + `as_of: date | None` filters (non-breaking
— both default None). Filter happens post-cache-load so a callsite asking
for 20 signals doesn't invalidate a callsite asking for all 1,188. The
cache always stores the full bulk parquet.
- `compute/output/schemas.py`:
- `StockDetail`: +`osap_signals: dict[str, float] | None` (per-stock
signalname → cross-sectional rank for the PBO/DSR-accepted subset) +
`osap_blended_score: float | None` (the 50/50 blend output).
- `Metadata`: +`osap_signals_used: list[str] | None` + `osap_excluded_
signals: list[str] | None` + `osap_signals_ic_12m: dict[str, float] |
None` + `osap_signals_coverage_pct: dict[str, float] | None`.
- All fields Optional with None default → forward-compatible with
legacy 0.8.x JSONs.
- `compute/config.py`: `SCHEMA_VERSION` 0.8.0-phase4.5f → 0.9.0-phase4h
(MINOR bump = new phase per SKILL.md schema-versions convention).
- `frontend/lib/types.ts` + `frontend/lib/schema-snapshot.json`: mirror
the schema additions; snapshot regenerated via `schema_check
--update-snapshot`.
- `.github/workflows/compute-rankings.yml`: install line
`pip install -e .` → `pip install -e ".[factors]"` so the next cron
run is ready when commit 5 (main.py wiring) lands — pinned
`openassetpricing==0.0.2` already in pyproject `factors` extra from
PR #110.
- `tests/test_config.py` + `tests/test_smoke.py`: SCHEMA_VERSION pin
assertions updated to phase4h.
Blend approach (locked in plan audit 2026-05-18, Path b):
`apply_osap_blend()` will compute outside `compute_composite()` — formula
`composite_score_osap_adjusted = (1 - weight) × composite_score +
weight × osap_signal_aggregate`. This commit does NOT yet touch the
composite path; that lands in commit 3 (osap_blend.py). The
PHASE3_WEIGHTS sum-to-1.0 invariant at composite.py:43-45 stays intact.
Verification (this commit only):
- `ruff check .` → clean
- `python -m pytest tests/ -m "not network"` → 861 passed
- `python -m compute.output.schema_check` → in sync (no
`--update-snapshot` needed; regen committed)
- `cd frontend && npx tsc --noEmit` → clean
- `cd frontend && npm run build` → 506 static pages, identical
shape to post-PR-110 production build
Ask-first surfaces touched (per AGENTS.md):
- `.github/workflows/compute-rankings.yml` (AGENTS.md:105)
- Schema triple (AGENTS.md:229-231)
- `SCHEMA_VERSION` (compute/config.py:30)
All flagged in advance in the plan audit; user authorized 2026-05-18.
https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
…or-exposure proxy)
Phase 4h commit 2 of N. Lays the per-stock signal mapping layer that
commit 3's blend module will consume. The manifest is committed to
config (so the validation harness in commit 4 can iterate over it
without importing from the replicate module), and the replicate
module ships the *factor-exposure proxy* version of the algorithm —
honest, well-documented, and complete-enough for the Phase 4h blend
target.
Module layer:
- `compute/config.py`: OSAP_SIGNALS_100 manifest (exactly 100 entries
across 8 theme buckets matching osap-integration/PLAN.md L60-73).
Two `assert` statements at module-load time pin the cardinality
and uniqueness invariants — any future edit that strays from 100
unique signals fails at import (caught by ruff / pytest collect,
surfaces during local dev). Manifest is aspirational: commit 4's
PBO/DSR gate filters out any signal that doesn't resolve in the
fetched OSAP DataFrame and logs the rejection under
`metadata.json::osap_excluded_signals`.
- `compute/features/osap_replicate.py` NEW: four public functions
+ one orchestrator + one coverage helper.
- `compute_long_short_returns(returns)` — pivots port to columns,
derives ls_return = port=01 − port=10, drops decile buckets and
incomplete pairs.
- `select_as_of_cross_section(ls_returns, as_of)` — picks the
most-recent observation per signal at or before as_of.
- `rank_signals_cross_sectional(cross_section)` — `pandas.rank(
method='average', pct=True)`; no scipy dependency.
- `compute_osap_signals(returns, tickers, as_of, requested_signals)`
— orchestrator returning `{ticker: {signalname: rank} | None}`.
- `coverage_by_signal(signal_map)` — per-signal coverage %
helper for commit 5's `metadata.json` write.
*Factor-exposure proxy mode* (locked 2026-05-18 plan audit, Path
consistent with §Scope IN #2): every ticker receives the same
signal map, derived from the market-wide OSAP long-short return
cross-section at as_of. True per-stock signal replication (porting
100 OSAP SAS/Stata formulas into pandas) is the deferred heavy
lift. Module docstring documents why this is sufficient for the
blend target:
1. osap_blended_score is observability-only this phase (SKILL.md
Rule 16 — Top-5 ranking still uses raw composite_score).
2. PBO/DSR gate (commit 4) runs on the long-short returns
themselves, not the per-stock projection, so signal acceptance
is identical to the full version.
3. Per-stock replication of all 100 signals slips Phase 4h by
weeks without unblocking 4i/4j/4k.
Tests (`tests/test_features/test_osap_replicate.py`, 14 offline):
- 4 covering `compute_long_short_returns`: basic 2-signal happy
path, missing-short-port drop, decile-bucket exclusion, integer
port column coercion.
- 3 covering `select_as_of_cross_section`: most-recent-per-signal
pick, future-date filter, empty-window None handling.
- 2 covering `rank_signals_cross_sectional`: unit-interval
normalisation, ties get average rank.
- 5 covering `compute_osap_signals` end-to-end: full happy path
with proxy-mode invariant assertion, empty-returns None policy,
universe-gap None policy, manifest cardinality sanity, and a
smoke test against the shipped scout fixture
(`tests/fixtures/osap_returns_sample.csv` from PR #110).
Verification:
- `ruff check .` → clean
- `python -m pytest tests/ -m "not network"` → 875 passed (861
prior + 14 new osap_replicate)
- No `compute.main` import yet — wiring lands in commit 5
- No schema bump (already at 0.9.0-phase4h from commit 1)
Universe-gap policy: tickers receive None (not zero, not an imputed
neutral) when the as-of cross-section is empty. Pillar
`compute_composite(neutralize_missing=True)` imputes 50.0 for
missing pillars; OSAP intentionally does not. Commit 3's blend
layer treats None as "no OSAP adjustment" and passes
composite_score through unchanged.
Next: commit 3 — `compute/scoring/osap_blend.py` (~80 LOC + 8
tests). The Path-b formula
`(1 - weight) × composite_score + weight × osap_signal_aggregate`
stays OUTSIDE `compute_composite`; PHASE3_WEIGHTS sum-to-1.0
invariant at composite.py:43-45 stays intact.
https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU
…ult)
Phase 4h commit 3 of N. Ships the blend layer that consumes commit 2's
per-ticker signal map and produces ``composite_score_osap_adjusted``
without touching the PHASE3_WEIGHTS sum-to-1.0 invariant at
``compute/scoring/composite.py:43-45``.
**Path-b architecture decision** — the OSAP correction is applied
*outside* ``compute_composite()``, NOT as a 9th slot in
PHASE3_WEIGHTS. Two reasons documented in module docstring:
1. Adding a 9th slot would either fail the existing
``abs(_W_SUM - 1.0) > 1e-9 → ValueError`` invariant or force a
pro-rata redistribution of the 8 active Phase-3 pillars — both
alter the established composite math retroactively.
2. Phase 4h's blend is observability-only (Top-5 still ranked by
raw composite_score per SKILL.md Rule 16); a layered
``composite_score → composite_score_osap_adjusted`` keeps the
pre-blend score on every StockDetail for direct
delta-attribution.
Module layer (`compute/scoring/osap_blend.py`, 96 LOC):
- ``OSAP_BLEND_WEIGHT_DEFAULT = 0.5`` — locked at
osap-integration/PLAN.md L168-170 (50/50 default). Phase 5 ML
meta-learner is where this can move.
- ``aggregate_osap_signals(signal_map) -> pd.Series`` — pools the
per-ticker ``{signalname: rank}`` map into a single 0-100 score
via arithmetic mean of ranks × 100. NaN for tickers with ``None``
inner map (universe gap) or empty ``{}``. Empty input → empty
Series. Matches the shape returned by commit 2's
``compute_osap_signals``.
- ``apply_osap_blend(composite_scores, osap_signal_aggregate,
weight=OSAP_BLEND_WEIGHT_DEFAULT) -> pd.Series`` — Path-b formula
``(1 - weight) × composite_score + weight × osap_signal_aggregate``.
Output indexed by ``composite_scores.index``, dtype float, name
``composite_score_osap_adjusted``, clipped to [0, 100] to match
composite-score domain.
**Universe-gap policy** — tickers whose OSAP aggregate is NaN
(after reindex) pass their raw composite_score through unchanged.
NO impute. This is intentionally distinct from pillar
``compute_composite(neutralize_missing=True)`` which imputes 50.0
for missing pillar values: an OSAP-blank ticker is "no information
added", not "no information available", so imputing 50.0 would
silently shrink the composite toward neutral and bias Top-5
against OSAP-covered names.
Tests (`tests/test_scoring/test_osap_blend.py`, 17 offline,
exceeded plan's 8-test floor):
- 4 covering aggregate: mean-of-ranks × 100 math, None → NaN,
empty inner dict → NaN, empty input → empty Series.
- 12 covering apply_osap_blend: 50/50 basic, weight=0 pass-through,
weight=1 OSAP-only-where-covered, NaN OSAP fallback, empty
composite, output clipping to [0, 100], invalid weight raises
(< 0 and > 1), extra OSAP tickers dropped via reindex, missing
OSAP ticker pass-through, default-weight matches constant,
end-to-end shape with commit 2's signal-map format, dtype
preservation (int composite → float blended).
- 1 cross-module sanity: round-trip with the exact signal_map shape
produced by ``compute_osap_signals`` (commit 2) → aggregate →
blend, asserting universe-gap and math.
Verification:
- ``ruff check .`` → clean (ruff auto-reorganized one import block;
no logic changes)
- ``python -m pytest tests/ -m "not network"`` → 892 passed
(875 prior + 17 new osap_blend)
- No ``compute.main`` import yet — wiring lands in commit 5
- No schema bump (already at 0.9.0-phase4h from commit 1)
- ``PHASE3_WEIGHTS`` untouched; composite.py L43-45 invariant
intact (manually verified: ``grep -n PHASE3_WEIGHTS
compute/scoring/composite.py`` returns the same 4 hits as
pre-commit)
Next: commit 4 — ``compute/validation/osap_validation.py`` (PBO/DSR
hard gate per-signal + rolling-12m-Spearman-IC observability,
~120 LOC + 10 tests). Wraps PR #60's ``factor_passes_gates``
(``compute/validation/pbo_dsr.py:388``); accepted-signal subset
feeds commit 5's ``compute/main.py`` wiring.
https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU
Phase 4h commit 4 of N. Ships the cohort-aware wrapper that decides which of the 100 candidate OSAP signals are accepted into the ``composite_score_osap_adjusted`` blend (commit 3). Wraps PR #60's ``compute/validation/pbo_dsr.py::factor_passes_gates`` — does NOT reimplement PBO or DSR math. Module layer (``compute/validation/osap_validation.py``, 220 LOC): - ``GateResult(frozen dataclass)`` — per-signal verdict with ``accepted: bool``, ``pbo / dsr / sharpe: float | None``, ``n_observations: int``, ``rejection_reason: str | None`` in ``{None, 'high_pbo', 'low_dsr', 'gate_failed', 'insufficient_data'}``. Distinct ``'gate_failed'`` category for diagnostic clarity when both PBO AND DSR fail simultaneously (Bailey 2014 pure-noise cohorts fail this way — verified by tests #1, #4, #8). - ``gate_osap_signals(long_short_returns, requested_signals=None, pbo_threshold=PBO_VETO_THRESHOLD, dsr_threshold=DSR_VETO_THRESHOLD, n_partitions=DEFAULT_N_PARTITIONS) -> dict[str, GateResult]`` — pivots commit-2's long-format DF to wide (date × signal), runs Bailey 2014 cohort framing (``n_trials = wide.shape[1]``), per-signal loop calling ``factor_passes_gates`` with the established defaults. Module constants imported from ``pbo_dsr`` — NOT redefined. - ``compute_rolling_ic_12m(long_short_returns, signalname) -> float | None`` — observability-only Spearman lag-1 IC over the most recent 12 monthly observations. Pure pandas (no scipy — matches ``pbo_dsr.py``'s hand-rolled Beasley-Springer-Moro precedent for the inverse normal CDF). Never gates acceptance. - ``filter_accepted_signals(gate_results) -> (accepted, excluded)`` — sorted-alphabetical split, feeds commit-5's metadata writer. **NaN policy — LOCKED, documented in module docstring**: Source-verified asymmetry in ``compute/validation/pbo_dsr.py``: - ``compute_pbo`` (L187-284) is **NaN-UNSAFE** — L234 ``to_numpy (dtype=float)`` then L256-257 ``.mean(axis=0)`` / ``.std(axis=0)`` then L261 ``np.argmax`` silently corrupts on any NaN cell. - ``compute_deflated_sharpe`` (L287-385) is **NaN-SAFE** — L323 strips internally via ``arr = arr[~np.isnan(arr)]``. Because ``factor_passes_gates`` accepts ``factor_returns`` and ``returns_matrix`` independently, this wrapper feeds different NaN treatments to each side: 1. ``factor_returns = wide[sig].dropna()`` — DSR's internal strip handles it. No information lost. 2. ``returns_matrix = wide.fillna(0.0)`` — zero-fill, NOT mean-fill, NOT ``dropna(how='any')``. Zero-fill chosen over the two alternatives: - ``dropna(how='any')`` would decimate the 100-signal × monthly matrix below ``n_partitions=16`` rows once any earnings-event-only signal is included, collapsing the Bailey 2014 multiple-testing ``n_trials = cohort_size`` correction. Test #13 ``test_gate_osap_ signals_sparse_cohort_zero_filled_not_decimated`` is the regression guard against accidental revert. - ``fillna(column_mean)`` would deflate per-signal variance, inflate Sharpe, bias PBO toward false acceptance — silently rewarding sparse signals for low coverage. - ``fillna(0.0)`` is the honest OSAP-semantic: absence-of-coverage for ``(signal, month)`` means "no portfolio formed / no information generated" → zero return is the right proxy. Bailey 2014 PBO is rank-based within each period; zero-imputation symmetrically pushes coverage-gap rows toward indeterminate rank. Acknowledged trade-off: sparse-coverage signals see their Sharpe shrunk toward zero by the zero-fill, raising DSR rejection probability. Cohort-fair but penalizes legitimate event-only signals. Phase 4h scope accepts this — the Phase 5 backtest harness (``defense-infrastructure/PLAN.md:270``) runs full walk-forward CV per signal and supersedes this gate when it ships. Standalone module discipline: zero imports from ``compute.features.osap_replicate`` (commit 2), ``compute.scoring.osap_blend`` (commit 3), or ``compute.main``. Only ``compute.validation.pbo_dsr`` for primitives + constants. Validation runs on the long-short returns DataFrame contract only. Tests (``tests/test_validation/test_osap_validation.py``, 14 offline, exceeded plan's 13-test target): 1. ``random_noise_yields_high_pbo`` — Bailey 2014 invariant: pure- noise cohort → zero acceptances, all reasons in {'high_pbo', 'low_dsr', 'gate_failed'}, no 'insufficient_data' 2. ``low_sharpe_signal_rejected_for_dsr`` — near-zero σ signal → 'low_dsr' or 'gate_failed' with DSR ≤ 0 3. ``strong_signal_accepted`` — monotone-drift signal beats noisy cohort → accepted=True, populated pbo/dsr/sharpe floats 4. ``insufficient_data`` — < ``MIN_OBS_PER_SIGNAL`` rows in cohort → all signals rejected with 'insufficient_data' 5. ``requested_signals_filter`` — subset filter applied pre-pivot 6. ``requested_none_uses_all_signals_in_df`` — default covers all 7. ``empty_input_returns_empty_dict`` — empty DF → {}, no crash 8. ``single_signal_cohort_rejects_with_insufficient_data`` — cohort size < 2 short-circuit 9. ``compute_rolling_ic_12m_known_signal`` — monotone series → Spearman = 1.0 ± 1e-9 10. ``compute_rolling_ic_12m_insufficient_history`` — < 13 obs → None 11. ``compute_rolling_ic_12m_nan_safe_with_gaps`` — NaN outside tail(13) window pruned cleanly; tail-13 strictly monotone Spearman = 1.0 12. ``filter_accepted_signals_splits_into_sorted_lists`` — sorted union round-trip 13. ``sparse_cohort_zero_filled_not_decimated`` — REGRESSION GUARD: 3 of 10 signals with 25% NaN coverage at staggered offsets; all 10 still get real PBO/DSR runs (none short-circuit) — fails immediately if cohort policy reverts to dropna(how='any') 14. ``module_load_constants_sourced_from_pbo_dsr`` — MIN_OBS_PER_ SIGNAL == DEFAULT_N_PARTITIONS == 16; canonical Phase 4 gate Verification: - ``ruff check compute/validation/osap_validation.py tests/test_ validation/test_osap_validation.py`` → clean - ``pytest tests/ -m "not network"`` → 906 passed (892 prior + 14 new) - Import sanity: ``from compute.validation.osap_validation import gate_osap_signals, compute_rolling_ic_12m, filter_accepted_ signals, GateResult, MIN_OBS_PER_SIGNAL, ROLLING_IC_WINDOW_MONTHS`` → OK - No schema change (still 0.9.0-phase4h from commit 1) - No ``PHASE3_WEIGHTS`` touched - No imports from osap_replicate / osap_blend / main — standalone verified Next: commit 5 — ``compute/main.py`` wiring + ``compute/ingest/ osap.py`` kwargs (~70 LOC). Wires fetch → replicate → gate → blend end-to-end; integration ``@network`` test against real OSAP fetch with 20-ticker compute slice + sanity-IC on Mom1m signal. https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU
**FINAL** commit of the Phase 4h 5-commit cluster. Wires commits 1-4
into the weekly compute orchestrator and ships the end-to-end
``@pytest.mark.network`` integration test.
Wiring (`compute/main.py`):
- New imports for the 4 OSAP layers (ingest / features / scoring /
validation). Ruff auto-reorganized them; no logic delta from the
reorganization.
- New OSAP pipeline block inserted after ``asof_date = now.date()``
and before the Step-8 per-ticker loop (~100 LOC). Steps:
1. ``fetch_osap_returns(signals=OSAP_SIGNALS_100, as_of=asof_date)``
— single bulk fetch (cached parquet per PR #110 + commit 1
``signals``/``as_of`` kwargs).
2. ``compute_long_short_returns`` — commit 2 helper.
3. ``gate_osap_signals`` + ``filter_accepted_signals`` — commit 4
PBO/DSR hard gate (PBO ≤ 0.5 AND DSR > 0 inherited from PR
#60's ``factor_passes_gates`` defaults).
4. ``compute_rolling_ic_12m`` per accepted signal — observability
only, populates ``metadata.osap_signals_ic_12m``.
5. ``compute_osap_signals`` over the *accepted* subset only —
produces per-ticker proxy signal map.
6. ``coverage_by_signal`` — populates
``metadata.osap_signals_coverage_pct``.
7. ``aggregate_osap_signals`` → ``apply_osap_blend(composite,
aggregate, weight=0.5)`` — commit 3 Path-b. STAYS OUTSIDE
``compute_composite()``: ``PHASE3_WEIGHTS`` sum-to-1.0
invariant (``compute/scoring/composite.py:43-45``) is intact.
- **Top-5 ranking still uses raw ``composite_score`` per SKILL.md
Rule 16.** ``composite_score_osap_adjusted`` is written into
``StockDetail.osap_blended_score`` as an observability column —
Phase 5 ML meta-learner is where 50/50 may be retuned and a
ranking cutover authorized.
- **Graceful degradation** — entire block wrapped in try/except:
on any failure (network outage, ``openassetpricing`` import
error, OSAP release shift, gate exception) all six OSAP-bearing
fields degrade to ``None`` and weekly production continues. OSAP
is observability-only this phase — non-essential to the static
ranking output.
- ``StockDetail`` per-ticker writer (existing loop): two new
fields wired — ``osap_signals=osap_signal_map.get(ticker)``
(dict or None per universe-gap policy) and ``osap_blended_score``
(rounded float or None when reindex misses or value is NaN).
- ``Metadata`` writer: four new fields wired —
``osap_signals_used`` (sorted list of accepted signals),
``osap_excluded_signals`` (sorted list of PBO/DSR rejects),
``osap_signals_ic_12m`` (per-signal rolling-12m Spearman IC,
observability), ``osap_signals_coverage_pct`` (per-signal % of
tickers populated). Each defaults to ``None`` (not empty list/
dict) when the OSAP pipeline degrades, matching the
``| None = None`` schema contract from commit 1.
Test (`tests/test_features/test_osap_e2e_integration.py`, NEW, ~150
LOC, 1 ``@pytest.mark.network @pytest.mark.timeout(600)``):
Full ingest → replicate → gate → IC → blend chain against the real
OSAP package release, 4-signal × 20-ticker slice (kept cheap so the
e2e test stays under the 300s ceiling on shared runners). Asserts:
- Live fetch returns non-empty filtered DataFrame
- Long-short derivation produces the expected
``{signalname, date, ls_return}`` schema
- Every ``GateResult`` has a sensible structure (accepted ⇒
``rejection_reason=None`` + populated PBO/DSR floats; rejected ⇒
one of 4 enumerated reasons)
- Mom1m rolling-12m IC is finite within [-1, 1] (not asserting > 0
— single-window IC is noisy)
- 20-ticker proxy signal map populates the universe
- ``aggregate_osap_signals`` → ``apply_osap_blend`` round-trips and
the output is clipped to [0, 100] over the full 20-ticker index
Test does NOT run full ``compute/main.py`` (502-ticker EDGAR fetch
exceeds CI budget); compute/main.py wiring correctness is verified
by the 906 offline unit tests across the 4 OSAP layers plus this
e2e chain that confirms data shapes match end-to-end.
Docs (atomic with the wiring per user direction):
- ``CLAUDE.md`` ``## Phase status`` — schema bump line updated to
``0.9.0-phase4h``, "Phase 4h in flight in PR #112" stanza added,
test counts bumped to 906 offline + 19 ``@network``. Defense
layer count UNCHANGED at 17 (annotate-only blend, no new veto).
- ``PHASE_STATUS.md`` Phase 4 row — Phase 4h sub-status added,
schema bump cited, "no new veto" lock cited.
- ``SKILL.md`` schema-versions table — new row for
``0.9.0-phase4h`` marked "in flight in PR #112" so the table
doesn't lie pre-merge; documents the 6 new optional fields,
Path-b architecture lock, hard-gate criteria, NaN-policy lock,
observability-only framing, and graceful-degradation contract.
- ``WORKFLOW.md`` — unchanged this commit; Phase 4h plan reference
already present, post-merge tick belongs to the merge PR.
Verification:
- ``ruff check .`` → clean (auto-reorganized import block in
compute/main.py + e2e test; no logic delta)
- ``pytest tests/ -m "not network"`` → **906 passed** (892 prior +
14 new from commit 4) — confirms zero regression from the
``compute/main.py`` wiring
- ``python -m compute.output.schema_check`` → in-sync (no schema
delta this commit — schema bump landed in commit 1)
- ``python -c "from compute import main"`` → OK; SCHEMA_VERSION
resolves to ``0.9.0-phase4h``
**STOP** here per user instruction — awaiting audit before Mark-
Ready flip on PR #112. After Ready + merge, Section I post-merge:
Vercel MCP 4-call (deploy health) + Playwright 4-ticker matrix
including one zero-OSAP-coverage ticker as the new failure mode.
Held issue (Phase 4h.1 full per-stock signal replication) auto-
files on the merge webhook event.
https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU
This was referenced May 18, 2026
dackclup
pushed a commit
that referenced
this pull request
May 19, 2026
**FINAL** commit of the Phase 4h.2 Part 1 3-commit cluster (issue #116). Populates the ``osap_gate_diagnostics`` field landed in commit 1's schema delta + docs the full Part-1 surface so reviewers + future maintainers see the schema and observability contract in one place. **`compute/main.py` wiring** (+23 LOC): 1. Import added to ``from compute.output.schemas import (...)``: ``OsapGateDiagnostic`` inserted alphabetically between ``Metadata`` and ``PillarScores`` (schemas import already used at this site, no new module touched). 2. Variable initialized BEFORE the OSAP try block: ``osap_gate_diagnostics: dict[str, OsapGateDiagnostic] = {}``. 3. Populated inside the try after ``gate_results = gate_osap_signals(osap_ls, requested_signals= config.OSAP_SIGNALS_100)`` and BEFORE ``filter_accepted_signals`` — captures EVERY signal that reached the gate (both accepted and rejected). Accepted carry ``rejection_reason=None``; rejected carry one of the canonical taxonomy values (``high_pbo`` / ``low_dsr`` / ``insufficient_data`` / ``gate_failed``) per ``compute/validation/osap_validation.py::GateResult``. 4. Reset to ``{}`` in the OSAP-pipeline-failed ``except`` branch so graceful degradation continues to leave every osap_* field at ``None``. 5. Wired into the ``Metadata(...)`` constructor with the established ``or None`` idiom: ```python osap_gate_diagnostics=osap_gate_diagnostics or None, ``` **Tests** (``tests/test_output/test_schema_phase4h2.py``, +55 LOC, 2 new offline appended to commit 1's suite): 1. ``test_metadata_gate_diagnostics_round_trip_with_production_cohort_shape`` — simulates the production observation from #116 (22 signals reach the gate, all rejected with a mix of rejection_reason values across the canonical 4-value taxonomy); asserts the dict-of-OsapGateDiagnostic structure survives ``model_validate`` → ``model_dump`` → ``model_validate`` round-trip. 2. ``test_metadata_gate_diagnostics_accepted_signal_has_null_rejection_reason`` — locks the ``rejection_reason=None`` semantics for accepted signals (Pydantic preserves None rather than coercing to a sentinel string). **Docs** (atomic with the wiring): - ``CLAUDE.md`` ``## Phase status`` — schema line updated to ``0.9.1-phase4h.2`` with the PATCH-bump framing; preserved the prior MINOR-bump history (`0.8.0-phase4.5f` → `0.9.0-phase4h` via PR #112). - ``PHASE_STATUS.md`` row 4 — Phase 4h.2 Part 1 sub-status added; describes both new fields, the Part-1 / Part-2 split rationale ("Part 2 opens after ≥1 week of production diagnostic data accumulates"), and the "no new veto / no rank change" invariant. - ``SKILL.md`` schema-versions table — new row for ``0.9.1-phase4h.2`` inserted above the ``0.9.0-phase4h`` row; cites the SKILL.md L305 PATCH-bump quote verbatim, locks the ``OsapGateDiagnostic`` "all 4 fields explicit = None" refinement in writing, and documents the set-diff helper placement decision (``compute/features/osap_replicate.py::signals_in_dataframe`` per refinement #4). - ``WORKFLOW.md`` — unchanged; no "Open items" checkbox list for Phase 4h.2 yet (would be created when Part 2 is scoped). **Verification ladder** (steps 1-5 complete): - ``ruff check .`` → clean ✅ - ``pytest tests/ -m "not network"`` → **924 passed** (911 baseline + 13 new across the 3-commit cluster: 7 schema + 4 helper + 2 gate-diagnostic) ✅ - ``python -m compute.output.schema_check`` → in-sync (no new schema delta this commit; the snapshot already captured both fields + ``OsapGateDiagnostic`` from commit 1's regen) ✅ - ``python -c "from compute.main import run_weekly_compute; from compute.output.schemas import OsapGateDiagnostic; ..."`` → OK ✅ Steps 6-8 next: ``git push`` → open Draft PR → ``subscribe_pr_activity`` + STOP for user audit + Mark-Ready authorization. **Defense layer**: unchanged at 17. **Top-5 rotation**: unchanged. **Schema version**: ``0.9.1-phase4h.2`` (locked from commit 1). **Cluster summary**: | # | SHA | LOC | Tests added | |---|---|---|---| | 1 — schema delta | ``428729ad`` | 231 | +7 (round-trip + backward-compat) | | 2 — silent-drop wiring | ``c7949403`` | 116 | +4 (helper unit tests) | | 3 — gate diagnostics + docs (this) | TBD | ~86 | +2 (gate-diag round-trip) | | **Total** | — | ~433 | **+13** | Within the Option-β diagnostic-first scope (~250-350 LOC budget; + docs); under the original plan's ~300 LOC estimate. https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU
8 tasks
dackclup
added a commit
that referenced
this pull request
May 19, 2026
…manifest + 6 offline tests (#119) Phase 4j scout PR — 3rd of 4 factor-library scouts (OSAP ✅ #110, JKP ✅ #114, Qlib THIS, IPCA next as 4k). Ships `pyqlib` install + Alpha158 158-feature manifest + 6 offline tests. NO production wiring; yfinance-to-Qlib BYO adapter + full Alpha158 compute on 502-ticker universe deferred to follow-on integration PR. 5 pre-plan investigations (all verified 2026-05-19): 1. PyPI package: `pyqlib` 0.9.7 (canonical). Alternative names (`qlib`, `microsoft-qlib`) return 404. 2. License: MIT via wheel METADATA classifier. No CC BY-NC complication like JKP — safe for Phase 6+ commercial roadmap. 3. Data init: `qlib.init(provider_uri=..., region="us")`. NO public US data bundle — Qlib's default covers CN A-share only; US universe is BYO via local .bin files. 4. Alpha158 surface: `qlib.contrib.data.handler.Alpha158` → 158 columns; manifest captured via `Alpha158DL.get_feature_config()[1]` and hardcoded; offline test 3 locks against upstream drift. 5. CI install footprint: ~150-180 MB net-new (mlflow / lightgbm / cvxpy / pymongo / redis / gym / jupyter + nbconvert transitives). One-time cold-start; pip wheel caching mitigates subsequent runs. Critical scope decisions: - NO @network test for this scout — Qlib has no remote CDN; data flow is local-bin filesystem I/O. Originally planned synthetic-OHLCV→bin→init→Alpha158 smoke test was dropped because pyqlib's PyPI wheel doesn't bundle `scripts/dump_bin.py`. Replacement: manifest-vs-runtime-introspection drift detector (stronger than the dropped test — fires on every pip install upgrade if Qlib changes the feature set). - Module name `compute/ingest/qlib_features.py` (NOT `qlib.py`) — Python import resolution would shadow the installed `qlib` package, breaking the entire factor-library integration. Distinct module name avoids namespace collision. - Tenacity NOT applied — Qlib's data flow is local filesystem I/O, no network retry semantics needed. First ingest module in QuantRank that diverges from the canonical `compute/ingest/osap.py:52-56` retry decorator (documented in module docstring). Module layer (compute/ingest/qlib_features.py, ~186 LOC): - `QLIB_DATA_CACHE: Path` constant (gitignored via parent `compute/cache/`) - `QLIB_INSTRUMENTS_UNIVERSE = "sp500"` (custom universe for future BYO bundle) - `ALPHA158_FEATURE_NAMES: tuple[str, ...]` — 158 hardcoded entries, asserted at module load - `init_qlib(provider_uri=None)` — thin wrapper around `qlib.init(region="us")`; idempotent - `fetch_alpha158_features(*, instruments, start_time, end_time)` — Alpha158 handler wrapper Config layer (compute/config.py, +23 LOC): - `QLIB_DATA_CACHE: Path = CACHE_DIR / "qlib" / "us_data"` - `QLIB_DATA_MAX_AGE_DAYS: int = 31` - `ALPHA158_FEATURE_COUNT: int = 158` (asserted against module manifest length) Tests (6 offline; ~113 LOC): 1. `test_alpha158_feature_manifest_has_158_entries` — primary CI signal (pure cardinality + uniqueness, no Qlib runtime) 2. `test_alpha158_feature_manifest_first_5_anchor` — K-bar leading features anchor (KMID, KLEN, KMID2, KUP, KUP2) 3. `test_alpha158_feature_manifest_matches_runtime_introspection` — drift detector (manifest == `Alpha158DL.get_feature_config()[1]`) 4. `test_qlib_data_cache_constant_under_repo_cache_dir` — config sanity 5. `test_init_qlib_passes_us_region_and_provider_uri` — monkeypatch capture 6. `test_init_qlib_defaults_to_config_cache_when_no_uri` — default path verified pyproject.toml: `pyqlib>=0.9.7,<0.10` added to `[factors]` extra (authorized in advance via plan-mode approval; pin range because Qlib's API drifts across minor versions). Ask-first surfaces touched: - `pyproject.toml [factors]` — extended (authorized via plan-mode) - `ci.yml` UNCHANGED (`[dev,factors]` install already covers new dep) - `compute-rankings.yml` UNTOUCHED per user hard constraint - Schema triple UNTOUCHED (no schema delta this scout) Verification (local): - ruff check . → clean - pytest tests/ -m "not network" → 930 passed (924 prior + 6 new) - python -m compute.output.schema_check → in-sync - python -c "from compute.ingest.qlib_features import ..." → OK 158 - Vercel preview ✅ READY Defense layer unchanged at 17. Top-5 rotation unchanged (no scoring touched). Schema unchanged at 0.9.1-phase4h.2. After this merges → 3 of 4 factor-library scouts done. Phase 4k (IPCA) is the final scout; once 4k merges → eligible for `v1.1.0-phase4` tag. Out of scope (deferred to follow-on full Phase 4j integration PR, ~5-commit cluster): - yfinance-to-Qlib BYO adapter (~150 LOC + custom S&P 500 instruments universe registration) - Full Alpha158 feature compute on 502-ticker universe → 502 × N_dates × 158 DataFrame - Per-feature cross-validation framework (PBO/DSR doesn't apply to per-stock-per-date features; walk-forward IC scoring per feature is the likely replacement) - Schema additions (StockDetail.qlib_features + Metadata.qlib_features_used + IC observability) → schema bump 0.9.1-phase4h.2 → 0.10.0-phase4j - compute/main.py wiring decision (observability-only? blended into composite? Phase-5 ML-meta-learner-only consumer?) Audit history: - Plan-audit round 1: 5 pre-plan investigations verified · MIT lock · heavy-deps disclosure approved - Plan-audit rounds 2-5: same plan re-paste loop (session-side stuck); main session verified PR #119 unchanged at each check - Pre-CI audit: clean (1 legitimate pivot — test #6 swapped from end-to-end smoke to manifest drift detector because pyqlib wheel lacks scripts/dump_bin.py) - Conditional Mark-Ready authorization given on Vercel ✅ + mergeable_state clean - Squash merged per "merge call is yours" delegation pattern (PR #112 / #114 / #118 precedent) https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
This was referenced May 19, 2026
dackclup
added a commit
that referenced
this pull request
May 20, 2026
…variants (#127) Closes #126. Process Hygiene Item #1 (parent epic #125). Adds Hypothesis property- based tests as the new defense line for "untested data-shape assumption" bugs — the class that hid the OSAP quintile/tercile silent-drop in PR #112's CI until production cron diagnostics caught it (subsequently fixed in PR #124 / Phase 4h.2 Part 2). If a `@given` property over `port_count ∈ {2,3,5,10}` had existed in Phase 4h, the hardcoded `port=10` filter would have been falsified the first time the CI ran. Test-addition only. No scoring / feature behavior touched. No schema delta. No CI workflow changes. Sub-task 1 — Hypothesis added to [dev] extra (pyproject.toml) -------------------------------------------------------------- `hypothesis>=6.92` joins `pytest` + `ruff` in the `[dev]` optional extra. Pure-Python dep (no C extensions); CI footprint negligible. Sub-task 2 — Property tests for osap_replicate.py (7 tests, 394 LOC) --------------------------------------------------------------------- New file: tests/test_features/test_osap_replicate_properties.py 7 property tests covering data-shape invariants the Phase 4h.2 Part 2 multi-port adapter must satisfy: 1. `test_compute_long_short_returns_handles_any_port_cardinality` — for port_count ∈ [2, 10] and n_dates ∈ [1, 12], the adapter produces exactly n_dates LS rows with ls_return == port_count - 1. THE headline property — would have caught the PR #112 bug. 2. `test_signals_dropped_no_long_short_returns_sorted_unique` — contract for the Metadata.osap_signals_dropped_no_long_short field: sorted, no duplicates, single-port signals appear, two-port signals don't. 3. `test_normalize_port_label_int_input_yields_2char_zfill` — port=int(1..10) → '01'..'10' for any input list. Idempotent. 4. `test_normalize_port_label_str_input_yields_2char_zfill` — mixed '1' / '01' / '10' inputs normalize to a uniform 2-char width. 5. `test_part2_accounting_invariant_under_random_partition` — the Phase 4h.2 Part 2 accounting equation (manifest = missing + dropped + gated + used) holds for any 3-way partition of a synthetic manifest into the bucket set. Uses st.composite to draw disjoint partitions. 6. `test_coverage_by_signal_returns_pct_in_0_to_100` — domain contract for the coverage helper (0..100 percent, NOT 0..1 fraction). 7. `test_rank_signals_cross_sectional_returns_unit_interval` — ranks live in (0, 1] for any non-empty cross-section. Sub-task 3 — Property tests for scoring transforms (7 tests, 340 LOC) --------------------------------------------------------------------- New file: tests/test_scoring/test_transforms_properties.py 7 property tests covering composite (compute/scoring/composite.py) and OSAP blend (compute/scoring/osap_blend.py) — pure-numeric transforms whose output domains are contract-locked by the downstream Pydantic + TypeScript schemas. Composite tests (4): A. `test_compute_composite_output_bounded_0_to_100` — for any pillar input in [0, 100], composite ∈ [0, 100] (the writer + Pydantic contract) B. `test_compute_composite_all_50_inputs_yield_composite_50` — neutral-pillar input collapses to composite == 50 (catches accidental weight-vector drift) C. `test_compute_composite_neutralize_missing_imputes_nan_to_50` — NaN pillar inputs are imputed when neutralize_missing=True; all-NaN → composite == 50.0 D. `test_compute_composite_constant_input_equals_input` — constant-pillar input → composite == that constant (PHASE3 weight-sum-equals-1.0 invariant expressed as a property) OSAP blend tests (3): E. `test_apply_osap_blend_output_bounded_and_nan_passthrough` — blend ∈ [0, 100]; NaN OSAP → composite passthrough; finite OSAP → interior point between composite and osap F. `test_aggregate_osap_signals_finite_values_in_0_to_100` — finite aggregate values live in [0, 100]; NaN allowed for universe gaps G. `test_apply_osap_blend_weight_zero_is_identity_on_composite` — weight=0 leaves composite unchanged (locks the Phase 4h observability-only design property + Rule 16: Top-5 still ranks raw composite) Sub-task 4 — CI integration + .gitignore + docs ------------------------------------------------- - `.gitignore` already covers `.hypothesis/` at line 50 (Python's default boilerplate) — no edit needed. - CLAUDE.md ## Gotchas — 1-line note that Hypothesis is the new defense line for data-shape bugs (paired with example tests), with the `@settings(deadline=None)` anti-pattern flagged. - CI hypothesis.errors.Flaky behaviour: default profile makes flaky examples fail-fast (no retry); the `pytest -m "not network"` CI invocation inherits this. NO `@settings(deadline=None)` used in this PR — slow examples surface as honest failures. Sanity verification (NOT committed) ----------------------------------- As part of pre-push verification I temporarily reverted the multi- port adapter at compute/features/osap_replicate.py:143 (`agg(["min", "max"])` → `agg(["min", "min"])`) and confirmed `test_compute_long_short_returns_handles_any_port_cardinality` fails with "Falsifying example: port_count=2, n_dates=1". Reverted the break before commit. Constraints honored ------------------- - NO modification to compute_composite() / PHASE3_WEIGHTS sum=1.0 invariant (composite.py:43-45) — pure test-addition PR - Rule 16: Top-5 still ranks raw composite_score; no scoring touched - No push to main; no force-push; no --no-verify - No workflow_dispatch trigger (compute-rankings.yml untouched) - Schema triple untouched (no schemas.py / types.ts changes) - NO @settings(deadline=None) — default deterministic deadline - NO RuleBasedStateMachine (out of scope per issue #126) Test count delta ---------------- Before: 945 passed (Phase 4h.2 Part 2 baseline) After: 959 passed (+14 property tests across 2 new files) Files (4 changed, +747 / 0) ---------------------------- - pyproject.toml — +6 (hypothesis>=6.92 in [dev]) - CLAUDE.md — +7 (## Gotchas note) - tests/test_features/test_osap_replicate_properties.py — +394 NEW - tests/test_scoring/test_transforms_properties.py — +340 NEW Verification ladder all green ------------------------------ - ruff check . → All checks passed - python -m pytest tests/ -m "not network" → 959 passed (1m46s) - python -m pytest tests/test_features/test_osap_replicate_properties.py tests/test_scoring/test_transforms_properties.py → 14 passed (5s) - python -m compute.output.schema_check → in sync (no schema delta) - Sanity break-revert confirmed property test catches a regression No regression discovered ------------------------ Property tests passed on first execution against current main (commit 80c6641, Phase 4h.2 Part 2 already merged). No hidden bugs surfaced beyond the 56-signal gap that PR #124 already fixed — which itself is a good signal that the multi-port adapter handles the [2, 10] cardinality region cleanly. https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU Co-authored-by: Claude <noreply@anthropic.com>
13 tasks
dackclup
added a commit
that referenced
this pull request
May 20, 2026
Part of epic #125 (Item #4 of 6). Doc-only PR — no code changes, no schema delta, no test additions. Phase 4h timeline (2026-05-18 → 2026-05-19) demonstrated the cost of shipping production wiring + gate logic without a diagnostic surface: - PR #112 (Phase 4h): OSAP signal replication + PBO/DSR gate + Path-b blend, NO observability surface for gate decisions - First production cron: every signal failed gate, no way to know why - PR #118 (Phase 4h.2 Part 1): retrofit diagnostic surface (osap_signals_missing_from_dataset + osap_gate_diagnostics) - Second production cron: 22 missing + 22 fail low_dsr, 56 silently dropped (gap that Part 1 still couldn't fully expose) - PR #124 (Phase 4h.2 Part 2): root-cause fix (multi-port adapter) + osap_signals_dropped_no_long_short closing the accounting gap The combined cost of Phase 4h.2 Parts 1 + 2 (~10 hours across 2 PRs) would have been ~30 minutes of additional Phase 4h scope if the diagnostic surface had shipped alongside the production wiring. Files (3 changed, +83 LOC) --------------------------- - WORKFLOW.md (+63 LOC) — new section "# Observability-Before-Wiring Pattern" inserted between the mobile playbook table and the "Initial Prompts" section. Includes mandatory checklist (6 items) + anti-pattern statement + 3 reference precedents (PR #112 bad, PR #118 good, PR #124 good) - SKILL.md (+14 LOC) — new "Rule 18: Observability-before-wiring" appended to the Core Behavior Rules section (Rule 17 was the prior trailing rule). Links back to WORKFLOW.md for the mandatory checklist detail - CLAUDE.md (+6 LOC) — 1 bullet added to ## Conventions referencing the new Rule 18 + WORKFLOW.md section Files NOT touched (deliberately per scope) ------------------------------------------- - PHASE_STATUS.md — chronological log; pattern guidance belongs in WORKFLOW.md / SKILL.md / CLAUDE.md, not in the historical tracker - AGENTS.md — cross-tool agent doc; lookups defer to WORKFLOW.md by default, so a fresh duplicate would just create drift risk - compute/ / frontend/ / tests/ — doc-only PR, no behavior change Constraints honored ------------------- - No code changes — pure markdown additions - No schema delta — schema_check confirms in-sync - No test additions — pytest count unchanged at 959 - No push to main; no force-push; no --no-verify - No workflow_dispatch trigger (compute-rankings.yml untouched) Verification ladder all green ------------------------------ - ruff check . → All checks passed - python tools/check_doc_test_counts.py → exit 0 (no new hardcoded test-count claims introduced — the precedents reference PRs and hour estimates, not "N offline + M @network" drift patterns) - python -m compute.output.schema_check → in sync (no schema touch) - python -m pytest tests/ -m "not network" → 959 passed (unchanged) https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU Co-authored-by: Claude <noreply@anthropic.com>
10 tasks
dackclup
added a commit
that referenced
this pull request
May 20, 2026
…ble skills (#132) 3-task housekeeping + tacit knowledge harvest. Docs/skills-only PR — no code, no schema delta, no test additions. Task A — SKILL.md schema-version table fixes --------------------------------------------- Two stale "in flight" entries flipped to merged + 1 new row inserted: - Row 0.9.0-phase4h: "(in flight in PR #112)" → "(PR #112 merged 2026-05-19)" - Row 0.9.1-phase4h.2: "(in flight in PR #<NEXT>)" → "(PR #118 merged 2026-05-19)" - NEW row 0.9.2-phase4h.2 (above 0.9.1) — PR #124 merged, multi-port OSAP adapter + osap_signals_dropped_no_long_short field, closing the 100-signal accounting equation; DSR sign-inversion deferred to Part 3 PHASE_STATUS.md row 4 ALSO has "Phase 4h.2 Part 2 in flight in this PR" staleness — confirmed via grep but DELIBERATELY not updated here per Task A explicit scope (SKILL.md only). Recommend a follow-up phase-status-bump PR after this lands. Task B — New worker-session-handoff skill ------------------------------------------ .claude/skills/worker-session-handoff/SKILL.md (+163 LOC). YAML frontmatter + 5 sections: - When to use vs inline (≤50 LOC single-file → inline; ≥2 files / new dep / code logic → handoff) - Constraint lock library (8 standard locks: composite/PHASE3, Rule 16, Rule 18, no-merge, no force-push, no --no-verify, no workflow_dispatch, schema triple) - Anti-pattern: paste-loop avoidance (single outer code-block fence; reference PR #123 as related-but-distinct paste-loop failure mode) - Template (paste-ready, single ```` outer code block with language tag ` text` so inner triple-backticks pass through) - Reference invocations + QuantRank precedents (PR #124, #127, #131) Codifies the handoff shape that appeared verbatim across PRs #123, #124, #127, #128, #129, #131 — user copies ONE block instead of editing 5 template snippets per handoff. Task C — Portable skills library (4 skills, +417 LOC) ----------------------------------------------------- Audit step (per spec): read CLAUDE.md + AGENTS.md + SKILL.md + WORKFLOW.md + PR descriptions of #112/#118/#124/#127/#128/#129/#131. Identified 7 candidate patterns; classified by portability: - ✅ scout-then-integrate (portable; vendoring pattern, no QR logic) - ✅ observability-before-wiring (portable; gate-diagnostic pattern) - ✅ drift-detector-manifest (portable; API surface lock pattern) - ✅ schema-triple-lockstep (portable; Python/TS JSON contract) - 🟡 annotate-before-veto (portable; progressive rollout — DEFERRED to follow-up issue, lower value vs the 4 shipped) - 🟡 pre-plan-investigations (subsumed by scout-then-integrate's Phase 1 § "Pre-plan investigations" — no separate skill needed) - 🟡 graceful-degradation-try-except (portable; error-handling pattern — DEFERRED to follow-up issue, the wrapper is generally 1-line so doesn't warrant a dedicated skill) 4 shipped (each ≤ 109 LOC): .claude/skills/portable-scout-then-integrate/SKILL.md (99 LOC) .claude/skills/portable-drift-detector-manifest/SKILL.md (109 LOC) .claude/skills/portable-schema-triple-lockstep/SKILL.md (103 LOC) .claude/skills/portable-observability-before-wiring/SKILL.md (106 LOC) Flat naming convention (`portable-<name>/SKILL.md` at depth 1 from `.claude/skills/`) because Claude Code's skill registry doesn't recurse into nested subdirectories per CLAUDE.md ## Conventions. Confirmed via session reload — all 4 portable + worker-session- handoff registered correctly. Each portable skill has: - YAML frontmatter (name + description + TRIGGER + SKIP) - ## Pattern section (generic, no QR business logic) - ## Trigger conditions + ## Skip conditions - ## QuantRank precedent (1 paragraph, clearly labeled as precedent not pattern definition) Task C constraint check: - All portable skills core pattern descriptions are project- agnostic (read `.claude/skills/portable-*/SKILL.md` ## Pattern sections — zero references to OSAP / IPCA / pillar / Top-5 inside the pattern body; only inside the labeled "QuantRank precedent" section at the bottom) - 3 of 4 portable skills are 103-109 LOC (slightly over the 100-LOC target — pattern + trigger + skip + precedent sections require ~25 LOC each, leaving ~25 LOC of unavoidable scaffold). The 99-LOC one (scout-then-integrate) shows the cap is achievable but tight. Files (6 changed, +580 LOC, no deletions) ------------------------------------------ - SKILL.md — schema-version table fixes (Task A) - 5 new SKILL.md files in .claude/skills/ (Tasks B + C) Verification ladder all green ------------------------------ - ruff check . → All checks passed - python tools/check_doc_test_counts.py → exit 0 - python tools/check_branch_collisions.py "skill" "portable" → expected⚠️ on #131 (own adjacent work, not a duplicate) - python -m compute.output.schema_check → in sync (no schema touch) - python -m pytest tests/ -m "not network" → 959 passed (unchanged; tools/ + .claude/skills/ aren't imported by tests) - Claude Code skill registry pick-up verified via session reload — all 5 new skills (worker-session-handoff + 4 portable-*) appear in the available-skills list Constraints honored ------------------- - No touch to compute/ / frontend/ / tests/ - No touch to PHASE_STATUS.md / WORKFLOW.md (Task A scope = SKILL.md only; PHASE_STATUS.md staleness flagged for follow-up) - No push to main; no force-push; no --no-verify - No workflow_dispatch trigger - Task C portable skills are project-agnostic in their pattern description (QR refs confined to labeled "precedent" sections) Follow-up issue (to file post-merge) ------------------------------------ Title: "Portable Skills Library — extract remaining tacit patterns" - annotate-before-veto (progressive rule rollout) - graceful-degradation-try-except (1-line wrapper guidance) - pre-plan-investigations as standalone (currently subsumed) - Anything else surfaced by future PR descriptions https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU Co-authored-by: Claude <noreply@anthropic.com>
dackclup
added a commit
that referenced
this pull request
May 20, 2026
…sk C.1 recovery) (#135) * docs(skills): SKILL.md schema bump + worker-session-handoff + 4 portable skills 3-task housekeeping + tacit knowledge harvest. Docs/skills-only PR — no code, no schema delta, no test additions. Task A — SKILL.md schema-version table fixes --------------------------------------------- Two stale "in flight" entries flipped to merged + 1 new row inserted: - Row 0.9.0-phase4h: "(in flight in PR #112)" → "(PR #112 merged 2026-05-19)" - Row 0.9.1-phase4h.2: "(in flight in PR #<NEXT>)" → "(PR #118 merged 2026-05-19)" - NEW row 0.9.2-phase4h.2 (above 0.9.1) — PR #124 merged, multi-port OSAP adapter + osap_signals_dropped_no_long_short field, closing the 100-signal accounting equation; DSR sign-inversion deferred to Part 3 PHASE_STATUS.md row 4 ALSO has "Phase 4h.2 Part 2 in flight in this PR" staleness — confirmed via grep but DELIBERATELY not updated here per Task A explicit scope (SKILL.md only). Recommend a follow-up phase-status-bump PR after this lands. Task B — New worker-session-handoff skill ------------------------------------------ .claude/skills/worker-session-handoff/SKILL.md (+163 LOC). YAML frontmatter + 5 sections: - When to use vs inline (≤50 LOC single-file → inline; ≥2 files / new dep / code logic → handoff) - Constraint lock library (8 standard locks: composite/PHASE3, Rule 16, Rule 18, no-merge, no force-push, no --no-verify, no workflow_dispatch, schema triple) - Anti-pattern: paste-loop avoidance (single outer code-block fence; reference PR #123 as related-but-distinct paste-loop failure mode) - Template (paste-ready, single ```` outer code block with language tag ` text` so inner triple-backticks pass through) - Reference invocations + QuantRank precedents (PR #124, #127, #131) Codifies the handoff shape that appeared verbatim across PRs #123, #124, #127, #128, #129, #131 — user copies ONE block instead of editing 5 template snippets per handoff. Task C — Portable skills library (4 skills, +417 LOC) ----------------------------------------------------- Audit step (per spec): read CLAUDE.md + AGENTS.md + SKILL.md + WORKFLOW.md + PR descriptions of #112/#118/#124/#127/#128/#129/#131. Identified 7 candidate patterns; classified by portability: - ✅ scout-then-integrate (portable; vendoring pattern, no QR logic) - ✅ observability-before-wiring (portable; gate-diagnostic pattern) - ✅ drift-detector-manifest (portable; API surface lock pattern) - ✅ schema-triple-lockstep (portable; Python/TS JSON contract) - 🟡 annotate-before-veto (portable; progressive rollout — DEFERRED to follow-up issue, lower value vs the 4 shipped) - 🟡 pre-plan-investigations (subsumed by scout-then-integrate's Phase 1 § "Pre-plan investigations" — no separate skill needed) - 🟡 graceful-degradation-try-except (portable; error-handling pattern — DEFERRED to follow-up issue, the wrapper is generally 1-line so doesn't warrant a dedicated skill) 4 shipped (each ≤ 109 LOC): .claude/skills/portable-scout-then-integrate/SKILL.md (99 LOC) .claude/skills/portable-drift-detector-manifest/SKILL.md (109 LOC) .claude/skills/portable-schema-triple-lockstep/SKILL.md (103 LOC) .claude/skills/portable-observability-before-wiring/SKILL.md (106 LOC) Flat naming convention (`portable-<name>/SKILL.md` at depth 1 from `.claude/skills/`) because Claude Code's skill registry doesn't recurse into nested subdirectories per CLAUDE.md ## Conventions. Confirmed via session reload — all 4 portable + worker-session- handoff registered correctly. Each portable skill has: - YAML frontmatter (name + description + TRIGGER + SKIP) - ## Pattern section (generic, no QR business logic) - ## Trigger conditions + ## Skip conditions - ## QuantRank precedent (1 paragraph, clearly labeled as precedent not pattern definition) Task C constraint check: - All portable skills core pattern descriptions are project- agnostic (read `.claude/skills/portable-*/SKILL.md` ## Pattern sections — zero references to OSAP / IPCA / pillar / Top-5 inside the pattern body; only inside the labeled "QuantRank precedent" section at the bottom) - 3 of 4 portable skills are 103-109 LOC (slightly over the 100-LOC target — pattern + trigger + skip + precedent sections require ~25 LOC each, leaving ~25 LOC of unavoidable scaffold). The 99-LOC one (scout-then-integrate) shows the cap is achievable but tight. Files (6 changed, +580 LOC, no deletions) ------------------------------------------ - SKILL.md — schema-version table fixes (Task A) - 5 new SKILL.md files in .claude/skills/ (Tasks B + C) Verification ladder all green ------------------------------ - ruff check . → All checks passed - python tools/check_doc_test_counts.py → exit 0 - python tools/check_branch_collisions.py "skill" "portable" → expected⚠️ on #131 (own adjacent work, not a duplicate) - python -m compute.output.schema_check → in sync (no schema touch) - python -m pytest tests/ -m "not network" → 959 passed (unchanged; tools/ + .claude/skills/ aren't imported by tests) - Claude Code skill registry pick-up verified via session reload — all 5 new skills (worker-session-handoff + 4 portable-*) appear in the available-skills list Constraints honored ------------------- - No touch to compute/ / frontend/ / tests/ - No touch to PHASE_STATUS.md / WORKFLOW.md (Task A scope = SKILL.md only; PHASE_STATUS.md staleness flagged for follow-up) - No push to main; no force-push; no --no-verify - No workflow_dispatch trigger - Task C portable skills are project-agnostic in their pattern description (QR refs confined to labeled "precedent" sections) Follow-up issue (to file post-merge) ------------------------------------ Title: "Portable Skills Library — extract remaining tacit patterns" - annotate-before-veto (progressive rule rollout) - graceful-degradation-try-except (1-line wrapper guidance) - pre-plan-investigations as standalone (currently subsumed) - Anything else surfaced by future PR descriptions https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU * docs(skills): Vendor karpathy-guidelines (Task C.1 recovery) + THIRD_PARTY_NOTICES.md Recovers Task C.1 from the original handoff that was silent-dropped in the prior PR #132 commit (50da720). The handoff explicitly named "Vendor karpathy-guidelines (1 skill, ~70 LOC)" as part of the portable skills library; the auditor session caught the omission and authorized this follow-up commit on the existing branch. Files (2 new, +138 LOC) ------------------------ - .claude/skills/portable-karpathy-guidelines/SKILL.md (+82 LOC) — vendored content of upstream skills/karpathy-guidelines/SKILL.md (67 LOC, byte-for-byte preserved) + 15-line appended attribution block referencing the upstream source, commit SHA, and the Karpathy tweet that motivated the guidelines. - THIRD_PARTY_NOTICES.md (+56 LOC, NEW at repo root) — third-party license disclosures. Section "karpathy-guidelines (Claude Code skill)" carries source URL, license declaration, vendored path, vendored date, upstream commit SHA, upstream first-commit date, and the full standard MIT License text with copyright attributed to "multica-ai contributors" (upstream has no individual copyright line and no standalone LICENSE file; the `license: MIT` claim appears in upstream README.md § License and each skill's YAML frontmatter). Upstream provenance ------------------- - Source: https://github.com/multica-ai/andrej-karpathy-skills - Upstream HEAD SHA at vendoring: 2c606141936f1eeef17fa3043a72095b4765b9c2 - Upstream first commit: 2026-01-27 - Vendored date: 2026-05-20 - License: MIT (declared) Verbatim content preserved -------------------------- `diff /tmp/karpathy-src/skills/karpathy-guidelines/SKILL.md .claude/skills/portable-karpathy-guidelines/SKILL.md` shows ONLY the 15-line appended attribution block at lines 68-82. The upstream 67-line content (YAML frontmatter + "Karpathy Guidelines" heading + the 4 principles) is byte-for-byte unchanged. Per the spec constraint: "เก็บ 4 principles verbatim. แก้ได้แค่ 'เพิ่ม' attribution block ท้ายไฟล์". License-disclosure caveat ------------------------- Upstream `multica-ai/andrej-karpathy-skills` declares MIT via README + YAML frontmatter but does NOT ship a standalone LICENSE file. The `THIRD_PARTY_NOTICES.md` entry includes the standard MIT License template with copyright attributed to the GitHub org ("multica-ai contributors"), matching the principle that an MIT declaration without a formal copyright line still licenses to the redistributor; the attribution is conservative. Verification ladder all green ------------------------------ - ruff check . → All checks passed - python tools/check_doc_test_counts.py → exit 0 (no test-count drift introduced by this commit) - python tools/check_branch_collisions.py "karpathy" → no scope collisions detected - python -m compute.output.schema_check → in sync (no schema touch) - python -m pytest tests/ -m "not network" → 959 passed (unchanged; .claude/skills/ + THIRD_PARTY_NOTICES.md aren't imported by tests) - Skill registry pickup verified via session reload — `portable-karpathy-guidelines` appears in the available-skills list with the upstream description verbatim Constraints honored ------------------- - No squash / amend of the prior 50da720 commit — this is a fresh commit pushed on top of the existing branch (per spec "ห้าม squash old commit") - No touch to the 4 already-shipped portable skills in 50da720 - No touch to compute/ / frontend/ / tests/ - No push to main; no force-push; no --no-verify - No workflow_dispatch trigger - Karpathy SKILL.md upstream content preserved verbatim; only the attribution block appended below the original content PR description update will follow as a separate `gh pr edit` / MCP `update_pull_request` call so the new "License Compliance" section + the audit-table row for karpathy-guidelines land in the PR body. https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU --------- Co-authored-by: Claude <noreply@anthropic.com>
dackclup
added a commit
that referenced
this pull request
May 20, 2026
…136) Vendoring + cleanup PR. Docs/skills-only — no code, no schema delta, no test additions. Task A — Vendor 8 mattpocock/skills selections ---------------------------------------------- Upstream: https://github.com/mattpocock/skills (MIT, Copyright (c) 2026 Matt Pocock). Vendored at upstream HEAD d54c497aa94400a496d3f2c38be10fa5f284c5a9 (2026-05-20). Selection criterion: engineering-core skills applicable to QuantRank's Python + TypeScript stack and PR-iteration workflow. Vendored 8 of upstream's 18 skills (flat naming under .claude/skills/ matches the portable-* convention from PR #132): .claude/skills/mattpocock-diagnose/ SKILL.md (128 LOC = 117 upstream + 11 attribution) scripts/hitl-loop.template.sh (verbatim) .claude/skills/mattpocock-tdd/ SKILL.md (120 LOC = 109 + 11) + 5 sidecars: deep-modules / interface-design / mocking / refactoring / tests (.md, verbatim) .claude/skills/mattpocock-to-issues/ SKILL.md (94 LOC = 83 + 11) .claude/skills/mattpocock-to-prd/ SKILL.md (87 LOC = 76 + 11) .claude/skills/mattpocock-setup-harness/ SKILL.md (132 LOC = 121 + 11; disable-model-invocation: true) + 5 sidecars: domain / issue-tracker-github / issue-tracker- gitlab / issue-tracker-local / triage-labels (.md, verbatim) .claude/skills/mattpocock-handoff/ SKILL.md (26 LOC = 15 + 11) .claude/skills/mattpocock-write-a-skill/ SKILL.md (128 LOC = 117 + 11) .claude/skills/mattpocock-grill-me/ SKILL.md (21 LOC = 10 + 11) Total: 19 new files, ~860 LOC of upstream content + 88 LOC attribution blocks. Each vendored SKILL.md carries upstream content byte-for-byte plus an 11-line appended "## License + Attribution" block referencing the upstream SHA + repo's THIRD_PARTY_NOTICES.md. Sidecars (referenced via ./domain.md style links) vendored verbatim. Skipped 10 upstream skills: - caveman / scaffold-exercises / setup-pre-commit / migrate-to- shoehorn / git-guardrails-claude-code (TypeScript-specific or redundant with QuantRank's existing CI guardrails) - grill-with-docs / improve-codebase-architecture / triage / prototype / zoom-out (lower-priority for current QR workflow) - all in-progress/ deprecated/ personal/ entries Registry pickup verified — 7 of 8 mattpocock skills appear in the available-skills list (mattpocock-diagnose / -tdd / -to-issues / -to-prd / -handoff / -write-a-skill / -grill-me); mattpocock-setup- harness has upstream `disable-model-invocation: true` (user-invoked only, not model-invoked). Task B — Remove 11 unused skills --------------------------------- QuantRank is a static-site finance dashboard — Office docs / Slack GIFs / art-generation / branded-design tooling don't apply. Deleted: algorithmic-art (p5.js generative art) brand-guidelines (Anthropic brand colors) canvas-design (poster / PDF visual art, 5.6 MB of fonts) docx (Word document tooling) internal-comms (corporate status reports) pdf (PDF form filling / OCR) pptx (PowerPoint deck generation) slack-gif-creator (Slack-optimized animated GIFs) theme-factory (artifact theme presets) web-artifacts-builder (claude.ai shadcn artifact builder) xlsx (Excel spreadsheet tooling) Total: 306 files deleted (~80,000 LOC dropped, dominated by embedded Office XSD schemas, fonts, and validators). Reduces clone size by ~10 MB. Kept (still relevant for QuantRank work): - mcp-builder (Phase 5 ML may surface an MCP server) - claude-api (Phase 5 ML SDK work) - skill-creator (maintainer-only) - webapp-testing (Playwright Section I verification) - frontend-design + frontend-design-system (UI work) - doc-coauthoring (PR descriptions, plans) Task C — Docs lockstep ----------------------- - CLAUDE.md row 33: skill count "24 invocation-triggerable skills (7 QuantRank + 17 Anthropic vendored)" → "31 invocation-triggerable skills (12 QuantRank operational + 4 QR-origin portable + 6 Anthropic vendored + 9 external MIT vendored — Karpathy + 8 mattpocock)" - THIRD_PARTY_NOTICES.md: new "mattpocock-skills" section appended after the existing karpathy-guidelines section. Carries source URL, license, upstream SHA, vendored-skill list, full MIT License text verbatim (Copyright (c) 2026 Matt Pocock per upstream LICENSE). Verification ladder ------------------- - ruff check . → All checks passed - python -m compute.output.schema_check → Schema snapshot in sync - python tools/check_doc_test_counts.py → exit 0 - python tools/check_branch_collisions.py "skill" "mattpocock" → 3 historical false positives (PRs #110/#112/#114 — JKP/OSAP scouts whose commit messages contained "skill"; unrelated) - pytest tests/ -m "not network" → not run locally (sandbox missing pandas); CI will verify. Changes are docs/skills-only — zero Python source touched. - Skill registry pickup verified via session reload — 7 of 8 mattpocock-* + all 11 removed skills no longer appear; the remaining mattpocock-setup-harness is correctly hidden by its upstream `disable-model-invocation: true` frontmatter. Constraints honored ------------------- - No touch to compute/ / frontend/ / tests/ - No touch to PHASE_STATUS.md / WORKFLOW.md (out of scope) - mattpocock SKILL.md content preserved byte-for-byte; only the 11-line attribution block appended below upstream content - Sidecars vendored verbatim (referenced by SKILL.md via ./<sidecar>.md links — links continue to resolve in the vendored layout) - No push to main; no force-push; no --no-verify - No workflow_dispatch trigger https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2 Co-authored-by: Claude <noreply@anthropic.com>
dackclup
added a commit
that referenced
this pull request
May 20, 2026
…4 staleness (#139) Closes #133. Docs/skills-only PR. Task A — Portable skills library final 2 (closes #133) ------------------------------------------------------ Extracts the last 2 deferred-but-tracked patterns from epic #125: - .claude/skills/portable-annotate-before-veto/SKILL.md (108 LOC): Progressive-rollout pattern for defense / risk flags. Ship as annotate FIRST, promote to veto only after ≥ 1 production cron of observation + threshold calibration + cohort-acceptance check. Forcing precedent: Phase 4.5 cluster (loss_avoidance_pattern at 0% fire rate would've been a no-op or hotfix candidate as a veto; annotate made it observable). - .claude/skills/portable-graceful-degradation-try-except/SKILL.md (115 LOC): Wrap every external-data integration call site in a try/except that sets ALL related output fields to None on failure + writes a structured log line + sets a per-integration status Metadata field. 3-rule contract: no partial state, no log swallowing, downstream-aware. Forcing precedent: OSAP integration in compute/main.py (PRs #112 → #118 → #124). Both skills follow the established portable-* convention from PR #132 (YAML frontmatter + Pattern + Trigger + Skip + QuantRank precedent section). Each pattern section is project-agnostic; QuantRank refs confined to the labeled "QuantRank precedent" sections at the bottom. Task B — PHASE_STATUS.md row 4 staleness fix --------------------------------------------- PHASE_STATUS.md row 4 said "Phase 4h.2 Part 2 in flight in this PR" since PR #124's prep work. PR #124 merged 2026-05-19 (commit sequence visible in main: ...124...118...112...). Updated to "Phase 4h.2 Part 2 merged via PR #124 (2026-05-19)" — the rest of the row 4 text (multi-port OSAP adapter description, IC-decay deferral note) stays unchanged. This was flagged in PR #132 body and tracked as a small follow-up. No other PHASE_STATUS.md edits — row 4 is the only stale entry. Task C — Docs lockstep ----------------------- CLAUDE.md row 33 skill count: 35 → 37 (QR-origin portable category 4 → 6, total reflects the 2 new skills landed here). Categorisation unchanged otherwise; 9arm license-pending caveat still flagged with cross-reference to issue #137. Skill inventory after this PR (37 total) ----------------------------------------- - QuantRank operational: 12 - QR-origin portable extract: 6 (was 4; +annotate-before-veto + graceful-degradation-try-except) - Anthropic vendored: 6 - External MIT vendored: 9 (Karpathy + 8 mattpocock, unchanged) - External license-pending vendored: 4 (9arm, unchanged) Verification ladder ------------------- - ruff check . → All checks passed - python -m compute.output.schema_check → Schema snapshot in sync - python tools/check_doc_test_counts.py → exit 0 - pytest tests/ -m "not network" → not run locally (sandbox missing pandas); CI will verify. Changes are docs/skills-only. - Skill registry pickup verified via session reload — both portable-annotate-before-veto and portable-graceful-degradation-try-except register with full YAML-frontmatter descriptions. Constraints honored ------------------- - No touch to compute/ / frontend/ / tests/ - No touch to WORKFLOW.md (out of scope; could file a future follow-up if WORKFLOW.md needs to cross-reference the two new portable skills) - No squash / amend of prior commits - No push to main; no force-push; no --no-verify - No workflow_dispatch trigger - 2 new portable skills pattern descriptions are project-agnostic; QR refs only in labeled "precedent" sections Epic #125 status after this PR ------------------------------- - #130 (quarterly cohort-threshold review tracker) — recurring, unchanged - #133 (portable skills library remaining) — CLOSED by this PR - #137 (9arm-skills license clarification) — external action, waiting on user to file upstream issue at thananon/9arm-skills Epic #125 Item 3 (Pre-merge production simulation) remains the only substantive open scope. PHASE_STATUS.md row 4 staleness was the last housekeeping task. https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2 Co-authored-by: Claude <noreply@anthropic.com>
4 tasks
dackclup
added a commit
that referenced
this pull request
May 20, 2026
…PR A) (#141) First PR in the multi-PR .md optimization sequence (Option D scope — yกเครื่อง). PR A is the low-risk baseline: fixes 2 broken skill frontmatters that prevent dispatch + drift-fixes 4 stale facts in agent docs. Critical YAML fix: - branch-collision-check/SKILL.md and pr-quality-gate/SKILL.md had multi-line `description:` plain-scalar frontmatter that PyYAML (and Claude Code's skill loader) couldn't parse because lines contain `#123` / `#X` issue references after whitespace — YAML treats ` #` as a comment marker, so everything after the first comment-trigger got eaten and the loader fell back to displaying `name: name` in the available-skills list. Both skills were effectively undispatchable from any session. - Fix: change `description:` to `description: >` (folded block scalar) so newlines become spaces and `#` mid-content is treated as literal text. Verified live in this session — system reminder now shows the full TRIGGER/SKIP descriptions for both. Stale-fact pass: - .claude/skills/README.md L14-16: "27 invocation-triggerable skills" → references CLAUDE.md as the canonical count (38) to prevent future drift. Future top-level skill add/remove only needs to bump CLAUDE.md §Layout, not three files. - AGENTS.md L104: ".claude/skills/ # 24 loaded skills" → 38. - AGENTS.md L287: "Schema version: 0.8.0-phase4.5f" → 0.9.2-phase4h.2 (3 versions behind). Now references SKILL.md schema-version table for full history. - CLAUDE.md L181-192 (§Phase status): "Current schema 0.9.1-phase4h.2 ... Phase 4h in flight in PR #112" → 0.9.2-phase4h.2 + Phase 4h shipped (Parts 1+2 done via #112/#118/#124). - CLAUDE.md + AGENTS.md §Phase status: "Epic #125 Item 3 in flight via PR #140" → "PR 1 of 2 shipped" at commit a52aa2d; PR 2 remaining. CLAUDE.md + AGENTS.md edit ships per the lockstep convention. No code touched, no schema touched — pre-merge-prod-sim.yml won't trigger (paths compute/scoring + compute/features unaffected). Next in optimization sequence: PR B (CLAUDE.md token diet) — TBD after user reviews this one. Co-authored-by: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Phase 4h commit 1 of N — foundation only. Lays the schema-triple groundwork + the cron-workflow extras bump so subsequent commits (replicate / blend / validation / main.py wiring) can extend without each touching ask-first surfaces. No new compute paths are wired into
main.pyyet — the schema additions are all Optional withNonedefaults; the next weekly cron run will writeversion: "0.9.0-phase4h"JSONs with the OSAP fields absent (i.e., null), which is forward-compatible.This PR will accumulate ~4 more commits before flipping Draft → Ready:
compute/features/osap_replicate.pycompute/scoring/osap_blend.py(Path b — outsidecompute_composite)compute/validation/osap_validation.py(PBO/DSR gate wrapper)compute/main.pywiring +@networkintegration testPlan ref:
/root/.claude/plans/resume-quantrank-swift-barto.md(audited 2026-05-18; all 5 citation/blend issues fixed; line numbers re-verified with grep).Commit 1 changes
compute/ingest/osap.py: extendfetch_osap_returns()with keyword-onlysignals: list[str] | None+as_of: date | Nonefilters. Non-breaking (defaultsNone); cache always stores the full bulk parquet — filter happens post-load so multiple callsites with different subsets don't fight over the cache.compute/output/schemas.py(Pydantic):StockDetail: +osap_signals: dict[str, float] | None,osap_blended_score: float | None.Metadata: +osap_signals_used: list[str] | None,osap_excluded_signals: list[str] | None,osap_signals_ic_12m: dict[str, float] | None,osap_signals_coverage_pct: dict[str, float] | None.compute/config.py:SCHEMA_VERSION"0.8.0-phase4.5f"→"0.9.0-phase4h"(MINOR bump = new phase per SKILL.md convention).frontend/lib/types.ts+frontend/lib/schema-snapshot.json: mirror Pydantic; snapshot auto-regenerated..github/workflows/compute-rankings.yml: install linepip install -e .→pip install -e ".[factors]"so weekly cron hasopenassetpricingin its env once commit 5 wires it intomain.py.tests/test_config.py+tests/test_smoke.py: SCHEMA_VERSION pin assertions →0.9.0-phase4h.AGENTS.md)All three flagged in the plan audit before this commit; user authorized 2026-05-18:
.github/workflows/compute-rankings.yml(AGENTS.md:105+ line 233 wildcard). Single-line install bump; no timeout / ordering / cache changes.AGENTS.md:229-231):schemas.py+types.ts+schema-snapshot.jsonmoved in lockstep in this one commit.schema_check(no--update-snapshot) confirms in-sync post-edit.SCHEMA_VERSION(compute/config.py:30): bumped MINOR. Legacy 0.8.x JSONs deserialize cleanly because the new fields are all| None = None.Blend approach lock (audit feedback)
apply_osap_blend()(commit 3) will compute outsidecompute_composite():The
PHASE3_WEIGHTSsum-to-1.0 invariant atcompute/scoring/composite.py:43-45stays intact — no 9th slot added. 50/50 default locked perosap-integration/PLAN.md:168-170.Verification (this commit only — local)
ruff check .→ cleanpython -m pytest tests/ -m "not network"→ 861 passed in 25s (no regressions)python -m compute.output.schema_check→ ✓ in synccd frontend && npx tsc --noEmit→ cleancd frontend && npx next build→ 506 static pages generated, identical shape to post-PR-110 production buildTest plan (PR-level — accumulates across commits)
06bdac76(ci.yml+Frontend (build))@networkintegration test + per-signal PBO/DSR scorecard posted as PR commentBranch note
SDK preamble locks branch to
claude/resume-quantrank-phase-4.5-Zh0pO(same harness allocation that hosted PR #110). The branch was deleted on remote post-PR-110-merge; this push creates it fresh on top of currentmain. Title/scope reflect Phase 4h, not Phase 4.5.🤖 Drafted with Claude Code via the Anthropic SDK.
Generated by Claude Code