Skip to content

feat(skills): scaffold 43 project-specific skills across all phases#13

Merged
dackclup merged 1 commit into
mainfrom
claude/skills-scaffolding
May 10, 2026
Merged

feat(skills): scaffold 43 project-specific skills across all phases#13
dackclup merged 1 commit into
mainfrom
claude/skills-scaffolding

Conversation

@dackclup
Copy link
Copy Markdown
Owner

feat(skills): scaffold 43 project-specific skills across all phases

QuantRank-specific Claude Code skills covering the compute → output → verify lifecycle. Skills here encode the conventions, file paths, schema versions, and verification patterns that emerged from PR-3a through PR-3d so each phase doesn't re-invent them.

Independent of PR #12 (Phase 3d Tier-2 work). Pure docs — no Python / TypeScript / schema changes.

Why

Throughout PR-3a → PR-3d we developed repeatable patterns for:

Each pattern was rediscovered or re-explained per phase. Skills codify them once so future phases (3e, 4, 5, 6, 7, 8) can invoke the patterns directly without rebuilding from scratch.

Scope

Cross-phase skills (7 — full SKILL.md):

Skill What it does
verify-production-output Section A-H scan of frontend/public/data/
schema-check Pydantic ↔ TypeScript snapshot guard wrapper
defense-scorecard Vetoes / guards / annotate flag tally vs baseline
top5-rotation-audit entered_top5 / exited_top5 invariant verification
network-test-runner pytest --run-network with EDGAR_USER_AGENT
phase-status-bump PHASE_STATUS.md + SKILL.md + WORKFLOW.md sync flow
pr-iteration-flow Draft ↔ Ready cycle (codifies PR-3d 5-iteration pattern)

Phase-specific stubs (36 — frontmatter + intent + acceptance criteria):

Phase Skills
Phase 1 (2) universe-refresh, yfinance-debug
Phase 2 (3) fundamentals-cache-warm, xbrl-tag-debug, sec-api-health-check
Phase 3a (2) pillar-imputation-check, sector-neutralization-debug
Phase 3b (3) altman-debug, sloan-debug, nsi-debug
Phase 3c (3) ensemble-method-debug, outlier-detection-debug, mos-display-clamp-check
Phase 3d (3) tier2-deferred-mode-check, going-concern-fp-audit, fundamentals-coverage-report
Phase 3e (3) beneish-mscore-debug, dechow-fscore-debug, honest-limitations-section
Phase 4 (5) 8k-events-pre-cache, going-concern-phrase-refine, ipca-factor-fit, alpha158-fit, chronic-slow-ticker-special-case
Phase 5 (4) triple-barrier-label, meta-label, conformal-predict, shap-explain
Phase 6 (3) whisper-transcribe, finbert-score, lazy-prices-detect
Phase 7 (3) student-t-hmm-fit, nco-portfolio-allocate, tda-risk-off
Phase 8 (2) universe-expand-sp1500, microcap-skip

Each stub captures intent, acceptance criteria, and references to related modules / docs / issues — so when each phase begins, the stub is the starting spec, not a blank page.

Layout

.claude/skills/
├── README.md                                  (index + authoring conventions)
├── verify-production-output/SKILL.md          ┐
├── schema-check/SKILL.md                      │
├── defense-scorecard/SKILL.md                 │ 7 cross-phase
├── top5-rotation-audit/SKILL.md               │ (full content)
├── network-test-runner/SKILL.md               │
├── phase-status-bump/SKILL.md                 │
├── pr-iteration-flow/SKILL.md                 ┘
├── phase-1/                                   ┐
│   ├── universe-refresh/SKILL.md              │
│   └── yfinance-debug/SKILL.md                │
├── phase-2/                                   │ 36 phase-specific
├── phase-3a/                                  │ stubs
├── phase-3b/                                  │
├── phase-3c/                                  │
├── phase-3d/                                  │
├── phase-3e/                                  │
├── phase-4/                                   │
├── phase-5/                                   │
├── phase-6/                                   │
├── phase-7/                                   │
└── phase-8/                                   ┘

Stats

  • 44 files (1 README + 7 cross-phase + 36 stubs)
  • +2,593 lines, 0 deletions
  • Pure docs — ruff check . clean (no Python touched), no schema impact, no test impact

Why now

  • Phase 3d's 5-iteration polish cycle revealed the value of pr-iteration-flow as a documented pattern
  • Phase 4 will need 8k-events-pre-cache and going-concern-phrase-refine skills — the stubs are the planning doc for that phase
  • Cross-phase skills (especially verify-production-output + defense-scorecard) accelerate the next workflow_dispatch verification cycle

What's NOT here

  • No helper scripts (each SKILL.md describes the helper but doesn't include implementation). Helpers get authored when the skill is fleshed out.
  • No tests for skills (skills are agent invocation prompts, not code — they're tested by being invoked + producing useful output)
  • No global config changes — .claude/skills/ is local to this repo only. Generic Claude Code skills can still be installed from skills.sh separately

Reviewer checklist

  • Every SKILL.md has YAML frontmatter (name, description)
  • Cross-phase skills explain when to use, what they do, inputs, outputs, anti-patterns
  • Phase-specific stubs include acceptance criteria + references to existing code / docs
  • No duplication: each skill has a single owner phase or "cross-phase" label
  • README.md table is up to date with the file structure

Post-merge

  • No deletions
  • No tag (skills are infra, not user-visible features)
  • Future phases author helpers + flesh out stubs in their own PRs

Generated with Claude Code · Tested with Anthropic API


Generated by Claude Code

QuantRank-specific Claude Code skills covering compute → output →
verify lifecycle. Skills encode the conventions, file paths, schema
versions, and verification patterns that emerged from PR-3a through
PR-3d so each phase doesn't re-invent them.

Layout (.claude/skills/):
- README.md (index + authoring conventions)
- 7 cross-phase skills (full SKILL.md each, ~120-200 LOC):
  - verify-production-output (Section A-H scan template)
  - schema-check (Pydantic ↔ TypeScript snapshot guard wrapper)
  - defense-scorecard (vetoes / guards / annotate flag tally)
  - top5-rotation-audit (entered_top5 / exited_top5 invariant)
  - network-test-runner (pytest --run-network with EDGAR_USER_AGENT)
  - phase-status-bump (PHASE_STATUS.md + SKILL.md + WORKFLOW.md sync)
  - pr-iteration-flow (Draft↔Ready cycle codifying PR-3d pattern)
- 36 phase-specific stubs (frontmatter + intent + acceptance
  criteria, ~30-50 LOC each):
  - Phase 1 (2): universe-refresh, yfinance-debug
  - Phase 2 (3): fundamentals-cache-warm, xbrl-tag-debug,
    sec-api-health-check
  - Phase 3a (2): pillar-imputation-check,
    sector-neutralization-debug
  - Phase 3b (3): altman-debug, sloan-debug, nsi-debug
  - Phase 3c (3): ensemble-method-debug, outlier-detection-debug,
    mos-display-clamp-check
  - Phase 3d (3): tier2-deferred-mode-check, going-concern-fp-audit,
    fundamentals-coverage-report
  - Phase 3e (3): beneish-mscore-debug, dechow-fscore-debug,
    honest-limitations-section
  - Phase 4 (5): 8k-events-pre-cache, going-concern-phrase-refine,
    ipca-factor-fit, alpha158-fit, chronic-slow-ticker-special-case
  - Phase 5 (4): triple-barrier-label, meta-label, conformal-predict,
    shap-explain
  - Phase 6 (3): whisper-transcribe, finbert-score, lazy-prices-detect
  - Phase 7 (3): student-t-hmm-fit, nco-portfolio-allocate, tda-risk-off
  - Phase 8 (2): universe-expand-sp1500, microcap-skip

Each stub captures the intent, acceptance criteria, and references
to related modules / docs / issues so the implementation has a
clear target when that phase begins. Phase-specific stubs get
fleshed out as each phase starts work — they're not all needed today.

44 files, +2593 LOC. Pure docs — no Python or TypeScript changes;
no compute or schema impact. Independent of PR 3d (Phase 3d Tier-2
work).

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
@vercel
Copy link
Copy Markdown

vercel Bot commented May 10, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
quantrank Ready Ready Preview, Comment May 10, 2026 5:10pm

@dackclup dackclup marked this pull request as ready for review May 10, 2026 17:14
@dackclup dackclup merged commit ecdac87 into main May 10, 2026
4 checks passed
@dackclup dackclup deleted the claude/skills-scaffolding branch May 10, 2026 17:14
dackclup added a commit that referenced this pull request May 13, 2026
Pre-registers the official anthropic/skills marketplace
(https://github.com/anthropics/skills) in repo-scoped
.claude/settings.json so contributors automatically get the catalog
available on clone. Lists all 17 plugins in enabledPlugins to
auto-install on next Claude Code session start.

Marketplace skills (general-purpose) complement, not replace, the
43 QuantRank-specific skills already at .claude/skills/ (committed
in PR #13 — verify-production-output, schema-check, defense-scorecard,
per-phase debuggers).

Settings.json shape per Claude Code docs
(https://code.claude.com/docs/en/discover-plugins.md):
- extraKnownMarketplaces["anthropics-skills"] → registers the
  source repo
- enabledPlugins → 17 entries '<name>@anthropics-skills' for
  auto-install

No skill-name collisions: the 7 cross-phase QuantRank skills
(verify-production-output, schema-check, defense-scorecard,
top5-rotation-audit, network-test-runner, phase-status-bump,
pr-iteration-flow) plus the 36 phase-specific skills don't overlap
with any of the 17 marketplace names.

README updated with manual incantation for contributors whose
Claude Code doesn't auto-install on clone:
    /plugin marketplace add anthropics/skills
    /plugin install <name>@anthropics-skills

No production code touched. ruff clean; pytest 526 passed.

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2

Co-authored-by: Claude <noreply@anthropic.com>
dackclup added a commit that referenced this pull request May 15, 2026
….6.0 → 0.7.0 (#79)

Closes #14. Closes #17 (the "10-K + 8-K" log line is now accurate again).

## What this PR does

Flips `compute.scoring.tier2._EIGHT_K_DEFENSES_ENABLED` False → True.
That single boolean was the only thing standing between the PR 3d
wiring and an active 4th veto. Active vetoes count goes 4 → 5.

| Defense | ItemFlag.fired effect | Active in production |
|---|---|---|
| `non_reliance_filing` (8-K Item 4.02, Schroeder 2024 SSRN) | hard VETO — suppresses `entered_top5` | **YES, after this PR** |
| `auditor_change` (8-K Item 4.01, Reg S-K Item 304) | ANNOTATE — emits `tier2_events.auditor_change.fired` | **YES, after this PR** |

## Why this is safe to flip now

The PR 3d deferral cited 3 workflow-timeout incidents (runs #12 / #13
/ #14). Every root cause has since been mitigated:

| PR 3d failure | Fix |
|---|---|
| Run #12 — `filing.text()` routed through `hybrid_section_detector` (~5-10s × 502 stocks) | PR 3d hotfix 226840d — `filing.html()` shortcut |
| Run #13 — `EightK.items` access also triggered the detector | PR 3d hotfix 12ad7ff — regex-on-raw-HTML extraction (mirrors edgartools Strategy 3 fallback) |
| Run #14 — SEC EDGAR throttling × overly-aggressive tenacity retry (60-90s/stuck-stock × ~40% failure rate) | PR 3d Part 1 — `stop_after_delay(30) | stop_after_attempt(2)`, 45s per-stock orchestrator timeout, edgartools warning suppression |
| Cache state lost across CI runs (only `fundamentals` was being preserved) | PR 4a — workflow cache restore step expanded to all 6 paths, including `edgar_8k` |
| Weekly compute = 7-day recovery window on failure | PR 4f — daily Mon-Fri cron = 24h recovery window |

Latency p95 has dropped from the 30+s regime that bit run #14 to
14.41s on the latest production run. Kill-switch capability
(`QR_SKIP_TIER2` env var + the feature flag itself) is preserved.

## Files changed

- `compute/scoring/tier2.py` — `_EIGHT_K_DEFENSES_ENABLED = True`;
  comment block updated with the post-flip rationale + kill-switch
  pointer
- `compute/config.py` — `SCHEMA_VERSION` `0.6.0-phase3d` →
  `0.7.0-phase4g`. Promoting a defense flag from deferred to active
  veto is a **minor** semver bump per
  `.claude/skills/phase-4/schema-versioning/PLAN.md`.
- `tests/test_config.py` — locked-constant test renamed
  `test_schema_version_is_phase3d` → `test_schema_version_is_phase4g`;
  asserts new version
- `tests/test_smoke.py` — `SCHEMA_VERSION.startswith("0.6.0")` →
  `startswith("0.7.0")`
- `tests/test_scoring/test_tier2.py` — `eight_k_disabled` fixture
  added; E1-E5 + F2 updated to flip the flag explicitly when
  exercising kill-switch behavior (was relying on the default).
  Same threshold-symbolic-test pattern as SKILL Rule 17 — tests stay
  green if the constant moves.

## Scope NOT in this PR

- Pre-cache off-cycle workflow (issue #14 §1) — kept as an option
  for further perf headroom but not needed for correctness. If the
  first daily run with 8-K enabled comes in under the 90-min budget,
  the pre-cache layer becomes an optimization, not a requirement.
  File as follow-up if needed after monitoring the first 1-2 runs.
- Frontend updates — `tier2_events.non_reliance_filing` /
  `auditor_change` were already wired through to `Tier2EventCard`
  in PR 3d; values just flip from "always false" to "computed from
  EDGAR data" after this PR. No frontend code change required.

## Verification ladder
- ✅ ruff check — clean
- ✅ pytest -m "not network" — 772 passed (5 retitled / threshold-
  symbolic tests in tier2 + test_config + test_smoke)
- ✅ schema_check — Pydantic ↔ TypeScript snapshot still in sync
  (no field shape change; only the version string moved)
- ✅ tsc --noEmit — clean
- ✅ next build — 506 static pages

## Post-merge monitoring

After the first daily compute run with 8-K active:
- Section A schema field reads `0.7.0-phase4g`
- Section B `non_reliance_filing` count — expect 0-5 tickers in the
  S&P 500 universe per Schroeder 2024 base rate; > 10 deserves
  investigation
- Section B `auditor_change` count — expect 5-20 tickers per Reg S-K
  base rate over a 730-day window
- `tier2_coverage_pct` should remain ≥ 95% (was ~100% with 10-K only;
  partial 8-K fetch failures will drag it down some)
- Workflow runtime — expect +5-10 min cold cache hit, +1-2 min warm

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2

Co-authored-by: Claude <noreply@anthropic.com>
dackclup pushed a commit that referenced this pull request May 18, 2026
Phase 4h commit 4 of N. Ships the cohort-aware wrapper that decides
which of the 100 candidate OSAP signals are accepted into the
``composite_score_osap_adjusted`` blend (commit 3). Wraps PR #60's
``compute/validation/pbo_dsr.py::factor_passes_gates`` — does NOT
reimplement PBO or DSR math.

Module layer (``compute/validation/osap_validation.py``, 220 LOC):

- ``GateResult(frozen dataclass)`` — per-signal verdict with
  ``accepted: bool``, ``pbo / dsr / sharpe: float | None``,
  ``n_observations: int``, ``rejection_reason: str | None`` in
  ``{None, 'high_pbo', 'low_dsr', 'gate_failed', 'insufficient_data'}``.
  Distinct ``'gate_failed'`` category for diagnostic clarity when both
  PBO AND DSR fail simultaneously (Bailey 2014 pure-noise cohorts fail
  this way — verified by tests #1, #4, #8).

- ``gate_osap_signals(long_short_returns, requested_signals=None,
  pbo_threshold=PBO_VETO_THRESHOLD, dsr_threshold=DSR_VETO_THRESHOLD,
  n_partitions=DEFAULT_N_PARTITIONS) -> dict[str, GateResult]`` —
  pivots commit-2's long-format DF to wide (date × signal), runs
  Bailey 2014 cohort framing (``n_trials = wide.shape[1]``), per-signal
  loop calling ``factor_passes_gates`` with the established defaults.
  Module constants imported from ``pbo_dsr`` — NOT redefined.

- ``compute_rolling_ic_12m(long_short_returns, signalname) -> float
  | None`` — observability-only Spearman lag-1 IC over the most
  recent 12 monthly observations. Pure pandas (no scipy — matches
  ``pbo_dsr.py``'s hand-rolled Beasley-Springer-Moro precedent for
  the inverse normal CDF). Never gates acceptance.

- ``filter_accepted_signals(gate_results) -> (accepted, excluded)`` —
  sorted-alphabetical split, feeds commit-5's metadata writer.

**NaN policy — LOCKED, documented in module docstring**:

Source-verified asymmetry in ``compute/validation/pbo_dsr.py``:

- ``compute_pbo`` (L187-284) is **NaN-UNSAFE** — L234 ``to_numpy
  (dtype=float)`` then L256-257 ``.mean(axis=0)`` / ``.std(axis=0)``
  then L261 ``np.argmax`` silently corrupts on any NaN cell.
- ``compute_deflated_sharpe`` (L287-385) is **NaN-SAFE** — L323 strips
  internally via ``arr = arr[~np.isnan(arr)]``.

Because ``factor_passes_gates`` accepts ``factor_returns`` and
``returns_matrix`` independently, this wrapper feeds different NaN
treatments to each side:

1. ``factor_returns = wide[sig].dropna()`` — DSR's internal strip
   handles it. No information lost.
2. ``returns_matrix = wide.fillna(0.0)`` — zero-fill, NOT mean-fill,
   NOT ``dropna(how='any')``.

Zero-fill chosen over the two alternatives:

- ``dropna(how='any')`` would decimate the 100-signal × monthly
  matrix below ``n_partitions=16`` rows once any earnings-event-only
  signal is included, collapsing the Bailey 2014 multiple-testing
  ``n_trials = cohort_size`` correction. Test #13 ``test_gate_osap_
  signals_sparse_cohort_zero_filled_not_decimated`` is the
  regression guard against accidental revert.
- ``fillna(column_mean)`` would deflate per-signal variance, inflate
  Sharpe, bias PBO toward false acceptance — silently rewarding
  sparse signals for low coverage.
- ``fillna(0.0)`` is the honest OSAP-semantic: absence-of-coverage
  for ``(signal, month)`` means "no portfolio formed / no
  information generated" → zero return is the right proxy. Bailey
  2014 PBO is rank-based within each period; zero-imputation
  symmetrically pushes coverage-gap rows toward indeterminate rank.

Acknowledged trade-off: sparse-coverage signals see their Sharpe
shrunk toward zero by the zero-fill, raising DSR rejection
probability. Cohort-fair but penalizes legitimate event-only
signals. Phase 4h scope accepts this — the Phase 5 backtest harness
(``defense-infrastructure/PLAN.md:270``) runs full walk-forward CV
per signal and supersedes this gate when it ships.

Standalone module discipline: zero imports from
``compute.features.osap_replicate`` (commit 2),
``compute.scoring.osap_blend`` (commit 3), or ``compute.main``.
Only ``compute.validation.pbo_dsr`` for primitives + constants.
Validation runs on the long-short returns DataFrame contract only.

Tests (``tests/test_validation/test_osap_validation.py``, 14
offline, exceeded plan's 13-test target):

1. ``random_noise_yields_high_pbo`` — Bailey 2014 invariant: pure-
   noise cohort → zero acceptances, all reasons in
   {'high_pbo', 'low_dsr', 'gate_failed'}, no 'insufficient_data'
2. ``low_sharpe_signal_rejected_for_dsr`` — near-zero σ signal →
   'low_dsr' or 'gate_failed' with DSR ≤ 0
3. ``strong_signal_accepted`` — monotone-drift signal beats noisy
   cohort → accepted=True, populated pbo/dsr/sharpe floats
4. ``insufficient_data`` — < ``MIN_OBS_PER_SIGNAL`` rows in cohort
   → all signals rejected with 'insufficient_data'
5. ``requested_signals_filter`` — subset filter applied pre-pivot
6. ``requested_none_uses_all_signals_in_df`` — default covers all
7. ``empty_input_returns_empty_dict`` — empty DF → {}, no crash
8. ``single_signal_cohort_rejects_with_insufficient_data`` — cohort
   size < 2 short-circuit
9. ``compute_rolling_ic_12m_known_signal`` — monotone series →
   Spearman = 1.0 ± 1e-9
10. ``compute_rolling_ic_12m_insufficient_history`` — < 13 obs →
    None
11. ``compute_rolling_ic_12m_nan_safe_with_gaps`` — NaN outside
    tail(13) window pruned cleanly; tail-13 strictly monotone
    Spearman = 1.0
12. ``filter_accepted_signals_splits_into_sorted_lists`` — sorted
    union round-trip
13. ``sparse_cohort_zero_filled_not_decimated`` — REGRESSION GUARD:
    3 of 10 signals with 25% NaN coverage at staggered offsets; all
    10 still get real PBO/DSR runs (none short-circuit) — fails
    immediately if cohort policy reverts to dropna(how='any')
14. ``module_load_constants_sourced_from_pbo_dsr`` — MIN_OBS_PER_
    SIGNAL == DEFAULT_N_PARTITIONS == 16; canonical Phase 4 gate

Verification:

- ``ruff check compute/validation/osap_validation.py tests/test_
  validation/test_osap_validation.py`` → clean
- ``pytest tests/ -m "not network"`` → 906 passed (892 prior + 14
  new)
- Import sanity: ``from compute.validation.osap_validation import
  gate_osap_signals, compute_rolling_ic_12m, filter_accepted_
  signals, GateResult, MIN_OBS_PER_SIGNAL, ROLLING_IC_WINDOW_MONTHS``
  → OK
- No schema change (still 0.9.0-phase4h from commit 1)
- No ``PHASE3_WEIGHTS`` touched
- No imports from osap_replicate / osap_blend / main — standalone
  verified

Next: commit 5 — ``compute/main.py`` wiring + ``compute/ingest/
osap.py`` kwargs (~70 LOC). Wires fetch → replicate → gate → blend
end-to-end; integration ``@network`` test against real OSAP fetch
with 20-ticker compute slice + sanity-IC on Mom1m signal.

https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU
dackclup added a commit that referenced this pull request May 18, 2026
…d + PBO/DSR gate (#112)

Phase 4h cluster — OSAP signal replication + PBO/DSR hard gate + Path-b composite × OSAP blend. Final shape: 5 commits, +2251 LOC / -16 across 19 files, 906 offline tests + 19 @network. Observability-only this phase: Top-5 still ranks raw composite_score per SKILL.md Rule 16; composite_score_osap_adjusted is informational on StockDetail.osap_blended_score.

5-commit cluster summary:

- 06bdac7: schema foundation — schema triple bump 0.8.0-phase4.5f → 0.9.0-phase4h + scout kwargs (signals/as_of) on fetch_osap_returns + workflow [factors] install
- b79983f: osap_replicate proxy — 100-signal manifest + compute_long_short_returns / compute_osap_signals / coverage_by_signal (factor-exposure proxy mode; full per-stock replication deferred to Phase 4h.1)
- a6760d9: osap_blend Path-b — aggregate_osap_signals + apply_osap_blend (50/50 default). STAYS OUTSIDE compute_composite() so PHASE3_WEIGHTS sum-to-1.0 invariant at composite.py:43-45 stays intact
- df4d9bd: osap_validation gate — gate_osap_signals + compute_rolling_ic_12m + filter_accepted_signals. Asymmetric NaN policy (zero-fill cohort + DSR-side strip) — locked by source-audit of pbo_dsr.py L234/L256-257 (NaN-UNSAFE) vs L323 (NaN-SAFE) + regression-guarded by test #13
- c9abb05: compute/main.py wiring + e2e — fetch → replicate → gate → IC → blend pipeline; try/except graceful degradation (all 6 OSAP fields → None on any failure); StockDetail/Metadata field writes; @network integration test (4-signal × 20-ticker slice with Mom1m IC sanity); CLAUDE.md + PHASE_STATUS.md + SKILL.md docs lockstep

Key architectural locks:
- Path-b blend OUTSIDE compute_composite() — PHASE3_WEIGHTS invariant intact, verified by 906 offline tests passing across all 5 commits
- PBO/DSR hard gate inherits PR #60's factor_passes_gates defaults (PBO ≤ 0.5 AND DSR > 0). Wraps the gate; does NOT reimplement
- Rolling-12m Spearman IC observability only — full walk-forward + purged-embargo CV is Phase 5's job per defense-infrastructure/PLAN.md:270
- Top-5 still ranks raw composite_score per Rule 16. Cutover to composite_score_osap_adjusted deferred until Phase 5 ML meta-learner provides IC evidence
- Universe-gap policy: NaN OSAP aggregate → composite passes through unchanged. NO impute (distinct from neutralize_missing=True on pillar side)
- Graceful degradation: OSAP fetch / library / network failure → all 6 fields None; weekly production continues unimpeded

Defense layer unchanged at 17 (annotate-only blend, no new veto). Schema bump MINOR (0.8 → 0.9) per SKILL.md schema-versions convention.

Audit history:
- Commit 1 audit: clean
- Commit 2 audit: clean (proxy scope decision flagged + transparent docs; Phase 4h.1 follow-up issue queued for post-merge)
- Commit 3 audit: clean (Path-b implementation precise, NaN-safe via .where() idiom)
- Commit 4 audit: clean (NaN strategy source-verified at pbo_dsr.py L234/L323; 14 tests pass locally on auditor sandbox)
- Commit 5 audit: clean (wiring exemplary, graceful-degradation envelope verified, docs lockstep)

Post-merge plan: Section I delegation (Vercel MCP 4-call via sibling session + Playwright 4-ticker matrix with one zero-OSAP-coverage ticker as new failure mode); Phase 4h.1 follow-up issue auto-files on merge webhook event.

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants