test(features): Add Hypothesis property-based tests for data-shape invariants (#126) by dackclup · Pull Request #127 · dackclup/quantrank

dackclup · 2026-05-20T01:25:17Z

Closes #126.

Summary

Process Hygiene Item #1 (parent epic #125). Adds Hypothesis property-based tests as the new defense line for "untested data-shape assumption" bugs — the class that hid the OSAP quintile/tercile silent-drop in PR #112's CI until production cron diagnostics caught it (subsequently fixed in PR #124 / Phase 4h.2 Part 2).

If a @given property over port_count ∈ {2, 3, 5, 10} had existed in Phase 4h, the hardcoded port=10 filter would have been falsified the first time CI ran.

Test-addition only. No scoring / feature behavior touched. No schema delta. No CI workflow changes.

Test count before / after

	Offline tests
Before (main @ `80c6641e`, post Phase 4h.2 Part 2)	945
After this PR	959 (+14 property tests)

Property tests landed (14)

Sub-task 2 — `osap_replicate.py` (7 tests, 394 LOC)

#	Property test	Catches
1	`test_compute_long_short_returns_handles_any_port_cardinality`	The headline. For port_count ∈ [2, 10], adapter produces LS rows. Would have caught PR #112's bug.
2	`test_signals_dropped_no_long_short_returns_sorted_unique`	Metadata field contract drift
3	`test_normalize_port_label_int_input_yields_2char_zfill`	int port idempotence + zfill
4	`test_normalize_port_label_str_input_yields_2char_zfill`	mixed '1' / '01' / '10' → uniform width
5	`test_part2_accounting_invariant_under_random_partition`	accounting equation `manifest = missing + dropped + gated + used` holds for any partition
6	`test_coverage_by_signal_returns_pct_in_0_to_100`	0..100 percent (NOT 0..1 fraction) confusion
7	`test_rank_signals_cross_sectional_returns_unit_interval`	ranks ∈ (0, 1]

Sub-task 3 — scoring transforms (7 tests, 340 LOC)

#	Property test	Module	Catches
A	`test_compute_composite_output_bounded_0_to_100`	composite	writer + Pydantic contract
B	`test_compute_composite_all_50_inputs_yield_composite_50`	composite	accidental weight-vector drift
C	`test_compute_composite_neutralize_missing_imputes_nan_to_50`	composite	NaN imputation regression
D	`test_compute_composite_constant_input_equals_input`	composite	`PHASE3_WEIGHTS` sum-to-1 invariant
E	`test_apply_osap_blend_output_bounded_and_nan_passthrough`	osap_blend	bound + NaN passthrough + interior-point property
F	`test_aggregate_osap_signals_finite_values_in_0_to_100`	osap_blend	rank × 100 multiplication
G	`test_apply_osap_blend_weight_zero_is_identity_on_composite`	osap_blend	Rule 16: weight=0 leaves composite unchanged (Phase 4h observability-only lock)

Sub-task 4 — CI integration + docs

.gitignore — already covers .hypothesis/ at L50 (Python default). No edit needed.
CLAUDE.md ## Gotchas — 1-line note that Hypothesis is the new defense line, with the @settings(deadline=None) anti-pattern flagged.
CI flaky behaviour — default profile makes flaky examples fail-fast (no retry); pytest -m "not network" inherits this. No @settings(deadline=None) used in any of the 14 properties.

Sanity verification (NOT committed)

Temporarily reverted compute/features/osap_replicate.py:143 (agg(["min", "max"]) → agg(["min", "min"])) and confirmed Property 1 fails with:

Falsifying example: test_compute_long_short_returns_handles_any_port_cardinality(
    port_count=2,
    n_dates=1,
)

Reverted the break before commit. Working tree matches main except for the 4 staged files.

No regression discovered

Property tests passed on first execution against current main (Phase 4h.2 Part 2 already merged). The fact that nothing falsified in 14 properties × ~100 examples is itself a quality signal — the multi-port adapter handles the [2, 10] cardinality region cleanly, the composite weight invariant holds, and the OSAP blend domain contract isn't violated under any (composite, osap, weight) triple in the unit interval.

Constraints honored

✅ NO modification to compute_composite / PHASE3_WEIGHTS sum=1.0 invariant — pure test-addition PR
✅ Rule 16: Top-5 still ranks raw composite_score; no scoring touched
✅ No push to main; no force-push; no --no-verify
✅ No workflow_dispatch trigger (compute-rankings.yml untouched)
✅ Schema triple untouched (no schemas.py / types.ts changes)
✅ NO @settings(deadline=None) — default deterministic deadline
✅ NO RuleBasedStateMachine (out of scope per issue Process hygiene #1 — Add Hypothesis property-based tests for data-shape invariants #126)

Files (4 changed, +747 / 0)

File	Change
`pyproject.toml`	`hypothesis>=6.92` in `[dev]` (+6)
`CLAUDE.md`	`## Gotchas` note (+7)
`tests/test_features/test_osap_replicate_properties.py`	NEW — 7 property tests (+394)
`tests/test_scoring/test_transforms_properties.py`	NEW — 7 property tests (+340)

Test plan

ruff check . → All checks passed
python -m pytest tests/ -m "not network" → 959 passed (1m46s)
python -m pytest tests/test_features/test_osap_replicate_properties.py tests/test_scoring/test_transforms_properties.py → 14 passed (5s)
python -m compute.output.schema_check → in sync (no schema delta)
Sanity break-revert confirmed property test catches a regression (Falsifying example: port_count=2, n_dates=1)
CI green on process-hygiene-1-hypothesis-property-tests
User audit + Mark-Ready authorization

https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU

Generated by Claude Code

…variants (#126) Closes #126. Process Hygiene Item #1 (parent epic #125). Adds Hypothesis property- based tests as the new defense line for "untested data-shape assumption" bugs — the class that hid the OSAP quintile/tercile silent-drop in PR #112's CI until production cron diagnostics caught it (subsequently fixed in PR #124 / Phase 4h.2 Part 2). If a `@given` property over `port_count ∈ {2,3,5,10}` had existed in Phase 4h, the hardcoded `port=10` filter would have been falsified the first time the CI ran. Test-addition only. No scoring / feature behavior touched. No schema delta. No CI workflow changes. Sub-task 1 — Hypothesis added to [dev] extra (pyproject.toml) -------------------------------------------------------------- `hypothesis>=6.92` joins `pytest` + `ruff` in the `[dev]` optional extra. Pure-Python dep (no C extensions); CI footprint negligible. Sub-task 2 — Property tests for osap_replicate.py (7 tests, 394 LOC) --------------------------------------------------------------------- New file: tests/test_features/test_osap_replicate_properties.py 7 property tests covering data-shape invariants the Phase 4h.2 Part 2 multi-port adapter must satisfy: 1. `test_compute_long_short_returns_handles_any_port_cardinality` — for port_count ∈ [2, 10] and n_dates ∈ [1, 12], the adapter produces exactly n_dates LS rows with ls_return == port_count - 1. THE headline property — would have caught the PR #112 bug. 2. `test_signals_dropped_no_long_short_returns_sorted_unique` — contract for the Metadata.osap_signals_dropped_no_long_short field: sorted, no duplicates, single-port signals appear, two-port signals don't. 3. `test_normalize_port_label_int_input_yields_2char_zfill` — port=int(1..10) → '01'..'10' for any input list. Idempotent. 4. `test_normalize_port_label_str_input_yields_2char_zfill` — mixed '1' / '01' / '10' inputs normalize to a uniform 2-char width. 5. `test_part2_accounting_invariant_under_random_partition` — the Phase 4h.2 Part 2 accounting equation (manifest = missing + dropped + gated + used) holds for any 3-way partition of a synthetic manifest into the bucket set. Uses st.composite to draw disjoint partitions. 6. `test_coverage_by_signal_returns_pct_in_0_to_100` — domain contract for the coverage helper (0..100 percent, NOT 0..1 fraction). 7. `test_rank_signals_cross_sectional_returns_unit_interval` — ranks live in (0, 1] for any non-empty cross-section. Sub-task 3 — Property tests for scoring transforms (7 tests, 340 LOC) --------------------------------------------------------------------- New file: tests/test_scoring/test_transforms_properties.py 7 property tests covering composite (compute/scoring/composite.py) and OSAP blend (compute/scoring/osap_blend.py) — pure-numeric transforms whose output domains are contract-locked by the downstream Pydantic + TypeScript schemas. Composite tests (4): A. `test_compute_composite_output_bounded_0_to_100` — for any pillar input in [0, 100], composite ∈ [0, 100] (the writer + Pydantic contract) B. `test_compute_composite_all_50_inputs_yield_composite_50` — neutral-pillar input collapses to composite == 50 (catches accidental weight-vector drift) C. `test_compute_composite_neutralize_missing_imputes_nan_to_50` — NaN pillar inputs are imputed when neutralize_missing=True; all-NaN → composite == 50.0 D. `test_compute_composite_constant_input_equals_input` — constant-pillar input → composite == that constant (PHASE3 weight-sum-equals-1.0 invariant expressed as a property) OSAP blend tests (3): E. `test_apply_osap_blend_output_bounded_and_nan_passthrough` — blend ∈ [0, 100]; NaN OSAP → composite passthrough; finite OSAP → interior point between composite and osap F. `test_aggregate_osap_signals_finite_values_in_0_to_100` — finite aggregate values live in [0, 100]; NaN allowed for universe gaps G. `test_apply_osap_blend_weight_zero_is_identity_on_composite` — weight=0 leaves composite unchanged (locks the Phase 4h observability-only design property + Rule 16: Top-5 still ranks raw composite) Sub-task 4 — CI integration + .gitignore + docs ------------------------------------------------- - `.gitignore` already covers `.hypothesis/` at line 50 (Python's default boilerplate) — no edit needed. - CLAUDE.md ## Gotchas — 1-line note that Hypothesis is the new defense line for data-shape bugs (paired with example tests), with the `@settings(deadline=None)` anti-pattern flagged. - CI hypothesis.errors.Flaky behaviour: default profile makes flaky examples fail-fast (no retry); the `pytest -m "not network"` CI invocation inherits this. NO `@settings(deadline=None)` used in this PR — slow examples surface as honest failures. Sanity verification (NOT committed) ----------------------------------- As part of pre-push verification I temporarily reverted the multi- port adapter at compute/features/osap_replicate.py:143 (`agg(["min", "max"])` → `agg(["min", "min"])`) and confirmed `test_compute_long_short_returns_handles_any_port_cardinality` fails with "Falsifying example: port_count=2, n_dates=1". Reverted the break before commit. Constraints honored ------------------- - NO modification to compute_composite() / PHASE3_WEIGHTS sum=1.0 invariant (composite.py:43-45) — pure test-addition PR - Rule 16: Top-5 still ranks raw composite_score; no scoring touched - No push to main; no force-push; no --no-verify - No workflow_dispatch trigger (compute-rankings.yml untouched) - Schema triple untouched (no schemas.py / types.ts changes) - NO @settings(deadline=None) — default deterministic deadline - NO RuleBasedStateMachine (out of scope per issue #126) Test count delta ---------------- Before: 945 passed (Phase 4h.2 Part 2 baseline) After: 959 passed (+14 property tests across 2 new files) Files (4 changed, +747 / 0) ---------------------------- - pyproject.toml — +6 (hypothesis>=6.92 in [dev]) - CLAUDE.md — +7 (## Gotchas note) - tests/test_features/test_osap_replicate_properties.py — +394 NEW - tests/test_scoring/test_transforms_properties.py — +340 NEW Verification ladder all green ------------------------------ - ruff check . → All checks passed - python -m pytest tests/ -m "not network" → 959 passed (1m46s) - python -m pytest tests/test_features/test_osap_replicate_properties.py tests/test_scoring/test_transforms_properties.py → 14 passed (5s) - python -m compute.output.schema_check → in sync (no schema delta) - Sanity break-revert confirmed property test catches a regression No regression discovered ------------------------ Property tests passed on first execution against current main (commit 80c6641, Phase 4h.2 Part 2 already merged). No hidden bugs surfaced beyond the 56-signal gap that PR #124 already fixed — which itself is a good signal that the multi-port adapter handles the [2, 10] cardinality region cleanly. https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU

vercel · 2026-05-20T01:25:22Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
quantrank	Ready	Preview, Comment	May 20, 2026 1:25am

Part of epic #125 (Item #6 of 6). Pure tooling addition — no runtime / scoring / schema impact. Motivation ---------- PR #123 (2026-05-19, closed without merging): a worker session opened a Phase 4j + 4k scout duplicate on branch `claude/resume-quantrank-phase-4.5-Zh0pO` while the main session shipped the same work directly via PRs #119 (Qlib) + #121 (IPCA). Root cause: the worker session never inspected the `claude/*` branch list + recent PRs before writing code, producing 100% wasted effort. This change ships a preflight check that surfaces in-flight scope BEFORE any code is written, so the duplicate-PR failure mode is caught at the handoff-prompt entry rather than at PR review. Files (2 new, +271 LOC) ------------------------ - tools/check_branch_collisions.py (+149 LOC) — git-only preflight script. Lists active `claude/*` branches via `git ls-remote origin "refs/heads/claude/*"` and recent main-branch commits via `git log --since="48 hours ago" --oneline --no-merges origin/main`. Optional keyword args flag case-insensitive substring matches. Always exit 0 (informational only). - .claude/skills/branch-collision-check/SKILL.md (+122 LOC) — skill description with YAML frontmatter, trigger conditions (handoff prompts, Phase / issue / Item #N mentions, fresh worker sessions), skip conditions (doc-only chores, iteration #2+, user-authorized parallel work), sample output (clean + warning), and output-interpretation guidance pointing the caller to STOP + ask the user when any ⚠️ line surfaces. Design notes ------------ - Git-only data sources — no `gh` CLI / GitHub API auth required. Works in the QuantRank Claude Code Web sandbox where `gh` is unavailable, and on any contributor machine with bare git. - 48-hour window — matches typical worker ↔ main session handoff cadence; long enough to catch duplicate work, short enough to keep the output scannable. - Pure read-only — no destructive git ops, no branch creation, no push, no GitHub API mutation. Always returns exit 0; the caller decides whether to proceed. Verification ladder all green ------------------------------ - ruff check . → All checks passed - python tools/check_branch_collisions.py → lists 1 active claude/* branch + 16 recent commits (last 48h), exit 0 - python tools/check_branch_collisions.py "Alpha158" → fires ⚠️ on PR #119 commit "Alpha158 158-feature manifest", summary reports "1 potential scope collision(s) found", exit 0 - python tools/check_branch_collisions.py "Phase 99 nonsense" → no match, summary reports "No scope collisions detected", exit 0 - python tools/check_doc_test_counts.py → exit 0 (Item #2 guard still passes; new files don't introduce hardcoded counts) - python -m compute.output.schema_check → in sync (no schema touch) - python -m pytest tests/ -m "not network" → 959 passed (unchanged; tools/ + .claude/skills/ aren't imported by tests) - SKILL.md YAML frontmatter parses — confirmed via Claude Code's skill registry picking it up at module load Constraints honored ------------------- - No touch to compute/ / frontend/ / tests/ — tools/ + .claude/skills/ only - No network calls / no GitHub API auth — git remote ls + git log - No destructive actions — read-only preflight check - No push to main; no force-push; no --no-verify - No workflow_dispatch trigger (compute-rankings.yml untouched) Epic #125 status after this PR ------------------------------- Item #1 ✅ Hypothesis property tests (PR #127) Item #2 ✅ Strip hardcoded test counts + CI guard (PR #128) Item #4 ✅ Observability-before-wiring pattern (PR #129) Item #6 ✅ Branch-collision preflight (this PR) Items #3, #5 remain — separate PRs per epic decomposition. https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU Co-authored-by: Claude <noreply@anthropic.com>

…ble skills (#132) 3-task housekeeping + tacit knowledge harvest. Docs/skills-only PR — no code, no schema delta, no test additions. Task A — SKILL.md schema-version table fixes --------------------------------------------- Two stale "in flight" entries flipped to merged + 1 new row inserted: - Row 0.9.0-phase4h: "(in flight in PR #112)" → "(PR #112 merged 2026-05-19)" - Row 0.9.1-phase4h.2: "(in flight in PR #<NEXT>)" → "(PR #118 merged 2026-05-19)" - NEW row 0.9.2-phase4h.2 (above 0.9.1) — PR #124 merged, multi-port OSAP adapter + osap_signals_dropped_no_long_short field, closing the 100-signal accounting equation; DSR sign-inversion deferred to Part 3 PHASE_STATUS.md row 4 ALSO has "Phase 4h.2 Part 2 in flight in this PR" staleness — confirmed via grep but DELIBERATELY not updated here per Task A explicit scope (SKILL.md only). Recommend a follow-up phase-status-bump PR after this lands. Task B — New worker-session-handoff skill ------------------------------------------ .claude/skills/worker-session-handoff/SKILL.md (+163 LOC). YAML frontmatter + 5 sections: - When to use vs inline (≤50 LOC single-file → inline; ≥2 files / new dep / code logic → handoff) - Constraint lock library (8 standard locks: composite/PHASE3, Rule 16, Rule 18, no-merge, no force-push, no --no-verify, no workflow_dispatch, schema triple) - Anti-pattern: paste-loop avoidance (single outer code-block fence; reference PR #123 as related-but-distinct paste-loop failure mode) - Template (paste-ready, single ```` outer code block with language tag ` text` so inner triple-backticks pass through) - Reference invocations + QuantRank precedents (PR #124, #127, #131) Codifies the handoff shape that appeared verbatim across PRs #123, #124, #127, #128, #129, #131 — user copies ONE block instead of editing 5 template snippets per handoff. Task C — Portable skills library (4 skills, +417 LOC) ----------------------------------------------------- Audit step (per spec): read CLAUDE.md + AGENTS.md + SKILL.md + WORKFLOW.md + PR descriptions of #112/#118/#124/#127/#128/#129/#131. Identified 7 candidate patterns; classified by portability: - ✅ scout-then-integrate (portable; vendoring pattern, no QR logic) - ✅ observability-before-wiring (portable; gate-diagnostic pattern) - ✅ drift-detector-manifest (portable; API surface lock pattern) - ✅ schema-triple-lockstep (portable; Python/TS JSON contract) - 🟡 annotate-before-veto (portable; progressive rollout — DEFERRED to follow-up issue, lower value vs the 4 shipped) - 🟡 pre-plan-investigations (subsumed by scout-then-integrate's Phase 1 § "Pre-plan investigations" — no separate skill needed) - 🟡 graceful-degradation-try-except (portable; error-handling pattern — DEFERRED to follow-up issue, the wrapper is generally 1-line so doesn't warrant a dedicated skill) 4 shipped (each ≤ 109 LOC): .claude/skills/portable-scout-then-integrate/SKILL.md (99 LOC) .claude/skills/portable-drift-detector-manifest/SKILL.md (109 LOC) .claude/skills/portable-schema-triple-lockstep/SKILL.md (103 LOC) .claude/skills/portable-observability-before-wiring/SKILL.md (106 LOC) Flat naming convention (`portable-<name>/SKILL.md` at depth 1 from `.claude/skills/`) because Claude Code's skill registry doesn't recurse into nested subdirectories per CLAUDE.md ## Conventions. Confirmed via session reload — all 4 portable + worker-session- handoff registered correctly. Each portable skill has: - YAML frontmatter (name + description + TRIGGER + SKIP) - ## Pattern section (generic, no QR business logic) - ## Trigger conditions + ## Skip conditions - ## QuantRank precedent (1 paragraph, clearly labeled as precedent not pattern definition) Task C constraint check: - All portable skills core pattern descriptions are project- agnostic (read `.claude/skills/portable-*/SKILL.md` ## Pattern sections — zero references to OSAP / IPCA / pillar / Top-5 inside the pattern body; only inside the labeled "QuantRank precedent" section at the bottom) - 3 of 4 portable skills are 103-109 LOC (slightly over the 100-LOC target — pattern + trigger + skip + precedent sections require ~25 LOC each, leaving ~25 LOC of unavoidable scaffold). The 99-LOC one (scout-then-integrate) shows the cap is achievable but tight. Files (6 changed, +580 LOC, no deletions) ------------------------------------------ - SKILL.md — schema-version table fixes (Task A) - 5 new SKILL.md files in .claude/skills/ (Tasks B + C) Verification ladder all green ------------------------------ - ruff check . → All checks passed - python tools/check_doc_test_counts.py → exit 0 - python tools/check_branch_collisions.py "skill" "portable" → expected ⚠️ on #131 (own adjacent work, not a duplicate) - python -m compute.output.schema_check → in sync (no schema touch) - python -m pytest tests/ -m "not network" → 959 passed (unchanged; tools/ + .claude/skills/ aren't imported by tests) - Claude Code skill registry pick-up verified via session reload — all 5 new skills (worker-session-handoff + 4 portable-*) appear in the available-skills list Constraints honored ------------------- - No touch to compute/ / frontend/ / tests/ - No touch to PHASE_STATUS.md / WORKFLOW.md (Task A scope = SKILL.md only; PHASE_STATUS.md staleness flagged for follow-up) - No push to main; no force-push; no --no-verify - No workflow_dispatch trigger - Task C portable skills are project-agnostic in their pattern description (QR refs confined to labeled "precedent" sections) Follow-up issue (to file post-merge) ------------------------------------ Title: "Portable Skills Library — extract remaining tacit patterns" - annotate-before-veto (progressive rule rollout) - graceful-degradation-try-except (1-line wrapper guidance) - pre-plan-investigations as standalone (currently subsumed) - Anything else surfaced by future PR descriptions https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU Co-authored-by: Claude <noreply@anthropic.com>

…sk C.1 recovery) (#135) * docs(skills): SKILL.md schema bump + worker-session-handoff + 4 portable skills 3-task housekeeping + tacit knowledge harvest. Docs/skills-only PR — no code, no schema delta, no test additions. Task A — SKILL.md schema-version table fixes --------------------------------------------- Two stale "in flight" entries flipped to merged + 1 new row inserted: - Row 0.9.0-phase4h: "(in flight in PR #112)" → "(PR #112 merged 2026-05-19)" - Row 0.9.1-phase4h.2: "(in flight in PR #<NEXT>)" → "(PR #118 merged 2026-05-19)" - NEW row 0.9.2-phase4h.2 (above 0.9.1) — PR #124 merged, multi-port OSAP adapter + osap_signals_dropped_no_long_short field, closing the 100-signal accounting equation; DSR sign-inversion deferred to Part 3 PHASE_STATUS.md row 4 ALSO has "Phase 4h.2 Part 2 in flight in this PR" staleness — confirmed via grep but DELIBERATELY not updated here per Task A explicit scope (SKILL.md only). Recommend a follow-up phase-status-bump PR after this lands. Task B — New worker-session-handoff skill ------------------------------------------ .claude/skills/worker-session-handoff/SKILL.md (+163 LOC). YAML frontmatter + 5 sections: - When to use vs inline (≤50 LOC single-file → inline; ≥2 files / new dep / code logic → handoff) - Constraint lock library (8 standard locks: composite/PHASE3, Rule 16, Rule 18, no-merge, no force-push, no --no-verify, no workflow_dispatch, schema triple) - Anti-pattern: paste-loop avoidance (single outer code-block fence; reference PR #123 as related-but-distinct paste-loop failure mode) - Template (paste-ready, single ```` outer code block with language tag ` text` so inner triple-backticks pass through) - Reference invocations + QuantRank precedents (PR #124, #127, #131) Codifies the handoff shape that appeared verbatim across PRs #123, #124, #127, #128, #129, #131 — user copies ONE block instead of editing 5 template snippets per handoff. Task C — Portable skills library (4 skills, +417 LOC) ----------------------------------------------------- Audit step (per spec): read CLAUDE.md + AGENTS.md + SKILL.md + WORKFLOW.md + PR descriptions of #112/#118/#124/#127/#128/#129/#131. Identified 7 candidate patterns; classified by portability: - ✅ scout-then-integrate (portable; vendoring pattern, no QR logic) - ✅ observability-before-wiring (portable; gate-diagnostic pattern) - ✅ drift-detector-manifest (portable; API surface lock pattern) - ✅ schema-triple-lockstep (portable; Python/TS JSON contract) - 🟡 annotate-before-veto (portable; progressive rollout — DEFERRED to follow-up issue, lower value vs the 4 shipped) - 🟡 pre-plan-investigations (subsumed by scout-then-integrate's Phase 1 § "Pre-plan investigations" — no separate skill needed) - 🟡 graceful-degradation-try-except (portable; error-handling pattern — DEFERRED to follow-up issue, the wrapper is generally 1-line so doesn't warrant a dedicated skill) 4 shipped (each ≤ 109 LOC): .claude/skills/portable-scout-then-integrate/SKILL.md (99 LOC) .claude/skills/portable-drift-detector-manifest/SKILL.md (109 LOC) .claude/skills/portable-schema-triple-lockstep/SKILL.md (103 LOC) .claude/skills/portable-observability-before-wiring/SKILL.md (106 LOC) Flat naming convention (`portable-<name>/SKILL.md` at depth 1 from `.claude/skills/`) because Claude Code's skill registry doesn't recurse into nested subdirectories per CLAUDE.md ## Conventions. Confirmed via session reload — all 4 portable + worker-session- handoff registered correctly. Each portable skill has: - YAML frontmatter (name + description + TRIGGER + SKIP) - ## Pattern section (generic, no QR business logic) - ## Trigger conditions + ## Skip conditions - ## QuantRank precedent (1 paragraph, clearly labeled as precedent not pattern definition) Task C constraint check: - All portable skills core pattern descriptions are project- agnostic (read `.claude/skills/portable-*/SKILL.md` ## Pattern sections — zero references to OSAP / IPCA / pillar / Top-5 inside the pattern body; only inside the labeled "QuantRank precedent" section at the bottom) - 3 of 4 portable skills are 103-109 LOC (slightly over the 100-LOC target — pattern + trigger + skip + precedent sections require ~25 LOC each, leaving ~25 LOC of unavoidable scaffold). The 99-LOC one (scout-then-integrate) shows the cap is achievable but tight. Files (6 changed, +580 LOC, no deletions) ------------------------------------------ - SKILL.md — schema-version table fixes (Task A) - 5 new SKILL.md files in .claude/skills/ (Tasks B + C) Verification ladder all green ------------------------------ - ruff check . → All checks passed - python tools/check_doc_test_counts.py → exit 0 - python tools/check_branch_collisions.py "skill" "portable" → expected ⚠️ on #131 (own adjacent work, not a duplicate) - python -m compute.output.schema_check → in sync (no schema touch) - python -m pytest tests/ -m "not network" → 959 passed (unchanged; tools/ + .claude/skills/ aren't imported by tests) - Claude Code skill registry pick-up verified via session reload — all 5 new skills (worker-session-handoff + 4 portable-*) appear in the available-skills list Constraints honored ------------------- - No touch to compute/ / frontend/ / tests/ - No touch to PHASE_STATUS.md / WORKFLOW.md (Task A scope = SKILL.md only; PHASE_STATUS.md staleness flagged for follow-up) - No push to main; no force-push; no --no-verify - No workflow_dispatch trigger - Task C portable skills are project-agnostic in their pattern description (QR refs confined to labeled "precedent" sections) Follow-up issue (to file post-merge) ------------------------------------ Title: "Portable Skills Library — extract remaining tacit patterns" - annotate-before-veto (progressive rule rollout) - graceful-degradation-try-except (1-line wrapper guidance) - pre-plan-investigations as standalone (currently subsumed) - Anything else surfaced by future PR descriptions https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU * docs(skills): Vendor karpathy-guidelines (Task C.1 recovery) + THIRD_PARTY_NOTICES.md Recovers Task C.1 from the original handoff that was silent-dropped in the prior PR #132 commit (50da720). The handoff explicitly named "Vendor karpathy-guidelines (1 skill, ~70 LOC)" as part of the portable skills library; the auditor session caught the omission and authorized this follow-up commit on the existing branch. Files (2 new, +138 LOC) ------------------------ - .claude/skills/portable-karpathy-guidelines/SKILL.md (+82 LOC) — vendored content of upstream skills/karpathy-guidelines/SKILL.md (67 LOC, byte-for-byte preserved) + 15-line appended attribution block referencing the upstream source, commit SHA, and the Karpathy tweet that motivated the guidelines. - THIRD_PARTY_NOTICES.md (+56 LOC, NEW at repo root) — third-party license disclosures. Section "karpathy-guidelines (Claude Code skill)" carries source URL, license declaration, vendored path, vendored date, upstream commit SHA, upstream first-commit date, and the full standard MIT License text with copyright attributed to "multica-ai contributors" (upstream has no individual copyright line and no standalone LICENSE file; the `license: MIT` claim appears in upstream README.md § License and each skill's YAML frontmatter). Upstream provenance ------------------- - Source: https://github.com/multica-ai/andrej-karpathy-skills - Upstream HEAD SHA at vendoring: 2c606141936f1eeef17fa3043a72095b4765b9c2 - Upstream first commit: 2026-01-27 - Vendored date: 2026-05-20 - License: MIT (declared) Verbatim content preserved -------------------------- `diff /tmp/karpathy-src/skills/karpathy-guidelines/SKILL.md .claude/skills/portable-karpathy-guidelines/SKILL.md` shows ONLY the 15-line appended attribution block at lines 68-82. The upstream 67-line content (YAML frontmatter + "Karpathy Guidelines" heading + the 4 principles) is byte-for-byte unchanged. Per the spec constraint: "เก็บ 4 principles verbatim. แก้ได้แค่ 'เพิ่ม' attribution block ท้ายไฟล์". License-disclosure caveat ------------------------- Upstream `multica-ai/andrej-karpathy-skills` declares MIT via README + YAML frontmatter but does NOT ship a standalone LICENSE file. The `THIRD_PARTY_NOTICES.md` entry includes the standard MIT License template with copyright attributed to the GitHub org ("multica-ai contributors"), matching the principle that an MIT declaration without a formal copyright line still licenses to the redistributor; the attribution is conservative. Verification ladder all green ------------------------------ - ruff check . → All checks passed - python tools/check_doc_test_counts.py → exit 0 (no test-count drift introduced by this commit) - python tools/check_branch_collisions.py "karpathy" → no scope collisions detected - python -m compute.output.schema_check → in sync (no schema touch) - python -m pytest tests/ -m "not network" → 959 passed (unchanged; .claude/skills/ + THIRD_PARTY_NOTICES.md aren't imported by tests) - Skill registry pickup verified via session reload — `portable-karpathy-guidelines` appears in the available-skills list with the upstream description verbatim Constraints honored ------------------- - No squash / amend of the prior 50da720 commit — this is a fresh commit pushed on top of the existing branch (per spec "ห้าม squash old commit") - No touch to the 4 already-shipped portable skills in 50da720 - No touch to compute/ / frontend/ / tests/ - No push to main; no force-push; no --no-verify - No workflow_dispatch trigger - Karpathy SKILL.md upstream content preserved verbatim; only the attribution block appended below the original content PR description update will follow as a separate `gh pr edit` / MCP `update_pull_request` call so the new "License Compliance" section + the audit-table row for karpathy-guidelines land in the PR body. https://claude.ai/code/session_01T8FE3MAnmk6hcjvH4SgYNU --------- Co-authored-by: Claude <noreply@anthropic.com>

vercel Bot deployed to Preview May 20, 2026 01:25 View deployment

dackclup marked this pull request as ready for review May 20, 2026 02:17

dackclup merged commit 780650f into main May 20, 2026
4 checks passed

dackclup deleted the process-hygiene-1-hypothesis-property-tests branch May 20, 2026 02:17

dackclup mentioned this pull request May 20, 2026

feat(tooling): Cross-session branch-collision check skill (#125) #131

Merged

10 tasks

dackclup mentioned this pull request May 20, 2026

docs(skills): SKILL.md schema bump + worker-session-handoff + 3-5 portable skills #132

Merged

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(features): Add Hypothesis property-based tests for data-shape invariants (#126)#127

test(features): Add Hypothesis property-based tests for data-shape invariants (#126)#127
dackclup merged 1 commit into
mainfrom
process-hygiene-1-hypothesis-property-tests

dackclup commented May 20, 2026

Uh oh!

vercel Bot commented May 20, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dackclup commented May 20, 2026

Summary

Test count before / after

Property tests landed (14)

Sub-task 2 — osap_replicate.py (7 tests, 394 LOC)

Sub-task 3 — scoring transforms (7 tests, 340 LOC)

Sub-task 4 — CI integration + docs

Sanity verification (NOT committed)

No regression discovered

Constraints honored

Files (4 changed, +747 / 0)

Test plan

Uh oh!

vercel Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Sub-task 2 — `osap_replicate.py` (7 tests, 394 LOC)

vercel Bot commented May 20, 2026 •

edited

Loading