Skip to content

fix(verify-output): Section B Tier-2 post-PR-#79 stale expectations (closes #117)#149

Merged
dackclup merged 1 commit into
mainfrom
claude/quantrank-handoff-track-selection-1Z0Hk
May 20, 2026
Merged

fix(verify-output): Section B Tier-2 post-PR-#79 stale expectations (closes #117)#149
dackclup merged 1 commit into
mainfrom
claude/quantrank-handoff-track-selection-1Z0Hk

Conversation

@dackclup
Copy link
Copy Markdown
Owner

@dackclup dackclup commented May 20, 2026

Summary

Closes #117. verify-production-output/helper.py Section B was hard-failing every post-PR-#79 production scan with:

✗ non_reliance_filing: 1 fired (expected 0; flag broken?)
✗ auditor_change: 9 fired (expected 0; flag broken?)

But PR #79 (Phase 4g, 2026-05-15) re-enabled both 8-K Tier-2 defenses by flipping _EIGHT_K_DEFENSES_ENABLED = True. Non-zero fires in the normal cohort band are EXPECTED, not bugs. Cleanup that's been pending since the first 0.9.0-phase4h scan five schema versions ago.

What changed

section_b_tier2() rewrite

Replaces the hard-fail-on-any with soft-band checks against the academic priors each flag was calibrated against:

Flag Calibration source Soft band Hard fail
going_concern_disclosure Mayew 2015 WARN > 5%
non_reliance_filing Schroeder 2024 (4.02 base rate) WARN > 2% tier2_coverage_pct ≤ 5% + fires > 0
auditor_change Cohen-Malloy-Nguyen 2020 WARN > 5% tier2_coverage_pct ≤ 5% + fires > 0

Inverted regression guard

The original "feature flag must hold" contract stays intact — but it now fires the right direction. If _EIGHT_K_DEFENSES_ENABLED ever flips back to False at compute time (proxied by tier2_coverage_pct ≤ 5% since the compute layer doesn't currently emit an explicit tier2_enabled field), then ANY non-zero fire = real bug. If Tier-2 is healthy, soft band check.

SKILL.md

Row 47 (the section-table) + lines 87-95 (Hard contract checks) updated to describe the post-PR-#79 reality.

CLAUDE.md + AGENTS.md

Lockstep update per the convention codified in PR #142. PR #148 also moved from "in flight" → "Recently merged" in CLAUDE.md.

Verification

=== Section B — Tier-2 fired-flag inventory ===
  ✓ going_concern_disclosure: 5 (1.0%)
  ✓ non_reliance_filing: 1 (0.2%)
  ✓ auditor_change: 9 (1.8%)
=== Summary: 0 failures, 0 warnings ===

Run on current production data (commit 3da995dc, 502 stocks). Previously: 2 hard failures on Section B from stale expectations.

Test plan

  • ruff check . — All checks passed
  • python .claude/skills/verify-production-output/helper.py on current production data — 0 failures, 0 warnings
  • Section A-H output sane (every section ✓)
  • CI green

Paired work (separate, not in this PR)

The 2026-05-19 anchor for issue #130 (Process Hygiene Item #5, quarterly cohort-threshold review) is also due. I'll post the quarterly fire-rate audit as a comment on #130 immediately after this PR — that's a comment + issue-body table update, no code change.

Constraints honored


Generated by Claude Code


Generated by Claude Code

…loses #117)

helper.py Section B was hard-failing on `non_reliance_filing` and
`auditor_change` fires with "expected 0; flag broken?" — but PR #79
(Phase 4g, 2026-05-15) re-enabled both 8-K Tier-2 defenses by flipping
`compute/scoring/tier2._EIGHT_K_DEFENSES_ENABLED = True`. Non-zero fires
in the normal cohort band are EXPECTED post-4g, not bugs.

Changes:
- `section_b_tier2()` now takes `metadata` as a second parameter and
  replaces the hard-fail-on-any with a soft-band check against the
  academic cohort priors that calibrated each flag:
    * going_concern_disclosure  — Mayew 2015: 1-3%; WARN > 5%
    * non_reliance_filing        — Schroeder 2024: rare 4.02s; WARN > 2%
    * auditor_change             — Cohen-Malloy-Nguyen 2020: 1-5%; WARN > 5%
- Regression guard inverts: if `tier2_coverage_pct` ≤ 5% (proxy for
  `_EIGHT_K_DEFENSES_ENABLED = False` at compute time) and a flag still
  fires, that's the real bug — keeps the original "feature flag must
  hold" contract intact without flipping it backwards on healthy runs.
- SKILL.md Section B description + Hard contract checks updated.
- CLAUDE.md + AGENTS.md lockstep update per Rule from PR #142.

Verification on current production data (commit `3da995dc`, 502 stocks):
  Section A-H run: 0 failures, 0 warnings (was: 2 failures pre-fix on
  the stale Section B expectations).
@vercel
Copy link
Copy Markdown

vercel Bot commented May 20, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
quantrank Ready Ready Preview, Comment May 20, 2026 9:13am

@dackclup dackclup marked this pull request as ready for review May 20, 2026 10:08
@dackclup dackclup merged commit 630b70f into main May 20, 2026
4 checks passed
@dackclup dackclup deleted the claude/quantrank-handoff-track-selection-1Z0Hk branch May 20, 2026 10:08
dackclup added a commit that referenced this pull request May 20, 2026
…ing cross-ref (#153)

Throwaway PR to dogfood the pre-merge-prod-sim workflow (PR #148 + #149).
The workflow's path filter triggers on `compute/scoring/**` or
`compute/features/**`, but neither prior PR touched those paths — so the
sticky-comment + diff-table composition has never run end-to-end in CI.

This PR adds a 3-line docstring cross-reference in `composite.py` to
epic #150 Phase 3 (pillar correlation analysis + Quality+Profitability
ROE double-counting). The cross-ref documents in code where the next
structural work lives — useful regardless of smoke-test outcome.

Composite logic is unchanged. PHASE3_WEIGHTS unchanged. sum-to-1.0
invariant lock at composite.py:43-45 unchanged.

CLAUDE.md + AGENTS.md lockstep update.

Co-authored-by: Claude <noreply@anthropic.com>
dackclup added a commit that referenced this pull request May 20, 2026
…+ tests (#156)

Phase 1.5 — `section_j_annotate_audit()` auto-tabulates the annotate
surface (`valuation_warnings` list + boolean-True `tier2_events` keys)
across the universe with counts + universe-pct, sorted descending. The
2026-05-20 quarterly cohort audit (issue #130 comment) discovered 10
undocumented flags by manual inventory walk; this section automates
that walk so the next quarterly review (2026-08-19) reads the table
off the helper. Complements Section E (risk_flags veto totals) by
covering the annotate-surface complement; dual-nature flags like
`non_reliance_filing` intentionally appear in both.

Phase 1.4 — `tests/test_verify_helper.py` adds a 9-test regression
suite covering Section A schema reporter (happy + missing-tier2-cov
warn + low-fundamentals-cov fail), the Section B 4-branch Tier-2
matrix from PR #149 (tier2_enabled × within-band vs over-band vs
no-fire vs has-fire-but-disabled), and the new Section J (empty +
populated with descending sort). Helper loads via importlib since it
lives outside any package.

Real-data smoke (502-stock S&P 500 production output): 17 distinct
valuation_warnings flags + 3 tier2_events booleans tabulated, e.g.
value_trap_risk 35.1%, goodwill_heavy 17.9%, auditor_change 1.8%.

Closes epic #150 Phase 1.4 + 1.5 (Phase 1.6 tracker #155 filed
2026-05-20; Phase 1 closes on merge of this PR).

Co-authored-by: Claude <noreply@anthropic.com>
dackclup added a commit that referenced this pull request May 20, 2026
…ield (closes #155) (#160)

Surfaces compute/scoring/tier2._EIGHT_K_DEFENSES_ENABLED into
Metadata.tier2_enabled so verify-production-output/helper.py Section B
branches on the explicit flag instead of inferring from
tier2_coverage_pct > 5%. A future emergency-disable PR will now show up
in the verifier output instead of silently masking itself.

Schema bump 0.9.2-phase4h.2 → 0.9.3-phase4h.3. Pydantic default True
for back-compat with legacy snapshots; TypeScript side optional+nullable
so the stale frontend/public/data/metadata.json snapshot still casts
cleanly. Helper falls back to coverage-based inference when the key is
absent. 3 new regression tests (writer round-trip + Section B explicit-
flag-overrides-coverage matrix).

Closes the last open AC item carried forward from issue #117 (PR #149
deferred). Phase 1 of epic #150 fully closed.

https://claude.ai/code/session_01Nj5sMzisnqDmF46g5ckEJn

Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

verify-production-output/helper.py — Section B Tier-2 expectations stale (expects pre-4g state)

2 participants