docs(phase4.5): add Earnings-Manipulation Defense Cluster to the v1.x roadmap by dackclup · Pull Request #86 · dackclup/quantrank

dackclup · 2026-05-16T09:13:25Z

Summary

User: "อยากทำให้ระบบป้องกันได้แน่นหนาที่สุด ช่วยเสนอแผนมาหน่อย" → "รวมแผนเข้าไปในแผนการทำ app".

Folds the earnings-manipulation-defense proposal into the existing phase-tracker triple as a new Phase 4.5 → v1.2.0 cluster, inserted between Phase 4 (factor consolidation → v1.1) and Phase 5 (ML meta-learner).

Phase 4.5 sub-PRs (~10-11 working weeks)

Sub-PR	New defenses	Effort
4.5a (3 sub-PRs in parallel)	Sector-relative Sloan + Beneish soft-veto (M > −1.78) + Dechow soft-veto (F > 3.0) + `manipulation_triple_flag` joint badge	~180 LOC, 1-2w
4.5b	`restatement_history` 10-K/A scan + `late_filing_notification` Form 12b-25	~270 LOC, 1w
4.5c	Roychowdhury REM 3-proxy (`abnormal_CFO` + `abnormal_production` + `abnormal_discretionary_exp`)	~250 LOC, 2w
4.5d	`m_score_deteriorating` 3y momentum + Burgstahler-Dichev kink-at-zero `loss_avoidance_pattern`	~180 LOC, 2w
4.5e	New SEC Form 4 parser + `insider_sell_cluster` + `c_suite_unusual_sell`	~420 LOC, 3w
4.5f	`manipulation_index` 0-100 composite + composite-score penalty + UI pillar card + README Honest Limitations + schema bump → `0.8.0-phase4.5f`	~250 LOC, 1w

Defense layer after 4.5: 5 → 7 active vetoes; 4 → 11 annotates; 9 → 18 total layers.

Validation harness (cross-cutting)

SEC AAER list 2000-2024 (~600 confirmed manipulators per Dechow et al. 2011 + ongoing). Public.
Audit Analytics restatement subset (~1,200 firms 2000-2024).
PBO ≤ 0.5 AND DSR > 0 gate per addition (Bailey-López de Prado-Zhu 2014 CSCV — already in PR 4b §2 scope per issue #75).
Purged + embargoed walk-forward CV (López de Prado 2018).

Sequencing

PR 4b (defense-infrastructure) — MUST land first; provides the PBO/DSR + AAER cohort fixtures
v1.1.0-phase4: 4h/4i/4j/4k factor integrations (OSAP / JKP / Qlib / IPCA)
v1.2.0-phase4.5: 4.5a (3 parallel sub-PRs) → 4.5b + 4.5c (parallel) → 4.5d → 4.5e → 4.5f
Factor 4h/4i/4j/4k can overlap 4.5 (disjoint files, same harness)

Doc lockstep (per phase-status-bump skill)

File	Change
`PHASE_STATUS.md`	New row in Phase Overview table + full "Phase 4.5 plan" detail section with all 6 sub-PRs + acceptance criteria + sequencing notes
`WORKFLOW.md`	New row in Phase Overview header table + full "PHASE 4.5" task section (~150 lines) with checkboxes per sub-PR + Defense Roadmap table extended with 4.5a-4.5f rows + updated calendar totals for v1.0 / v1.1 / v1.2 / v2.0
`CLAUDE.md`	§Phase status — Phase 4.5 named as next-after-PR-4b; defense-layer count delta (9 → 18) called out; issue #7 (Sloan sector-relative) noted as folded into 4.5a.1

Not in this PR

SKILL.md — no current-state constant change yet (schema / veto count / rule additions land per sub-PR as they ship, not at plan time)
Code changes (compute/, frontend/, tests/) — plan only
.claude/skills/phase-4.5/ sub-skill PLAN.md stubs — land per sub-PR at 4.5a kickoff

Test plan

No code changes; CI runs only doc-level lint
Phase 4.5 row appears in PHASE_STATUS.md table
WORKFLOW.md "PHASE 4.5" task section parses cleanly (markdown checkboxes render)
CLAUDE.md cross-references resolve (PHASE_STATUS.md → §"Phase 4.5 plan")

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2

Generated by Claude Code

… roadmap User: "อยากทำให้ระบบป้องกันได้แน่นหนาที่สุด ช่วยเสนอแผนมาหน่อย" → "รวมแผนเข้าไปในแผนการทำ app". Folds the manipulation-defense proposal into the existing phase-tracker triple as a new **Phase 4.5 → v1.2.0** cluster inserted between Phase 4 (factor consolidation → v1.1.0) and Phase 5 (ML meta-learner). ## Why this is its own phase, not folded into Phase 4 - **Phase 4 = factor focus** (OSAP / JKP / Qlib / IPCA). Ships v1.1. - **Phase 4.5 = manipulation focus** (Sloan/Beneish/Dechow/REM/insider). Touches disjoint code paths. - Splitting keeps release themes clean and lets v1.1 ship sooner. - Phase 4.5 sub-PRs **can run in parallel with Phase 4 factor integrations** (4h/4i/4j/4k) since they touch different layers and share the PR 4b PBO/DSR validation harness. ## Phase 4.5 sub-PRs (6 total, ~10-11 weeks full-time) | Sub-PR | New defenses | Effort | |---|---|---| | 4.5a (3 sub-PRs in parallel) | Sector-relative Sloan + Beneish/Dechow soft-veto + `manipulation_triple_flag` joint badge | ~180 LOC, 1-2w | | 4.5b | `restatement_history` (Hennes-Leone-Miller 2008) + `late_filing_notification` Form 12b-25 (Bartov-Lai-Yeung 2002) | ~270 LOC, 1w | | 4.5c | Roychowdhury REM 3-proxy (`abnormal_CFO` + `abnormal_production` + `abnormal_discretionary_exp`) | ~250 LOC, 2w | | 4.5d | `m_score_deteriorating` 3y momentum + Burgstahler-Dichev kink at zero | ~180 LOC, 2w | | 4.5e | New SEC Form 4 parser + `insider_sell_cluster` (Cohen-Malloy-Pomorski 2012) + `c_suite_unusual_sell` | ~420 LOC, 3w | | 4.5f | `manipulation_index` 0-100 composite + composite-score penalty + UI pillar card + README Honest Limitations + schema bump 0.7.x → 0.8.0-phase4.5f | ~250 LOC, 1w | **Defense layer after 4.5**: 5 → 7 active vetoes; 4 → 11 annotates; 9 → **18 total layers** (verifiable via defense-scorecard skill). ## Validation cohort (used per sub-PR) - SEC AAER list 2000-2024 (~600 confirmed manipulators per Dechow et al. 2011 dataset + ongoing). Public, free. - Audit Analytics restatement subset (~1,200 firms 2000-2024) as second-source. - PBO ≤ 0.5 AND DSR > 0 gate per addition (Bailey-de Prado-Zhu 2014 CSCV harness — already in PR 4b §2 scope per issue #75). - Purged + embargoed walk-forward CV (López de Prado 2018). ## Doc lockstep (per phase-status-bump skill) | File | Change | |---|---| | `PHASE_STATUS.md` | New row in Phase Overview table + full "Phase 4.5 plan" section with all 6 sub-PRs + acceptance criteria + sequencing | | `WORKFLOW.md` | New row in Phase Overview header table + full "PHASE 4.5 — Earnings-Manipulation Defense Cluster" task section with checkboxes per sub-PR + Defense Roadmap table extended with 4.5a-4.5f rows + updated calendar totals for v1.1 / v1.2 / v2.0 | | `CLAUDE.md` | §Phase status updated — Phase 4.5 named as the next deliverable after PR 4b → v1.1.0; defense-layer count delta (9 → 18) called out; issue #7 noted as folded into 4.5a | ## Not touched - `SKILL.md` — no current-state constant change yet (schema / veto count / rule additions land per sub-PR as they ship, not at plan time). Rule 16 (annotate-and-veto-Top-N) covers 4.5a's defense pattern; new rule if `manipulation_index` composite-penalty becomes a Phase 5+ adopted pattern. - Code (`compute/`, `frontend/`, `tests/`) — plan only. - `.claude/skills/phase-4.5/` — sub-skill PLAN.md stubs land per sub-PR at the start of 4.5a kickoff. ## Sequencing reminder 1. PR 4b (defense-infrastructure) — MUST land first 2. v1.1.0-phase4: 4h/4i/4j/4k factor integrations 3. v1.2.0-phase4.5: 4.5a → 4.5b + 4.5c (parallel) → 4.5d → 4.5e → 4.5f 4. (Factor 4h/4i/4j/4k can also overlap 4.5 — disjoint files) https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2

vercel · 2026-05-16T09:13:31Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
quantrank	Ready	Preview, Comment	May 16, 2026 9:13am

… only §3 polish remains (#87) During run #45 verification (defense-scorecard skill) the `cross_source_disagreement` flag surfaced 23 stocks in production — evidence that PR 4b §1 had already shipped. A `git log` confirms **PR #60 ("feat(defense): cross-source validator + PBO/DSR + IC- decay infra (PR 4b)") merged on 2026-05-14**, before v1.0.0. Issue #75 was filed on 2026-05-15 (one day after #60 merged) to track the remaining acceptance criteria — but PR #81 and PR #86 mistakenly treated "PR 4b" as fully not-yet-started. This commit re-aligns the triple-doc with the actual state. ## What PR #60 actually shipped (already in production) | Sub-section | Status | Evidence | |---|---|---| | **§1 Cross-source validator** | ✅ DONE | `compute/ingest/cross_source.py` exists; wired in `compute/main.py:979-988`; run #45 shows 23/502 stocks (4.6%) flagging `cross_source_disagreement` annotate | | **§2 PBO/DSR library** | ✅ DONE | `compute/validation/pbo_dsr.py` exists with CSCV + DSR + Beasley-Springer-Moro inverse normal CDF; `factor_passes_gates()` entry point ready | | **§3 IC-decay monitor** | 🟡 PARTIAL | `compute/validation/ic_decay.py` exists; **`decay_report.json` writer NOT wired** + no UI transparency surface — exactly the 2 unchecked acceptance criteria on issue #75 | ## Real "next deliverable" — PR 4b §3 polish (~2-3 days) 1. **Writer wiring** — call `ic_decay.run()` from `compute/main.py` after pillar normalization; write per-pillar decay table to `frontend/public/data/decay_report.json` via new writer in `compute/output/writer.py` (atomic temp → rename pattern). 2. **UI transparency surface** — new `DecayReportCard.tsx` on the stock detail page below `PillarRadarChart`. Reads `decay_report.json` client-side (fail-soft if absent), shows 8-pillar 12m + 36m IC trend + decay-alert badges per pillar. Effort: ~150 LOC writer + UI + ~80 LOC tests. After §3 polish ships: → 4h/4i/4j/4k factor integrations (each gated by the now-complete PBO/DSR harness) → tag v1.1.0-phase4. ## File-by-file changes | File | Change | |---|---| | `CLAUDE.md` | §Phase status — "Next deliverable" reframed as PR 4b §3 polish only. Production verification numbers cited (23 stocks, 4.6%). Issue #75 description updated to "remaining items: writer + UI surface". | | `PHASE_STATUS.md` | Phase 4 table row reflects PR 4b §1+§2 merged via PR #60. The long "Next deliverable: PR 4b — defense-infrastructure" block is replaced by a 3-row sub-section status table (§1 ✅ / §2 ✅ / §3 🟡) + a tighter "PR 4b §3 polish" next-deliverable scope description. | | `WORKFLOW.md` | Phase 4 Acceptance Criteria — 3 cross-source / PBO-DSR / IC-decay checkboxes flipped from `[ ]` to `[x]` (with footnotes on PR #60 + run #45 evidence) for §1+§2; §3 stays `[ ]` with the remaining-items breakdown. | ## Doc audit trail This is the second triple-doc fix this week: - PR #81: marked 4g ✅ DONE (correct) - PR #86: added Phase 4.5 roadmap + named PR 4b as next (WRONG — this commit corrects) - This PR: PR 4b §1+§2 ✅ DONE; §3 polish is the real next The cause of the PR #81/#86 error: PR #60 merged 2026-05-14 used the literal title "PR 4b" — but the `PHASE_STATUS.md` PR-label slot "4b" was already taken by the `_avg_3y_roe` fix (different content, same letter). Two unrelated PRs both calling themselves "PR 4b" created the confusion. Going forward we should disambiguate by always referring to it as "PR 4b defense-infrastructure" or "issue #75" when meaning the cross-source/PBO/IC-decay work. ## Verification - No code changes; docs only - `grep "PR 4b defense-infrastructure (issue #75) next"` — returns 0 hits - `grep "PR 4b §3 polish"` — appears in CLAUDE.md + PHASE_STATUS.md (the new wording) - WORKFLOW.md Phase 4 acceptance checklist: 3 items flipped to [x] with evidence https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2 Co-authored-by: Claude <noreply@anthropic.com>

…oses issue #7) (#89) User: "อยากทำให้ระบบป้องกันได้แน่นหนาที่สุด" → "รวมแผนเข้าไปในแผน การทำ app" — Phase 4.5 manipulation-defense cluster sub-PR 1 of 6. ## Problem (issue #7) `sloan_accruals_top_decile` ใช้ **cross-sectional** top-decile — ไม่คิดถึง sector. Sloan accruals = (NI − CFO) / TotalAssets วัด "earnings ที่ไม่ใช่เงินสด" ซึ่งโดยโครงสร้างแตกต่างกันตาม sector: - **Financials** มี non-cash items สูง (loan-loss provisions, fair-value adjustments) — ไม่ใช่ manipulation - **REITs** มี D&A สูง — โครงสร้างของ business model - **Tech** มี SBC + working-capital build จาก growth — ไม่ใช่ manipulation Cross-sectional top-decile จึง over-fire ที่ Financials/REITs. Production run #45 confirms — **Financials 16/75 = 21.3% flagged** vs expected ~10% (ผ่าน 2× over-fire). IT/Comm/REITs ก็เลื่อนสูง กว่า baseline เล็กน้อย; Staples/Utilities ต่ำกว่ามาก (3% / 2.8%). Sector spread 21.3% / 2.8% = 7.6× — เกินที่ Sloan paper บ่งบอก. ## Fix `compute_risk_flags()` คำนวณ Sloan top-decile threshold **per sector** เมื่อ `sectors` ถูกส่งเข้ามา (production ส่งเสมอ). Sectors ที่มีขนาด < `SLOAN_MIN_POPULATION_SECTOR=15` (เช่น future S&P 1500 sub-sectors) fall back ไป cross-sectional threshold. Mirrors NSI per-sector pattern ที่มีอยู่แล้ว — เพิ่ม dict `sloan_thresholds_by_sector` คู่ขนานกับ `nsi_thresholds_by_sector` และเปลี่ยน per-ticker check ให้ prefer sector threshold แล้วค่อย fall back ไป cross-sectional. ## Expected production impact | Sector | Current | After 4.5a.1 (~10%) | Δ | |---|---|---|---| | Financials (75) | 16 (21.3%) | ~8 (10%) | **−8** | | IT (73) | 8 (11.0%) | ~7 (10%) | −1 | | Comm (23) | 3 (13.0%) | ~2 (10%) | −1 | | Cons.Disc (48) | 5 (10.4%) | ~5 (10%) | 0 | | Health (59) | 6 (10.2%) | ~6 (10%) | 0 | | Real Estate (31) | 3 (9.7%) | ~3 (10%) | 0 | | Industrials (79) | 6 (7.6%) | ~8 (10%) | +2 | | Energy (21) | 1 (4.8%) | ~2 (10%) | +1 | | Materials (26) | 1 (3.8%) | ~3 (10%) | +2 | | Utilities (31) | 1 (3.2%) | ~3 (10%) | +2 | | Cons.Staples (36) | 1 (2.8%) | ~4 (10%) | +3 | Net: **51 → ~50** flagged (roughly stable total) but **per-sector rate stabilizes at ~10%** instead of the 7.6× spread today. The flag now means "this stock is in the worst Sloan decile **of its own sector peer group**", which is the correct earnings-quality signal per Sloan 1996. Tier-3 defenses (Beneish + Dechow) catch the residual sector- agnostic manipulation cases that don't show up in sector-relative Sloan (4.5a.2 + 4.5a.3 will soft-veto promote them). ## Implementation | File | Change | |---|---| | `compute/scoring/risk_overlay.py` | New `SLOAN_MIN_POPULATION_SECTOR = 15` constant. Module docstring updated to flag PR 4.5a.1 + issue #7 close. `compute_risk_flags()` builds `sloan_thresholds_by_sector` dict when `sectors` is supplied (mirrors NSI pattern). Per-ticker check prefers sector threshold → cross-sectional fallback → skip. | | `tests/test_scoring/test_risk_overlay.py` | 3 new tests: `test_sloan_sector_relative_top_decile_when_sectors_supplied` (top in EACH of 2 sectors flagged; bottom in each NOT flagged), `test_sloan_sector_relative_skips_undersized_sector` (small-sector tickers fall back to cross-sectional), `test_sloan_sector_relative_floor_constant` (sanity: floor ≥ 10 and ≥ SLOAN_MIN_POPULATION). | ## Backward compat - Existing tests that call `compute_risk_flags(snaps)` without `sectors` continue to work — cross-sectional fallback is the same code path as v1.0. - `sectors` arg was already optional + production already passes `sectors=sectors_dict` to `compute_risk_flags` (compute/main.py:830). - No schema change. No new flag identifier — the flag name `sloan_accruals_top_decile` is unchanged (the threshold rubric is the implementation detail). ## Verification ladder - ✅ ruff check . — clean - ✅ pytest tests/ -m "not network" — **775 passed** (was 772) - ✅ schema_check — not touched (no schema delta) - ⏳ Production verification deferred to next weekly compute — will re-run `defense-scorecard` + `verify-production-output` Section E vs current 51-flag baseline after the next workflow_dispatch lands ## Closes / references - Closes [issue #7](#7) (Sloan over-fire on growers + Financials) - First sub-PR of Phase 4.5 manipulation-defense cluster per PR #86 - Defense layer count unchanged (5 active vetoes), defense **quality** improves — same number of flagged stocks but now correctly sector-distributed ## Future (not in this PR) - 4.5a.2: Beneish soft-veto (M > −1.78 threshold) - 4.5a.3: Dechow soft-veto + `manipulation_triple_flag` joint gate - AAER cohort backtest via the PR 4b §2 PBO/DSR harness (deferred until cohort fixtures land in 4.5c kickoff) https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2 Co-authored-by: Claude <noreply@anthropic.com>

…tered_top5 suppression) (#90) Phase 4.5 manipulation-defense cluster sub-PR 2 of 6 per PR #86. Promotes `beneish_high` (M > -2.22 annotate, PR 3e.1) to a **second tier** active veto at the stricter threshold `M > -1.78`. ## Threshold rationale Beneish 1999 *FAJ* original cutoff is M > -2.22 — Type-I ~17%, Type-II ~24%. The PR 3e.1 annotate uses this threshold as the "is this stock worth a closer look" signal (low precision, high recall, doesn't suppress Top-5). PR 4.5a.2 adds a STRICTER cutoff M > -1.78 for the **active veto** path. Beneish 1999 paper Table 4 shows positive-predictive-value crosses ~60% at M > -1.78 on the original 74-manipulator sample; below that the precision drops into FP-heavy territory. The stricter cutoff mirrors PR 3d's `non_reliance_filing` veto trade- off — high precision, narrower recall, won't dilute Top-5 with marginal annotators. Tickers in the -2.22 to -1.78 band keep ONLY the annotate flag (no veto). Tickers above -1.78 get BOTH the annotate AND the veto. ## Production estimate (run #46, c=737d8efe) | Threshold | Coverage | Flagged | |---|---|---| | M > -2.22 (annotate, existing) | 160/502 (32% covered) | 26 (16.2%) | | **M > -1.78 (veto, NEW)** | same 160 | **11 (6.9%)** | New vetoes that will fire (top 11 in M-score order): SMCI · WAT · PODD · WDC · NVDA · CAT · PLTR · SNDK · BG · STX · LLY Most are growth tech (SMCI, NVDA, PLTR, SNDK, STX) where Beneish 1999 acknowledged growers can FP. The Tier-3 forensic posture documented in PR 3e.1 is exactly the right framing — the veto is high-precision-narrow-recall; growers that show up here usually warrant a closer look even if not all are confirmed manipulators. ## Architecture | Layer | Before | After | |---|---|---| | Active vetoes | 5 (altman / sloan / NSI / non_reliance / data_quality) | **6** (+ `beneish_manipulation_veto`) | | Annotate flags | `beneish_high` at M > -2.22 (unchanged) | + nothing new | `compute_risk_flags(beneish_m_scores=...)` is the inject pattern — mirrors `non_reliance_by_ticker`. `compute/main.py` pre-computes all 502 Beneish results BEFORE the risk_flag pass so the veto can suppress entered_top5; the existing per-ticker loop (Step 8) reads the cached `beneish_results[ticker]` instead of recomputing (performance neutral — one compute per ticker, was already two before this refactor). ## Files changed | File | Change | |---|---| | `compute/scoring/beneish.py` | + `BENEISH_VETO_THRESHOLD = -1.78` constant + docstring rationale (paper Table 4 PPV crossover, parallel to non_reliance_filing trade-off) | | `compute/scoring/risk_overlay.py` | + `beneish_m_scores` kwarg on `compute_risk_flags`. New veto check at the end of the per-ticker loop emits `beneish_manipulation_veto` when M > threshold. Imports `BENEISH_VETO_THRESHOLD`. | | `compute/main.py` | Pre-compute `beneish_results` dict + `beneish_m_scores` dict before `compute_risk_flags` call. Pass to compute_risk_flags. Per-ticker Step-8 loop reads from cached `beneish_results[ticker]` (no double-compute). | | `tests/test_scoring/test_risk_overlay.py` | 4 new tests: veto fires above strict threshold, skipped on None m_score, strict inequality at exact threshold, disabled when dict not supplied (backward-compat). | ## Backward compat - `beneish_m_scores` arg is **optional**. Existing callers without it (tests, future external users) see no behavior change. - `beneish_high` annotate at M > -2.22 **unchanged** — old flag still fires for ranks below the veto band. - `StockDetail.beneish_m_score` numeric field unchanged. ## Verification ladder - ✅ `ruff check .` — clean - ✅ `pytest tests/ -m "not network"` — **779 passed** (was 775; 4 new) - ✅ schema_check — N/A (no schema delta) - ⏳ Production verification deferred to next workflow_dispatch — expect: - 11 new tickers with `beneish_manipulation_veto` in `risk_flags` - Top-5 rotation re-shuffles if NVDA/SMCI/WAT/etc were in raw-top-5 (NVDA in run #46 raw-top-5 at #3 — will be suppressed with Sloan + new Beneish veto) - Active veto count metadata 5 → 6 (verify via `defense-scorecard`) ## Sibling sub-PRs (Phase 4.5a wave) - ✅ 4.5a.1 (Sloan sector-relative) — merged PR #89 - **4.5a.2 (this PR)** — Beneish soft-veto - ⬜ 4.5a.3 (Dechow soft-veto + `manipulation_triple_flag`) — next, branches off this PR or off main after merge https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2 Co-authored-by: Claude <noreply@anthropic.com>

Phase 4.5 manipulation-defense cluster sub-PR 3 of 6 per PR #86. Final 4.5a wave member. Branched off PR #90 (4.5a.2 Beneish veto) to integrate cleanly. Depends on 4.5a.2 merging first. ## Two additions ### 1. Dechow F-score soft-veto (F > 3.0) Promotes `dechow_high` (F > 2.45 annotate, PR 3e.2) to a second-tier active veto at the stricter threshold F > 3.0. Mirrors PR 4.5a.2's Beneish veto pattern exactly. Threshold rationale: Dechow 2011 *CAR* Table 7 shows that at F > 3.0 the AAER hit rate exceeds 4× baseline (vs ~2× at F > 2.45). The stricter cutoff matches the precision/recall trade-off PR 4.5a.2 locked for Beneish and PR 3d locked for non_reliance_filing — high precision, narrower recall, won't dilute Top-5. ### 2. `manipulation_triple_flag` joint-gate badge Fires when Sloan + Beneish-high + Dechow-high all flag on the same ticker. Rare but high-confidence — 2 tickers in run #46: - **SMCI**: F=6.65 (Dechow veto fires too) + Sloan + Beneish high - **WAT**: Sloan + Beneish high + Dechow high (annotates only) UI-only badge in `valuation_warnings`; does NOT stack a third veto on top of the individual component vetoes. Per PR #86 plan §4.5a.3. ## Production estimate (run #46) | Threshold | Coverage | Flagged | |---|---|---| | F > 2.45 (annotate, existing) | 157/502 (31% covered) | 2 (1.3%) | | **F > 3.0 (veto, NEW)** | same 157 | **1 (0.6%)** | | **manipulation_triple_flag** | full universe | **2** | The veto layer expects after this PR ships: - 4.5a.1 (merged): Sloan sector-relative, 51 → ~56 - 4.5a.2 (PR #90): + Beneish veto, 11 new flags - **4.5a.3 (this PR)**: + Dechow veto, **1** new flag (SMCI overlaps with Beneish veto on SMCI — Top-5 suppression stacks but the effective count of NEW suppressions is 1 unique ticker, since SMCI already loses entered_top5 from the Beneish veto) - Active vetoes: **5 → 7** (Beneish + Dechow added) - Annotates: + manipulation_triple_flag = **+1 reason taxonomy** ## Architecture | Layer | Before 4.5a wave | After 4.5a wave | |---|---|---| | Active vetoes | 5 | **7** (+ beneish_manipulation_veto, dechow_manipulation_veto) | | Tier-3 annotates | `beneish_high` + `dechow_high` at looser thresholds | unchanged (kept for the soft band) | | Joint gates | none | **+ `manipulation_triple_flag`** (3-of-3 joint) | ## Files changed | File | Change | |---|---| | `compute/scoring/dechow_f.py` | + `DECHOW_VETO_THRESHOLD = 3.0` constant + docstring rationale (Dechow 2011 Table 7 4× baseline crossover) | | `compute/scoring/risk_overlay.py` | + `dechow_f_scores` kwarg on `compute_risk_flags`. New veto check at end of per-ticker loop, immediately after Beneish. Imports `DECHOW_VETO_THRESHOLD`. | | `compute/main.py` | Pre-compute `dechow_results` dict + `dechow_f_scores` dict alongside Beneish (one combined pass). Pass to compute_risk_flags. Step-8 per-ticker loop reads from cached `dechow_results[ticker]` (no double-compute). + `manipulation_triple_flag` joint-gate logic appended after Dechow annotate emission. | | `tests/test_scoring/test_risk_overlay.py` | 5 new tests: veto fires above threshold, skipped on None, strict inequality, disabled when dict not supplied, Beneish + Dechow co-firing independence. | ## Backward compat - `dechow_f_scores` arg optional. Existing callers without it unchanged. - `dechow_high` annotate at F > 2.45 unchanged. - `StockDetail.dechow_f_score` numeric field unchanged. - `manipulation_triple_flag` is in `valuation_warnings` (annotate) not `risk_flags` — doesn't change Top-N suppression on top of component vetoes. UI must opt in to render it. ## Verification ladder - ✅ `ruff check .` — clean - ✅ `pytest tests/ -m "not network"` — **784 passed** (was 779 on 4.5a.2 branch; +5 new) - ✅ schema_check — N/A (no schema delta — `manipulation_triple_flag` is a string in existing `valuation_warnings: list[str]`) - ⏳ Production verification deferred — expect: - 1 new `dechow_manipulation_veto` (SMCI) - 2 `manipulation_triple_flag` annotates (SMCI, WAT) - 7 active vetoes total ## Sibling sub-PRs (Phase 4.5a wave — COMPLETE after this PR) - ✅ **4.5a.1** Sloan sector-relative — merged PR #89 - 🟡 **4.5a.2** Beneish soft-veto — open PR #90 - **4.5a.3 (this PR)** — Dechow soft-veto + manipulation_triple_flag Next: **4.5b** (restatement_history + late_filing_notification), **4.5c** (Roychowdhury REM), **4.5d** (m-score momentum + Burgstahler kink), **4.5e** (Form 4 insider clustering), **4.5f** (composite manipulation_index + UI + schema bump). https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2

#91) Phase 4.5 manipulation-defense cluster sub-PR 3 of 6 per PR #86. Final 4.5a wave member. Branched off PR #90 (4.5a.2 Beneish veto) to integrate cleanly. Depends on 4.5a.2 merging first. ## Two additions ### 1. Dechow F-score soft-veto (F > 3.0) Promotes `dechow_high` (F > 2.45 annotate, PR 3e.2) to a second-tier active veto at the stricter threshold F > 3.0. Mirrors PR 4.5a.2's Beneish veto pattern exactly. Threshold rationale: Dechow 2011 *CAR* Table 7 shows that at F > 3.0 the AAER hit rate exceeds 4× baseline (vs ~2× at F > 2.45). The stricter cutoff matches the precision/recall trade-off PR 4.5a.2 locked for Beneish and PR 3d locked for non_reliance_filing — high precision, narrower recall, won't dilute Top-5. ### 2. `manipulation_triple_flag` joint-gate badge Fires when Sloan + Beneish-high + Dechow-high all flag on the same ticker. Rare but high-confidence — 2 tickers in run #46: - **SMCI**: F=6.65 (Dechow veto fires too) + Sloan + Beneish high - **WAT**: Sloan + Beneish high + Dechow high (annotates only) UI-only badge in `valuation_warnings`; does NOT stack a third veto on top of the individual component vetoes. Per PR #86 plan §4.5a.3. ## Production estimate (run #46) | Threshold | Coverage | Flagged | |---|---|---| | F > 2.45 (annotate, existing) | 157/502 (31% covered) | 2 (1.3%) | | **F > 3.0 (veto, NEW)** | same 157 | **1 (0.6%)** | | **manipulation_triple_flag** | full universe | **2** | The veto layer expects after this PR ships: - 4.5a.1 (merged): Sloan sector-relative, 51 → ~56 - 4.5a.2 (PR #90): + Beneish veto, 11 new flags - **4.5a.3 (this PR)**: + Dechow veto, **1** new flag (SMCI overlaps with Beneish veto on SMCI — Top-5 suppression stacks but the effective count of NEW suppressions is 1 unique ticker, since SMCI already loses entered_top5 from the Beneish veto) - Active vetoes: **5 → 7** (Beneish + Dechow added) - Annotates: + manipulation_triple_flag = **+1 reason taxonomy** ## Architecture | Layer | Before 4.5a wave | After 4.5a wave | |---|---|---| | Active vetoes | 5 | **7** (+ beneish_manipulation_veto, dechow_manipulation_veto) | | Tier-3 annotates | `beneish_high` + `dechow_high` at looser thresholds | unchanged (kept for the soft band) | | Joint gates | none | **+ `manipulation_triple_flag`** (3-of-3 joint) | ## Files changed | File | Change | |---|---| | `compute/scoring/dechow_f.py` | + `DECHOW_VETO_THRESHOLD = 3.0` constant + docstring rationale (Dechow 2011 Table 7 4× baseline crossover) | | `compute/scoring/risk_overlay.py` | + `dechow_f_scores` kwarg on `compute_risk_flags`. New veto check at end of per-ticker loop, immediately after Beneish. Imports `DECHOW_VETO_THRESHOLD`. | | `compute/main.py` | Pre-compute `dechow_results` dict + `dechow_f_scores` dict alongside Beneish (one combined pass). Pass to compute_risk_flags. Step-8 per-ticker loop reads from cached `dechow_results[ticker]` (no double-compute). + `manipulation_triple_flag` joint-gate logic appended after Dechow annotate emission. | | `tests/test_scoring/test_risk_overlay.py` | 5 new tests: veto fires above threshold, skipped on None, strict inequality, disabled when dict not supplied, Beneish + Dechow co-firing independence. | ## Backward compat - `dechow_f_scores` arg optional. Existing callers without it unchanged. - `dechow_high` annotate at F > 2.45 unchanged. - `StockDetail.dechow_f_score` numeric field unchanged. - `manipulation_triple_flag` is in `valuation_warnings` (annotate) not `risk_flags` — doesn't change Top-N suppression on top of component vetoes. UI must opt in to render it. ## Verification ladder - ✅ `ruff check .` — clean - ✅ `pytest tests/ -m "not network"` — **784 passed** (was 779 on 4.5a.2 branch; +5 new) - ✅ schema_check — N/A (no schema delta — `manipulation_triple_flag` is a string in existing `valuation_warnings: list[str]`) - ⏳ Production verification deferred — expect: - 1 new `dechow_manipulation_veto` (SMCI) - 2 `manipulation_triple_flag` annotates (SMCI, WAT) - 7 active vetoes total ## Sibling sub-PRs (Phase 4.5a wave — COMPLETE after this PR) - ✅ **4.5a.1** Sloan sector-relative — merged PR #89 - 🟡 **4.5a.2** Beneish soft-veto — open PR #90 - **4.5a.3 (this PR)** — Dechow soft-veto + manipulation_triple_flag Next: **4.5b** (restatement_history + late_filing_notification), **4.5c** (Roychowdhury REM), **4.5d** (m-score momentum + Burgstahler kink), **4.5e** (Form 4 insider clustering), **4.5f** (composite manipulation_index + UI + schema bump). https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2 Co-authored-by: Claude <noreply@anthropic.com>

#92) Phase 4.5a manipulation-defense quick-wins shipped 2026-05-16 across 3 sub-PRs (#89/#90/#91). Production verified on run #47 (commit `8cdf4886`). This commit bumps the triple-doc lockstep so future sessions read the actual current state instead of the in-progress plan. ## What shipped (per-sub-PR) | Sub-PR | PR | Delivered | Production effect | |---|---|---|---| | **4.5a.1** | #89 | Sloan accruals top-decile within sector; `SLOAN_MIN_POPULATION_SECTOR=15` floor; cross-sectional fallback for under-floor sectors. Closes issue #7. | Financials Sloan rate 21.3% → 11.7%. Cross-sector spread 7.7× → 1.4×. Total Sloan flagged 51 → 56. | | **4.5a.2** | #90 | `beneish_manipulation_veto` active-veto path at M > −1.78 (Beneish 1999 Table 4 PPV crossover). | 11 new vetoed tickers: SMCI · WAT · PODD · WDC · NVDA · CAT · PLTR · SNDK · BG · STX · LLY. | | **4.5a.3** | #91 | `dechow_manipulation_veto` active-veto path at F > 3.0 (Dechow 2011 Table 7 4× baseline crossover) + `manipulation_triple_flag` joint-gate annotate. | 1 Dechow veto (SMCI F=6.65); 2 triple_flag tickers (SMCI + WAT). | ## End-state defense layer - **Active vetoes**: 5 → **7** (added `beneish_manipulation_veto`, `dechow_manipulation_veto`) - **Annotate flags**: 4 → **5** (added `manipulation_triple_flag`) - **Tier-3 forensic**: still 2 (Beneish + Dechow operating at two thresholds each — annotate + veto) - **Reason taxonomy**: 24 → **29 stable identifiers** No schema delta — new flag IDs are strings in the existing `risk_flags: list[str]` and `valuation_warnings: list[str]` arrays. `SCHEMA_VERSION` stays `0.7.1-phase4g`. ## Triple-doc lockstep changes | File | Change | |---|---| | `CLAUDE.md` | §Phase status — "Next deliverable" reframed from "4.5a.1 sector-relative Sloan" to "4.5b disclosure-driven catches". Defense layer count "9 → 18 target" updated to "9 → 11 after 4.5a; target 18 after 4.5f". Issue #7 marked ✅ closed by 4.5a.1. | | `PHASE_STATUS.md` | Phase Overview table: Phase 4.5 row flipped ⚪ → 🟡 IN PROGRESS with the 4.5a wave landed; duplicate ⚪ row removed. "Phase 4.5 plan" section §4.5a replaced "1-2 weeks, +2 active veto + 1 badge" header with "✅ DONE 2026-05-16" + a results table showing per-sub-PR production effect. Original plan text preserved below the results table for audit. | | `WORKFLOW.md` | Phase 4.5 §Tasks §4.5a — all 4 checkboxes flipped [ ] → [x] with per-sub-PR PR-number citations (PR #89 / #90 / #91), LOC counts, test deltas, production verification numbers. | ## Audit trail This is the 4th doc-correction PR in the post-v1.0 cleanup pattern, this time legitimate (the work actually shipped). Earlier ones in this session were correcting drift between intent and state: - PR #81 — 4g status correct (factual) - PR #86 — added Phase 4.5 roadmap (planning) - PR #87 — corrected "PR 4b next" → "§3 polish next" (was wrong) - PR #88 — corrected "§3 polish next" → "deferred to Phase 5" (was wrong) - **This PR** — 4.5a wave ✅ DONE (factual, not drift-correction) ## Verification - No code changes; docs only - `grep "4.5a.1" CLAUDE.md PHASE_STATUS.md WORKFLOW.md` returns only DONE/closed references in active sections - `grep "Next deliverable.*4.5a"` returns 0 hits (all moved to 4.5b) - `grep "9 → 11"` appears in CLAUDE.md (new defense layer count) https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2 Co-authored-by: Claude <noreply@anthropic.com>

…tatement_history + late_filing_notification) (#93) Phase 4.5 manipulation-defense cluster sub-PR 4 of 6 per PR #86. Adds two ANNOTATE-only flags surfaced from the SEC EDGAR filing list. ## What's new ### `restatement_history` annotate (5y lookback) - Module: `compute/scoring/restatement_filings.py` (~390 LOC + ~290 test LOC). - Counts 10-K/A + 10-Q/A filings per CIK in the trailing 5 years. - Paper: Hennes-Leone-Miller 2008 *TAR* — restating firms see -9% abnormal return on announcement; recurrent restaters compound. - Lookback: `config.RESTATEMENT_HISTORY_LOOKBACK_DAYS = 1825` (5×365). - ANNOTATE-only — base rate sector-agnostic, no veto without sector adjustment (which 4.5b doesn't include). ### `late_filing_notification` annotate (1y lookback) - Same module. - Detects SEC Form 12b-25 (NT 10-K / NT 10-Q) in the trailing 365d. - Paper: Bartov-Lai-Yeung 2002 *JAR* — late filers see -5-7% abnormal returns. - Lookback: `config.LATE_FILING_LOOKBACK_DAYS = 365`. ## Architecture (mirrors `compute.scoring.eight_k_events`) - Per-CIK JSON cache (7d TTL) at `compute/cache/edgar_amendments/` and `compute/cache/edgar_late_filings/`. - Cache shape: `{fetched_at, lookback_days, filings: [{accession, form, filing_date, filing_url}]}`. - Fetch path: `edgar.Company(ticker).get_filings(form=...)` with per-form retry; merges results across multiple forms client-side + sorts desc by filing_date so `filings[0]` is the latest. - Public entry points: `check_restatement_history(ticker, ..., filings_override=...)` and `check_late_filing(ticker, ..., filings_override=...)`. The override path is the test inject — bypasses EDGAR, keeps unit tests offline. ## Files touched | File | Change | |---|---| | `compute/scoring/restatement_filings.py` | NEW. Cache + fetch + check_restatement_history + check_late_filing. | | `compute/config.py` | + `RESTATEMENT_HISTORY_LOOKBACK_DAYS=1825` + `LATE_FILING_LOOKBACK_DAYS=365` + `EDGAR_AMENDMENTS_CACHE_DIR` + `EDGAR_LATE_FILINGS_CACHE_DIR`. | | `compute/main.py` | + import `check_late_filing` + `check_restatement_history`. Per-ticker Step-8 loop appends `restatement_history` / `late_filing_notification` to `valuation_warnings` when the check fires. Slots immediately after the existing PR 4b §1 `cross_source_disagreement` block. | | `.github/workflows/compute-rankings.yml` | + 2 new cache paths (`edgar_amendments`, `edgar_late_filings`) so weekly runs preserve the per-ticker JSON files. | | `tests/test_scoring/test_restatement_filings.py` | NEW. 17 tests covering `_filing_date_within` boundaries, both check_* entry points (no filings, within window, outside window, multiple within window, lookback constants, fetch-failure graceful path). All offline via `filings_override`. | | `tests/test_workflow_cache_coverage.py` | + 2 new parametrized cache-path assertions for the new directories (catches future workflow YAML drift). | ## Defense layer end-state (after this PR ships) - Active vetoes: 7 (unchanged — 4.5b is annotate-only) - Annotate flags: 5 → **7** (added `restatement_history`, `late_filing_notification`) - Reason taxonomy: 29 → **31** stable identifiers No schema delta — new flag IDs are strings in existing `valuation_warnings: list[str]`. ## Backward compat - `filings_override` arg is opt-in (None default → fetches via EDGAR). Existing callers without it unchanged. - `EDGAR_USER_AGENT` env var precondition matches the rest of the Tier-2 layer — fetcher returns None when unset (cleanly skipping the flag rather than crashing). - Caches gitignored under `compute/cache/`. ## Verification ladder - ✅ `ruff check .` — clean - ✅ `pytest tests/ -m "not network"` — **803 passed** (was 784; +17 new restatement tests + 2 new workflow-cache parametrize entries) - ⏳ Production verification deferred to next workflow_dispatch. Expected fire rates on S&P 500 (rough — needs production run to confirm): - `restatement_history` — 30-80 tickers (~6-16%) based on historical 10-K/A base rates 2020-2025 - `late_filing_notification` — 5-20 tickers (~1-4%) based on SEC Form 12b-25 filing data ## Sibling sub-PRs (Phase 4.5 cluster) - ✅ **4.5a wave** complete (PRs #89/#90/#91 + #92 docs) - **4.5b (this PR)** — disclosure-driven catches - ⬜ **4.5c** — Roychowdhury REM 3-proxy - ⬜ **4.5d** — M-score 3y momentum + Burgstahler-Dichev kink - ⬜ **4.5e** — Form 4 insider clustering - ⬜ **4.5f** — `manipulation_index` composite + UI + schema bump → v1.2.0 https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2 Co-authored-by: Claude <noreply@anthropic.com>

#94) Phase 4.5b disclosure-driven catches shipped via PR #93. Production verified on run #48 (commit `849b7ca8`, workflow 2h08m due to cold-cache populating both new `edgar_amendments` + `edgar_late_filings` dirs; warm runs return to ~1h30m). ## What shipped | Flag | Lookback | Production fire | Notes | |---|---|---|---| | `restatement_history` | 5y 10-K/A + 10-Q/A | **60 / 502 (12.0%)** | within expected 6-16% — AMD, DIS, CVX, BSX, EBAY etc. (mostly mature firms with periodic amendments) | | `late_filing_notification` | 365d Form 12b-25 | **2 / 502 (0.4%)** | HAS + Q — slightly under expected 1-4% (S&P 500 firms tend to be more compliant than broader Bartov-Lai-Yeung sample) | ## End-state defense layer - Active vetoes: **7** (unchanged — 4.5b is annotate-only) - Annotate flags: 5 → **7** (+ `restatement_history`, `late_filing_notification`) - Reason taxonomy: 29 → **31** - **Defense layer 9 → 13 layers after 4.5a + 4.5b** No schema delta — both new flags are strings in existing `valuation_warnings: list[str]`. `SCHEMA_VERSION` stays `0.7.1-phase4g`. ## Triple-doc lockstep changes | File | Change | |---|---| | `CLAUDE.md` | "Next deliverable" reframed from 4.5b to 4.5c (Roychowdhury REM). Defense layer count "9 → 11 after 4.5a" → "9 → 13 after 4.5a + 4.5b". 4.5b results summary appended. | | `PHASE_STATUS.md` | Phase 4.5 table row updated with 4.5b results + tickers + 60/2 counts. §4.5b header flipped to ✅ DONE 2026-05-16 with results table + workflow time note (cold-cache 2h08m). Original plan text preserved below for audit. | | `WORKFLOW.md` | §Tasks §4.5b — all 4 checkboxes [ ] → [x] with PR-number + LOC + test-count + production-verification citations. SEC Filing Roadmap table: 4 new rows for 10-K/A, 10-Q/A, NT 10-K, NT 10-Q (all ✅ active with PR #93 / 2026-05-16 production-fire-rate footnotes). Form 4 status flipped from "❌ not used" to "⬜ planned (Phase 4.5e)" to reflect the upcoming sub-PR. | ## Next deliverable **Phase 4.5c — Real Earnings Management (Roychowdhury 2006 REM)**: - 3 abnormal proxies per ticker: - `abnormal_CFO` = actual − model(Sales, ΔSales) - `abnormal_production` = actual − model(Sales, ΔSales, ΔSales_t−1) - `abnormal_discretionary_expenses` = actual − model(Sales_t−1) - Flag `rem_suspect` fires when 2 of 3 proxies sit in worst decile within sector - ~250 LOC + golden tests against Roychowdhury 2006 paper Table 6 - Catches REAL manipulation (cutting R&D, channel stuffing, deferring maintenance) — invisible to Sloan/Beneish/Dechow which target accrual manipulation ## Audit trail (post-v1.0 doc PRs) | PR | Purpose | |---|---| | #81 | 4g ✅ DONE | | #86 | Phase 4.5 roadmap added | | #87 | "PR 4b next" → "§3 polish next" (was wrong) | | #88 | "§3 polish next" → "Phase-5 blocked" (was wrong) | | #92 | 4.5a wave ✅ DONE | | **this PR** | 4.5b wave ✅ DONE | ## Verification - No code changes; docs only - `grep "Next deliverable.*4.5b"` returns 0 hits (all moved to 4.5c) - `grep "9 → 13"` appears in CLAUDE.md (new defense layer count) - `grep "10-K/A.*✅ active"` returns the new WORKFLOW.md filing-roadmap row https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2 Co-authored-by: Claude <noreply@anthropic.com>

…em_suspect annotate) (#95) Phase 4.5 manipulation-defense cluster sub-PR 5 of 6 per PR #86. Catches REAL manipulation (cutting R&D, channel stuffing, deferring maintenance, overproduction) — invisible to the existing accrual- targeting defenses (Sloan / Beneish / Dechow). ## Model — Roychowdhury 2006 *JAE* 3-proxy Three abnormal residuals from per-sector OLS regressions: 1. **Abnormal CFO** — residual of `CFO_t / A_{t-1}` on `[1, 1/A_{t-1}, Sales_t/A_{t-1}, ΔSales_t/A_{t-1}]`. **Low (negative)** = suspicious → firm front-loaded sales via channel stuffing / loose credit / discounts to inflate CFO. 2. **Abnormal Production** — residual of `(COGS_t + ΔInventory_t) / A_{t-1}` on `[1, 1/A_{t-1}, Sales_t, ΔSales_t, ΔSales_{t-1}]` (all over A_{t-1}). **High** = suspicious → overproduction spreads fixed costs over more units, deflating per-unit COGS and inflating gross margin. 3. **Abnormal Discretionary Expenses** — residual of `(R&D_t + SGA_t) / A_{t-1}` on `[1, 1/A_{t-1}, Sales_{t-1}/A_{t-1}]`. **Low (negative)** = suspicious → firm cut discretionary spending to boost current earnings. (Advertising omitted — SEC XBRL rarely tags it separately; per Roychowdhury 2006 footnote 7 the SGA-only adaptation is acceptable since advertising is usually subsumed in SGA.) Flag `rem_suspect` fires when **≥ 2 of 3** residuals sit in their respective worst decile within the ticker's GICS sector. Mirrors the 4.5a.3 `manipulation_triple_flag` pattern but uses *real* (not accrual) signals. ## Architecture | File | Change | |---|---| | `compute/scoring/rem.py` | **NEW** — `REMProxies` + `REMResult` dataclasses; `compute_proxies` (per-ticker input vector from snap + history); `_fit_sector_models` (per-sector OLS via `numpy.linalg.lstsq`); `compute_rem_flags` (two-pass: proxies → sector models → residuals → within-sector decile rank → fire). ~420 LOC. | | `compute/main.py` | Pre-compute `rem_results` once via `compute_rem_flags(snapshots, histories=histories, sectors=sectors_dict)` right after `compute_risk_flags`. Per-ticker Step-8 loop appends `rem_suspect` to `valuation_warnings` when `rem_result.fired`. | | `tests/test_scoring/test_rem.py` | **NEW** — 14 tests in three layers: (1) proxy construction (5 tests covering well-formed, missing snap, missing assets denominator, R&D fallback to SGA-only, inventory-missing PROD skip), (2) end-to-end `compute_rem_flags` (8 tests: empty, below floor, at floor, double-outlier fires, single-outlier triggers cfo axis, triple-outlier 3-trigger, normal-ticker H0 FP rate, constants), (3) **golden numerical test** verifying OLS recovers known-DGP coefficients. | ## No new dependencies - `numpy.linalg.lstsq` for OLS (already in dep tree) - No `sklearn`, no `statsmodels` — pure-numpy reimplementation keeps install surface tight (mirrors PR 4b §2 PBO/DSR decision) ## Defense-layer end-state (after this PR ships) - Active vetoes: 7 (unchanged — 4.5c is annotate-only) - Annotate flags: 7 → **8** (+ `rem_suspect`) - Reason taxonomy: 31 → **32** stable identifiers No schema delta — `rem_suspect` is a string in existing `valuation_warnings: list[str]`. `SCHEMA_VERSION` stays `0.7.1-phase4g`. ## Backward compat - `compute_rem_flags(snapshots, histories=None, sectors=None)` — both kwargs optional. When sectors absent, no sector model can fit (every ticker's sector lookup returns None); all results have `fired=False`. - Sectors below `REM_MIN_POPULATION_SECTOR = 15` (matches 4.5a.1 Sloan sector-relative floor) skip REM cleanly — those tickers get `REMResult(None, None, None, fired=False)`. No active-veto fallback (REM is annotate-only). - DISEXP falls back to SGA-only when R&D is missing (financials / REITs / utilities) per Roychowdhury 2006 footnote 7. ## Verification ladder - ✅ `ruff check .` — clean - ✅ `pytest tests/test_scoring/test_rem.py` — **14 passed** - ✅ `pytest tests/ -m "not network"` — **817 passed** (was 803; +14 new REM tests) - ✅ schema_check — N/A (no schema delta) - ⏳ Production verification deferred to next workflow_dispatch. Expected fire rate: 5-7% (~25-35 of 502 S&P 500 tickers) assuming moderate axis correlation. H0 (independent axes) FP rate is 2.8% per the 2-of-3 joint-probability calc. ## Sibling sub-PRs (Phase 4.5 cluster) - ✅ **4.5a wave** complete (PRs #89 / #90 / #91 + #92 docs) - ✅ **4.5b** complete (PR #93 + #94 docs) - **4.5c (this PR)** — Roychowdhury REM - ⬜ **4.5d** — M-score 3y momentum + Burgstahler-Dichev kink - ⬜ **4.5e** — Form 4 insider clustering - ⬜ **4.5f** — `manipulation_index` composite + UI + schema bump → v1.2.0 https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2 Co-authored-by: Claude <noreply@anthropic.com>

#96) Phase 4.5c Roychowdhury REM shipped via PR #95. Production verified on run #49 (commit `65097703`, warm-cache 6m25s — all 9 cache layers populated). ## What shipped `rem_suspect` annotate via per-sector OLS regressions on 3 abnormal proxies (CFO, Production, Discretionary Expenses). Module `compute/scoring/rem.py` (~420 LOC, pure-numpy via `np.linalg.lstsq`, no sklearn/statsmodels dep). 14 offline tests including golden numerical test recovering known-DGP coefficients. ## Production verification | Metric | Value | |---|---| | Fire rate | **16 / 502 (3.2%)** — within H0-to-correlation expected 2.8-7% | | Tickers fired | SMCI · WAT · ADM · TSN · HRL · STLD · FSLR · JBL · COHR · LII · LDOS · POOL · OMC · WY · TECH · RVTY | | Orthogonality check | NVDA / PLTR (Beneish-veto fired) **NOT** in REM list — confirms 4.5c captures real-manipulation signal orthogonal to accrual targets | | Real-world coverage | ADM (2024 SEC investigation) · SMCI (2024 investigation) · TSN / HRL (periodic scrutiny) · FSLR (solar channel-stuffing history) | ## End-state defense layer - Active vetoes: **7** (unchanged — 4.5c is annotate-only) - Annotate flags: 7 → **8** (+ `rem_suspect`) - Reason taxonomy: 31 → **32** - **Defense layer 9 → 14 after 4.5a + 4.5b + 4.5c** No schema delta — `rem_suspect` is a string in existing `valuation_warnings: list[str]`. `SCHEMA_VERSION` stays `0.7.1-phase4g`. ## Triple-doc lockstep changes | File | Change | |---|---| | `CLAUDE.md` | "Next deliverable" 4.5c → 4.5d. Defense layer "9 → 13 after 4.5a+4.5b" → "9 → 14 after 4.5a+4.5b+4.5c". 4.5c results + ticker list + orthogonality note inserted between 4.5b and the post-completion roadmap. | | `PHASE_STATUS.md` | Phase 4.5 row updated with 4.5c production stats. §4.5c header flipped to ✅ DONE 2026-05-17 with results table + orthogonality note. Original plan text preserved below for audit. | | `WORKFLOW.md` | §4.5c checkboxes [ ] → [x] with PR-number / LOC / test-count / production-verification citations + golden-test reference. | ## Next deliverable **Phase 4.5d — earnings-quality time-series + Burgstahler-Dichev kink at zero** (~180 LOC, ~7 days): - `m_score_deteriorating` annotate — Δ(Beneish M-score) > +0.5 over trailing 3y (manipulation gathering steam) - `loss_avoidance_pattern` annotate — NI ∈ [0, $5M] OR EPS ∈ [0, $0.05] for 3+ consecutive years (Burgstahler-Dichev 1997 kink) ## Audit trail (post-v1.0 doc PRs) | PR | Purpose | |---|---| | #81 | 4g ✅ DONE | | #86 | Phase 4.5 roadmap added | | #87 | "PR 4b next" → "§3 polish next" (was wrong) | | #88 | "§3 polish next" → "Phase-5 blocked" (was wrong) | | #92 | 4.5a wave ✅ DONE | | #94 | 4.5b wave ✅ DONE | | **this PR** | 4.5c wave ✅ DONE | ## Verification - No code changes; docs only - `grep "Next deliverable.*4.5c"` returns 0 hits (all moved to 4.5d) - `grep "9 → 14"` appears in CLAUDE.md (new defense layer count) - `grep "rem_suspect"` appears in PHASE_STATUS.md + WORKFLOW.md active-flags references https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2 Co-authored-by: Claude <noreply@anthropic.com>

…als_momentum_high + loss_avoidance_pattern) (#97) Phase 4.5 manipulation-defense cluster sub-PR 6 of 6 (the last purely-defense sub before 4.5f composite + UI bundling). Two annotate-only flags derived from the per-ticker fundamentals history (annual XBRL). ## What's new ### `accruals_momentum_high` — Δ(TATA) over 3y > +0.05 - TATA = (NetIncome − OperatingCashFlow) / TotalAssets, the Sloan 1996 / Beneish 1999 accruals backbone. - Threshold +0.05 ≈ Beneish 1999 ΔM > +0.5 via the β_TATA = 4.679 coefficient (ΔM ≈ 4.679 × ΔTATA → ΔM > 0.5 ⇔ ΔTATA > 0.107). We use 0.05 — more sensitive since TATA alone captures less than the full 8-ratio signal; standard practitioner adaptation when shortening to one ratio. - Catches manipulation **gathering steam** — the snapshot-only Sloan + Beneish flags miss the trajectory entirely. **Practical note on naming**: PR #86 plan §4.5d called this `m_score_deteriorating` (full Δ(Beneish M-score) > +0.5). We chose TATA momentum as a practical equivalent: building 3 historical 8-ratio Beneish snapshots from XBRL history would require expanding the annual-history coverage of 6+ supplementary ratios (DSRI / GMI / AQI / etc.) that often have gaps for prior years. TATA is the single Beneish component that's a level rather than a ratio-of-ratios, and Sloan 1996 established it as the standalone accruals signal — so this is a clean shortening, not a weakening. ### `loss_avoidance_pattern` — Burgstahler-Dichev 1997 kink at zero - Fires when **3+ consecutive fiscal years** of tiny-positive earnings: NI ∈ [\$0, \$5M] **OR** EPS ∈ [\$0.00, \$0.05]. - Per-share band catches the high-share-count case where NI alone is above the absolute floor but per-share is still tiny. - Empirical kink-at-zero signature of managers shading reported earnings just enough to clear the loss / loss-threshold. ## Architecture | File | Change | |---|---| | `compute/scoring/earnings_quality.py` | **NEW** ~250 LOC — `check_accruals_momentum` + `check_loss_avoidance` + history-walk helpers (`_annual_values`, `_value_at_year`). Pure pandas; no new deps. | | `compute/main.py` | + 2 import lines + 2 per-ticker annotate appends in the Step-8 loop, slotting after `rem_suspect`. | | `tests/test_scoring/test_earnings_quality.py` | **NEW** ~225 LOC — 14 offline tests covering both flags (fires / doesn't fire / improves / threshold pins / EPS-band fallback / negative-NI rejection / large-NI rejection / multi-year streak / streak break / constants sanity). | ## Defense-layer end-state (after this PR ships) - Active vetoes: **7** (unchanged — 4.5d is annotate-only) - Annotate flags: 8 → **10** (+ `accruals_momentum_high`, `loss_avoidance_pattern`) - Reason taxonomy: 32 → **34** stable identifiers - Total defense layers: **9 → 16** after 4.5a + 4.5b + 4.5c + 4.5d No schema delta — both flags are strings in existing `valuation_warnings: list[str]`. `SCHEMA_VERSION` stays `0.7.1-phase4g`. ## Backward compat - Both check functions take `(snap, history)` — no caller changes elsewhere. Missing inputs (snap=None, no history, insufficient years) cleanly return fired=False. - No new EDGAR fetches — both flags read from existing fundamentals + fundamentals_history caches. ## Verification ladder - ✅ `ruff check .` — clean - ✅ `pytest tests/test_scoring/test_earnings_quality.py` — **14 passed** - ✅ `pytest tests/ -m "not network"` — **831 passed** (was 817; +14 new) - ✅ schema_check — N/A (no schema delta) - ⏳ Production verification deferred. Expected fire rates on S&P 500: - `accruals_momentum_high` ~3-8% (~15-40 tickers) — H0 from Δ(TATA) > 0.05 base rate - `loss_avoidance_pattern` ~1-3% (~5-15 tickers) — S&P 500 firms rarely report tiny-positive earnings for 3+ years (mega-cap distribution); base rate higher on small-caps per Burgstahler- Dichev 1997 original sample ## Sibling sub-PRs (Phase 4.5 cluster) - ✅ **4.5a wave** (PRs #89 / #90 / #91 + #92 docs) - ✅ **4.5b** (PR #93 + #94 docs) - ✅ **4.5c** (PR #95 + #96 docs) - **4.5d (this PR)** — earnings-quality time-series - ⬜ **4.5e** — Form 4 insider clustering (~420 LOC, ~12 days — needs new SEC Form 4 parser) - ⬜ **4.5f** — `manipulation_index` composite + composite-score penalty + UI pillar card + README Honest Limitations + schema bump → **v1.2.0-phase4.5** https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2 Co-authored-by: Claude <noreply@anthropic.com>

vercel Bot deployed to Preview May 16, 2026 09:13 View deployment

dackclup marked this pull request as ready for review May 16, 2026 12:25

dackclup merged commit c59b8f5 into main May 16, 2026
4 checks passed

dackclup deleted the docs/phase4.5-manipulation-roadmap branch May 16, 2026 12:25

dackclup mentioned this pull request May 16, 2026

docs(phase4): correct PR 4b status — §1+§2 already shipped in PR #60; only §3 polish remains #87

Merged

4 tasks

This was referenced May 16, 2026

docs(phase4): defer PR 4b §3 to Phase 5 — surface 4.5a.1 + 4h as the real next tracks #88

Merged

feat(scoring): Sloan accruals top decile within-sector (PR 4.5a.1, closes issue #7) #89

Merged

This was referenced May 16, 2026

feat(scoring): Beneish soft-veto promotion (PR 4.5a.2, M > -1.78 → entered_top5 suppression) #90

Merged

feat(scoring): Dechow soft-veto + manipulation_triple_flag (PR 4.5a.3) #91

Merged

dackclup mentioned this pull request May 16, 2026

docs(phase4.5a): mark 4.5a wave ✅ DONE + bump next deliverable to 4.5b #92

Merged

5 tasks

dackclup mentioned this pull request May 16, 2026

feat(scoring): disclosure-driven manipulation defenses (PR 4.5b — restatement_history + late_filing_notification) #93

Merged

3 tasks

dackclup mentioned this pull request May 17, 2026

feat(scoring): Roychowdhury 2006 Real Earnings Management (PR 4.5c, rem_suspect annotate) #95

Merged

5 tasks

dackclup mentioned this pull request May 17, 2026

feat(scoring): earnings-quality time-series (PR 4.5d — accruals_momentum_high + loss_avoidance_pattern) #97

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(phase4.5): add Earnings-Manipulation Defense Cluster to the v1.x roadmap#86

docs(phase4.5): add Earnings-Manipulation Defense Cluster to the v1.x roadmap#86
dackclup merged 1 commit into
mainfrom
docs/phase4.5-manipulation-roadmap

dackclup commented May 16, 2026

Uh oh!

vercel Bot commented May 16, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dackclup commented May 16, 2026

Summary

Phase 4.5 sub-PRs (~10-11 working weeks)

Validation harness (cross-cutting)

Sequencing

Doc lockstep (per phase-status-bump skill)

Not in this PR

Test plan

Uh oh!

vercel Bot commented May 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vercel Bot commented May 16, 2026 •

edited

Loading