docs(phase4.5): add Earnings-Manipulation Defense Cluster to the v1.x roadmap#86
Merged
Merged
Conversation
… roadmap User: "อยากทำให้ระบบป้องกันได้แน่นหนาที่สุด ช่วยเสนอแผนมาหน่อย" → "รวมแผนเข้าไปในแผนการทำ app". Folds the manipulation-defense proposal into the existing phase-tracker triple as a new **Phase 4.5 → v1.2.0** cluster inserted between Phase 4 (factor consolidation → v1.1.0) and Phase 5 (ML meta-learner). ## Why this is its own phase, not folded into Phase 4 - **Phase 4 = factor focus** (OSAP / JKP / Qlib / IPCA). Ships v1.1. - **Phase 4.5 = manipulation focus** (Sloan/Beneish/Dechow/REM/insider). Touches disjoint code paths. - Splitting keeps release themes clean and lets v1.1 ship sooner. - Phase 4.5 sub-PRs **can run in parallel with Phase 4 factor integrations** (4h/4i/4j/4k) since they touch different layers and share the PR 4b PBO/DSR validation harness. ## Phase 4.5 sub-PRs (6 total, ~10-11 weeks full-time) | Sub-PR | New defenses | Effort | |---|---|---| | 4.5a (3 sub-PRs in parallel) | Sector-relative Sloan + Beneish/Dechow soft-veto + `manipulation_triple_flag` joint badge | ~180 LOC, 1-2w | | 4.5b | `restatement_history` (Hennes-Leone-Miller 2008) + `late_filing_notification` Form 12b-25 (Bartov-Lai-Yeung 2002) | ~270 LOC, 1w | | 4.5c | Roychowdhury REM 3-proxy (`abnormal_CFO` + `abnormal_production` + `abnormal_discretionary_exp`) | ~250 LOC, 2w | | 4.5d | `m_score_deteriorating` 3y momentum + Burgstahler-Dichev kink at zero | ~180 LOC, 2w | | 4.5e | New SEC Form 4 parser + `insider_sell_cluster` (Cohen-Malloy-Pomorski 2012) + `c_suite_unusual_sell` | ~420 LOC, 3w | | 4.5f | `manipulation_index` 0-100 composite + composite-score penalty + UI pillar card + README Honest Limitations + schema bump 0.7.x → 0.8.0-phase4.5f | ~250 LOC, 1w | **Defense layer after 4.5**: 5 → 7 active vetoes; 4 → 11 annotates; 9 → **18 total layers** (verifiable via defense-scorecard skill). ## Validation cohort (used per sub-PR) - SEC AAER list 2000-2024 (~600 confirmed manipulators per Dechow et al. 2011 dataset + ongoing). Public, free. - Audit Analytics restatement subset (~1,200 firms 2000-2024) as second-source. - PBO ≤ 0.5 AND DSR > 0 gate per addition (Bailey-de Prado-Zhu 2014 CSCV harness — already in PR 4b §2 scope per issue #75). - Purged + embargoed walk-forward CV (López de Prado 2018). ## Doc lockstep (per phase-status-bump skill) | File | Change | |---|---| | `PHASE_STATUS.md` | New row in Phase Overview table + full "Phase 4.5 plan" section with all 6 sub-PRs + acceptance criteria + sequencing | | `WORKFLOW.md` | New row in Phase Overview header table + full "PHASE 4.5 — Earnings-Manipulation Defense Cluster" task section with checkboxes per sub-PR + Defense Roadmap table extended with 4.5a-4.5f rows + updated calendar totals for v1.1 / v1.2 / v2.0 | | `CLAUDE.md` | §Phase status updated — Phase 4.5 named as the next deliverable after PR 4b → v1.1.0; defense-layer count delta (9 → 18) called out; issue #7 noted as folded into 4.5a | ## Not touched - `SKILL.md` — no current-state constant change yet (schema / veto count / rule additions land per sub-PR as they ship, not at plan time). Rule 16 (annotate-and-veto-Top-N) covers 4.5a's defense pattern; new rule if `manipulation_index` composite-penalty becomes a Phase 5+ adopted pattern. - Code (`compute/`, `frontend/`, `tests/`) — plan only. - `.claude/skills/phase-4.5/` — sub-skill PLAN.md stubs land per sub-PR at the start of 4.5a kickoff. ## Sequencing reminder 1. PR 4b (defense-infrastructure) — MUST land first 2. v1.1.0-phase4: 4h/4i/4j/4k factor integrations 3. v1.2.0-phase4.5: 4.5a → 4.5b + 4.5c (parallel) → 4.5d → 4.5e → 4.5f 4. (Factor 4h/4i/4j/4k can also overlap 4.5 — disjoint files) https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Merged
4 tasks
dackclup
added a commit
that referenced
this pull request
May 16, 2026
… only §3 polish remains (#87) During run #45 verification (defense-scorecard skill) the `cross_source_disagreement` flag surfaced 23 stocks in production — evidence that PR 4b §1 had already shipped. A `git log` confirms **PR #60 ("feat(defense): cross-source validator + PBO/DSR + IC- decay infra (PR 4b)") merged on 2026-05-14**, before v1.0.0. Issue #75 was filed on 2026-05-15 (one day after #60 merged) to track the remaining acceptance criteria — but PR #81 and PR #86 mistakenly treated "PR 4b" as fully not-yet-started. This commit re-aligns the triple-doc with the actual state. ## What PR #60 actually shipped (already in production) | Sub-section | Status | Evidence | |---|---|---| | **§1 Cross-source validator** | ✅ DONE | `compute/ingest/cross_source.py` exists; wired in `compute/main.py:979-988`; run #45 shows 23/502 stocks (4.6%) flagging `cross_source_disagreement` annotate | | **§2 PBO/DSR library** | ✅ DONE | `compute/validation/pbo_dsr.py` exists with CSCV + DSR + Beasley-Springer-Moro inverse normal CDF; `factor_passes_gates()` entry point ready | | **§3 IC-decay monitor** | 🟡 PARTIAL | `compute/validation/ic_decay.py` exists; **`decay_report.json` writer NOT wired** + no UI transparency surface — exactly the 2 unchecked acceptance criteria on issue #75 | ## Real "next deliverable" — PR 4b §3 polish (~2-3 days) 1. **Writer wiring** — call `ic_decay.run()` from `compute/main.py` after pillar normalization; write per-pillar decay table to `frontend/public/data/decay_report.json` via new writer in `compute/output/writer.py` (atomic temp → rename pattern). 2. **UI transparency surface** — new `DecayReportCard.tsx` on the stock detail page below `PillarRadarChart`. Reads `decay_report.json` client-side (fail-soft if absent), shows 8-pillar 12m + 36m IC trend + decay-alert badges per pillar. Effort: ~150 LOC writer + UI + ~80 LOC tests. After §3 polish ships: → 4h/4i/4j/4k factor integrations (each gated by the now-complete PBO/DSR harness) → tag v1.1.0-phase4. ## File-by-file changes | File | Change | |---|---| | `CLAUDE.md` | §Phase status — "Next deliverable" reframed as PR 4b §3 polish only. Production verification numbers cited (23 stocks, 4.6%). Issue #75 description updated to "remaining items: writer + UI surface". | | `PHASE_STATUS.md` | Phase 4 table row reflects PR 4b §1+§2 merged via PR #60. The long "Next deliverable: PR 4b — defense-infrastructure" block is replaced by a 3-row sub-section status table (§1 ✅ / §2 ✅ / §3 🟡) + a tighter "PR 4b §3 polish" next-deliverable scope description. | | `WORKFLOW.md` | Phase 4 Acceptance Criteria — 3 cross-source / PBO-DSR / IC-decay checkboxes flipped from `[ ]` to `[x]` (with footnotes on PR #60 + run #45 evidence) for §1+§2; §3 stays `[ ]` with the remaining-items breakdown. | ## Doc audit trail This is the second triple-doc fix this week: - PR #81: marked 4g ✅ DONE (correct) - PR #86: added Phase 4.5 roadmap + named PR 4b as next (WRONG — this commit corrects) - This PR: PR 4b §1+§2 ✅ DONE; §3 polish is the real next The cause of the PR #81/#86 error: PR #60 merged 2026-05-14 used the literal title "PR 4b" — but the `PHASE_STATUS.md` PR-label slot "4b" was already taken by the `_avg_3y_roe` fix (different content, same letter). Two unrelated PRs both calling themselves "PR 4b" created the confusion. Going forward we should disambiguate by always referring to it as "PR 4b defense-infrastructure" or "issue #75" when meaning the cross-source/PBO/IC-decay work. ## Verification - No code changes; docs only - `grep "PR 4b defense-infrastructure (issue #75) next"` — returns 0 hits - `grep "PR 4b §3 polish"` — appears in CLAUDE.md + PHASE_STATUS.md (the new wording) - WORKFLOW.md Phase 4 acceptance checklist: 3 items flipped to [x] with evidence https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2 Co-authored-by: Claude <noreply@anthropic.com>
This was referenced May 16, 2026
dackclup
added a commit
that referenced
this pull request
May 16, 2026
…oses issue #7) (#89) User: "อยากทำให้ระบบป้องกันได้แน่นหนาที่สุด" → "รวมแผนเข้าไปในแผน การทำ app" — Phase 4.5 manipulation-defense cluster sub-PR 1 of 6. ## Problem (issue #7) `sloan_accruals_top_decile` ใช้ **cross-sectional** top-decile — ไม่คิดถึง sector. Sloan accruals = (NI − CFO) / TotalAssets วัด "earnings ที่ไม่ใช่เงินสด" ซึ่งโดยโครงสร้างแตกต่างกันตาม sector: - **Financials** มี non-cash items สูง (loan-loss provisions, fair-value adjustments) — ไม่ใช่ manipulation - **REITs** มี D&A สูง — โครงสร้างของ business model - **Tech** มี SBC + working-capital build จาก growth — ไม่ใช่ manipulation Cross-sectional top-decile จึง over-fire ที่ Financials/REITs. Production run #45 confirms — **Financials 16/75 = 21.3% flagged** vs expected ~10% (ผ่าน 2× over-fire). IT/Comm/REITs ก็เลื่อนสูง กว่า baseline เล็กน้อย; Staples/Utilities ต่ำกว่ามาก (3% / 2.8%). Sector spread 21.3% / 2.8% = 7.6× — เกินที่ Sloan paper บ่งบอก. ## Fix `compute_risk_flags()` คำนวณ Sloan top-decile threshold **per sector** เมื่อ `sectors` ถูกส่งเข้ามา (production ส่งเสมอ). Sectors ที่มีขนาด < `SLOAN_MIN_POPULATION_SECTOR=15` (เช่น future S&P 1500 sub-sectors) fall back ไป cross-sectional threshold. Mirrors NSI per-sector pattern ที่มีอยู่แล้ว — เพิ่ม dict `sloan_thresholds_by_sector` คู่ขนานกับ `nsi_thresholds_by_sector` และเปลี่ยน per-ticker check ให้ prefer sector threshold แล้วค่อย fall back ไป cross-sectional. ## Expected production impact | Sector | Current | After 4.5a.1 (~10%) | Δ | |---|---|---|---| | Financials (75) | 16 (21.3%) | ~8 (10%) | **−8** | | IT (73) | 8 (11.0%) | ~7 (10%) | −1 | | Comm (23) | 3 (13.0%) | ~2 (10%) | −1 | | Cons.Disc (48) | 5 (10.4%) | ~5 (10%) | 0 | | Health (59) | 6 (10.2%) | ~6 (10%) | 0 | | Real Estate (31) | 3 (9.7%) | ~3 (10%) | 0 | | Industrials (79) | 6 (7.6%) | ~8 (10%) | +2 | | Energy (21) | 1 (4.8%) | ~2 (10%) | +1 | | Materials (26) | 1 (3.8%) | ~3 (10%) | +2 | | Utilities (31) | 1 (3.2%) | ~3 (10%) | +2 | | Cons.Staples (36) | 1 (2.8%) | ~4 (10%) | +3 | Net: **51 → ~50** flagged (roughly stable total) but **per-sector rate stabilizes at ~10%** instead of the 7.6× spread today. The flag now means "this stock is in the worst Sloan decile **of its own sector peer group**", which is the correct earnings-quality signal per Sloan 1996. Tier-3 defenses (Beneish + Dechow) catch the residual sector- agnostic manipulation cases that don't show up in sector-relative Sloan (4.5a.2 + 4.5a.3 will soft-veto promote them). ## Implementation | File | Change | |---|---| | `compute/scoring/risk_overlay.py` | New `SLOAN_MIN_POPULATION_SECTOR = 15` constant. Module docstring updated to flag PR 4.5a.1 + issue #7 close. `compute_risk_flags()` builds `sloan_thresholds_by_sector` dict when `sectors` is supplied (mirrors NSI pattern). Per-ticker check prefers sector threshold → cross-sectional fallback → skip. | | `tests/test_scoring/test_risk_overlay.py` | 3 new tests: `test_sloan_sector_relative_top_decile_when_sectors_supplied` (top in EACH of 2 sectors flagged; bottom in each NOT flagged), `test_sloan_sector_relative_skips_undersized_sector` (small-sector tickers fall back to cross-sectional), `test_sloan_sector_relative_floor_constant` (sanity: floor ≥ 10 and ≥ SLOAN_MIN_POPULATION). | ## Backward compat - Existing tests that call `compute_risk_flags(snaps)` without `sectors` continue to work — cross-sectional fallback is the same code path as v1.0. - `sectors` arg was already optional + production already passes `sectors=sectors_dict` to `compute_risk_flags` (compute/main.py:830). - No schema change. No new flag identifier — the flag name `sloan_accruals_top_decile` is unchanged (the threshold rubric is the implementation detail). ## Verification ladder - ✅ ruff check . — clean - ✅ pytest tests/ -m "not network" — **775 passed** (was 772) - ✅ schema_check — not touched (no schema delta) - ⏳ Production verification deferred to next weekly compute — will re-run `defense-scorecard` + `verify-production-output` Section E vs current 51-flag baseline after the next workflow_dispatch lands ## Closes / references - Closes [issue #7](#7) (Sloan over-fire on growers + Financials) - First sub-PR of Phase 4.5 manipulation-defense cluster per PR #86 - Defense layer count unchanged (5 active vetoes), defense **quality** improves — same number of flagged stocks but now correctly sector-distributed ## Future (not in this PR) - 4.5a.2: Beneish soft-veto (M > −1.78 threshold) - 4.5a.3: Dechow soft-veto + `manipulation_triple_flag` joint gate - AAER cohort backtest via the PR 4b §2 PBO/DSR harness (deferred until cohort fixtures land in 4.5c kickoff) https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2 Co-authored-by: Claude <noreply@anthropic.com>
This was referenced May 16, 2026
Merged
dackclup
added a commit
that referenced
this pull request
May 16, 2026
…tered_top5 suppression) (#90) Phase 4.5 manipulation-defense cluster sub-PR 2 of 6 per PR #86. Promotes `beneish_high` (M > -2.22 annotate, PR 3e.1) to a **second tier** active veto at the stricter threshold `M > -1.78`. ## Threshold rationale Beneish 1999 *FAJ* original cutoff is M > -2.22 — Type-I ~17%, Type-II ~24%. The PR 3e.1 annotate uses this threshold as the "is this stock worth a closer look" signal (low precision, high recall, doesn't suppress Top-5). PR 4.5a.2 adds a STRICTER cutoff M > -1.78 for the **active veto** path. Beneish 1999 paper Table 4 shows positive-predictive-value crosses ~60% at M > -1.78 on the original 74-manipulator sample; below that the precision drops into FP-heavy territory. The stricter cutoff mirrors PR 3d's `non_reliance_filing` veto trade- off — high precision, narrower recall, won't dilute Top-5 with marginal annotators. Tickers in the -2.22 to -1.78 band keep ONLY the annotate flag (no veto). Tickers above -1.78 get BOTH the annotate AND the veto. ## Production estimate (run #46, c=737d8efe) | Threshold | Coverage | Flagged | |---|---|---| | M > -2.22 (annotate, existing) | 160/502 (32% covered) | 26 (16.2%) | | **M > -1.78 (veto, NEW)** | same 160 | **11 (6.9%)** | New vetoes that will fire (top 11 in M-score order): SMCI · WAT · PODD · WDC · NVDA · CAT · PLTR · SNDK · BG · STX · LLY Most are growth tech (SMCI, NVDA, PLTR, SNDK, STX) where Beneish 1999 acknowledged growers can FP. The Tier-3 forensic posture documented in PR 3e.1 is exactly the right framing — the veto is high-precision-narrow-recall; growers that show up here usually warrant a closer look even if not all are confirmed manipulators. ## Architecture | Layer | Before | After | |---|---|---| | Active vetoes | 5 (altman / sloan / NSI / non_reliance / data_quality) | **6** (+ `beneish_manipulation_veto`) | | Annotate flags | `beneish_high` at M > -2.22 (unchanged) | + nothing new | `compute_risk_flags(beneish_m_scores=...)` is the inject pattern — mirrors `non_reliance_by_ticker`. `compute/main.py` pre-computes all 502 Beneish results BEFORE the risk_flag pass so the veto can suppress entered_top5; the existing per-ticker loop (Step 8) reads the cached `beneish_results[ticker]` instead of recomputing (performance neutral — one compute per ticker, was already two before this refactor). ## Files changed | File | Change | |---|---| | `compute/scoring/beneish.py` | + `BENEISH_VETO_THRESHOLD = -1.78` constant + docstring rationale (paper Table 4 PPV crossover, parallel to non_reliance_filing trade-off) | | `compute/scoring/risk_overlay.py` | + `beneish_m_scores` kwarg on `compute_risk_flags`. New veto check at the end of the per-ticker loop emits `beneish_manipulation_veto` when M > threshold. Imports `BENEISH_VETO_THRESHOLD`. | | `compute/main.py` | Pre-compute `beneish_results` dict + `beneish_m_scores` dict before `compute_risk_flags` call. Pass to compute_risk_flags. Per-ticker Step-8 loop reads from cached `beneish_results[ticker]` (no double-compute). | | `tests/test_scoring/test_risk_overlay.py` | 4 new tests: veto fires above strict threshold, skipped on None m_score, strict inequality at exact threshold, disabled when dict not supplied (backward-compat). | ## Backward compat - `beneish_m_scores` arg is **optional**. Existing callers without it (tests, future external users) see no behavior change. - `beneish_high` annotate at M > -2.22 **unchanged** — old flag still fires for ranks below the veto band. - `StockDetail.beneish_m_score` numeric field unchanged. ## Verification ladder - ✅ `ruff check .` — clean - ✅ `pytest tests/ -m "not network"` — **779 passed** (was 775; 4 new) - ✅ schema_check — N/A (no schema delta) - ⏳ Production verification deferred to next workflow_dispatch — expect: - 11 new tickers with `beneish_manipulation_veto` in `risk_flags` - Top-5 rotation re-shuffles if NVDA/SMCI/WAT/etc were in raw-top-5 (NVDA in run #46 raw-top-5 at #3 — will be suppressed with Sloan + new Beneish veto) - Active veto count metadata 5 → 6 (verify via `defense-scorecard`) ## Sibling sub-PRs (Phase 4.5a wave) - ✅ 4.5a.1 (Sloan sector-relative) — merged PR #89 - **4.5a.2 (this PR)** — Beneish soft-veto - ⬜ 4.5a.3 (Dechow soft-veto + `manipulation_triple_flag`) — next, branches off this PR or off main after merge https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2 Co-authored-by: Claude <noreply@anthropic.com>
dackclup
pushed a commit
that referenced
this pull request
May 16, 2026
Phase 4.5 manipulation-defense cluster sub-PR 3 of 6 per PR #86. Final 4.5a wave member. Branched off PR #90 (4.5a.2 Beneish veto) to integrate cleanly. Depends on 4.5a.2 merging first. ## Two additions ### 1. Dechow F-score soft-veto (F > 3.0) Promotes `dechow_high` (F > 2.45 annotate, PR 3e.2) to a second-tier active veto at the stricter threshold F > 3.0. Mirrors PR 4.5a.2's Beneish veto pattern exactly. Threshold rationale: Dechow 2011 *CAR* Table 7 shows that at F > 3.0 the AAER hit rate exceeds 4× baseline (vs ~2× at F > 2.45). The stricter cutoff matches the precision/recall trade-off PR 4.5a.2 locked for Beneish and PR 3d locked for non_reliance_filing — high precision, narrower recall, won't dilute Top-5. ### 2. `manipulation_triple_flag` joint-gate badge Fires when Sloan + Beneish-high + Dechow-high all flag on the same ticker. Rare but high-confidence — 2 tickers in run #46: - **SMCI**: F=6.65 (Dechow veto fires too) + Sloan + Beneish high - **WAT**: Sloan + Beneish high + Dechow high (annotates only) UI-only badge in `valuation_warnings`; does NOT stack a third veto on top of the individual component vetoes. Per PR #86 plan §4.5a.3. ## Production estimate (run #46) | Threshold | Coverage | Flagged | |---|---|---| | F > 2.45 (annotate, existing) | 157/502 (31% covered) | 2 (1.3%) | | **F > 3.0 (veto, NEW)** | same 157 | **1 (0.6%)** | | **manipulation_triple_flag** | full universe | **2** | The veto layer expects after this PR ships: - 4.5a.1 (merged): Sloan sector-relative, 51 → ~56 - 4.5a.2 (PR #90): + Beneish veto, 11 new flags - **4.5a.3 (this PR)**: + Dechow veto, **1** new flag (SMCI overlaps with Beneish veto on SMCI — Top-5 suppression stacks but the effective count of NEW suppressions is 1 unique ticker, since SMCI already loses entered_top5 from the Beneish veto) - Active vetoes: **5 → 7** (Beneish + Dechow added) - Annotates: + manipulation_triple_flag = **+1 reason taxonomy** ## Architecture | Layer | Before 4.5a wave | After 4.5a wave | |---|---|---| | Active vetoes | 5 | **7** (+ beneish_manipulation_veto, dechow_manipulation_veto) | | Tier-3 annotates | `beneish_high` + `dechow_high` at looser thresholds | unchanged (kept for the soft band) | | Joint gates | none | **+ `manipulation_triple_flag`** (3-of-3 joint) | ## Files changed | File | Change | |---|---| | `compute/scoring/dechow_f.py` | + `DECHOW_VETO_THRESHOLD = 3.0` constant + docstring rationale (Dechow 2011 Table 7 4× baseline crossover) | | `compute/scoring/risk_overlay.py` | + `dechow_f_scores` kwarg on `compute_risk_flags`. New veto check at end of per-ticker loop, immediately after Beneish. Imports `DECHOW_VETO_THRESHOLD`. | | `compute/main.py` | Pre-compute `dechow_results` dict + `dechow_f_scores` dict alongside Beneish (one combined pass). Pass to compute_risk_flags. Step-8 per-ticker loop reads from cached `dechow_results[ticker]` (no double-compute). + `manipulation_triple_flag` joint-gate logic appended after Dechow annotate emission. | | `tests/test_scoring/test_risk_overlay.py` | 5 new tests: veto fires above threshold, skipped on None, strict inequality, disabled when dict not supplied, Beneish + Dechow co-firing independence. | ## Backward compat - `dechow_f_scores` arg optional. Existing callers without it unchanged. - `dechow_high` annotate at F > 2.45 unchanged. - `StockDetail.dechow_f_score` numeric field unchanged. - `manipulation_triple_flag` is in `valuation_warnings` (annotate) not `risk_flags` — doesn't change Top-N suppression on top of component vetoes. UI must opt in to render it. ## Verification ladder - ✅ `ruff check .` — clean - ✅ `pytest tests/ -m "not network"` — **784 passed** (was 779 on 4.5a.2 branch; +5 new) - ✅ schema_check — N/A (no schema delta — `manipulation_triple_flag` is a string in existing `valuation_warnings: list[str]`) - ⏳ Production verification deferred — expect: - 1 new `dechow_manipulation_veto` (SMCI) - 2 `manipulation_triple_flag` annotates (SMCI, WAT) - 7 active vetoes total ## Sibling sub-PRs (Phase 4.5a wave — COMPLETE after this PR) - ✅ **4.5a.1** Sloan sector-relative — merged PR #89 - 🟡 **4.5a.2** Beneish soft-veto — open PR #90 - **4.5a.3 (this PR)** — Dechow soft-veto + manipulation_triple_flag Next: **4.5b** (restatement_history + late_filing_notification), **4.5c** (Roychowdhury REM), **4.5d** (m-score momentum + Burgstahler kink), **4.5e** (Form 4 insider clustering), **4.5f** (composite manipulation_index + UI + schema bump). https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
dackclup
added a commit
that referenced
this pull request
May 16, 2026
#91) Phase 4.5 manipulation-defense cluster sub-PR 3 of 6 per PR #86. Final 4.5a wave member. Branched off PR #90 (4.5a.2 Beneish veto) to integrate cleanly. Depends on 4.5a.2 merging first. ## Two additions ### 1. Dechow F-score soft-veto (F > 3.0) Promotes `dechow_high` (F > 2.45 annotate, PR 3e.2) to a second-tier active veto at the stricter threshold F > 3.0. Mirrors PR 4.5a.2's Beneish veto pattern exactly. Threshold rationale: Dechow 2011 *CAR* Table 7 shows that at F > 3.0 the AAER hit rate exceeds 4× baseline (vs ~2× at F > 2.45). The stricter cutoff matches the precision/recall trade-off PR 4.5a.2 locked for Beneish and PR 3d locked for non_reliance_filing — high precision, narrower recall, won't dilute Top-5. ### 2. `manipulation_triple_flag` joint-gate badge Fires when Sloan + Beneish-high + Dechow-high all flag on the same ticker. Rare but high-confidence — 2 tickers in run #46: - **SMCI**: F=6.65 (Dechow veto fires too) + Sloan + Beneish high - **WAT**: Sloan + Beneish high + Dechow high (annotates only) UI-only badge in `valuation_warnings`; does NOT stack a third veto on top of the individual component vetoes. Per PR #86 plan §4.5a.3. ## Production estimate (run #46) | Threshold | Coverage | Flagged | |---|---|---| | F > 2.45 (annotate, existing) | 157/502 (31% covered) | 2 (1.3%) | | **F > 3.0 (veto, NEW)** | same 157 | **1 (0.6%)** | | **manipulation_triple_flag** | full universe | **2** | The veto layer expects after this PR ships: - 4.5a.1 (merged): Sloan sector-relative, 51 → ~56 - 4.5a.2 (PR #90): + Beneish veto, 11 new flags - **4.5a.3 (this PR)**: + Dechow veto, **1** new flag (SMCI overlaps with Beneish veto on SMCI — Top-5 suppression stacks but the effective count of NEW suppressions is 1 unique ticker, since SMCI already loses entered_top5 from the Beneish veto) - Active vetoes: **5 → 7** (Beneish + Dechow added) - Annotates: + manipulation_triple_flag = **+1 reason taxonomy** ## Architecture | Layer | Before 4.5a wave | After 4.5a wave | |---|---|---| | Active vetoes | 5 | **7** (+ beneish_manipulation_veto, dechow_manipulation_veto) | | Tier-3 annotates | `beneish_high` + `dechow_high` at looser thresholds | unchanged (kept for the soft band) | | Joint gates | none | **+ `manipulation_triple_flag`** (3-of-3 joint) | ## Files changed | File | Change | |---|---| | `compute/scoring/dechow_f.py` | + `DECHOW_VETO_THRESHOLD = 3.0` constant + docstring rationale (Dechow 2011 Table 7 4× baseline crossover) | | `compute/scoring/risk_overlay.py` | + `dechow_f_scores` kwarg on `compute_risk_flags`. New veto check at end of per-ticker loop, immediately after Beneish. Imports `DECHOW_VETO_THRESHOLD`. | | `compute/main.py` | Pre-compute `dechow_results` dict + `dechow_f_scores` dict alongside Beneish (one combined pass). Pass to compute_risk_flags. Step-8 per-ticker loop reads from cached `dechow_results[ticker]` (no double-compute). + `manipulation_triple_flag` joint-gate logic appended after Dechow annotate emission. | | `tests/test_scoring/test_risk_overlay.py` | 5 new tests: veto fires above threshold, skipped on None, strict inequality, disabled when dict not supplied, Beneish + Dechow co-firing independence. | ## Backward compat - `dechow_f_scores` arg optional. Existing callers without it unchanged. - `dechow_high` annotate at F > 2.45 unchanged. - `StockDetail.dechow_f_score` numeric field unchanged. - `manipulation_triple_flag` is in `valuation_warnings` (annotate) not `risk_flags` — doesn't change Top-N suppression on top of component vetoes. UI must opt in to render it. ## Verification ladder - ✅ `ruff check .` — clean - ✅ `pytest tests/ -m "not network"` — **784 passed** (was 779 on 4.5a.2 branch; +5 new) - ✅ schema_check — N/A (no schema delta — `manipulation_triple_flag` is a string in existing `valuation_warnings: list[str]`) - ⏳ Production verification deferred — expect: - 1 new `dechow_manipulation_veto` (SMCI) - 2 `manipulation_triple_flag` annotates (SMCI, WAT) - 7 active vetoes total ## Sibling sub-PRs (Phase 4.5a wave — COMPLETE after this PR) - ✅ **4.5a.1** Sloan sector-relative — merged PR #89 - 🟡 **4.5a.2** Beneish soft-veto — open PR #90 - **4.5a.3 (this PR)** — Dechow soft-veto + manipulation_triple_flag Next: **4.5b** (restatement_history + late_filing_notification), **4.5c** (Roychowdhury REM), **4.5d** (m-score momentum + Burgstahler kink), **4.5e** (Form 4 insider clustering), **4.5f** (composite manipulation_index + UI + schema bump). https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2 Co-authored-by: Claude <noreply@anthropic.com>
5 tasks
dackclup
added a commit
that referenced
this pull request
May 16, 2026
#92) Phase 4.5a manipulation-defense quick-wins shipped 2026-05-16 across 3 sub-PRs (#89/#90/#91). Production verified on run #47 (commit `8cdf4886`). This commit bumps the triple-doc lockstep so future sessions read the actual current state instead of the in-progress plan. ## What shipped (per-sub-PR) | Sub-PR | PR | Delivered | Production effect | |---|---|---|---| | **4.5a.1** | #89 | Sloan accruals top-decile within sector; `SLOAN_MIN_POPULATION_SECTOR=15` floor; cross-sectional fallback for under-floor sectors. Closes issue #7. | Financials Sloan rate 21.3% → 11.7%. Cross-sector spread 7.7× → 1.4×. Total Sloan flagged 51 → 56. | | **4.5a.2** | #90 | `beneish_manipulation_veto` active-veto path at M > −1.78 (Beneish 1999 Table 4 PPV crossover). | 11 new vetoed tickers: SMCI · WAT · PODD · WDC · NVDA · CAT · PLTR · SNDK · BG · STX · LLY. | | **4.5a.3** | #91 | `dechow_manipulation_veto` active-veto path at F > 3.0 (Dechow 2011 Table 7 4× baseline crossover) + `manipulation_triple_flag` joint-gate annotate. | 1 Dechow veto (SMCI F=6.65); 2 triple_flag tickers (SMCI + WAT). | ## End-state defense layer - **Active vetoes**: 5 → **7** (added `beneish_manipulation_veto`, `dechow_manipulation_veto`) - **Annotate flags**: 4 → **5** (added `manipulation_triple_flag`) - **Tier-3 forensic**: still 2 (Beneish + Dechow operating at two thresholds each — annotate + veto) - **Reason taxonomy**: 24 → **29 stable identifiers** No schema delta — new flag IDs are strings in the existing `risk_flags: list[str]` and `valuation_warnings: list[str]` arrays. `SCHEMA_VERSION` stays `0.7.1-phase4g`. ## Triple-doc lockstep changes | File | Change | |---|---| | `CLAUDE.md` | §Phase status — "Next deliverable" reframed from "4.5a.1 sector-relative Sloan" to "4.5b disclosure-driven catches". Defense layer count "9 → 18 target" updated to "9 → 11 after 4.5a; target 18 after 4.5f". Issue #7 marked ✅ closed by 4.5a.1. | | `PHASE_STATUS.md` | Phase Overview table: Phase 4.5 row flipped ⚪ → 🟡 IN PROGRESS with the 4.5a wave landed; duplicate ⚪ row removed. "Phase 4.5 plan" section §4.5a replaced "1-2 weeks, +2 active veto + 1 badge" header with "✅ DONE 2026-05-16" + a results table showing per-sub-PR production effect. Original plan text preserved below the results table for audit. | | `WORKFLOW.md` | Phase 4.5 §Tasks §4.5a — all 4 checkboxes flipped [ ] → [x] with per-sub-PR PR-number citations (PR #89 / #90 / #91), LOC counts, test deltas, production verification numbers. | ## Audit trail This is the 4th doc-correction PR in the post-v1.0 cleanup pattern, this time legitimate (the work actually shipped). Earlier ones in this session were correcting drift between intent and state: - PR #81 — 4g status correct (factual) - PR #86 — added Phase 4.5 roadmap (planning) - PR #87 — corrected "PR 4b next" → "§3 polish next" (was wrong) - PR #88 — corrected "§3 polish next" → "deferred to Phase 5" (was wrong) - **This PR** — 4.5a wave ✅ DONE (factual, not drift-correction) ## Verification - No code changes; docs only - `grep "4.5a.1" CLAUDE.md PHASE_STATUS.md WORKFLOW.md` returns only DONE/closed references in active sections - `grep "Next deliverable.*4.5a"` returns 0 hits (all moved to 4.5b) - `grep "9 → 11"` appears in CLAUDE.md (new defense layer count) https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2 Co-authored-by: Claude <noreply@anthropic.com>
3 tasks
dackclup
added a commit
that referenced
this pull request
May 16, 2026
…tatement_history + late_filing_notification) (#93) Phase 4.5 manipulation-defense cluster sub-PR 4 of 6 per PR #86. Adds two ANNOTATE-only flags surfaced from the SEC EDGAR filing list. ## What's new ### `restatement_history` annotate (5y lookback) - Module: `compute/scoring/restatement_filings.py` (~390 LOC + ~290 test LOC). - Counts 10-K/A + 10-Q/A filings per CIK in the trailing 5 years. - Paper: Hennes-Leone-Miller 2008 *TAR* — restating firms see -9% abnormal return on announcement; recurrent restaters compound. - Lookback: `config.RESTATEMENT_HISTORY_LOOKBACK_DAYS = 1825` (5×365). - ANNOTATE-only — base rate sector-agnostic, no veto without sector adjustment (which 4.5b doesn't include). ### `late_filing_notification` annotate (1y lookback) - Same module. - Detects SEC Form 12b-25 (NT 10-K / NT 10-Q) in the trailing 365d. - Paper: Bartov-Lai-Yeung 2002 *JAR* — late filers see -5-7% abnormal returns. - Lookback: `config.LATE_FILING_LOOKBACK_DAYS = 365`. ## Architecture (mirrors `compute.scoring.eight_k_events`) - Per-CIK JSON cache (7d TTL) at `compute/cache/edgar_amendments/` and `compute/cache/edgar_late_filings/`. - Cache shape: `{fetched_at, lookback_days, filings: [{accession, form, filing_date, filing_url}]}`. - Fetch path: `edgar.Company(ticker).get_filings(form=...)` with per-form retry; merges results across multiple forms client-side + sorts desc by filing_date so `filings[0]` is the latest. - Public entry points: `check_restatement_history(ticker, ..., filings_override=...)` and `check_late_filing(ticker, ..., filings_override=...)`. The override path is the test inject — bypasses EDGAR, keeps unit tests offline. ## Files touched | File | Change | |---|---| | `compute/scoring/restatement_filings.py` | NEW. Cache + fetch + check_restatement_history + check_late_filing. | | `compute/config.py` | + `RESTATEMENT_HISTORY_LOOKBACK_DAYS=1825` + `LATE_FILING_LOOKBACK_DAYS=365` + `EDGAR_AMENDMENTS_CACHE_DIR` + `EDGAR_LATE_FILINGS_CACHE_DIR`. | | `compute/main.py` | + import `check_late_filing` + `check_restatement_history`. Per-ticker Step-8 loop appends `restatement_history` / `late_filing_notification` to `valuation_warnings` when the check fires. Slots immediately after the existing PR 4b §1 `cross_source_disagreement` block. | | `.github/workflows/compute-rankings.yml` | + 2 new cache paths (`edgar_amendments`, `edgar_late_filings`) so weekly runs preserve the per-ticker JSON files. | | `tests/test_scoring/test_restatement_filings.py` | NEW. 17 tests covering `_filing_date_within` boundaries, both check_* entry points (no filings, within window, outside window, multiple within window, lookback constants, fetch-failure graceful path). All offline via `filings_override`. | | `tests/test_workflow_cache_coverage.py` | + 2 new parametrized cache-path assertions for the new directories (catches future workflow YAML drift). | ## Defense layer end-state (after this PR ships) - Active vetoes: 7 (unchanged — 4.5b is annotate-only) - Annotate flags: 5 → **7** (added `restatement_history`, `late_filing_notification`) - Reason taxonomy: 29 → **31** stable identifiers No schema delta — new flag IDs are strings in existing `valuation_warnings: list[str]`. ## Backward compat - `filings_override` arg is opt-in (None default → fetches via EDGAR). Existing callers without it unchanged. - `EDGAR_USER_AGENT` env var precondition matches the rest of the Tier-2 layer — fetcher returns None when unset (cleanly skipping the flag rather than crashing). - Caches gitignored under `compute/cache/`. ## Verification ladder - ✅ `ruff check .` — clean - ✅ `pytest tests/ -m "not network"` — **803 passed** (was 784; +17 new restatement tests + 2 new workflow-cache parametrize entries) - ⏳ Production verification deferred to next workflow_dispatch. Expected fire rates on S&P 500 (rough — needs production run to confirm): - `restatement_history` — 30-80 tickers (~6-16%) based on historical 10-K/A base rates 2020-2025 - `late_filing_notification` — 5-20 tickers (~1-4%) based on SEC Form 12b-25 filing data ## Sibling sub-PRs (Phase 4.5 cluster) - ✅ **4.5a wave** complete (PRs #89/#90/#91 + #92 docs) - **4.5b (this PR)** — disclosure-driven catches - ⬜ **4.5c** — Roychowdhury REM 3-proxy - ⬜ **4.5d** — M-score 3y momentum + Burgstahler-Dichev kink - ⬜ **4.5e** — Form 4 insider clustering - ⬜ **4.5f** — `manipulation_index` composite + UI + schema bump → v1.2.0 https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2 Co-authored-by: Claude <noreply@anthropic.com>
dackclup
added a commit
that referenced
this pull request
May 17, 2026
#94) Phase 4.5b disclosure-driven catches shipped via PR #93. Production verified on run #48 (commit `849b7ca8`, workflow 2h08m due to cold-cache populating both new `edgar_amendments` + `edgar_late_filings` dirs; warm runs return to ~1h30m). ## What shipped | Flag | Lookback | Production fire | Notes | |---|---|---|---| | `restatement_history` | 5y 10-K/A + 10-Q/A | **60 / 502 (12.0%)** | within expected 6-16% — AMD, DIS, CVX, BSX, EBAY etc. (mostly mature firms with periodic amendments) | | `late_filing_notification` | 365d Form 12b-25 | **2 / 502 (0.4%)** | HAS + Q — slightly under expected 1-4% (S&P 500 firms tend to be more compliant than broader Bartov-Lai-Yeung sample) | ## End-state defense layer - Active vetoes: **7** (unchanged — 4.5b is annotate-only) - Annotate flags: 5 → **7** (+ `restatement_history`, `late_filing_notification`) - Reason taxonomy: 29 → **31** - **Defense layer 9 → 13 layers after 4.5a + 4.5b** No schema delta — both new flags are strings in existing `valuation_warnings: list[str]`. `SCHEMA_VERSION` stays `0.7.1-phase4g`. ## Triple-doc lockstep changes | File | Change | |---|---| | `CLAUDE.md` | "Next deliverable" reframed from 4.5b to 4.5c (Roychowdhury REM). Defense layer count "9 → 11 after 4.5a" → "9 → 13 after 4.5a + 4.5b". 4.5b results summary appended. | | `PHASE_STATUS.md` | Phase 4.5 table row updated with 4.5b results + tickers + 60/2 counts. §4.5b header flipped to ✅ DONE 2026-05-16 with results table + workflow time note (cold-cache 2h08m). Original plan text preserved below for audit. | | `WORKFLOW.md` | §Tasks §4.5b — all 4 checkboxes [ ] → [x] with PR-number + LOC + test-count + production-verification citations. SEC Filing Roadmap table: 4 new rows for 10-K/A, 10-Q/A, NT 10-K, NT 10-Q (all ✅ active with PR #93 / 2026-05-16 production-fire-rate footnotes). Form 4 status flipped from "❌ not used" to "⬜ planned (Phase 4.5e)" to reflect the upcoming sub-PR. | ## Next deliverable **Phase 4.5c — Real Earnings Management (Roychowdhury 2006 REM)**: - 3 abnormal proxies per ticker: - `abnormal_CFO` = actual − model(Sales, ΔSales) - `abnormal_production` = actual − model(Sales, ΔSales, ΔSales_t−1) - `abnormal_discretionary_expenses` = actual − model(Sales_t−1) - Flag `rem_suspect` fires when 2 of 3 proxies sit in worst decile within sector - ~250 LOC + golden tests against Roychowdhury 2006 paper Table 6 - Catches REAL manipulation (cutting R&D, channel stuffing, deferring maintenance) — invisible to Sloan/Beneish/Dechow which target accrual manipulation ## Audit trail (post-v1.0 doc PRs) | PR | Purpose | |---|---| | #81 | 4g ✅ DONE | | #86 | Phase 4.5 roadmap added | | #87 | "PR 4b next" → "§3 polish next" (was wrong) | | #88 | "§3 polish next" → "Phase-5 blocked" (was wrong) | | #92 | 4.5a wave ✅ DONE | | **this PR** | 4.5b wave ✅ DONE | ## Verification - No code changes; docs only - `grep "Next deliverable.*4.5b"` returns 0 hits (all moved to 4.5c) - `grep "9 → 13"` appears in CLAUDE.md (new defense layer count) - `grep "10-K/A.*✅ active"` returns the new WORKFLOW.md filing-roadmap row https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2 Co-authored-by: Claude <noreply@anthropic.com>
Merged
5 tasks
dackclup
added a commit
that referenced
this pull request
May 17, 2026
…em_suspect annotate) (#95) Phase 4.5 manipulation-defense cluster sub-PR 5 of 6 per PR #86. Catches REAL manipulation (cutting R&D, channel stuffing, deferring maintenance, overproduction) — invisible to the existing accrual- targeting defenses (Sloan / Beneish / Dechow). ## Model — Roychowdhury 2006 *JAE* 3-proxy Three abnormal residuals from per-sector OLS regressions: 1. **Abnormal CFO** — residual of `CFO_t / A_{t-1}` on `[1, 1/A_{t-1}, Sales_t/A_{t-1}, ΔSales_t/A_{t-1}]`. **Low (negative)** = suspicious → firm front-loaded sales via channel stuffing / loose credit / discounts to inflate CFO. 2. **Abnormal Production** — residual of `(COGS_t + ΔInventory_t) / A_{t-1}` on `[1, 1/A_{t-1}, Sales_t, ΔSales_t, ΔSales_{t-1}]` (all over A_{t-1}). **High** = suspicious → overproduction spreads fixed costs over more units, deflating per-unit COGS and inflating gross margin. 3. **Abnormal Discretionary Expenses** — residual of `(R&D_t + SGA_t) / A_{t-1}` on `[1, 1/A_{t-1}, Sales_{t-1}/A_{t-1}]`. **Low (negative)** = suspicious → firm cut discretionary spending to boost current earnings. (Advertising omitted — SEC XBRL rarely tags it separately; per Roychowdhury 2006 footnote 7 the SGA-only adaptation is acceptable since advertising is usually subsumed in SGA.) Flag `rem_suspect` fires when **≥ 2 of 3** residuals sit in their respective worst decile within the ticker's GICS sector. Mirrors the 4.5a.3 `manipulation_triple_flag` pattern but uses *real* (not accrual) signals. ## Architecture | File | Change | |---|---| | `compute/scoring/rem.py` | **NEW** — `REMProxies` + `REMResult` dataclasses; `compute_proxies` (per-ticker input vector from snap + history); `_fit_sector_models` (per-sector OLS via `numpy.linalg.lstsq`); `compute_rem_flags` (two-pass: proxies → sector models → residuals → within-sector decile rank → fire). ~420 LOC. | | `compute/main.py` | Pre-compute `rem_results` once via `compute_rem_flags(snapshots, histories=histories, sectors=sectors_dict)` right after `compute_risk_flags`. Per-ticker Step-8 loop appends `rem_suspect` to `valuation_warnings` when `rem_result.fired`. | | `tests/test_scoring/test_rem.py` | **NEW** — 14 tests in three layers: (1) proxy construction (5 tests covering well-formed, missing snap, missing assets denominator, R&D fallback to SGA-only, inventory-missing PROD skip), (2) end-to-end `compute_rem_flags` (8 tests: empty, below floor, at floor, double-outlier fires, single-outlier triggers cfo axis, triple-outlier 3-trigger, normal-ticker H0 FP rate, constants), (3) **golden numerical test** verifying OLS recovers known-DGP coefficients. | ## No new dependencies - `numpy.linalg.lstsq` for OLS (already in dep tree) - No `sklearn`, no `statsmodels` — pure-numpy reimplementation keeps install surface tight (mirrors PR 4b §2 PBO/DSR decision) ## Defense-layer end-state (after this PR ships) - Active vetoes: 7 (unchanged — 4.5c is annotate-only) - Annotate flags: 7 → **8** (+ `rem_suspect`) - Reason taxonomy: 31 → **32** stable identifiers No schema delta — `rem_suspect` is a string in existing `valuation_warnings: list[str]`. `SCHEMA_VERSION` stays `0.7.1-phase4g`. ## Backward compat - `compute_rem_flags(snapshots, histories=None, sectors=None)` — both kwargs optional. When sectors absent, no sector model can fit (every ticker's sector lookup returns None); all results have `fired=False`. - Sectors below `REM_MIN_POPULATION_SECTOR = 15` (matches 4.5a.1 Sloan sector-relative floor) skip REM cleanly — those tickers get `REMResult(None, None, None, fired=False)`. No active-veto fallback (REM is annotate-only). - DISEXP falls back to SGA-only when R&D is missing (financials / REITs / utilities) per Roychowdhury 2006 footnote 7. ## Verification ladder - ✅ `ruff check .` — clean - ✅ `pytest tests/test_scoring/test_rem.py` — **14 passed** - ✅ `pytest tests/ -m "not network"` — **817 passed** (was 803; +14 new REM tests) - ✅ schema_check — N/A (no schema delta) - ⏳ Production verification deferred to next workflow_dispatch. Expected fire rate: 5-7% (~25-35 of 502 S&P 500 tickers) assuming moderate axis correlation. H0 (independent axes) FP rate is 2.8% per the 2-of-3 joint-probability calc. ## Sibling sub-PRs (Phase 4.5 cluster) - ✅ **4.5a wave** complete (PRs #89 / #90 / #91 + #92 docs) - ✅ **4.5b** complete (PR #93 + #94 docs) - **4.5c (this PR)** — Roychowdhury REM - ⬜ **4.5d** — M-score 3y momentum + Burgstahler-Dichev kink - ⬜ **4.5e** — Form 4 insider clustering - ⬜ **4.5f** — `manipulation_index` composite + UI + schema bump → v1.2.0 https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2 Co-authored-by: Claude <noreply@anthropic.com>
dackclup
added a commit
that referenced
this pull request
May 17, 2026
#96) Phase 4.5c Roychowdhury REM shipped via PR #95. Production verified on run #49 (commit `65097703`, warm-cache 6m25s — all 9 cache layers populated). ## What shipped `rem_suspect` annotate via per-sector OLS regressions on 3 abnormal proxies (CFO, Production, Discretionary Expenses). Module `compute/scoring/rem.py` (~420 LOC, pure-numpy via `np.linalg.lstsq`, no sklearn/statsmodels dep). 14 offline tests including golden numerical test recovering known-DGP coefficients. ## Production verification | Metric | Value | |---|---| | Fire rate | **16 / 502 (3.2%)** — within H0-to-correlation expected 2.8-7% | | Tickers fired | SMCI · WAT · ADM · TSN · HRL · STLD · FSLR · JBL · COHR · LII · LDOS · POOL · OMC · WY · TECH · RVTY | | Orthogonality check | NVDA / PLTR (Beneish-veto fired) **NOT** in REM list — confirms 4.5c captures real-manipulation signal orthogonal to accrual targets | | Real-world coverage | ADM (2024 SEC investigation) · SMCI (2024 investigation) · TSN / HRL (periodic scrutiny) · FSLR (solar channel-stuffing history) | ## End-state defense layer - Active vetoes: **7** (unchanged — 4.5c is annotate-only) - Annotate flags: 7 → **8** (+ `rem_suspect`) - Reason taxonomy: 31 → **32** - **Defense layer 9 → 14 after 4.5a + 4.5b + 4.5c** No schema delta — `rem_suspect` is a string in existing `valuation_warnings: list[str]`. `SCHEMA_VERSION` stays `0.7.1-phase4g`. ## Triple-doc lockstep changes | File | Change | |---|---| | `CLAUDE.md` | "Next deliverable" 4.5c → 4.5d. Defense layer "9 → 13 after 4.5a+4.5b" → "9 → 14 after 4.5a+4.5b+4.5c". 4.5c results + ticker list + orthogonality note inserted between 4.5b and the post-completion roadmap. | | `PHASE_STATUS.md` | Phase 4.5 row updated with 4.5c production stats. §4.5c header flipped to ✅ DONE 2026-05-17 with results table + orthogonality note. Original plan text preserved below for audit. | | `WORKFLOW.md` | §4.5c checkboxes [ ] → [x] with PR-number / LOC / test-count / production-verification citations + golden-test reference. | ## Next deliverable **Phase 4.5d — earnings-quality time-series + Burgstahler-Dichev kink at zero** (~180 LOC, ~7 days): - `m_score_deteriorating` annotate — Δ(Beneish M-score) > +0.5 over trailing 3y (manipulation gathering steam) - `loss_avoidance_pattern` annotate — NI ∈ [0, $5M] OR EPS ∈ [0, $0.05] for 3+ consecutive years (Burgstahler-Dichev 1997 kink) ## Audit trail (post-v1.0 doc PRs) | PR | Purpose | |---|---| | #81 | 4g ✅ DONE | | #86 | Phase 4.5 roadmap added | | #87 | "PR 4b next" → "§3 polish next" (was wrong) | | #88 | "§3 polish next" → "Phase-5 blocked" (was wrong) | | #92 | 4.5a wave ✅ DONE | | #94 | 4.5b wave ✅ DONE | | **this PR** | 4.5c wave ✅ DONE | ## Verification - No code changes; docs only - `grep "Next deliverable.*4.5c"` returns 0 hits (all moved to 4.5d) - `grep "9 → 14"` appears in CLAUDE.md (new defense layer count) - `grep "rem_suspect"` appears in PHASE_STATUS.md + WORKFLOW.md active-flags references https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2 Co-authored-by: Claude <noreply@anthropic.com>
5 tasks
dackclup
added a commit
that referenced
this pull request
May 17, 2026
…als_momentum_high + loss_avoidance_pattern) (#97) Phase 4.5 manipulation-defense cluster sub-PR 6 of 6 (the last purely-defense sub before 4.5f composite + UI bundling). Two annotate-only flags derived from the per-ticker fundamentals history (annual XBRL). ## What's new ### `accruals_momentum_high` — Δ(TATA) over 3y > +0.05 - TATA = (NetIncome − OperatingCashFlow) / TotalAssets, the Sloan 1996 / Beneish 1999 accruals backbone. - Threshold +0.05 ≈ Beneish 1999 ΔM > +0.5 via the β_TATA = 4.679 coefficient (ΔM ≈ 4.679 × ΔTATA → ΔM > 0.5 ⇔ ΔTATA > 0.107). We use 0.05 — more sensitive since TATA alone captures less than the full 8-ratio signal; standard practitioner adaptation when shortening to one ratio. - Catches manipulation **gathering steam** — the snapshot-only Sloan + Beneish flags miss the trajectory entirely. **Practical note on naming**: PR #86 plan §4.5d called this `m_score_deteriorating` (full Δ(Beneish M-score) > +0.5). We chose TATA momentum as a practical equivalent: building 3 historical 8-ratio Beneish snapshots from XBRL history would require expanding the annual-history coverage of 6+ supplementary ratios (DSRI / GMI / AQI / etc.) that often have gaps for prior years. TATA is the single Beneish component that's a level rather than a ratio-of-ratios, and Sloan 1996 established it as the standalone accruals signal — so this is a clean shortening, not a weakening. ### `loss_avoidance_pattern` — Burgstahler-Dichev 1997 kink at zero - Fires when **3+ consecutive fiscal years** of tiny-positive earnings: NI ∈ [\$0, \$5M] **OR** EPS ∈ [\$0.00, \$0.05]. - Per-share band catches the high-share-count case where NI alone is above the absolute floor but per-share is still tiny. - Empirical kink-at-zero signature of managers shading reported earnings just enough to clear the loss / loss-threshold. ## Architecture | File | Change | |---|---| | `compute/scoring/earnings_quality.py` | **NEW** ~250 LOC — `check_accruals_momentum` + `check_loss_avoidance` + history-walk helpers (`_annual_values`, `_value_at_year`). Pure pandas; no new deps. | | `compute/main.py` | + 2 import lines + 2 per-ticker annotate appends in the Step-8 loop, slotting after `rem_suspect`. | | `tests/test_scoring/test_earnings_quality.py` | **NEW** ~225 LOC — 14 offline tests covering both flags (fires / doesn't fire / improves / threshold pins / EPS-band fallback / negative-NI rejection / large-NI rejection / multi-year streak / streak break / constants sanity). | ## Defense-layer end-state (after this PR ships) - Active vetoes: **7** (unchanged — 4.5d is annotate-only) - Annotate flags: 8 → **10** (+ `accruals_momentum_high`, `loss_avoidance_pattern`) - Reason taxonomy: 32 → **34** stable identifiers - Total defense layers: **9 → 16** after 4.5a + 4.5b + 4.5c + 4.5d No schema delta — both flags are strings in existing `valuation_warnings: list[str]`. `SCHEMA_VERSION` stays `0.7.1-phase4g`. ## Backward compat - Both check functions take `(snap, history)` — no caller changes elsewhere. Missing inputs (snap=None, no history, insufficient years) cleanly return fired=False. - No new EDGAR fetches — both flags read from existing fundamentals + fundamentals_history caches. ## Verification ladder - ✅ `ruff check .` — clean - ✅ `pytest tests/test_scoring/test_earnings_quality.py` — **14 passed** - ✅ `pytest tests/ -m "not network"` — **831 passed** (was 817; +14 new) - ✅ schema_check — N/A (no schema delta) - ⏳ Production verification deferred. Expected fire rates on S&P 500: - `accruals_momentum_high` ~3-8% (~15-40 tickers) — H0 from Δ(TATA) > 0.05 base rate - `loss_avoidance_pattern` ~1-3% (~5-15 tickers) — S&P 500 firms rarely report tiny-positive earnings for 3+ years (mega-cap distribution); base rate higher on small-caps per Burgstahler- Dichev 1997 original sample ## Sibling sub-PRs (Phase 4.5 cluster) - ✅ **4.5a wave** (PRs #89 / #90 / #91 + #92 docs) - ✅ **4.5b** (PR #93 + #94 docs) - ✅ **4.5c** (PR #95 + #96 docs) - **4.5d (this PR)** — earnings-quality time-series - ⬜ **4.5e** — Form 4 insider clustering (~420 LOC, ~12 days — needs new SEC Form 4 parser) - ⬜ **4.5f** — `manipulation_index` composite + composite-score penalty + UI pillar card + README Honest Limitations + schema bump → **v1.2.0-phase4.5** https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2 Co-authored-by: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
User: "อยากทำให้ระบบป้องกันได้แน่นหนาที่สุด ช่วยเสนอแผนมาหน่อย" → "รวมแผนเข้าไปในแผนการทำ app".
Folds the earnings-manipulation-defense proposal into the existing phase-tracker triple as a new Phase 4.5 → v1.2.0 cluster, inserted between Phase 4 (factor consolidation → v1.1) and Phase 5 (ML meta-learner).
Phase 4.5 sub-PRs (~10-11 working weeks)
manipulation_triple_flagjoint badgerestatement_history10-K/A scan +late_filing_notificationForm 12b-25abnormal_CFO+abnormal_production+abnormal_discretionary_exp)m_score_deteriorating3y momentum + Burgstahler-Dichev kink-at-zeroloss_avoidance_patterninsider_sell_cluster+c_suite_unusual_sellmanipulation_index0-100 composite + composite-score penalty + UI pillar card + README Honest Limitations + schema bump →0.8.0-phase4.5fDefense layer after 4.5: 5 → 7 active vetoes; 4 → 11 annotates; 9 → 18 total layers.
Validation harness (cross-cutting)
Sequencing
Doc lockstep (per phase-status-bump skill)
PHASE_STATUS.mdWORKFLOW.mdCLAUDE.mdNot in this PR
SKILL.md— no current-state constant change yet (schema / veto count / rule additions land per sub-PR as they ship, not at plan time)compute/,frontend/,tests/) — plan only.claude/skills/phase-4.5/sub-skill PLAN.md stubs — land per sub-PR at 4.5a kickoffTest plan
WORKFLOW.md"PHASE 4.5" task section parses cleanly (markdown checkboxes render)CLAUDE.mdcross-references resolve (PHASE_STATUS.md→ §"Phase 4.5 plan")https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
Generated by Claude Code