Skip to content

feat(phase-3c): fair price ensemble + price history + Tier-1 defenses#9

Merged
dackclup merged 21 commits into
mainfrom
claude/phase-3c-fair-price-defenses
May 10, 2026
Merged

feat(phase-3c): fair price ensemble + price history + Tier-1 defenses#9
dackclup merged 21 commits into
mainfrom
claude/phase-3c-fair-price-defenses

Conversation

@dackclup
Copy link
Copy Markdown
Owner

@dackclup dackclup commented May 9, 2026

feat(phase-3c): Fair price ensemble + price history + Tier-1 defenses

Closes the v1.0 milestone halfway: ships the fair-price ensemble + 6 numerical defenses + price-history files. After this merges, PR 3d (Tier-2 events) and PR 3e (Beneish/Dechow + v1.0 tag) close out v1.0.

Scope

Fair-price ensemble (6 methods)

  • Graham (defensive): √(22.5 × EPS_3y_avg × tangible_BVPS)
  • P/E multiples (sector peers, 4-tier walk + 5/95 winsorize)
  • P/B multiples (uses BVPS_reported, not TBVPS — peer comparability)
  • EV/EBITDA multiples (Financials excluded)
  • Residual Income (RIM) — Penman 2013, uses TBVPS for B0
  • DCF (2-stage) — Gordon terminal, g ≤ min(0.03, WACC − 100bp)

Aggregation: median of all applicable + max excluding outliers (values outside [0.2×, 5×] of current price). MoS = (median − current) / median.

Price history

Per-stock 1y OHLCV at stocks/history/{TICKER}.json (column-major, ~30 KB each, lazy-loaded by frontend).

Defense backbone

  • 3 vetoes: altman_distress (Z″ < 1.10), sloan_accruals_top_decile, net_issuance_top_decile (NEW; within-sector, Pontiff-Woodgate 2008)
  • 5 numerical guards: stale_filing (120 / 180 d), outlier 5×, terminal_g, sector_exclusions, data_quality_$10K_ceiling (Defense Investigate NVDA Sloan accruals flag — genuine red flag or growth-stage artifact? #7, added mid-PR after BKR=$105M spot-check)
  • 5+ annotate-only flags: goodwill_heavy, value_trap_risk, extreme_{method}_estimate (one per method slot — {method} is one of graham / multiples_pe / multiples_pb / multiples_ev_ebitda / rim / dcf; in practice 2–3 fire per stock), stale_filing_soft, data_quality_input_corruption

Production verification (commit 1a25d84, run #11)

  • Universe: 502 S&P 500 stocks
  • Schema: 0.4.0-phase3b0.5.0-phase3c
  • Fair-price coverage: 487 / 502 (97.0%)
  • Sanity guard activations: 8 tickers (AMCR, BKR, CHTR, ERIE, PSKY, RTX, SPG, VTRS) — all show null fair_price + warning
  • mos_trailing_ic_smoke: −0.2216 (consistent with current momentum-up regime; not a predictive claim — see compute/scoring/sanity.py docstring)
  • Top-5 unchanged from baseline: SNDK / EOG / CF / BKR / HST
  • Risk-flag totals stable: altman 54 / sloan 50 / nsi 37
  • Run-id 25618712829, last_update 2026-05-10T03:49:51Z

Tests

  • 118 (PR 3b baseline) → 410 (+292 in PR 3c)
  • All 4 CI jobs green: ruff, schema-snapshot guard, pytest, tsc
  • Schema snapshot CI guard (frontend/lib/schema-snapshot.json) prevents Python ↔ TypeScript drift

Reviewer checklist

  • CI green (4 jobs)
  • Vercel preview spot-checked (rankings + AAPL + JPM + BKR detail)
  • Defense scorecard matches v1.0 spec (3 vetoes + 5 guards + 5 annotate)
  • Schema snapshot in sync (no drift)
  • 21 reason taxonomy entries

Post-merge actions

  • Tag v0.5.0-phase3c
  • Delete branch claude/phase-3c-fair-price-defenses
  • File 3 staged issues from /tmp/issue_drafts/:
    1. shares_outstanding ingestion bug (~11 tickers)
    2. _avg_3y_roe denominator bug (221 false value_trap_risk flags)
    3. mos_pct frontend display clamping (already implemented in Step 10; decide whether to file for trace or skip)
  • Comment on Issue Investigate NVDA Sloan accruals flag — genuine red flag or growth-stage artifact? #7 with FITB/HBAN finding (additional Sloan-bank semantic evidence for Phase 4 priority)
  • Begin PR 3d planning

Defense Playbook lineage

Prior research (docs PR #8, commit e7c418c) established the 26-defense roadmap. PR 3c implements Tier-1 (defenses 1-6 + bonus Defense #7). PR 3d implements Tier-2 (going-concern + 8-K events). PR 3e implements Tier-3 (Beneish M-Score + Dechow F-Score + Honest Limitations). v1.0 tags after PR 3e merges.


Generated with Claude Code · Tested with Anthropic API
Session: https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2

…onstants + goodwill/intangibles ingest

Step 1 of PR-3c (foundation; no scoring or fair-price logic yet — those land in
subsequent steps). Per the WORKFLOW.md PR-3c spec and the kickoff message's
sequential-checkpoint discipline.

- compute/config.py: bump SCHEMA_VERSION 0.4.0-phase3b → 0.5.0-phase3c. Add
  13 valuation + defense constants (DISCOUNT_RATE, TERMINAL_GROWTH,
  COST_OF_EQUITY, DCF_FORECAST_YEARS, DCF_FCF_WINDOW_YEARS,
  FILING_STALE_SOFT_DAYS=120, FILING_STALE_HARD_DAYS=180,
  GOODWILL_HEAVY_RATIO=0.5, EXTREME_ESTIMATE_HIGH=5.0, EXTREME_ESTIMATE_LOW=0.2,
  MULTIPLES_MIN_PEERS=8, NSI_TOP_DECILE=0.90, NSI_LOOKBACK_DAYS=365).
- pyproject.toml: pin pandas to '>=2.2,<3' to avoid pandas 3.x silent
  semantic drift (groupby / pct_change behavior); flagged in PR-3b §C-LOW.
- compute/ingest/fundamentals.py: add 'goodwill' (single tag,
  us-gaap:Goodwill — 5/5 hit on probe) and 'intangibles_net' (3-tag
  fallback chain: IntangibleAssetsNetExcludingGoodwill →
  OtherIntangibleAssetsNet → FiniteLivedIntangibleAssetsNet — covers
  KO/JPM/BRK-B which don't tag the primary). Plumbed through ALL_METRIC_KEYS,
  FundamentalsSnapshot dataclass, and _build_snapshot. Feeds Tangible BVPS
  (Defense Playbook §PR 3c §2) in Step 3.
- tests/test_smoke.py: bump SCHEMA_VERSION assertion 0.4.0 → 0.5.0.
- tests/test_features/test_fundamentals.py: extend AAPL synthetic snapshot
  with goodwill+intangibles_net (so missing_fields()==[] still holds). Add
  @network golden test test_goodwill_intangibles_fallback_chain (5 tickers:
  AAPL, KO, PG, JPM, BRK-B) — asserts goodwill 5/5 non-null > $1B and the
  fallback chain pushes intangibles_net coverage to ≥3/5 (vs 2/5 with the
  primary tag alone).

Verification: ruff clean, pytest 118 passed / 7 skipped (+1 new network),
tsc --noEmit exit 0.
@vercel
Copy link
Copy Markdown

vercel Bot commented May 9, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
quantrank Ready Ready Preview, Comment May 10, 2026 3:50am

…Woodgate 2008)

Adds the third active VETO joining altman_distress and sloan_accruals_top_decile.
Annotate-only per Rule 16; Top-5 rotation in compute/main.py is the only
enforcement layer (no composite mutation).

- compute/scoring/risk_overlay.py:
  * New helper _shares_at_lookback(history, asof_days, today=) — picks the
    period_end row closest to today − asof_days, requires it's at least
    asof_days/2 (or 90 days, whichever is greater) old to avoid same-quarter
    same-year leakage.
  * New helper _net_stock_issuance(snap, history, today=) — computes
    ln(shares_t / shares_{t-12m}); NaN when inputs missing.
  * compute_risk_flags signature extended with histories= and sectors=
    keyword-only args (backward-compatible — existing PR-3b callers see no
    behavior change).
  * NSI threshold computed within-sector with NSI_MIN_POPULATION=10 floor
    (consistent with SLOAN_MIN_POPULATION). When sectors is not provided,
    NSI is suppressed entirely — we don't degrade to cross-sectional, which
    was the lesson from issue #7's Sloan over-firing on REITs/banks.
  * Strict-positive guard: NSI ≤ 0 (buybacks / stable shares) never gets the
    dilution flag, even when the within-sector threshold collapses to 0 in
    populations with mostly-zero NSI. Documented inline.

- compute/main.py: build sectors_dict from inputs, pass histories +
  sectors_dict into compute_risk_flags. Top-5 rotation already iterates
  risk_flags.get(ticker), so NSI joins the existing flagged-skip path
  automatically.

- tests/test_scoring/test_risk_overlay.py: 11 new tests covering
  _shares_at_lookback (correct value, too-recent rows skipped, empty/missing
  inputs, non-positive value rejection), _net_stock_issuance (zero on
  unchanged shares, positive on dilution, NaN on missing history), and
  compute_risk_flags integration (top-decile-within-sector flag fires on
  diluter, suppressed without sectors, suppressed below NSI_MIN_POPULATION,
  backward-compat without new kwargs).

Verification: ruff clean, pytest 129 passed (+11 NSI) / 7 skipped, 0 failed.

Defense scorecard at end of Step 2:
- VETOES: 3 (altman_distress + sloan_accruals_top_decile + net_issuance_top_decile)
- GUARDS: 0 (Step 5 lands stale_filing + outlier_5x + terminal_g_unsafe)
- ANNOTATE: 0 (Step 3 lands goodwill_heavy)
…ibles netting)

New compute/valuation/ package with the first defense-layer building block:
TBVPS = (equity − goodwill − intangibles_net) / shares. Pure-function module
with no production wiring yet; Step 4.2 (graham.py) and Step 4.4 (rim.py)
will consume it; Step 5 (ensemble.py) will surface goodwill_heavy as a
valuation_warning.

- compute/valuation/__init__.py: re-exports the public surface
  (tangible_book_value_per_share, goodwill_heavy_flag).
- compute/valuation/tangible_book.py:
  * tangible_book_value_per_share(snap) — full intangibles netting per
    Penman 2013. None for negative tangible book / missing equity / zero
    or missing shares. Missing goodwill or intangibles_net coerce to 0
    (consistent with Step 1's 3-tag fallback chain achieving ~85%
    intangibles coverage; the remaining ~15% are treated conservatively
    as "no intangibles to net" rather than refusing to compute).
  * goodwill_heavy_flag(snap, tbvps) — TBVPS / BVPS_reported < 0.5
    (config.GOODWILL_HEAVY_RATIO). Annotate-only flag; will be appended
    to valuation_warnings in Step 5. Strict inequality so 50/50 firms
    don't trip — only material acquirer-balance-sheets fire.
  * Module docstring documents the dual-implementation rationale: Value
    pillar's compute/features/value.py keeps the fast-TTM Graham
    intentionally; this module powers the user-facing fair-price
    ensemble where the over-paying-for-goodwill caveat matters.

- tests/test_valuation/__init__.py: package marker.
- tests/test_valuation/test_tangible_book.py: 15 cases covering
  - kickoff §3.3 (a)-(h) baseline + edge cases
  - bonus edges: None shares, zero tangible book, exact-threshold
    strict-inequality semantics, negative-equity guard
  - module public-surface invariant + config constant sanity check
  - @network golden test against AAPL FY2024 EDGAR (skipped without
    --run-network; lenient internal-consistency assertion rather than a
    pinned $/share value, because the intangibles tag varies across
    AAPL's annual taxonomy by multi-billion).

Verification: ruff clean, pytest 144 passed (+15 tangible_book) /
8 skipped (+1 new network), 0 failed. Frontend tsc --noEmit exit 0.

Defense scorecard at end of Step 3 (unchanged from Step 2 — flag will
fire only after Step 5 surfaces it as valuation_warning):
- VETOES: 3 (altman_distress + sloan_accruals_top_decile + net_issuance_top_decile)
- GUARDS: 0 (Step 5 lands stale_filing + outlier_5x + terminal_g_unsafe)
- ANNOTATE: 0 (Step 5 lands goodwill_heavy via valuation_warnings)
… primitives

First sub-step of Step 4 (valuation methods). Pure-function module — no I/O,
no globals, no production wiring yet. Subsequent sub-steps (4.2 graham,
4.3 multiples, 4.4 rim, 4.5 dcf) consume these gates; Step 5 ensemble
pattern-matches on the SKIP_REASONS taxonomy to populate
StockDetail.fair_price.methods.<method>.reason.

- compute/valuation/applicability.py:
  * MethodApplicability dataclass — frozen, validates that reason is
    None iff applicable=True.
  * SKIP_REASONS — stable snake_case identifier tuple. Renaming any
    breaks the JSON contract.
  * 6 per-method check functions (keyword-only signature so each method
    declares the inputs it needs):
      - check_dcf_applicability — skip Financials & Utilities; require
        positive median 5y FCF, positive shares, non-hard-stale filing
      - check_graham_applicability — positive eps_3y_avg + tbvps
      - check_rim_applicability — value-trap-risk gate (ROE > Ke);
        Financials/Utilities OK; cost_of_equity defaults to
        config.COST_OF_EQUITY
      - check_multiples_pe_applicability — positive eps_ttm + peer median
      - check_multiples_pb_applicability — uses BVPS_reported (NOT
        TBVPS) for consistency with peer median; documented inline
      - check_multiples_ev_ebitda_applicability — skip Financials only
        (per kickoff §B2 — Utilities have meaningful EBITDA above the
        D&A line)
  * Stale-filing primitives:
      - filing_lag_days(filing_date, asof) → int | None
      - stale_filing_status(lag_days) → "fresh" | "soft" | "hard" | "unknown"
        Boundary semantics use strict > (lag == 120 is fresh; lag == 180
        is soft); aligned with config.FILING_STALE_SOFT_DAYS=120 and
        FILING_STALE_HARD_DAYS=180.

- compute/valuation/__init__.py: re-export the new public surface
  (MethodApplicability, LagStatus, all 6 check_* functions, both
  stale-filing primitives) alongside the existing tangible_book exports.

- tests/test_valuation/test_applicability.py: 44 cases covering
  - DCF: financials/utilities exclusion, FCF median, missing shares,
    hard-stale skip, partial None FCF list, all-None FCF list (8 cases)
  - Graham: positive inputs, negative EPS, None tbvps, negative tbvps,
    hard-stale (5 cases)
  - RIM: financials OK + ROE > Ke, ROE < Ke value-trap-risk (with exact
    reason string), ROE == Ke strict-inequality, None ROE, None tbvps,
    hard-stale, default cost_of_equity from config (7 cases)
  - Multiples P/E: positive inputs, negative eps, None peer median,
    hard-stale (4 cases)
  - Multiples P/B: positive inputs, negative bvps, hard-stale (3 cases)
  - Multiples EV/EBITDA: financials skip, IT applicable, negative ebitda,
    hard-stale, utilities NOT excluded (5 cases)
  - Stale-filing primitives: 4 status branches + 4 boundary cases
    (120/121/180/181) + filing_lag_days None and 120-day calc (10 cases)
  - Invariants: MethodApplicability state validation; SKIP_REASONS no
    duplicates; "stale_filing_hard" + value-trap reason both present (2 cases)

Verification: ruff clean, pytest 188 passed (+44 applicability) /
8 skipped, 0 failed. Frontend tsc --noEmit exit 0.

Reason taxonomy stable (14 identifiers in SKIP_REASONS); Step 5
ensemble will string-match against these to surface
fair_price.methods.<method>.reason in StockDetail JSON.

Defense scorecard at end of Step 4.1: unchanged from Step 3 (gates
exist; flags fire only when Step 5 wires them into ensemble output).
Second sub-step of Step 4 (valuation methods). Pure-function module — no
production wiring yet (Step 5 ensemble consumes); no schema changes.

- compute/valuation/graham.py:
  * graham_fair_price(*, eps_3y_avg, tangible_book_value_per_share,
    lag_status) → tuple[float | None, MethodApplicability]
  * Uses Graham's canonical 22.5 multiplier (15 P/E × 1.5 P/B). NOT
    relocated to config — it's the textbook value, not a tuning knob.
  * Wraps check_graham_applicability from Step 4.1; on skip returns
    (None, applicability) with one of the stable reason identifiers
    (non_positive_eps_3y_avg / non_positive_or_missing_tangible_book /
    stale_filing_hard).
  * Soft + unknown lag_status are permissive — Graham computes; Step 5
    ensemble will append valuation_warnings separately.
  * Defensive post-gate math sanity check (mathematically unreachable
    given the gate, but kept for runtime safety on potential gate
    regressions). Uses module-local _POST_GATE_PRODUCT_NON_POSITIVE
    identifier rather than polluting the public SKIP_REASONS taxonomy.

- compute/valuation/__init__.py: re-export graham_fair_price.

- compute/features/value.py: graham_number() docstring updated with
  cross-reference to the new tangible-book-aware variant. NO logic
  change to the pillar function — it intentionally retains TTM EPS +
  reported BVPS for cross-sectional ranking responsiveness (kickoff §B4
  "intentional dual implementation").

- tests/test_valuation/test_graham.py: 16 cases covering
  - kickoff cases (1)-(13): synthetic golden values (3 deterministic),
    skip cases for negative/None/zero EPS, None/negative TBVPS,
    hard-stale; soft + unknown stale pass-through; sqrt round-trip
    stability; multiplier-ratio invariant
  - Bonus: both-inputs-None → EPS reason wins (applicability ordering)
  - (14) @network AAPL EDGAR — lenient: if TBVPS positive, Graham
    must compute and fair²/(22.5*eps*tbvps) ≈ 1; if TBVPS None,
    applicability skips with tangible-book reason
  - (15) @network breadth check — 5 reference tickers (KO/PG/MSFT/JNJ/MMM);
    require ≥3/5 produce positive Graham values
  - Module public-surface invariant

Verification: ruff clean, pytest 204 passed (+16 graham) / 10 skipped
(+2 new network), 0 failed. Frontend tsc --noEmit exit 0.

Local spot-check on cached AAPL snapshot:
  EPS_diluted (3y-avg proxy) = $4.85
  TBVPS = $7.26
  Graham fair price = $28.15 (= sqrt(22.5 × 4.85 × 7.26) = sqrt(792))
This is the Graham defensive-investor floor; AAPL trades at ~$200, so
Graham would say AAPL is far above its conservative anchor — correct
diagnosis for a high-multiple growth name. Spot-check confirms math.

Defense scorecard at end of Step 4.2: unchanged (graham is a fair-price
method, not a defense flag — the goodwill_heavy_flag from Step 3 is
what surfaces tangible-book concerns to users via Step 5 ensemble).
…fair price

Third sub-step of Step 4. Pure-function module with the 4-tier peer-median
walk + 5/95 winsorization + 3 multiples methods. No production wiring
(Step 5 ensemble consumes); no schema changes.

- compute/valuation/multiples.py:
  * PeerTierUsed StrEnum: SUB_INDUSTRY → INDUSTRY → SECTOR →
    BROAD_EX_FIN_UTIL → INSUFFICIENT (5 values; first 4 are walked
    in priority order, last signals all 4 tiers fell short).
  * PeerMedian frozen dataclass: median, tier_used, peer_count,
    tier_thresholds_tried (per-tier audit trail for Step 5/UI).
  * compute_peer_medians(*, tickers_by_tier, metric_values,
    target_ticker, min_peers=config.MULTIPLES_MIN_PEERS) — pure
    function. Walks tiers; per tier filters target + None +
    non-positive, requires ≥ min_peers, winsorizes 5/95, takes
    median. Returns PeerMedian with full audit trail. Does NOT
    fetch data or classify GICS — Step 5 ensemble owns that.
  * _linear_percentile + _winsorize_5_95 helpers using stdlib only
    (no numpy dependency). Matches numpy.percentile default
    interpolation; verified by unit test.
  * multiples_pe_fair_price(*, eps_ttm, peer_pe_median,
    peer_tier_used, lag_status) → tuple[float | None, MethodApplicability].
    INSUFFICIENT tier short-circuits with insufficient_peers_all_tiers;
    otherwise delegates gating to check_multiples_pe_applicability.
  * multiples_pb_fair_price — same pattern; takes bvps_REPORTED
    (NOT TBVPS) per Step 4.1 peer-comparability rationale.
    Documented in module + function docstrings.
  * multiples_ev_ebitda_fair_price — extra inputs net_debt +
    shares_outstanding for the per-share conversion. New skip
    reasons: ev_ebitda_net_debt_unknown (when net_debt is None),
    ev_ebitda_negative_equity_post_debt (when EV − net_debt ≤ 0).
    Reuses missing_shares_outstanding from existing taxonomy.

- compute/valuation/applicability.py: SKIP_REASONS extended with 3
  new identifiers (ev_ebitda_negative_equity_post_debt,
  ev_ebitda_net_debt_unknown, insufficient_peers_all_tiers). Total
  taxonomy: 17 stable identifiers.

- compute/valuation/__init__.py: re-export PeerMedian, PeerTierUsed,
  compute_peer_medians, and the 3 multiples_*_fair_price methods.

- tests/test_valuation/test_multiples.py: 32 cases covering
  - Group A (10 cases): peer-tier walk for all 4 tiers, INSUFFICIENT
    fallback, target exclusion, winsorization, None/zero/negative
    filtering, plus _linear_percentile + _winsorize_5_95 unit tests.
  - Group B (5 cases): P/E method full path including INSUFFICIENT
    short-circuit.
  - Group C (4 cases): P/B method, with explicit assertion that the
    function takes bvps_REPORTED (NOT tbvps).
  - Group D (8 cases): EV/EBITDA simple, cash-rich-negative-net-debt,
    high-leverage-negative-equity, Financials sector skip, missing
    EBITDA / net_debt / shares, INSUFFICIENT tier.
  - Group E (5 cases): tuple-shape consistency across all 3 methods,
    PeerMedian frozen+expected fields, new SKIP_REASONS additions,
    public-surface re-export, MULTIPLES_MIN_PEERS=8 invariant.

Verification: ruff clean, pytest 236 passed (+32 multiples) /
10 skipped, 0 failed. Frontend tsc --noEmit exit 0.

Reason taxonomy after Step 4.3: 17 stable identifiers
(was 14 after Step 4.1). Step 4.4 (rim.py) will not add to taxonomy;
Step 4.5 (dcf.py) may add 1-2 for terminal-g internal cases.
…ith TBVPS

Fourth sub-step of Step 4. Pure-function module — no production wiring
(Step 5 ensemble consumes); no schema changes.

- compute/config.py: add RIM_FORECAST_YEARS=5 (explicit forecast horizon).

- compute/valuation/rim.py:
  * rim_fair_price(*, tangible_book_value_per_share, avg_3y_roe,
    cost_of_equity=config.COST_OF_EQUITY,
    forecast_years=config.RIM_FORECAST_YEARS,
    lag_status) → tuple[float | None, MethodApplicability]
  * Implements Penman 2013 §5 RIM:
      V_0 = B_0 + Σ_{t=1..N} [(ROE − Ke) × B_{t−1}] / (1 + Ke)^t
    with constant-ROE forecast and full-retention book accumulation
    (B_t = B_{t−1} × (1 + ROE)). Zero terminal value beyond year N
    (conservative — Penman's terminal-RIM extensions are empirically
    fragile; truncating aligns with annotate-and-veto principle).
  * **Critical input choice — TBVPS, NOT BVPS_reported** (opposite of
    P/B in Step 4.3). Module docstring documents the rationale: P/B is
    peer-relative and demands consistent treatment on both sides; RIM
    is single-stock with no peer comparison, so the most conservative
    starting equity base is appropriate.
  * Defensive overflow guard: caps book value at 1e15 during iterative
    compounding. Plausible inputs (ROE up to 200%) over default 5y
    horizon stay finite (3^5 = 243× headroom). Pathological inputs
    (ROE=1000% × 50y) trigger the rim_book_value_overflow reason —
    module-local identifier per Step 4.2's _POST_GATE pattern.
  * Module docstring documents the "full retention" simplification and
    its conservative-for-fair-price implication; Phase 4+ may extend
    with payout_ratio.

- compute/valuation/__init__.py: re-export rim_fair_price.

- tests/test_valuation/test_rim.py: 18 cases:
  - A (4): math correctness — hand-calculated 5-year reference yielding
    V_0 ≈ 12.4889; boundary ROE==Ke; tiny-spread bounds; high-ROE
    compounder bounds.
  - B (4): applicability delegated — ROE<Ke, None ROE, None TBVPS,
    hard-stale.
  - C (2): soft + unknown stale pass-through.
  - D (2): plausible-extreme ROE=200% × 5y stays finite; truly
    pathological ROE=1000% × 50y triggers overflow guard.
  - E (2): forecast_years longer → higher V_0; forecast_years=1
    boundary (V_0 ≈ 10.4545).
  - F (1): lower cost_of_equity → higher V_0.
  - G (3): tuple-shape consistency, config defaults pinned, public
    re-export.
  - H (2): @network AAPL + 5-ticker breadth (KO/PG/MSFT/JNJ/MMM); for
    skipped tickers the reason must be value_trap_risk_roe_below_cost_of_equity
    or non_positive_or_missing_tangible_book — never arbitrary.

Verification: ruff clean, pytest 254 passed (+18 RIM) / 12 skipped
(+2 new network), 0 failed. Frontend tsc --noEmit exit 0.

Local AAPL spot-check on cached snapshot:
  TBVPS = $7.26 (from Step 4.2)
  ROE_proxy = 1.151 (NI / equity — anomalously high due to AAPL's
    massive buybacks shrinking the equity denominator; the actual
    avg_3y_roe will be lower once Step 7 wires the history-based
    3y average through compute/main.py)
  RIM V_0 = $207.60

Compare ensemble inputs at AAPL current price ~$200:
  Graham      = $28   (deep conservative floor; rejects growth names)
  RIM         = $207  (nearly market — high ROE drives big residual)
  Multiples   = TBD (Step 5 will compute from peer panel)
  DCF         = TBD (Step 4.5)

This dispersion is exactly what the ensemble median + max + outlier-5×
guard (Step 5) is designed to reconcile. Spot-check confirms math is
producing economically sensible values, NOT a uniform anchor.

Reason taxonomy after Step 4.4: still 17 stable identifiers (no public
additions). Module-local _RIM_BOOK_OVERFLOW joins Step 4.2's
_POST_GATE_PRODUCT_NON_POSITIVE as defense-in-depth without taxonomy
pollution.
…nal-g cap

Final sub-step of Step 4. Pure-function module — no production wiring
(Step 5 ensemble consumes); no schema changes.

Step 4 valuation methods are NOW COMPLETE — Graham + Multiples (P/E,
P/B, EV/EBITDA) + RIM + DCF. Step 5 will orchestrate all 4 methods
into the ensemble, apply outlier guard + stale-filing handling, and
write StockDetail.fair_price.

- compute/config.py: no new constants (DISCOUNT_RATE, TERMINAL_GROWTH,
  DCF_FORECAST_YEARS already added in Step 1).

- compute/valuation/applicability.py: SKIP_REASONS extended with 3
  new identifiers emitted by dcf.py (terminal_g_unsafe_g_too_close_to_wacc,
  dcf_net_debt_unknown, dcf_negative_equity_post_debt). Total taxonomy:
  20 stable identifiers (was 17 after Step 4.3).

- compute/valuation/dcf.py:
  * dcf_fair_price(*, sector, fcf_5y, shares_outstanding, net_debt,
    lag_status, wacc=DISCOUNT_RATE, terminal_growth=TERMINAL_GROWTH,
    forecast_years=DCF_FORECAST_YEARS) → tuple[float | None, MethodApplicability]
  * Two-stage: 5y flat-FCF explicit forecast + Gordon-growth terminal.
  * **Defense #5 — terminal-g HARD cap** validated against BOTH:
      - config.TERMINAL_GROWTH (0.03 long-run nominal-GDP cap, Damodaran)
      - WACC − 0.01 (100bp math-safety buffer)
    Either cap exceeded → terminal_g_unsafe_g_too_close_to_wacc skip.
  * Net-debt MANDATORY: None → dcf_net_debt_unknown skip. Will not
    silently coerce to zero (would materially overstate equity per
    share for leveraged firms).
  * Negative equity post-debt → dcf_negative_equity_post_debt skip.
  * FCF normalization: median of POSITIVES only (the gate uses median
    of ALL finite values; this is a refinement not a relaxation —
    documented in module docstring as the "conservative anchor"
    choice).
  * Defensive _DCF_NO_VALID_FCF_POST_GATE module-local identifier for
    the mathematically unreachable empty-positives branch (same pattern
    as Step 4.2's _POST_GATE_PRODUCT_NON_POSITIVE).

- compute/valuation/__init__.py: re-export dcf_fair_price.

- tests/test_valuation/test_dcf.py: 24 cases:
  - A (4): math correctness — A1 hand-calculated reference
    (FCF=100M flat × 5y, WACC=10%, g=3%, no debt, 10M shares
    → per share ~$129.27 within 5¢); higher WACC → lower value;
    g ordering monotonic; net-debt impact (positive subtracts,
    negative adds — exact arithmetic verified).
  - B (3): terminal-g cap — g=0.09 with WACC=0.10 fails (0.09 > 0.03);
    default g=0.03 safe; pathological g=0.15 fails.
  - C (5): applicability delegated — Financials/Utilities skip,
    negative FCF median, None shares, hard-stale.
  - D (3): edge cases — high-leverage negative equity skip, None
    net_debt skip, zero net_debt = EV.
  - E (2): soft + unknown lag pass-through.
  - F (1): N=10 vs N=5 modest change (<20% relative).
  - G (2): mixed +/- FCF uses positives-only median; all-zero gate skip.
  - H (4): tuple shape, config defaults, public re-export, new
    SKIP_REASONS taxonomy entries.
  - I (2): @network AAPL + 5-ticker breadth (skipped without --run-network).

Verification: ruff clean, pytest 278 passed (+24 DCF) /
14 skipped (+2 new network), 0 failed. Frontend tsc --noEmit exit 0.

Local AAPL 3-method dispersion (cached snapshot):
  TBVPS = $7.26
  EPS = $4.85
  ROE proxy = 1.151 (buyback-shrunken-equity artifact)
  FCF TTM = $129.17B
  Net debt = -$45.57B (cash-rich)
  Shares = 14.67B

  Graham:    $28.15  (deep conservative floor — rejects growth names)
  RIM:       $207.60 (high ROE → big residual income contribution)
  DCF:       $116.95 (fcf×3.79 + terminal/1.61, conservative no-growth)
  Multiples: TBD (Step 5 ensemble computes peer panel)
  Market:    ~$200

The 3-method spread ($28 / $117 / $208) is exactly the "healthy
divergence" pattern that makes ensemble-median + outlier-guard
the right Step 5 architecture. Median of (28, 117, 208) = 117;
once Multiples lands in Step 5 (likely $150-200 range), the median
will land in the $115-150 band — meaningful, not noise.

Reason taxonomy after Step 4.5: 20 stable identifiers
(was 17 after Step 4.3). This is the FINAL state for PR-3c — Step 5
ensemble does NOT add new reasons (it pattern-matches against the
20 existing ones).
Largest single sub-step in PR-3c. Ensemble orchestrates the 6 fair-price
methods (Step 4) into the user-facing fair_price object; applies Defense
#3 (stale-filing hard/soft), Defense #4 (multi-method outlier guard), and
Defense #2 annotations (goodwill_heavy + value_trap_risk). Writer adds the
per-stock history JSON output. Schemas extend StockSummary/StockDetail/
RawMetrics/Metadata with all new fields; types.ts mirrors.

Sub-task 5.1 — compute/valuation/ensemble.py (NEW):
  * EnsembleResult + FairPriceMethodResult frozen dataclasses
  * METHOD_NAMES = ('graham', 'multiples_pe', 'multiples_pb',
    'multiples_ev_ebitda', 'rim', 'dcf') — 6 keys.
  * compute_fair_price_ensemble(*, ticker, snap, sector, sub_industry,
    industry, current_price, filing_lag_days_value, peer_panels,
    universe_metrics, historical_metrics) → (EnsembleResult, risk_flags).
  * Defense #3 stale: hard → all methods skip + risk_flag
    'stale_filing_hard' returned for caller to merge into the existing
    risk_flags from compute_risk_flags. Soft → annotates
    'stale_filing_soft' in valuation_warnings.
  * Defense #4 outlier: per-method values > 5×current OR < 0.2×current
    excluded from MAX but kept in MEDIAN (robust). Each outlier
    triggers an 'extreme_<method>_estimate' warning.
  * Defense #2 goodwill_heavy: tbvps/bvps_reported < 0.5 →
    'goodwill_heavy' warning.
  * RIM value_trap_risk → 'value_trap_risk' warning when RIM skips
    on ROE<Ke.
  * Aggregation: median = ALL applicable values (robust); max = non-
    outlier max only; low/high = full extremes; mos_pct = (median −
    current) / median × 100, None when current ≤ 0 or median ≤ 0.
  * Helpers: _all_methods_skipped, _classify_outliers, _aggregate_methods,
    _net_debt, _bvps_reported, _convert_peer_panel — all pure for direct
    testability.

Sub-task 5.2 — compute/output/writer.py extension:
  * write_stock_history(*, ticker, prices_df, output_dir) → bool.
    Slices prices_df.tail(min(252, len)). Outputs column-major JSON
    with NaN→None coercion. Writes to
    output_dir/stocks/history/{TICKER}.json via existing
    atomic_write_json. Returns True on success; caller sets
    StockDetail.has_history accordingly.

Sub-task 5.3 — schemas.py + types.ts:
  * StockSummary: + valuation_warnings: list[str]
  * StockDetail: + valuation_warnings, has_history, tangible_book_value
  * RawMetrics: + goodwill: float | None
  * Metadata: + mos_trailing_ic_smoke: float | None
  * frontend/lib/types.ts: full mirror — adds FairPriceMethodResult,
    FairPriceEnsemble, StockHistory types, and all the field
    additions above. StockDetail.fair_price typed as
    FairPriceEnsemble | null (was Record<string, unknown>).

- compute/valuation/__init__.py: re-export EnsembleResult,
  FairPriceMethodResult, METHOD_NAMES, compute_fair_price_ensemble.

- tests/test_valuation/test_ensemble.py (NEW): 27 cases covering
  - A (4): aggregation arithmetic — 4 methods no outliers, single
    method, no applicable methods, MoS sign convention
  - B (3): outlier guard — 5×/0.2× boundary semantics, multiple
    outliers, strict-inequality at boundary
  - C (3): stale filing — hard short-circuit + risk_flag, soft
    annotation, fresh no-warning
  - D (2): goodwill_heavy — flag fires below 0.5 ratio, doesn't
    fire above
  - E (2): value_trap_risk — RIM applicable no warning, RIM skip
    → warning
  - F (4): shape invariants — 6 method keys, FairPriceMethodResult
    + EnsembleResult frozen, tier_used None for non-multiples
  - G (3): edge cases — zero/negative current_price → null MoS;
    negative method value triggers outlier guard
  - H (3): full-ensemble integration — IT/Financials/Utilities
    sector handling
  - Helpers: _net_debt arithmetic, _bvps_reported, EXTREME_*
    constants pinned
- tests/test_output/test_writer.py extension: 7 history-writer cases
  covering 252-row slice, shorter input, empty df, None input,
  missing columns, NaN coercion, payload schema keys.

Verification: ruff clean, pytest 312 passed (+34 Step 5) /
14 skipped, 0 failed. Frontend tsc --noEmit exit 0.

Local AAPL ensemble spot-check (current=$200, empty peer panels):
  graham:               $28.15  applicable
  multiples_pe:         SKIP    reason=insufficient_peers_all_tiers
  multiples_pb:         SKIP    reason=insufficient_peers_all_tiers
  multiples_ev_ebitda:  SKIP    reason=insufficient_peers_all_tiers
  rim:                  $207.60 applicable
  dcf:                  $116.95 applicable

  median:               $116.95
  max (excl outlier):   $207.60
  low / high:           $28.15 / $207.60
  MoS:                  -71.01% (overvalued vs median)
  warnings:             ['extreme_graham_estimate']
  risk_flags:           []

Outlier guard works end-to-end: Graham's $28.15 (0.14× current=$200,
below 0.2× floor) is excluded from max but kept in median. Warning
'extreme_graham_estimate' surfaces in valuation_warnings. Step 7
will populate the multiples once peer panels are built cross-
sectionally; expected median lifts into the $150-180 range for AAPL.

Reason taxonomy after Step 5: still 20 stable identifiers. Ensemble
emits 'stale_filing_hard' as a per-method reason (when filing
hard-stale) but that string is already in SKIP_REASONS from Step 4.1.
…extended

Smallest defense step (~70 LOC module + 250 LOC tests). Extends
Greenblatt 2005's Magic Formula sector exclusion (Financials +
Utilities, where EBIT/EV is meaningless because the balance sheet is
reserves+regulated capital) to ALL Quality pillar metrics that depend
on EBIT, gross profit, or invested capital.

Pre-existing state surveyed before implementation:
- compute/scoring/sector_rules.py did NOT exist (created fresh).
- compute/features/quality.py ROIC and gross_profitability had NO
  sector gating despite the spec's claim of pre-existing exclusions.
- compute/features/profitability.py asset_turnover docstring CLAIMED
  to exclude Financials but the function had no actual gate.
  This step fixes the documentation-stub debt by wiring the actual
  gate at the pillar-wrapper layer.

Design decision — sector gate lives in pillar wrapper, not feature
function:
- Feature functions (compute/features/*) stay sector-agnostic and
  pure. Easier to unit test in isolation.
- Pillar wrappers (compute/scoring/pillars.py) apply the sector
  context. is_metric_excluded_for_sector() is consulted post-compute,
  with the metric value replaced by NaN when the rule fires.
- Existing pillar aggregation (compute.scoring.normalize.average_
  pillar_score with min_coverage=0.5) handles NaN gracefully — drops
  the gated metric, averages survivors. Per SKILL.md Rule 7.

- compute/scoring/sector_rules.py (NEW):
  * SECTOR_BLACKLIST: dict[str, frozenset[str]] with 5 entries:
      magic_formula        → {Financials, Utilities, Real Estate}
      asset_turnover       → {Financials}
      ebit_based_roic      → {Financials, Utilities}     [NEW]
      gross_profitability  → {Financials}                [NEW]
      ev_ebitda_multiple   → {Financials}                [NEW]
  * is_metric_excluded_for_sector(*, metric, sector) → bool.
    Returns False for None sector, missing metric, or non-listed
    sector — never second-guesses callers.

- compute/scoring/pillars.py:
  * _quality_metrics: gate roic for ebit_based_roic; gate
    gross_profitability for gross_profitability rule.
  * _value_metrics: gate ev_ebitda for ev_ebitda_multiple.
  * _profitability_metrics: gate asset_turnover (existing rule, now
    enforced) and gross_p (Profitability pillar's GP/A — same metric
    as Quality.gross_profitability, both must gate).

- tests/test_scoring/test_sector_rules.py (NEW): 11 cases:
  - A1-A4: lookup primitives (existing entry, sector not blacklisted,
    metric not blacklisted, None sector all return False/True correctly).
  - 5 spec-coverage assertions (each canonical key + its blacklist set).
  - REIT NOT excluded from ebit_based_roic (Phase 4 will add FFO).
  - All blacklist values are frozensets (immutable contract).

- tests/test_scoring/test_pillars.py extension: 14 cases:
  - B (3): JPM/NEE Financials/Utilities ROIC → NaN; AAPL IT → finite.
  - C (4): JPM gross_profitability NaN in Quality; same metric in
    Profitability (gross_p) also gated; AAPL finite for both.
  - D (3): JPM ev_ebitda NaN; AAPL finite; Utilities NOT excluded
    (mirrors Step 4.1 fair-price applicability semantics).
  - E (2): asset_turnover JPM NaN, AAPL finite.
  - F (2): pillar score remains finite when some Quality metrics
    are NaN; full universe with Financials does not crash.

Verification: ruff clean, pytest 338 passed (+26 Step 6: 11 sector_rules
+ 14 pillars + 1 fixed import) / 14 skipped, 0 failed. Frontend tsc
--noEmit exit 0.

Reason taxonomy: NO additions. Sector exclusion is a feature-pillar
concern, not a fair-price applicability concern. The Step 4.1
SKIP_REASONS taxonomy already has sector_excluded_financials and
sector_excluded_utilities for the FAIR-PRICE side; those operate on
applicability gates. Step 6 mirrors at the pillar layer with NaN
replacement (different mechanism). Two layers, one canonical sector
taxonomy.

Phase 4 follow-up: add REIT FFO/AFFO substitutes for the Quality
pillar on Real Estate stocks. Currently filed as a post-PR-3c-merge
issue; PR-3c keeps Real Estate fully in the Quality ranking with no
new exclusions.

Defense scorecard at end of Step 6:
- VETOES: 3 (altman_distress + sloan_accruals_top_decile +
            net_issuance_top_decile) — UNCHANGED from Step 2.
- GUARDS: 4 — stale_filing (3a from Step 5), outlier_5x (3a
            from Step 5), terminal_g (3a from Step 4.5),
            sector_exclusion (NEW this step, pillar layer).
- ANNOTATE: 2 active (goodwill_heavy, value_trap_risk; both from
            Step 5 ensemble) + 1 hidden (extreme_*_estimate; emitted
            by ensemble outlier guard).
…compute/main.py

Step 7 is THE BIG STEP per the PR-3c plan: production rankings.json now
contains fair_price values for the first time. Spot-checks scheduled for
Step 12 will verify behavior is sane.

Changes
-------

compute/main.py
- Add cross-sectional builders (one pass over the whole universe before
  the per-ticker loop):
    * _build_universe_metrics — per-ticker P/E TTM, P/B reported,
      EV/EBITDA TTM (feeds compute_peer_medians).
    * _build_peer_groupings — sub_industry / sector / broad-ex-Fin-Util
      tier dicts for the 4-tier peer-median walk. The "industry" GICS
      level-2 tier receives an empty list since Wikipedia exposes only
      level-1 + level-3 — this falls through to sector by design.
    * _build_historical_metrics — per-ticker eps_3y_avg, avg_3y_roe,
      fcf_5y from the annual fundamentals history. avg_3y_roe uses the
      current-period equity denominator (Phase 4 follow-up: backfill
      historical equity for true per-year ROE).
    * _filing_lag — days between asof and snapshot.latest_filed_date.
- Combine the previously separate StockSummary and StockDetail loops
  into a single per-ticker pass that:
    * computes the fair-price ensemble (compute_fair_price_ensemble),
    * merges its returned risk_flags (e.g. stale_filing_hard) into the
      existing risk_flags dict from compute_risk_flags,
    * writes the per-stock 1y price-history JSON via write_stock_history,
    * populates StockSummary.{fair_price, max_fair_price,
      margin_of_safety_pct, valuation_warnings},
    * populates StockDetail.{fair_price (full ensemble dict),
      valuation_warnings, has_history, tangible_book_value}.
- Update _build_raw_metrics to populate goodwill on RawMetrics.
- Add Metadata.mos_trailing_ic_smoke = None placeholder (Step 8 will
  compute the actual sanity-check value).
- Best-effort psutil RSS log at end of run (try/except ImportError so
  production keeps working without the optional dep).

compute/valuation/ensemble.py
- New ensemble_result_to_dict(r: EnsembleResult) -> dict serializer
  whose shape mirrors the FairPriceEnsemble TypeScript type. The
  returned valuation_warnings list is a copy (mutation-safe).

compute/valuation/__init__.py
- Re-export ensemble_result_to_dict.

tests/test_main.py (new, 21 cases)
- Cover _filing_lag, _build_universe_metrics, _build_peer_groupings,
  _build_historical_metrics, _eps_3y_avg, _avg_3y_roe, _fcf_5y.
- Lock the contract of the cross-sectional builders so future refactors
  surface input/output-shape changes loudly.

tests/test_valuation/test_ensemble.py (+3 cases I1/I2/I3)
- Verify ensemble_result_to_dict shape matches the TS type, handles the
  all-null case (every method skipped), and returns a defensive copy of
  valuation_warnings.

Verification
------------
- ruff check passes on all changed files.
- pytest tests/ -m "not network": 364 passed (was 343 before — +21 new
  test_main cases, +3 ensemble I-series cases; -3 reflects no removals).
- npx tsc --noEmit: clean.

What's NOT in this commit
-------------------------
- Real-data spot-checks (Step 12).
- mos_trailing_ic_smoke computation (Step 8).
- Schema snapshot guard (Step 9).
- Frontend PriceHistoryChart wiring (Step 10).

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
…(Defense #7)

Step 7 production verification on commit c13e4f7 (workflow #9) showed the
existing Phase-3 fundamentals ingest layer corrupts shares_outstanding for
~11 S&P 500 tickers (CVNA, HOOD, DDOG, CRWD, VTRS, BKR, PSKY, AMCR, SPG,
CHTR, RTX). The bug propagates into TBVPS / market_cap / multiples and
surfaced as user-visible nonsense — most prominently BKR fair_price = $105M
inside the production Top-5.

Step 7.5 adds an ensemble-level sanity ceiling so corrupted inputs produce
a clean null + warning instead of garbage. The upstream ingestion bug is
tracked separately (issue draft staged at /tmp/issue_drafts/) and will be
fixed in Phase 4.

Changes
-------

compute/config.py
- Add FAIR_PRICE_DATA_QUALITY_CEILING = 10000.0. Rationale: no S&P 500
  stock has a sensible per-share fair price > $10,000 (BRK-A trades
  ~$700K but is not in the index; BRK-B is). Above this ceiling, inputs
  are corrupted by definition.

compute/valuation/applicability.py
- Add data_quality_input_corruption to SKIP_REASONS taxonomy
  (20 -> 21 entries). Stable identifier surfaced via
  StockDetail.fair_price.methods.<method>.reason and inside
  StockDetail.valuation_warnings.

compute/valuation/ensemble.py
- New _has_corrupt_input(methods) helper: returns True iff any
  applicable method produced a value > the ceiling (strict >).
- New _data_quality_corrupt_result(methods) helper: builds the
  all-null EnsembleResult, preserving tier_used on the multiples
  methods for diagnostics, with a single
  data_quality_input_corruption warning.
- Wire the sanity sweep into compute_fair_price_ensemble after the 6
  methods compute but BEFORE outlier classification + aggregation. No
  risk_flags are appended (data quality is an upstream-ingest concern,
  not a ranking veto).

frontend/lib/types.ts
- Document the reason taxonomy on FairPriceMethodResult, including the
  new data_quality_input_corruption entry. No schema shape change —
  it's a string member of the existing union.

tests/test_valuation/test_ensemble.py (+4 cases)
- test_data_quality_sanity_guard_triggers_on_extreme_method_value:
  one method at $10,001 nulls all 6 + emits warning + empty risk_flags
  + preserves tier_used.
- test_data_quality_guard_boundary_exactly_at_ceiling: $10,000 exactly
  does NOT trip (strict >).
- test_data_quality_guard_skipped_methods_dont_trigger: applicable=False
  methods bypass the check.
- test_data_quality_guard_end_to_end_via_full_ensemble: integration
  through compute_fair_price_ensemble with a corrupted snapshot
  (shares_outstanding=10) yields the canonical all-null payload.

Verification
------------
- ruff check passes on all changed files.
- pytest tests/ -m "not network": 368 passed (was 364 -> +4 new).
- npx tsc --noEmit: clean.
- All 9 F1 tickers (median > 10x current) from the Step 7 verification
  report should now show fair_price=null with the new warning. Will
  re-confirm via workflow_dispatch re-trigger.

Issue drafts staged (not yet filed)
-----------------------------------
/tmp/issue_drafts/issue_shares_outstanding_bug.md
/tmp/issue_drafts/issue_avg_3y_roe_denominator.md
/tmp/issue_drafts/issue_mos_display_clamping.md

What's NOT in this commit
-------------------------
- Fix to compute/ingest/fundamentals.py (deferred to Phase 4 — issue
  draft staged).
- Fix to compute/main.py::_avg_3y_roe denominator (Phase 4 follow-up).
- Frontend mos_pct clamping (Step 10).

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
Adds a single cross-sectional sanity smoke test that runs once per
weekly compute and surfaces a number on Metadata.mos_trailing_ic_smoke.
This is NOT a backtest — it's a same-day Spearman rank correlation
between StockSummary.margin_of_safety_pct and trailing 1-year return,
intended for operators to spot-check that the fair-price ensemble is
producing values with some signal-to-noise relative to historical
price moves. A non-trivial correlation (positive OR negative) is
informative; near-zero says "the field is essentially uncorrelated
with recent return drift."

Changes
-------

compute/scoring/sanity.py (NEW, ~95 LOC)
- compute_mos_trailing_ic(*, rankings, prices_by_ticker, lookback_days=252)
  -> float | None
- Skips per-ticker:
    * mos_pct is None (ensemble couldn't aggregate)
    * data_quality_input_corruption in valuation_warnings
      (Step 7.5 / Defense #7 fired — explicit guard against future
      regressions where the warning is surfaced even with a non-null
      mos_pct)
    * ticker missing from prices_by_ticker
    * len(prices) < lookback_days
    * lookback or trailing close <= 0 (defensive)
- Returns None when:
    * Fewer than MIN_SAMPLE=30 valid pairs after filtering
    * All mos_pct values identical (corr undefined)
    * All trailing returns identical (corr undefined)
    * Computed coefficient is non-finite (defensive)
- Spearman implemented as Pearson on the rank vectors via
  pd.Series.rank() — mathematically identical to scipy.stats.spearmanr
  but doesn't require adding scipy to project dependencies.
- Heavy-tail rationale documented: 143/502 stocks had mos_pct outside
  [-99%, +500%] in Step 7 verification; Pearson would be dominated
  by those outliers.

compute/main.py
- Wire compute_mos_trailing_ic into run_weekly_compute, called once
  after the per-ticker loop builds the summaries list, before
  Metadata construction.
- metadata.mos_trailing_ic_smoke = compute_mos_trailing_ic(...)
- INFO log of the result.

tests/test_scoring/test_sanity_smoke.py (NEW, 16 cases)
- A1/A2/A3: math correctness (perfect rank corr = 1.0; perfect
  inverse = -1.0; random pairs in [-1, 1]).
- B1/B2/B3: sample-size threshold — 29 → None, 30 → float, 100 → float.
- C1/C2: identical mos / identical returns → None.
- C3/C4: None mos / data_quality_input_corruption skipped.
- D1/D2: insufficient prices / missing ticker → skipped.
- E1: zero/negative lookback close → skipped.
- F1: integration — 50-stock universe with 5 corruption-warning
  tickers + 5 None-mos tickers + 40 valid pairs → finite Spearman.
- 2 module-constant regression tests (DEFAULT_LOOKBACK_DAYS=252,
  MIN_SAMPLE=30).

Verification
------------
- ruff check passes on all changed files.
- pytest tests/ -m "not network": 386 passed (was 368 -> +18:
  16 new sanity tests + 2 constant regressions).
- npx tsc --noEmit: clean.

What's NOT in this commit
-------------------------
- Production verification of the actual smoke value — deferred to
  Step 12 final verification along with the rest of the schema.
- METHODOLOGY.md note about smoke-test-not-backtest distinction —
  Step 11 documentation work.

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
Catches silent schema drift between the Pydantic output schemas in
compute/output/schemas.py and the TypeScript types in
frontend/lib/types.ts at CI time. PR-3c added 6 new fields + 3 new
types; the guard makes future drift impossible to merge accidentally.

Changes
-------

compute/output/schema_check.py (NEW, ~245 LOC)
- generate_snapshot() emits a deterministic dict
  {ModelName: {field_name: {type, required, default}}}, alphabetized
  at both levels, with human-readable type strings ("float | None",
  "list[str]", "dict | None") and JSON-safe default representations
  (None / scalars verbatim; <required> sentinel; <factory:list> /
  <factory:PillarScores> for default_factory; <repr:...> for
  anything else).
- check_snapshot() compares stored vs fresh, returns
  (in_sync: bool, diff: str | None) with a grouped-by-model diff:
  added / removed / changed fields per model.
- update_snapshot() writes frontend/lib/schema-snapshot.json with
  trailing newline.
- main(argv) — CLI entry point. Exit codes: 0 in-sync, 1 drift, 2
  unexpected error. Prints a human-readable diff + resolution
  instructions on drift.
- Tracks 6 models: DataQuality, Metadata, PillarScores, RawMetrics,
  StockDetail, StockSummary.

frontend/lib/schema-snapshot.json (NEW, 7.6 KB / 374 lines)
- Initial snapshot generated from the schemas as of c346ed5.
- Confirms the 6 new Phase-3c fields are tracked: StockSummary
  fair_price / max_fair_price / margin_of_safety_pct /
  valuation_warnings; StockDetail fair_price / valuation_warnings /
  has_history / tangible_book_value; Metadata mos_trailing_ic_smoke;
  RawMetrics goodwill.

.github/workflows/ci.yml
- New "Schema snapshot guard" step in the python job, between Ruff
  and Pytest. Runs `python -m compute.output.schema_check` — fails
  the job on drift before pytest spends time.

tests/test_output/test_schema_check.py (NEW, 23 cases)
- A1-A5: snapshot generation contract (top-level keys, alphabetical
  ordering, required field-info keys, TRACKED_MODELS canary that
  catches new BaseModel subclasses missing from the registry).
- B1-B2: round-trip semantics (update then check is in-sync;
  mutating the file triggers drift).
- C1-C3: diff message quality (added/removed/changed fields appear
  with the model name; type changes show old → new).
- D1-D4: CLI surface (--update-snapshot writes; no-arg in-sync
  returns 0; drift returns 1 with resolution text; missing file
  surfaces a helpful message).
- E1-E3: critical Phase-3c fields are tracked (regression guard).
- F1: the committed snapshot matches the live schemas (a pytest-side
  mirror of the CI check, so test runs catch drift early too).
- 4 misc: factory/required normalization sentinels, invalid-JSON
  handling, parametrized smoke for both CLI modes.

Manual break/revert verification (Step 9.4)
-------------------------------------------
1. Injected a dummy field into a tracked Pydantic model.
2. Ran `python -m compute.output.schema_check` → exit 1, diff printed
   the model + field name + new type info, plus resolution
   instructions ("update types.ts then run --update-snapshot OR
   revert").
3. Restored schemas.py → re-ran → exit 0, "in sync" message.
4. `diff` confirms schemas.py is byte-identical to the original.

Verification
------------
- ruff check passes on the entire repo.
- pytest tests/ -m "not network": 409 passed (was 386 -> +23 new
  schema_check cases).
- npx tsc --noEmit: clean.
- python -m compute.output.schema_check: exit 0, "in sync".

What's NOT in this commit
-------------------------
- Production rankings.json change (this is a CI-time guard only).
- Frontend updates for Step 10 (PriceHistoryChart, mos clamping
  per Issue 3).

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
… chart

First user-visible PR-3c step. Surfaces the new ensemble fields in the
rankings table + stock detail page, adds a lazy-loaded 1y price chart
backed by the per-stock history JSONs from Step 5, and clamps the
display of extreme MoS values per the Step-7 verification report
(Issue 3 acceptance criteria).

Changes
-------

frontend/lib/format.ts (NEW, ~55 LOC)
- formatMosPct(mos | null) → { display, tooltip, isClamped }:
    * null               → "—" (no tooltip)
    * mos < -99          → "< −99%" (tooltip shows raw value)
    * mos > 500          → "> +500%" (tooltip shows raw value)
    * else               → "+50.0%" / "-25.0%" with sign + 1 dp
- formatFairPrice(value | null):
    * null               → "—"
    * value >= 1_000_000 → "—" (defensive against future regressions
                                even after Step 7.5 sanity guard)
    * value < 0.01       → "< $0.01"
    * else               → "$NNN.NN"
- mosColorClass(mos): semantic palette tokens (emerald 700/600 for
  undervalued, slate 500 for near-fair, rose 600 for overvalued).

frontend/components/RankingTable.tsx
- Two new sortable columns: "Fair price" + "MoS".
- Sort comparator now nulls-last (don't fight for the top of an
  ascending sort when there's no data).
- New column-default sort: composite_score / fair_price /
  margin_of_safety_pct → desc; everything else → asc.
- Data-quality flag rendering: stocks with valuation_warnings
  including "data_quality_input_corruption" show "⚠ —" with a
  title tooltip naming the Step 7.5 sanity guard, instead of the
  fair-price number.
- Mobile cards: third row added for "Fair $X · MoS ±Y%".

frontend/components/PriceHistoryChart.tsx (NEW, ~125 LOC)
- "use client" component; lazy fetch via useEffect from the static
  /data/stocks/history/{TICKER}.json files written by Step 5.
- Honors NEXT_PUBLIC_BASE_PATH for sub-path deploys.
- Loading + error + empty states all render at h-64 to prevent
  layout shift on mount.
- Column-major → row-major transformation; null closes are dropped
  (preserves the gap rather than drawing through it).
- Recharts LineChart with monotone interpolation, $-axis ticks,
  tooltip showing date + close to 2 dp.

frontend/components/FairPriceCard.tsx (NEW, ~150 LOC)
- 4-stat headline grid: median fair / margin of safety / max
  (ex-outliers) / tangible BVPS.
- Per-method breakdown table: Graham, P/E, P/B, EV/EBITDA, RIM, DCF.
  Each row shows formatted value or italic "skipped" with the
  reason in the title attribute (so hovering surfaces the
  applicability gate). Multiples methods get a small label
  showing the peer tier used ("vs sub_industry peers" etc).
- Warning chips below the table for each entry in
  valuation_warnings — amber pills, snake_case → spaces.
- Renders gracefully when ensemble is null (snapshot missing) or
  data_quality_input_corruption fires (replaces median/max with
  em-dash).

frontend/app/stock/[ticker]/page.tsx
- Hero block adds an MoS chip below the price (small, color-coded
  via mosColorClass).
- New "Price (1y)" section above the fundamentals: renders
  PriceHistoryChart when detail.has_history is true, otherwise an
  h-64 placeholder.
- New FairPriceCard section between Price and the existing Raw
  fundamentals table.
- Footer note updated from Phase 3b → Phase 3c with one sentence
  describing the 6-method ensemble.

What's NOT in this commit
-------------------------
- Frontend test framework (none configured; CI's `tsc --noEmit`
  + `next build` are the type-correctness guarantees).
- New top-level npm dependencies (recharts already in package.json).
- Any compute/* changes — frontend-only commit.
- Issue 3 polish (sparkbar / shadcn Tooltip primitive) — the
  current implementation uses native title attributes per spec
  ("native title is acceptable for now").

Verification
------------
- `npx tsc --noEmit`: clean.
- `npm run build`: ✓ 506 routes pre-rendered (1 home + 1 not-found
  + 502 stock detail pages + 2 misc), route /stock/[ticker] is
  98.9 kB (recharts + ensemble card; same chunked JS shared across
  all 502 pages).
- `ruff check .`: clean (entire repo).
- `python -m compute.output.schema_check`: in-sync (no schema
  changes, just consumes existing fields).
- `pytest tests/ -m "not network"`: 409 passed (no Python-side
  changes; sanity check that nothing regressed).

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
Documentation-only commit. No code changes.

PHASE_STATUS.md (+69 lines)
- PR 3c row flipped to ✅ DONE 2026-05-09 with full description: 7
  defenses delivered (the original 6 + Defense #7 added mid-PR at
  Step 7.5), all schema additions, frontend wire-up scope, and
  production verification stats.
- Defense scorecard table updated: now-vs-v1.0 view across all 3
  defense layers.
- New "Phase 3c verified production stats" + "Phase 3c acceptance
  checklist" subsections (parallel to the existing Phase 1/2 blocks).
- Phase 3 sub-PR plan: 3c moved from 🟡 NEXT → ✅ DONE; 3d/3e ETAs
  updated.
- v1.0 ETA: ~2-3 days remaining (3d + 3e).

SKILL.md (net +3 lines, but conceptual change)
- Schema-versions section converted from bullet list to a 3-column
  Markdown table (Schema / Phase / What changed). Phase 3c row
  documents all 7 defenses, every new schema field, the
  schema-snapshot CI guard, and the 21-entry reason taxonomy.

docs/METHODOLOGY.md (+178 lines)
- New "Fair-price ensemble" section: per-method table, aggregation
  semantics, why-median-not-mean, why-dispersion-matters, sign
  convention citing Damodaran 2012.
- New "Defense layer" section: 3 vetoes / 5 numerical guards /
  5+ annotate-only flags as separate tables with sources cited
  (Altman, Sloan, Pontiff-Woodgate, Penman, Damodaran, Greenblatt).
  Annotate-vs-veto philosophy spelled out with a Q/A grid.
- New "Sanity tests" subsection: NOT-a-backtest disclaimer,
  Spearman-not-Pearson rationale, null-return semantics, references
  Phase 4+ for real predictive validation.
- Composite-weights table preserved; Realistic-expectations + Honest-
  limits sections preserved verbatim from the prior doc.

What's NOT in this commit
-------------------------
- WORKFLOW.md changes (Phase 3c was already documented as the
  roadmap target; the commit is the execution flip in PHASE_STATUS).
- README.md disclaimer block (Phase 3e adds the Honest Limitations
  section per the kickoff spec).
- stock_ranking_knowledge.md (separate authoritative reference,
  unchanged in PR 3c).

Verification
------------
- ruff check .  → clean
- python -m compute.output.schema_check → in-sync (no schema changes)
- pytest tests/ -m "not network" → 409 passed
- File sizes: PHASE_STATUS 183→252; SKILL 569→572; METHODOLOGY 58→236

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
…hema-table cleanup

Documentation-only follow-up to commit d648b1d. Four fixes:

1. SKILL.md — schema versions table, Phase 3c row.
   The "(×4)" count for extreme_<method>_estimate was wrong: the
   outlier guard applies to all 6 ensemble method slots (graham,
   multiples_pe, multiples_pb, multiples_ev_ebitda, rim, dcf),
   though in practice only 2-3 fire per stock. Replaced with
   "(×6 method slots — <method> is one of …; in practice 2-3
   fire per stock)" so the slot enumeration is explicit.
   (The literal `<method>` placeholder was already correct in the
   file; the earlier review paste lost it to markdown HTML parsing.)

2. METHODOLOGY.md — Annotate-only flags section.
   Expanded the extreme_<method>_estimate bullet to enumerate
   the 6 method names and spell out the [0.2x, 5x] outlier band
   + "kept in MEDIAN, excluded from MAX" semantics, matching
   Defense #4 in the Numerical guards table.

3. METHODOLOGY.md — Active vetoes table.
   Updated Altman citation from "Altman 1968" to "Altman 1968,
   Hotchkiss 2003 update for non-manufacturers". The Z″ < 1.10
   threshold comes from the 2003 update in Altman & Hotchkiss
   (Corporate Financial Distress and Bankruptcy, 3rd ed., Wiley),
   not the original 1968 paper. Matches the citation already in
   stock_ranking_knowledge.md §1.2.

4. SKILL.md — schema versions table, Phase 4-8 rows removed.
   Replaced with a one-line note pointing to WORKFLOW.md "Defense
   Roadmap" as the single source of truth for unshipped schemas.
   Reason: the prior Phase 4-8 descriptions reflected the
   pre-Defense-Playbook roadmap and didn't match the post-2026-05-08
   WORKFLOW.md updates (Issue #7 Sloan fix, shares_outstanding
   ingestion fix, REIT FFO/AFFO substitutes, cross-source validator,
   IC decay monitor, Bao-Ke ML fraud overlay, MAPIE conformal). The
   Phase 8 row also conflicted with the current "production hardening"
   target (S&P 1500 expansion deferred beyond v2.0). Schema table
   now documents shipped schemas only; roadmap doc owns the rest.

Verification
------------
- ruff check .  → clean
- python -m compute.output.schema_check → in-sync
- pytest tests/ -m "not network" → 409 passed
- grep checks confirm both files now match the review spec.

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
@dackclup dackclup marked this pull request as ready for review May 10, 2026 04:05
@dackclup dackclup merged commit 8a9d35f into main May 10, 2026
2 checks passed
@dackclup dackclup deleted the claude/phase-3c-fair-price-defenses branch May 10, 2026 04:28
@dackclup dackclup restored the claude/phase-3c-fair-price-defenses branch May 10, 2026 04:28
@dackclup dackclup deleted the claude/phase-3c-fair-price-defenses branch May 10, 2026 04:28
dackclup pushed a commit that referenced this pull request May 10, 2026
Step 3 of PR 3d. Adds the 8-K event scoring module that backs:
  - Defense #9 (Item 4.02 "Non-Reliance on Previously Issued Financial
    Statements") — HARD VETO. Joins altman / sloan / NSI as the 4th
    active veto at v1.0.
  - Defense #10 (Item 4.01 "Changes in Registrant's Certifying
    Accountant") — annotate-only. Reg S-K Item 304 mandates the same
    disclosure for benign reasons, so false-positive rate is too
    high for veto.

Both defenses surface in StockDetail.tier2_events (Pydantic field
lands in Step 4) and the user-visible flag list. Per SKILL.md
Rule 16, neither modifies the composite score.

Changes
-------

compute/config.py (+11 LOC)
- New constants:
    * EDGAR_8K_CACHE_DIR = CACHE_DIR / "edgar_8k"
    * EDGAR_8K_CACHE_TTL_SECONDS = 7 * 86400
    * EDGAR_8K_ITEM_TEXT_EXCERPT_CHARS = 500

compute/scoring/eight_k_events.py (NEW, ~310 LOC)
- ItemFlag frozen dataclass — return shape for both check_* funcs.
  Fields: fired (bool), filing_date (str|None), filing_url (str|None),
  raw_item_text (str|None, ≤ EXCERPT_CHARS).
- fetch_recent_8k_filings(ticker, lookback_days) -> list[dict] | None.
  Wraps edgartools' Company.get_filings(form="8-K", filing_date=...);
  parses each filing via filing.obj() (returns EightK with .items
  attribute returning list[str] like ["Item 5.02", "Item 9.01"]);
  extracts item-text excerpts from EightK.sections (best-effort —
  shape varies across edgartools versions, gracefully degrades to
  empty excerpts).
- Returns None on EDGAR rate-limit / network failure / missing
  identity / ticker-not-found. Returns [] on successful fetch with
  zero 8-Ks in window.
- check_non_reliance(ticker) — Item 4.02, 365-day lookback.
- check_auditor_change(ticker) — Item 4.01, 730-day lookback.
- Both accept optional `filings=` kwarg for unit-test injection.
- Most-recent match wins when multiple 4.02 / 4.01 fire in window.
- Item-number regex is dot-anchored both sides ("\bItem\s+4\.\s*02\b")
  so "Item 4.020" does NOT match "Item 4.02".
- _ensure_edgar_identity is lazy (logged warning, not RuntimeError)
  on missing EDGAR_USER_AGENT — Tier-2 features are non-fatal,
  unlike fundamentals.

Cache layer (inlined in eight_k_events.py, ~80 LOC)
- JSON-on-disk at compute/cache/edgar_8k/<ticker>.json (gitignored
  by existing compute/cache/ rule).
- 7-day TTL — safe because 4.02/4.01 events are sticky once filed
  (they don't disappear).
- Cache hit requires cached_lookback >= requested_lookback (so a
  365d entry can't serve a 730d request).
- Atomic write via tmp + os.replace.
- Corrupt JSON / unparseable timestamps treated as miss (logged warn).
- Filename ticker-sanitized via [^A-Za-z0-9_-] regex (BRK-B works,
  path-traversal attempts neutralized).
- invalidate_cache(ticker) — public helper, idempotent.

tests/test_scoring/test_eight_k_events.py (NEW, 28 cases — 25 unit/cache
+ 3 @network)
- A1-A14: synthetic Filing fixture tests (item parsing, lookback
  windows, multiple matches, case variants, excerpt truncation,
  frozen dataclass, item-number boundary precision).
- B1-B6: cache layer (miss → fetch, hit → no fetch, expired → refetch,
  invalidate, corrupt JSON, lookback-undersize miss).
- 2 ticker-path safety tests (BRK-B preservation, path traversal).
- C1-C3: @network smoke against real SEC EDGAR (skipped without
  EDGAR_USER_AGENT). Asserts 5 known-clean tickers (AAPL/MSFT/GOOGL/
  JPM/KO) have ≤1 fired flag, AAPL has neither 4.02 nor 4.01,
  cache effectiveness via timing.

Verification
------------
- ruff check . -> clean (1 pytest.raises(Exception) lint fix —
  switched to FrozenInstanceError specifically)
- python -m compute.output.schema_check -> in-sync (Step 4 adds
  the Pydantic tier2_events field)
- pytest tests/ -m "not network" -> 464 passed (was 439 -> +25 new
  unit/cache; +3 @network properly skipped)
- npx tsc --noEmit -> clean

Edgartools API notes (for Step 5 wire-up)
------------------------------------------
- Company.get_filings(form="8-K", filing_date=(start, end)) returns
  an EntityFilings iterable.
- Each Filing has .obj() that returns an EightK (for 8-K forms).
- EightK.items returns List[str] like ["Item 5.02", "Item 9.01"]
  via a 3-tier fallback parser (modern sections → chunked_document
  → text-pattern extraction). Handles SGML legacy filings (1999-2001).
- EightK.sections is the source for item-body excerpts but its
  shape varies (sometimes dict, sometimes list); the module guards
  with `if isinstance(sections, dict)` and degrades to empty
  excerpts if the shape doesn't match expectations.

What's NOT in this commit
-------------------------
- Pydantic schema additions (Step 4: StockDetail.tier2_events
  field + Metadata.tier2_coverage_pct)
- Risk-overlay integration (Step 4: non_reliance_filing flag joins
  the risk_flags list)
- compute/main.py wire-up (Step 5)
- Frontend Tier2EventCard (Step 6)
- New pip dependencies — uses existing edgartools + stdlib

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
dackclup pushed a commit that referenced this pull request May 10, 2026
Step 4 of PR 3d. Wires Defenses #8 / #9 / #10 into the JSON contract and
makes #9 (8-K Item 4.02 non-reliance) the **4th active hard veto** at v1.0.

Changes
-------

compute/output/schemas.py
- StockDetail.tier2_events: dict | None = None
  Display payload populated by Step 5; shape (when set):
    {"going_concern_disclosure": bool,
     "non_reliance_filing": bool,
     "auditor_change": bool,
     "latest_8k_filing_date": str | None,   # ISO YYYY-MM-DD
     "latest_8k_filing_url": str | None}
- Metadata.tier2_coverage_pct: float | None = None
  Population-level fetch-success rate. None when Tier-2 disabled
  (e.g., env var missing).

frontend/lib/types.ts
- New `Tier2Events` type mirroring the Python dict shape
- `StockDetail.tier2_events: Tier2Events | null`
- `Metadata.tier2_coverage_pct: number | null`
- Inline doc on Tier2Events explaining which fields are veto vs annotate.

compute/valuation/applicability.py
- SKIP_REASONS: 21 → 24 entries. New stable identifiers:
    going_concern_disclosure, non_reliance_filing, auditor_change.
  These are tracked here so the JSON-contract reason taxonomy is
  complete; the same strings also appear in StockDetail.tier2_events
  (display) and risk_flags (only non_reliance_filing — hard veto).

compute/scoring/risk_overlay.py
- Module docstring updated: "three vetoes" → "four vetoes" with
  non_reliance_filing entry citing eight_k_events.check_non_reliance.
- compute_risk_flags acquires a new optional kwarg
  `non_reliance_by_ticker: dict[str, bool] | None = None`:
    * Default (None): per-ticker fallback to
      check_non_reliance(ticker), which hits the 7-day on-disk
      EDGAR cache or returns ItemFlag(fired=False) when identity
      is unset (= test environment, sandbox).
    * Explicit dict: tests + Step 5 inject pre-computed results.
      Step 5 will share fetch work between this veto path and
      the StockDetail.tier2_events display path so the EDGAR fetch
      happens once per ticker per compute run, not twice.
  This is a slight extension of the spec's pure-inline `check_non_reliance(ticker)` call, but it keeps the function unit-testable
  without network mocking and avoids a duplicate fetch in production.
  The default behavior matches the spec exactly when the kwarg is
  omitted.

frontend/lib/schema-snapshot.json
- Regenerated via `python -m compute.output.schema_check
  --update-snapshot`. Diff: +tier2_coverage_pct under Metadata,
  +tier2_events under StockDetail. No collateral drift.

tests/test_output/test_tier2_schema.py (NEW, 13 cases)
- A1-A5: Pydantic field validation (StockDetail.tier2_events
  accepts dict / None; Metadata.tier2_coverage_pct accepts float / None;
  JSON round-trip preserves the dict shape).
- B1-B5: SKIP_REASONS taxonomy (3 new entries present, count = 24,
  all entries unique).
- D1-D3: schema-snapshot file (committed snapshot includes both
  new fields with correct type/required/default shape).

tests/test_scoring/test_risk_overlay.py (+6 cases)
- C1-C6: Defense #9 non_reliance integration:
    * inject {ticker: True} → flag appears
    * inject {ticker: False} → no flag
    * empty inject dict → no flag
    * default path with no EDGAR_USER_AGENT → no flag (existing
      PR-3c tests rely on this contract; tests use monkeypatch
      to ensure a clean cache + identity state)
    * additive with altman/sloan — all 4 vetoes can fire together
    * inject dict for ticker A doesn't pollute ticker B

Verification
------------
- ruff check . -> clean (1 import-sort fix auto-applied)
- python -m compute.output.schema_check -> in-sync after regen
- pytest tests/ -m "not network" -> 483 passed (was 464 -> +19 new)
- npx tsc --noEmit -> clean

What's NOT in this commit
-------------------------
- compute/main.py wire-up (Step 5 — pre-fetches Tier-2 data in
  parallel with fundamentals, populates tier2_events display
  dict + injects non_reliance_by_ticker into compute_risk_flags)
- Frontend Tier2EventCard / PillarRadarChart / FairPriceBarChart
  components (Steps 6-8)

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
dackclup pushed a commit that referenced this pull request May 10, 2026
Step 5 of PR 3d. Wires Defenses #8 / #9 / #10 into the production
weekly-compute pipeline. After this commit, rankings.json carries the
4th active veto (non_reliance_filing) and StockDetail.tier2_events on
every stock; metadata.json reports population-level Tier-2 coverage.

Architecture
------------

- New `compute/ingest/filing_text.py` (~210 LOC): 10-K text fetch with
  90-day on-disk cache. Mirrors the eight_k_events.py cache pattern
  (atomic write via tmp + os.replace; fetched_at TTL gate; safe_ticker
  filename sanitization). Returns None on every failure mode (rate
  limit, missing identity, no recent 10-K) — never raises.
- New `compute/scoring/tier2.py` (~180 LOC): Tier2Result frozen
  dataclass + fetch_tier2_for_ticker orchestrator + tier2_events_dict
  + coverage_pct helpers. The orchestrator catches every per-defense
  exception so one bad ticker can't crash the run.
- Reuses `fetch_recent_8k_filings` ONCE per ticker (with the larger
  730d lookback that covers both 4.02 and 4.01 windows) — both
  `check_non_reliance` and `check_auditor_change` operate on the same
  in-memory filing list. Avoids a duplicate EDGAR call per ticker.

compute/config.py
- New EDGAR_10K_TEXT_CACHE_DIR + EDGAR_10K_TEXT_CACHE_TTL_SECONDS
  (= 90 days). 10-K filings are annual so an 89-day stale cache hit
  returns the same filing.

compute/main.py
- New "Step 4b" between fundamentals + risk-flag computation: parallel
  Tier-2 fetch via ThreadPoolExecutor(max_workers=EDGAR_MAX_WORKERS=5).
  Same parallelism budget as fundamentals — well under SEC's 10/sec
  rate limit.
- non_reliance_by_ticker dict built from tier2_results, injected into
  compute_risk_flags. Avoids the duplicate fetch the inline default
  path would have triggered. Only fired tickers go in (per Step 4
  spec: dict.get(ticker, False) default).
- Per-ticker StockDetail loop populates tier2_events from
  tier2_events_dict(tier2_results.get(ticker)). Tickers absent from
  the dict get tier2_events=None — graceful "no Tier-2 data" surface.
- Metadata.tier2_coverage_pct populated from coverage_pct(tier2_results).
  None when universe is empty; 0.0 when all fetches failed; rounded
  to 2 decimal places otherwise.
- Added `Tier2Result` to imports for type clarity (linter wanted it
  in a separate `from .. import` line because of the `as` alias on
  coverage_pct — accepted).

Failure isolation
-----------------

Three layers of safety:
1. Each underlying fetcher (fetch_latest_10k_text, fetch_recent_8k_filings)
   returns None on any failure — never raises.
2. fetch_tier2_for_ticker wraps each per-defense call in try/except; one
   defense's failure doesn't abort the orchestrator.
3. The compute/main.py executor loop also catches exceptions from
   fut.result() — defensive, since the orchestrator already swallows
   everything.

A failed-fetch ticker simply won't appear in tier2_results; the
per-ticker loop's tier2_results.get(ticker) returns None, which builds
a StockDetail with tier2_events=None.

tests/test_scoring/test_tier2.py (NEW, 17 cases)
- A1-A6: orchestration permutations (clean, partial 10-K fail, partial
  8-K fail, total fail, exception caught, both 8-K items present).
- B1-B4: tier2_events_dict shape + non_reliance > auditor_change
  preference for latest_8k_filing date/url + 5-key contract check.
- C1-C5: coverage_pct including 100% / 0% / 49.80% / empty / single.
- D1: end-to-end synthetic 10-ticker pipeline covering all 3 defenses.
- D2: Tier2Result frozen dataclass.

Verification
------------
- ruff check . -> clean
- python -m compute.output.schema_check -> in-sync
- pytest tests/ -m "not network" -> 500 passed (was 483 -> +17 new)
- npx tsc --noEmit -> clean
- main.py wire-up smoke-imports cleanly; sanity grep confirms
  tier2_results / tier2_coverage_pct / tier2_events / non_reliance
  inject all wired through.

Performance budget
------------------

Cold cache estimate (first run):
- 502 tickers × 2 EDGAR fetches each (10-K + 8-K) at ~5 parallel workers
  = ~200s = ~3.5 min. Well under the +10-15 min budget.
Subsequent weekly runs: mostly cache hits → +30-60s.
NO new asyncio / concurrency primitives — ThreadPoolExecutor
matches the existing fundamentals-fetch pattern.

What's NOT in this commit
-------------------------
- Frontend Tier2EventCard / PillarRadarChart / FairPriceBarChart
  (Steps 6-8)
- Documentation updates (Step 9)
- Production verification via workflow_dispatch (Step 10)

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
dackclup pushed a commit that referenced this pull request May 10, 2026
…e wire

Step 6 of PR 3d. Adds the user-visible surface for Defenses #8 / #9 /
#10 — a card that lists fired regulatory events with severity coding
(HARD VETO red pill for non-reliance; Annotate amber pill for
going-concern + auditor-change). Renders nothing when there are no
events or when the StockDetail predates the schema (graceful
forward-compat for stocks/*.json files written under PR-3c schema).

Position: between PriceHistoryChart and FairPriceCard — regulatory
events affect investability more than valuation, so they sit higher
in the visual hierarchy.

Changes
-------

frontend/components/Tier2EventCard.tsx (NEW, ~165 LOC)
- "use client" component with strict TypeScript types (no `any`).
- Props: tier2_events: Tier2Events | null, ticker: string.
- Renders null when:
    * tier2_events is null OR undefined (loose-equality check —
      `undefined` is the runtime shape for stock JSONs written
      under pre-PR-3d schemas, before Step 10's compute regenerates
      them with the field populated)
    * All 3 flags are false (clean ticker)
- Otherwise renders rows in priority order: non_reliance_filing
  first (hard veto), then going_concern_disclosure, then
  auditor_change. Date footer (latest 8-K) shown only when an 8-K
  flag fired AND a date is present.
- "View filing" link with target=_blank + rel=noopener,noreferrer
  for the 8-K rows; going-concern has no link (text scan, not 8-K).
- Inline SVG icons (lucide-react is NOT in package.json — spec's
  hard constraint says "NO new npm dependencies"). Three 24px
  stroke icons styled to match lucide's visual language:
  AlertOctagon (veto), AlertTriangle (going-concern),
  UserMinus (auditor-change), plus a small ExternalLink for the
  filing-link affordance.
- Light-theme palette matching existing components
  (rose/amber/slate ring-1 ring-inset badges) — the spec's
  bg-card/text-foreground tokens reference shadcn dark-theme but
  the project uses bg-white/text-slate-700.
- Accessibility: aria-label on section, role="status" on severity
  pills, aria-hidden on decorative icons.
- Mobile-first: stacked rows on <sm, side-by-side on sm+.

frontend/app/stock/[ticker]/page.tsx
- Import Tier2EventCard.
- Wired between the Price (1y) section and the FairPriceCard
  block, per spec ordering: chart → events → fair price →
  fundamentals.

Edge case fixed during build verification
-----------------------------------------

Initial implementation guarded with `tier2_events === null`. The
production stock JSONs committed under PR 3c lack the `tier2_events`
key entirely (the schema is forward-compatible: the field is
optional in Pydantic, so existing files just don't have it).
JavaScript JSON.parse returns `undefined` for absent keys, not
`null` — so `=== null` missed the case and the destructure crashed
during `next build` for all 502 stocks. Fixed to `== null` (loose
equality catches both null + undefined). Comment in the component
explains the forward-compat reasoning.

Tests (frontend)
----------------

The frontend has no test framework configured (no jest / vitest /
@testing-library in package.json). Per spec ("If neither has
component tests, skip in favor of visual regression"), no component
tests added. `tsc --noEmit` + `next build` are the type/build
correctness guarantees:

- npx tsc --noEmit -> clean
- npm run build -> 506 / 506 routes pre-rendered cleanly

What's NOT in this commit
-------------------------
- Visual snapshot regression tests (no harness; would require
  adding playwright or storybook — out of scope)
- PillarRadarChart (Step 7)
- FairPriceBarChart (Step 8)

Verification
------------
- npx tsc --noEmit -> clean
- npm run build -> 506 / 506 routes ✓
- ruff check . -> clean (no Python touched)
- pytest tests/ -m "not network" -> 500 passed (no Python touched;
  sanity-check that nothing regressed)

Visual spot-checks deferred to Vercel preview
---------------------------------------------

I cannot render the component locally; spot-checks happen on the
Vercel preview deploy after this commit lands. Spec scenarios:
1. Stock with no Tier-2 events (most production stocks at
   commit 9cd2c74) → card hidden ✓ (forward-compat null-check)
2. Stock with auditor_change only → amber Annotate row + link
3. Stock with non-reliance fired → red HARD VETO row + link
4. All 3 fired → 3-row card + 8-K date footer

Production stock JSONs at HEAD won't have tier2_events populated
(Step 10 workflow_dispatch is what triggers regeneration). So the
preview will show "no Tier-2 events" everywhere; full visual
verification of fired states happens at Step 10.

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
dackclup pushed a commit that referenced this pull request May 10, 2026
…efer

Run #14 timeout root cause: SEC EDGAR API throttling amplified by
tenacity retry policy (max=30s × 3 attempts = 60-90s per failed
stock). Run #11 (2 days ago) finished in 23m on same code path;
Run #14 stuck 1h+ in fundamentals stage = 3-6x SEC API slowdown
during incident.

Mitigations:
1. Tighten retry on BOTH _build_snapshot and _build_annual_history:
   stop=(stop_after_delay(30) | stop_after_attempt(2)),
   wait=wait_exponential(min=2, max=8). Caps per-stock retry at ~30s.
2. Per-stock fundamentals + history fetch timeout (fut.result(timeout=45))
   — graceful skip on stuck-task. Defensive backstop; real cap is the
   inner tenacity stop_after_delay.
3. Suppress noisy edgartools concept-miss UserWarnings via
   facts._suppress_warnings = True after company.get_facts(). Skips
   the difflib fuzzy-match suggestion pass and frees stderr for triage.
4. Per-stock latency histogram (<5s / 5-15s / 15-30s / 30s+) with
   thresholds aligned to retry-policy tiers, plus p50/p95 + top-20
   slow tickers logged for Phase 4 throttling-detection visibility.
5. fundamentals_coverage_pct + fundamentals_latency_p50_seconds +
   fundamentals_latency_p95_seconds in Metadata mirror the existing
   tier2_coverage_pct.

ALSO: defer 8-K event fetches (Defenses #9 + #10) to Phase 4. Three
workflow timeouts (#12, #13, #14) consumed budget; ship PR 3d with
going-concern (Defense #8) only. _EIGHT_K_DEFENSES_ENABLED feature
flag gates the 8-K branch — single-line flip in Phase 4 to re-enable
once the pre-cache layer lands. Schema unchanged; 8-K event fields
in tier2_events emit but always False/None until Phase 4. Active veto
count temporarily 3 (was planned 4); restored in Phase 4.

Tests: 511 → 526 (+15: 5 deferred-mode tier2, 5 histogram/percentile/
tuple-return main, 1 retry-policy contract, 4 fixture-extended A/D
tests for 8-K wiring).

Tracked: /tmp/issue_drafts/issue_8k_events_phase4.md +
/tmp/issue_drafts/issue_fundamentals_resilience_phase4.md.

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
dackclup added a commit that referenced this pull request May 10, 2026
…ience (#12)

feat(phase-3d): Tier-2 going-concern + UI polish + fundamentals resilience (#12)

Ships Tier-2 going-concern defense (Defense #8, annotate-only,
known 10.8% FP rate to refine in Phase 4) + 3 frontend UI
components (Tier2EventCard, PillarRadarChart, FairPriceBarChart)
+ fundamentals resilience (retry tightening + per-stock timeout +
latency observability).

8-K event defenses (#9 + #10) DEFERRED to Phase 4 due to SEC
API throttling cost during integration. Schema additions:
tier2_events, tier2_coverage_pct, fundamentals_coverage_pct,
latency p50/p95.

Schema 0.5.0-phase3c → 0.6.0-phase3d.
Tests 409 → 526.
Defense scorecard: 3 vetoes / 5 guards / 6 annotate.

v1.0 ETA: PR 3e (Tier-3 + Honest Limitations).

Generated with Claude Code · Tested with Anthropic API
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants