Skip to content

feat(phase-3d): Tier-2 going-concern + UI polish + fundamentals resilience#12

Merged
dackclup merged 19 commits into
mainfrom
claude/phase-3d-tier2-events
May 10, 2026
Merged

feat(phase-3d): Tier-2 going-concern + UI polish + fundamentals resilience#12
dackclup merged 19 commits into
mainfrom
claude/phase-3d-tier2-events

Conversation

@dackclup
Copy link
Copy Markdown
Owner

@dackclup dackclup commented May 10, 2026

feat(phase-3d): Tier-2 going-concern + UI polish + fundamentals resilience

Closes the v1.0 milestone two-thirds. Ships Tier-2 going-concern defense + 3 frontend UI components. 8-K event defenses (4.02 + 4.01) deferred to Phase 4 due to SEC API throttling encountered during integration (3 workflow_dispatch attempts hit timeout).

After this merges, PR 3e ships Tier-3 (Beneish + Dechow + Honest Limitations) and tags v1.0.

Scope shipped

Tier-2 event defenses (1 of 3 originally planned)

Frontend UI

  • Tier2EventCard: severity-coded events (red HARD VETO / amber Annotate pills), filing links. Currently displays only going-concern flag (8-K fields hidden until Phase 4).
  • PillarRadarChart: 8-pillar polar radar with null-handling + Phase 5+ footer
  • FairPriceBarChart: 6-method horizontal bars + outlier graying + reference lines

Schema additions

  • StockDetail.tier2_events (dict | None, 5-field shape)
  • Metadata.tier2_coverage_pct (float | None)
  • Metadata.fundamentals_coverage_pct (float | None) — NEW
  • Metadata.fundamentals_latency_p50_seconds (float | None) — NEW
  • Metadata.fundamentals_latency_p95_seconds (float | None) — NEW
  • Tier2Events TypeScript interface mirror
  • Schema snapshot CI guard prevents drift
  • Reason taxonomy: 21 → 24 stable identifiers

Fundamentals resilience (added during PR 3d)

SEC API throttling encountered during workflow_dispatch attempts revealed retry-amplification cost (~60-90s per stuck stock under exponential backoff). Added:

  • Tightened tenacity retry: stop_after_delay(30) | stop_after_attempt(2), wait_exponential(max=8s)
  • Per-stock fetch timeout: 45s ceiling via fut.result(timeout=45)
  • Suppress edgartools concept-miss warnings (~1.2s + log cleanliness)
  • Per-stock latency observability: p50, p95, histogram, slow-ticker logging
  • Coverage + latency surfaced in Metadata for monitoring

Production verification (commit 4805741, Run #15)

Tests

  • 409 (PR 3c baseline) → 526 (+91 non-network + 3 @network + 15 fundamentals-resilience + 6 8-K-defer = 117 net additions in PR 3d)
  • All 4 CI jobs green: ruff, schema-snapshot, pytest, tsc
  • Frontend: 506/506 routes pre-render

Defense scorecard at v0.6.0-phase3d

  • 3 vetoes (was planned 4): altman, sloan, NSI
    • non_reliance_filing deferred to Phase 4
  • 5 guards (unchanged): stale_filing, outlier_5x, terminal_g, sector_excl, data_quality
  • 6 annotate flags (was planned 7): goodwill_heavy, value_trap_risk, extreme_{method}_estimate, stale_filing_soft, data_quality_input_corruption, going_concern_disclosure
    • auditor_change deferred to Phase 4

Reviewer checklist

  • CI green (4 jobs on Run Fundamentals fetch resilience to SEC API throttling #15 commit)
  • Vercel preview spot-checked (rankings + stock detail with going_concern flag visible on at least 1 ticker)
  • Schema snapshot in sync (Pydantic ↔ TypeScript no drift)
  • 24 reason taxonomy entries
  • Feature flag _EIGHT_K_DEFENSES_ENABLED = False confirmed
  • Fundamentals coverage_pct + p50/p95 metadata surfaces

Post-merge actions

  • Tag v0.6.0-phase3d (pre-release; v1.0 ships in PR 3e)
  • Delete branch claude/phase-3d-tier2-events
  • File 4 staged Phase 4 issues:
    • 8-K events re-enablement (issue_8k_events_phase4.md)
    • Fundamentals resilience pre-cache (issue_fundamentals_resilience_phase4.md)
    • Going-concern phrase refinement (issue_going_concern_phrase_refinement.md, NEW)
    • Stale log message (issue_log_message_update.md, NEW low-priority)
  • Begin PR 3e planning (Tier-3 + v1.0 milestone)

Defense Playbook lineage (revised)


Generated with Claude Code · Tested with Anthropic API

…nstants

Step 1 of PR 3d (Tier-2 event defenses). Lays groundwork for the 3 new
defenses landing in Steps 2-3:
  - Defense #8 — going-concern phrase scan (annotate-only, Step 2)
  - Defense #9 — 8-K Item 4.02 hard veto (4th active veto, Step 3)
  - Defense #10 — 8-K Item 4.01 auditor change (annotate-only, Step 3)

Changes
-------

compute/config.py
- SCHEMA_VERSION: "0.5.0-phase3c" -> "0.6.0-phase3d"
- New constants in a Phase-3d-specific block:
    * EIGHT_K_LOOKBACK_DAYS_VETO = 365
      (Item 4.02 trailing-12-month window — Schroeder 2024 SSRN shows
      ~50% subsequent restatement rate within this window)
    * EIGHT_K_LOOKBACK_DAYS_ANNOTATE = 730
      (Item 4.01 2-year window per Reg S-K Item 304 disclosure)
    * GOING_CONCERN_FILING_LOOKBACK_DAYS = 400
      (1y + buffer to capture the most recent 10-K — calendar-year
      filers cluster ~75d after fiscal year-end)

PHASE_STATUS.md
- Test count fix: "118 → 410" was a transposition error in PR 3c's
  Step 11 docs (actual test count was 409 throughout). Off-by-one
  cosmetic; flagged in pre-PR-3d verification, fixed here while
  touching docs anyway.

tests/test_smoke.py
- test_phase0_scaffold_imports: bump expected SCHEMA_VERSION prefix
  from "0.5.0" to "0.6.0" (this assertion gets updated each phase
  bump; same pattern as PR 3c).

tests/test_config.py (NEW, ~35 LOC)
- 5 trivial smoke tests locking the values of the 4 phase-3d
  constants + the schema version. Catches accidental drift.
  - test_schema_version_is_phase3d
  - test_eight_k_lookback_veto_is_one_year
  - test_eight_k_lookback_annotate_is_two_years
  - test_going_concern_filing_lookback_is_one_year_plus_buffer
  - test_eight_k_annotate_window_outlasts_veto_window
    (annotate window must be >= veto window — surfaces a 4.01
    disclosure even after a 4.02 veto would have lapsed)

Verification
------------
- ruff check .  -> clean
- python -m compute.output.schema_check  -> in-sync (no Pydantic
  schema changes in this step; tier2_events field lands in Step 4)
- pytest tests/ -m "not network"  -> 414 passed (was 409 -> +5 new)
- npx tsc --noEmit (frontend)  -> clean

What's NOT in this commit
-------------------------
- New scoring modules (Step 2 going_concern.py; Step 3 eight_k_events.py)
- Pydantic schema additions (Step 4 tier2_events field)
- Frontend components (Steps 6-8)
- Any new pip dependencies — beautifulsoup4 + lxml + edgartools +
  requests already cover everything needed for Tier-2

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
@vercel
Copy link
Copy Markdown

vercel Bot commented May 10, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
quantrank Ready Ready Preview, Comment May 10, 2026 4:42pm

…te-only)

Step 2 of PR 3d. Adds a pure scoring module that scans 10-K / 10-Q
filing text for going-concern indicator phrases drawn from the
Loughran-McDonald financial dictionary subset documented in
Mayew-Sethuraman-Venkatachalam 2015 *The Accounting Review*.

Annotate-only flag (Rule 16) — composite score is never modified.
Surfaces in StockDetail.tier2_events.going_concern_disclosure (the
field lands in Step 4) and the user-visible flag list on the detail
page (the UI lands in Step 6).

Changes
-------

compute/scoring/going_concern.py (NEW, ~120 LOC)
- GOING_CONCERN_PHRASES: tuple of 14 curated LM-dictionary phrases
  (locked at module level; tuple = immutable, prevents accidental
  runtime mutation).
- scan_going_concern(text) -> bool: pure function. Pre-compiled
  per-phrase regex tuple at module load (avoids per-call recompile
  on every ticker × every phrase).
- Each pattern:
    * uses re.escape to neutralize metacharacters in the phrase
    * replaces each escaped space with [\s\-]+ so multi-space, line
      breaks, and hyphens between words all match
    * anchors with \b at start and end so partial-word matches
      (e.g., "ongoing concerns", "discontinued") do NOT trip the
      flag — these are likely false-positive vectors and the spec
      doesn't list them but the test suite asserts both
- Loughran-McDonald CC BY 4.0 attribution in module docstring.
- Returns False for None / empty (caller distinguishes "no signal"
  from "couldn't fetch" via tier2_coverage_pct in Metadata).
- No new dependencies — uses stdlib re only.

tests/test_scoring/test_going_concern.py (NEW, 25 cases)
- A1-A8: primary phrases detected (parametrized over the 8 most
  common boilerplate variants).
- B1-B4: whitespace + punctuation flex (multi-space, newline,
  multi-double-space, hyphen).
- C1-C4: negative cases (clean text, "concern" alone, "doubt"
  without "substantial", empty string).
- D1-D3: edge cases (None, single char, multi-occurrence).
- E1-E2: phrase at start / end of text.
- F1-F2: module surface (tuple-ness, ≥12 entries).
- G1-G2: word-boundary safety — not in the original spec, but
  guards against the obvious false-positive vectors:
    * "ongoing concerns" must NOT match (it contains the
      substring "going concern" but both word boundaries fail
      under \b anchoring)
    * "discontinued operations" must NOT match (substring
      "continued" inside a longer word)

Verification
------------
- ruff check . -> clean (1 import-sort fix auto-applied)
- python -m compute.output.schema_check -> in-sync
  (no Pydantic schema changes in this step; tier2_events lands in Step 4)
- pytest tests/ -m "not network" -> 439 passed (was 414 -> +25 new)
- npx tsc --noEmit (frontend) -> clean

What's NOT in this commit
-------------------------
- 8-K event parsing (Step 3 module: eight_k_events.py)
- Pydantic schema additions (Step 4: StockDetail.tier2_events)
- compute/main.py wire-up (Step 5)
- Frontend Tier2EventCard (Step 6)
- 10-K filing fetch + cache layer (Step 5 — uses edgartools)

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
Step 3 of PR 3d. Adds the 8-K event scoring module that backs:
  - Defense #9 (Item 4.02 "Non-Reliance on Previously Issued Financial
    Statements") — HARD VETO. Joins altman / sloan / NSI as the 4th
    active veto at v1.0.
  - Defense #10 (Item 4.01 "Changes in Registrant's Certifying
    Accountant") — annotate-only. Reg S-K Item 304 mandates the same
    disclosure for benign reasons, so false-positive rate is too
    high for veto.

Both defenses surface in StockDetail.tier2_events (Pydantic field
lands in Step 4) and the user-visible flag list. Per SKILL.md
Rule 16, neither modifies the composite score.

Changes
-------

compute/config.py (+11 LOC)
- New constants:
    * EDGAR_8K_CACHE_DIR = CACHE_DIR / "edgar_8k"
    * EDGAR_8K_CACHE_TTL_SECONDS = 7 * 86400
    * EDGAR_8K_ITEM_TEXT_EXCERPT_CHARS = 500

compute/scoring/eight_k_events.py (NEW, ~310 LOC)
- ItemFlag frozen dataclass — return shape for both check_* funcs.
  Fields: fired (bool), filing_date (str|None), filing_url (str|None),
  raw_item_text (str|None, ≤ EXCERPT_CHARS).
- fetch_recent_8k_filings(ticker, lookback_days) -> list[dict] | None.
  Wraps edgartools' Company.get_filings(form="8-K", filing_date=...);
  parses each filing via filing.obj() (returns EightK with .items
  attribute returning list[str] like ["Item 5.02", "Item 9.01"]);
  extracts item-text excerpts from EightK.sections (best-effort —
  shape varies across edgartools versions, gracefully degrades to
  empty excerpts).
- Returns None on EDGAR rate-limit / network failure / missing
  identity / ticker-not-found. Returns [] on successful fetch with
  zero 8-Ks in window.
- check_non_reliance(ticker) — Item 4.02, 365-day lookback.
- check_auditor_change(ticker) — Item 4.01, 730-day lookback.
- Both accept optional `filings=` kwarg for unit-test injection.
- Most-recent match wins when multiple 4.02 / 4.01 fire in window.
- Item-number regex is dot-anchored both sides ("\bItem\s+4\.\s*02\b")
  so "Item 4.020" does NOT match "Item 4.02".
- _ensure_edgar_identity is lazy (logged warning, not RuntimeError)
  on missing EDGAR_USER_AGENT — Tier-2 features are non-fatal,
  unlike fundamentals.

Cache layer (inlined in eight_k_events.py, ~80 LOC)
- JSON-on-disk at compute/cache/edgar_8k/<ticker>.json (gitignored
  by existing compute/cache/ rule).
- 7-day TTL — safe because 4.02/4.01 events are sticky once filed
  (they don't disappear).
- Cache hit requires cached_lookback >= requested_lookback (so a
  365d entry can't serve a 730d request).
- Atomic write via tmp + os.replace.
- Corrupt JSON / unparseable timestamps treated as miss (logged warn).
- Filename ticker-sanitized via [^A-Za-z0-9_-] regex (BRK-B works,
  path-traversal attempts neutralized).
- invalidate_cache(ticker) — public helper, idempotent.

tests/test_scoring/test_eight_k_events.py (NEW, 28 cases — 25 unit/cache
+ 3 @network)
- A1-A14: synthetic Filing fixture tests (item parsing, lookback
  windows, multiple matches, case variants, excerpt truncation,
  frozen dataclass, item-number boundary precision).
- B1-B6: cache layer (miss → fetch, hit → no fetch, expired → refetch,
  invalidate, corrupt JSON, lookback-undersize miss).
- 2 ticker-path safety tests (BRK-B preservation, path traversal).
- C1-C3: @network smoke against real SEC EDGAR (skipped without
  EDGAR_USER_AGENT). Asserts 5 known-clean tickers (AAPL/MSFT/GOOGL/
  JPM/KO) have ≤1 fired flag, AAPL has neither 4.02 nor 4.01,
  cache effectiveness via timing.

Verification
------------
- ruff check . -> clean (1 pytest.raises(Exception) lint fix —
  switched to FrozenInstanceError specifically)
- python -m compute.output.schema_check -> in-sync (Step 4 adds
  the Pydantic tier2_events field)
- pytest tests/ -m "not network" -> 464 passed (was 439 -> +25 new
  unit/cache; +3 @network properly skipped)
- npx tsc --noEmit -> clean

Edgartools API notes (for Step 5 wire-up)
------------------------------------------
- Company.get_filings(form="8-K", filing_date=(start, end)) returns
  an EntityFilings iterable.
- Each Filing has .obj() that returns an EightK (for 8-K forms).
- EightK.items returns List[str] like ["Item 5.02", "Item 9.01"]
  via a 3-tier fallback parser (modern sections → chunked_document
  → text-pattern extraction). Handles SGML legacy filings (1999-2001).
- EightK.sections is the source for item-body excerpts but its
  shape varies (sometimes dict, sometimes list); the module guards
  with `if isinstance(sections, dict)` and degrades to empty
  excerpts if the shape doesn't match expectations.

What's NOT in this commit
-------------------------
- Pydantic schema additions (Step 4: StockDetail.tier2_events
  field + Metadata.tier2_coverage_pct)
- Risk-overlay integration (Step 4: non_reliance_filing flag joins
  the risk_flags list)
- compute/main.py wire-up (Step 5)
- Frontend Tier2EventCard (Step 6)
- New pip dependencies — uses existing edgartools + stdlib

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
Step 4 of PR 3d. Wires Defenses #8 / #9 / #10 into the JSON contract and
makes #9 (8-K Item 4.02 non-reliance) the **4th active hard veto** at v1.0.

Changes
-------

compute/output/schemas.py
- StockDetail.tier2_events: dict | None = None
  Display payload populated by Step 5; shape (when set):
    {"going_concern_disclosure": bool,
     "non_reliance_filing": bool,
     "auditor_change": bool,
     "latest_8k_filing_date": str | None,   # ISO YYYY-MM-DD
     "latest_8k_filing_url": str | None}
- Metadata.tier2_coverage_pct: float | None = None
  Population-level fetch-success rate. None when Tier-2 disabled
  (e.g., env var missing).

frontend/lib/types.ts
- New `Tier2Events` type mirroring the Python dict shape
- `StockDetail.tier2_events: Tier2Events | null`
- `Metadata.tier2_coverage_pct: number | null`
- Inline doc on Tier2Events explaining which fields are veto vs annotate.

compute/valuation/applicability.py
- SKIP_REASONS: 21 → 24 entries. New stable identifiers:
    going_concern_disclosure, non_reliance_filing, auditor_change.
  These are tracked here so the JSON-contract reason taxonomy is
  complete; the same strings also appear in StockDetail.tier2_events
  (display) and risk_flags (only non_reliance_filing — hard veto).

compute/scoring/risk_overlay.py
- Module docstring updated: "three vetoes" → "four vetoes" with
  non_reliance_filing entry citing eight_k_events.check_non_reliance.
- compute_risk_flags acquires a new optional kwarg
  `non_reliance_by_ticker: dict[str, bool] | None = None`:
    * Default (None): per-ticker fallback to
      check_non_reliance(ticker), which hits the 7-day on-disk
      EDGAR cache or returns ItemFlag(fired=False) when identity
      is unset (= test environment, sandbox).
    * Explicit dict: tests + Step 5 inject pre-computed results.
      Step 5 will share fetch work between this veto path and
      the StockDetail.tier2_events display path so the EDGAR fetch
      happens once per ticker per compute run, not twice.
  This is a slight extension of the spec's pure-inline `check_non_reliance(ticker)` call, but it keeps the function unit-testable
  without network mocking and avoids a duplicate fetch in production.
  The default behavior matches the spec exactly when the kwarg is
  omitted.

frontend/lib/schema-snapshot.json
- Regenerated via `python -m compute.output.schema_check
  --update-snapshot`. Diff: +tier2_coverage_pct under Metadata,
  +tier2_events under StockDetail. No collateral drift.

tests/test_output/test_tier2_schema.py (NEW, 13 cases)
- A1-A5: Pydantic field validation (StockDetail.tier2_events
  accepts dict / None; Metadata.tier2_coverage_pct accepts float / None;
  JSON round-trip preserves the dict shape).
- B1-B5: SKIP_REASONS taxonomy (3 new entries present, count = 24,
  all entries unique).
- D1-D3: schema-snapshot file (committed snapshot includes both
  new fields with correct type/required/default shape).

tests/test_scoring/test_risk_overlay.py (+6 cases)
- C1-C6: Defense #9 non_reliance integration:
    * inject {ticker: True} → flag appears
    * inject {ticker: False} → no flag
    * empty inject dict → no flag
    * default path with no EDGAR_USER_AGENT → no flag (existing
      PR-3c tests rely on this contract; tests use monkeypatch
      to ensure a clean cache + identity state)
    * additive with altman/sloan — all 4 vetoes can fire together
    * inject dict for ticker A doesn't pollute ticker B

Verification
------------
- ruff check . -> clean (1 import-sort fix auto-applied)
- python -m compute.output.schema_check -> in-sync after regen
- pytest tests/ -m "not network" -> 483 passed (was 464 -> +19 new)
- npx tsc --noEmit -> clean

What's NOT in this commit
-------------------------
- compute/main.py wire-up (Step 5 — pre-fetches Tier-2 data in
  parallel with fundamentals, populates tier2_events display
  dict + injects non_reliance_by_ticker into compute_risk_flags)
- Frontend Tier2EventCard / PillarRadarChart / FairPriceBarChart
  components (Steps 6-8)

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
Step 5 of PR 3d. Wires Defenses #8 / #9 / #10 into the production
weekly-compute pipeline. After this commit, rankings.json carries the
4th active veto (non_reliance_filing) and StockDetail.tier2_events on
every stock; metadata.json reports population-level Tier-2 coverage.

Architecture
------------

- New `compute/ingest/filing_text.py` (~210 LOC): 10-K text fetch with
  90-day on-disk cache. Mirrors the eight_k_events.py cache pattern
  (atomic write via tmp + os.replace; fetched_at TTL gate; safe_ticker
  filename sanitization). Returns None on every failure mode (rate
  limit, missing identity, no recent 10-K) — never raises.
- New `compute/scoring/tier2.py` (~180 LOC): Tier2Result frozen
  dataclass + fetch_tier2_for_ticker orchestrator + tier2_events_dict
  + coverage_pct helpers. The orchestrator catches every per-defense
  exception so one bad ticker can't crash the run.
- Reuses `fetch_recent_8k_filings` ONCE per ticker (with the larger
  730d lookback that covers both 4.02 and 4.01 windows) — both
  `check_non_reliance` and `check_auditor_change` operate on the same
  in-memory filing list. Avoids a duplicate EDGAR call per ticker.

compute/config.py
- New EDGAR_10K_TEXT_CACHE_DIR + EDGAR_10K_TEXT_CACHE_TTL_SECONDS
  (= 90 days). 10-K filings are annual so an 89-day stale cache hit
  returns the same filing.

compute/main.py
- New "Step 4b" between fundamentals + risk-flag computation: parallel
  Tier-2 fetch via ThreadPoolExecutor(max_workers=EDGAR_MAX_WORKERS=5).
  Same parallelism budget as fundamentals — well under SEC's 10/sec
  rate limit.
- non_reliance_by_ticker dict built from tier2_results, injected into
  compute_risk_flags. Avoids the duplicate fetch the inline default
  path would have triggered. Only fired tickers go in (per Step 4
  spec: dict.get(ticker, False) default).
- Per-ticker StockDetail loop populates tier2_events from
  tier2_events_dict(tier2_results.get(ticker)). Tickers absent from
  the dict get tier2_events=None — graceful "no Tier-2 data" surface.
- Metadata.tier2_coverage_pct populated from coverage_pct(tier2_results).
  None when universe is empty; 0.0 when all fetches failed; rounded
  to 2 decimal places otherwise.
- Added `Tier2Result` to imports for type clarity (linter wanted it
  in a separate `from .. import` line because of the `as` alias on
  coverage_pct — accepted).

Failure isolation
-----------------

Three layers of safety:
1. Each underlying fetcher (fetch_latest_10k_text, fetch_recent_8k_filings)
   returns None on any failure — never raises.
2. fetch_tier2_for_ticker wraps each per-defense call in try/except; one
   defense's failure doesn't abort the orchestrator.
3. The compute/main.py executor loop also catches exceptions from
   fut.result() — defensive, since the orchestrator already swallows
   everything.

A failed-fetch ticker simply won't appear in tier2_results; the
per-ticker loop's tier2_results.get(ticker) returns None, which builds
a StockDetail with tier2_events=None.

tests/test_scoring/test_tier2.py (NEW, 17 cases)
- A1-A6: orchestration permutations (clean, partial 10-K fail, partial
  8-K fail, total fail, exception caught, both 8-K items present).
- B1-B4: tier2_events_dict shape + non_reliance > auditor_change
  preference for latest_8k_filing date/url + 5-key contract check.
- C1-C5: coverage_pct including 100% / 0% / 49.80% / empty / single.
- D1: end-to-end synthetic 10-ticker pipeline covering all 3 defenses.
- D2: Tier2Result frozen dataclass.

Verification
------------
- ruff check . -> clean
- python -m compute.output.schema_check -> in-sync
- pytest tests/ -m "not network" -> 500 passed (was 483 -> +17 new)
- npx tsc --noEmit -> clean
- main.py wire-up smoke-imports cleanly; sanity grep confirms
  tier2_results / tier2_coverage_pct / tier2_events / non_reliance
  inject all wired through.

Performance budget
------------------

Cold cache estimate (first run):
- 502 tickers × 2 EDGAR fetches each (10-K + 8-K) at ~5 parallel workers
  = ~200s = ~3.5 min. Well under the +10-15 min budget.
Subsequent weekly runs: mostly cache hits → +30-60s.
NO new asyncio / concurrency primitives — ThreadPoolExecutor
matches the existing fundamentals-fetch pattern.

What's NOT in this commit
-------------------------
- Frontend Tier2EventCard / PillarRadarChart / FairPriceBarChart
  (Steps 6-8)
- Documentation updates (Step 9)
- Production verification via workflow_dispatch (Step 10)

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
…e wire

Step 6 of PR 3d. Adds the user-visible surface for Defenses #8 / #9 /
#10 — a card that lists fired regulatory events with severity coding
(HARD VETO red pill for non-reliance; Annotate amber pill for
going-concern + auditor-change). Renders nothing when there are no
events or when the StockDetail predates the schema (graceful
forward-compat for stocks/*.json files written under PR-3c schema).

Position: between PriceHistoryChart and FairPriceCard — regulatory
events affect investability more than valuation, so they sit higher
in the visual hierarchy.

Changes
-------

frontend/components/Tier2EventCard.tsx (NEW, ~165 LOC)
- "use client" component with strict TypeScript types (no `any`).
- Props: tier2_events: Tier2Events | null, ticker: string.
- Renders null when:
    * tier2_events is null OR undefined (loose-equality check —
      `undefined` is the runtime shape for stock JSONs written
      under pre-PR-3d schemas, before Step 10's compute regenerates
      them with the field populated)
    * All 3 flags are false (clean ticker)
- Otherwise renders rows in priority order: non_reliance_filing
  first (hard veto), then going_concern_disclosure, then
  auditor_change. Date footer (latest 8-K) shown only when an 8-K
  flag fired AND a date is present.
- "View filing" link with target=_blank + rel=noopener,noreferrer
  for the 8-K rows; going-concern has no link (text scan, not 8-K).
- Inline SVG icons (lucide-react is NOT in package.json — spec's
  hard constraint says "NO new npm dependencies"). Three 24px
  stroke icons styled to match lucide's visual language:
  AlertOctagon (veto), AlertTriangle (going-concern),
  UserMinus (auditor-change), plus a small ExternalLink for the
  filing-link affordance.
- Light-theme palette matching existing components
  (rose/amber/slate ring-1 ring-inset badges) — the spec's
  bg-card/text-foreground tokens reference shadcn dark-theme but
  the project uses bg-white/text-slate-700.
- Accessibility: aria-label on section, role="status" on severity
  pills, aria-hidden on decorative icons.
- Mobile-first: stacked rows on <sm, side-by-side on sm+.

frontend/app/stock/[ticker]/page.tsx
- Import Tier2EventCard.
- Wired between the Price (1y) section and the FairPriceCard
  block, per spec ordering: chart → events → fair price →
  fundamentals.

Edge case fixed during build verification
-----------------------------------------

Initial implementation guarded with `tier2_events === null`. The
production stock JSONs committed under PR 3c lack the `tier2_events`
key entirely (the schema is forward-compatible: the field is
optional in Pydantic, so existing files just don't have it).
JavaScript JSON.parse returns `undefined` for absent keys, not
`null` — so `=== null` missed the case and the destructure crashed
during `next build` for all 502 stocks. Fixed to `== null` (loose
equality catches both null + undefined). Comment in the component
explains the forward-compat reasoning.

Tests (frontend)
----------------

The frontend has no test framework configured (no jest / vitest /
@testing-library in package.json). Per spec ("If neither has
component tests, skip in favor of visual regression"), no component
tests added. `tsc --noEmit` + `next build` are the type/build
correctness guarantees:

- npx tsc --noEmit -> clean
- npm run build -> 506 / 506 routes pre-rendered cleanly

What's NOT in this commit
-------------------------
- Visual snapshot regression tests (no harness; would require
  adding playwright or storybook — out of scope)
- PillarRadarChart (Step 7)
- FairPriceBarChart (Step 8)

Verification
------------
- npx tsc --noEmit -> clean
- npm run build -> 506 / 506 routes ✓
- ruff check . -> clean (no Python touched)
- pytest tests/ -m "not network" -> 500 passed (no Python touched;
  sanity-check that nothing regressed)

Visual spot-checks deferred to Vercel preview
---------------------------------------------

I cannot render the component locally; spot-checks happen on the
Vercel preview deploy after this commit lands. Spec scenarios:
1. Stock with no Tier-2 events (most production stocks at
   commit 9cd2c74) → card hidden ✓ (forward-compat null-check)
2. Stock with auditor_change only → amber Annotate row + link
3. Stock with non-reliance fired → red HARD VETO row + link
4. All 3 fired → 3-row card + 8-K date footer

Production stock JSONs at HEAD won't have tier2_events populated
(Step 10 workflow_dispatch is what triggers regeneration). So the
preview will show "no Tier-2 events" everywhere; full visual
verification of fired states happens at Step 10.

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
…age wire

Step 7 of PR 3d. Adds the 8-pillar polar radar visualization on the
stock detail page so users see composite-score breakdown at a glance.
Recharts (already in deps from PR 3c) provides the polar primitives;
no new npm dependencies.

Position in detail page: after FairPriceCard, before the existing
pillar-scores table. Final order is now PriceHistoryChart →
Tier2EventCard → FairPriceCard → PillarRadarChart → RawMetricsTable.

Changes
-------

frontend/components/PillarRadarChart.tsx (NEW, ~125 LOC)
- "use client" component, strict TypeScript types.
- Renders only the 8 active pillars (quality / value / growth /
  momentum / health / profitability / technical / risk). The two
  always-null Phase-5+ pillars (sentiment / ml) are explicitly
  named in the footer instead of being plotted as zero-axes.
- Null active-pillar handling: skipped from the dataset entirely.
  The footer separates "data quality issue this run" pillars from
  the permanent "Phase 5+" placeholder pair, so a degraded ticker
  isn't silently shrunk to fewer axes without explanation.
- Hard floor: <5 non-null active pillars → returns null. A 4-axis
  radar is degenerate (two chord pairs) and less useful than the
  raw pillar-scores table downstream.
- ACTIVE_PILLARS tuple typed with `as const satisfies
  ReadonlyArray<readonly [keyof PillarScores, string]>` so the
  pillar keys are checked at compile time against PillarScores.
- ResponsiveContainer (Recharts SSR convention) inside an h-72
  fixed-height wrapper — prevents layout shift on chart mount.
- Light-theme palette: slate-200 grid, slate-600 angle-axis labels,
  indigo-500 radar fill at 0.4 opacity. Matches the Step 6
  Tier2EventCard + the existing Step 10 (PR 3c) component palette.
- isAnimationActive=false for the Radar — pre-rendered routes
  shouldn't hint at client-side animation jank.

frontend/app/stock/[ticker]/page.tsx
- Import PillarRadarChart.
- Wired between FairPriceCard and RawMetricsTable.
- Spec ordering preserved: events → fair price → radar →
  fundamentals.

Verification
------------
- npx tsc --noEmit -> clean
- npm run build -> 506/506 routes pre-rendered ✓
- ruff check . -> clean (no Python touched)
- pytest tests/ -m "not network" -> 500 passed (no regression)

What's NOT in this commit
-------------------------
- Frontend component tests (no harness configured; build-time
  type + render verification covers the contract surface)
- FairPriceBarChart (Step 8)
- Docs updates (Step 9)
- Production verification via workflow_dispatch (Step 10)

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
…page wire

Step 8 of PR 3d (last frontend component). Adds the horizontal-bar
visual summary of the 6 fair-price ensemble methods on the stock
detail page. Renders just above the existing FairPriceCard tabular
breakdown, so users see the visual landscape first then can dig into
the per-method details.

Position in detail page (final order, top to bottom):
  1. PriceHistoryChart
  2. Tier2EventCard (renders null when no events)
  3. FairPriceBarChart (NEW) — visual ensemble summary
  4. FairPriceCard — tabular detail
  5. PillarRadarChart
  6. RawMetricsTable

Changes
-------

frontend/components/FairPriceBarChart.tsx (NEW, ~210 LOC)
- "use client" + strict TS. Recharts only (already in deps).
- Renders one horizontal bar per APPLICABLE method (skipped methods
  are simply absent — different from the FairPriceCard table which
  shows them with reasons).
- Outlier detection: each method whose extreme_<method>_estimate
  warning appears in fair_price.valuation_warnings is grayed
  (slate-400) instead of indigo-500. The 6-method warning convention
  was set in PR 3c Step 7.5.
- ReferenceLines for context:
    * Current price — rose-500 dashed, with "Current $X.XX" label
    * Median — indigo-700 solid (the headline ensemble value)
    * Max (excl. outliers) — indigo-300 solid (upside scenario)
- Domain capping: 1.2× max(non-outlier values + current_price +
  median + max). Outlier values are intentionally excluded from
  the cap so an extreme value extends off-chart on the right —
  visual signal that the value is extreme. Tooltip surfaces the
  unclamped raw_value.
- Negative method values clamped to 0 in the bar geometry (bars
  can't go negative visually); raw_value preserved in tooltip.
- Custom Tooltip component with method label + formatted price +
  outlier annotation when applicable.
- Inline legend below the chart explains the 5 visual elements
  (Applicable, Outlier when present, Median, Max, Current).
- Footer note about outlier 5×/0.2× cutoff shown only when at least
  one outlier is in the chart.
- Returns null when:
    * fair_price == null/undefined (forward-compat for older schema)
    * current_price == null/NaN (no reference line possible)
    * Zero applicable methods (e.g., BKR with all 6 nulled by the
      Step 7.5 data-quality sanity guard)
- Recharts naming gotcha documented: BarChart layout="vertical"
  produces HORIZONTAL bars. Inline comment so future readers don't
  flip it.
- Method order matches FairPriceCard table order (Graham → P/E →
  P/B → EV/EBITDA → RIM → DCF) for visual continuity between the
  two components.
- Accessibility: aria-label on the section, role="img" with descriptive
  aria-label on the chart container, aria-hidden on decorative
  legend swatches.

frontend/app/stock/[ticker]/page.tsx
- Import FairPriceBarChart, wire between Tier2EventCard and
  FairPriceCard.

Verification
------------
- npx tsc --noEmit -> clean
- npm run build -> 506/506 routes pre-rendered ✓ (route bundle
  shared chunks unchanged at 87.5 kB)
- ruff check . -> clean (no Python touched)
- pytest tests/ -m "not network" -> 500 passed (no regression)

Decisions worth flagging
------------------------

1. Outlier graying via Recharts <Cell> children (the standard
   per-bar override pattern), not a fill function. Cell-level fill
   is the documented Recharts API.
2. Aggregate markers (median/max) implemented as ReferenceLines,
   NOT separate Bars or ReferenceDots. Cleaner for a layout="vertical"
   BarChart and avoids confusing them with method bars.
3. Tooltip is a custom React component because Recharts' default
   tooltip can't render the "outlier — excluded from MAX" affordance.
4. Method labels match the FairPriceCard's METHOD_LABELS object
   verbatim ("Graham (defensive)", "P/E multiples", etc.) — single
   source of truth in spirit, though duplicated in this component
   for now. If the labels diverge, both UIs need a sync.

What's NOT in this commit
-------------------------
- Documentation updates (Step 9)
- Production verification via workflow_dispatch (Step 10)

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
Documentation-only commit. No code changes; tests / lint / schema
unchanged at 500 passing.

PHASE_STATUS.md (+65 lines)
- 3d row: ⚪ NEXT → ✅ DONE 2026-05-10. Lists all 3 Tier-2
  defenses with their modules + sources + commit refs (Steps 2/3
  commits). Notes the orchestrator + 10-K text fetcher as
  supporting infrastructure. Frontend additions (Tier2EventCard,
  PillarRadarChart, FairPriceBarChart) listed.
- Defense scorecard table flipped from "Now (post-3c)" to
  "Now (post-3d)": 4 vetoes (was 3) + 5 guards (unchanged) +
  7 annotate flags (was 5+, with `going_concern_disclosure` and
  `auditor_change` now bolded as new in 3d).
- v1.0 ETA updated: ~1 day remaining (just PR 3e).
- New "Phase 3d verified production stats — DRAFT" subsection
  with placeholder values for Step 10 fill-in.
- New "Phase 3d acceptance checklist" subsection mirroring the
  PR-3c block. 8 items already checked (Steps 1-8 + Vercel
  spot-check); 7 still pending (Step 10 production verification).

SKILL.md (+1 net line — schema-versions table row added)
- New `0.6.0-phase3d` row before the existing `1.0.0` (Phase 3e)
  row. Single dense paragraph documents the 3 defenses + their
  sources (Mayew 2015 TAR, Schroeder 2024 SSRN, Reg S-K Item 304),
  the 4th veto, the orchestrator, the 10-K cache, the schema
  additions, the new UI components, and the 21→24 reason
  taxonomy expansion.
- Old `1.0.0` row trimmed: removed the now-shipped Tier-2 scope
  description; left only the Tier-3 + Honest Limitations
  description (unchanged behavior).

docs/METHODOLOGY.md (+64 lines)
- Schema reference: `0.5.0-phase3c` (2026-05-09) →
  `0.6.0-phase3d` (2026-05-10).
- Defense-count summary: "7 active defenses — 3 vetoes + 5
  guards + 5+ annotate" → "10 active defenses — 4 vetoes + 5
  guards + 7 annotate".
- Active-vetoes table: 3 → 4 rows. New row: `non_reliance_filing`
  (8-K Item 4.02 within 365d, Schroeder 2024 SSRN).
- Annotate-only flags list: 5+ → 7 entries. New entries:
  `going_concern_disclosure` (with FP-rate caveat per Mayew et
  al. methodology) and `auditor_change` (with Reg S-K Item 304
  + FP-rate-too-high-for-veto reasoning).
- New "Tier-2 events" subsection between the Defense layer
  and Sanity tests sections. Contents:
    * Source/defense/mode/lookback table (10-K + two 8-K paths)
    * Cache strategy (90d 10-K, 7d 8-K) with rationale
    * Failure semantics (per-fetch None, never raises)
    * Implementation modules (4 files: going_concern,
      eight_k_events, tier2 orchestrator, filing_text)
    * Why-veto-vs-annotate split for 4.02 (high precision,
      Schroeder 50%) vs 4.01 (audit-firm rotation drives FP rate
      too high)

docs/RESEARCH_FINDINGS.md (+38 lines)
- PR 3d Tier-2 section (#7-9): "next" → "✅ SHIPPED 2026-05-10".
  Each defense annotated with implementation module + commit ref
  (Step 2 commit fee4498 / Step 3 commit cedadca / Step 4
  commit b90930e / Step 5 commit 9cd2c74).
- LOC estimate updated: ~270 → ~520 (closer to actual delivered
  scope including the orchestrator + cache + frontend).
- Added a paragraph listing the supporting infrastructure
  (orchestrator, 10-K cache, schema additions, frontend) and
  the 423 → 500 test count delta with breakdown across new test
  files.
- PR 3e + Phase 4+ sections unchanged (forward-looking).

Cross-doc consistency check
---------------------------

Search confirmed all `current state` references flipped:
- "21 stable identifiers" → "24 stable identifiers" everywhere
  the doc describes the live taxonomy. Historical references
  (e.g., the PR-3c row in SKILL's schema table, the PR-3c
  verified-stats block in PHASE_STATUS) correctly retain the
  21 count — that's accurate history.
- "3 vetoes" / "5+ annotate" similarly flipped where they
  describe the post-3d state.
- "0.5.0-phase3c" remains in PR-3c historical references (its
  schema-table row, its DONE block) — correct.

What's NOT in this commit
-------------------------

- WORKFLOW.md — Defense Roadmap is the single source of truth
  for unshipped schemas; PR 3d items there will close in
  Step 10's final docs pass once `tier2_coverage_pct` lands in
  production.
- README.md — Honest Limitations + v1.0 marketing copy lands
  in PR 3e.
- stock_ranking_knowledge.md — formula reference, not phase-
  specific.

Verification
------------
- ruff check . → clean (no Python touched)
- python -m compute.output.schema_check → in-sync
- pytest tests/ -m "not network" → 500 passed (unchanged)
- Doc line counts: 2552 → 2720 (+168 lines)

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
…7→91)

Cleanup follow-up to commit 923dca5. Two interlocking errors in the
PR 3d test-count claim, caught during Step 9 verify-2:

1. The "PR 3c baseline" was wrong. Actual count at PR-3c-end was 409
   (verified via the per-step deltas in commit messages eea8644c346ed5: 409 → 414 → 439 → 464 → 483 → 500). The docs claimed
   423, which was off by 14.

2. The "+5 misc" line in the breakdown was the 5 tests in
   `tests/test_config.py` from Step 1 (schema-version smoke). Not
   actually misc — it's the foundation-step constants test.

Correction
----------

PR 3c baseline: 409 (was 423)
Current non-network: 500 (unchanged)
Net delta: +91 non-network tests + 3 @network = **+94 total added
in PR 3d**

Per-file breakdown (verified via pytest --collect-only):
- +25 going-concern (Step 2)
- +25 8-K-events non-network + 3 @network (Step 3)
- +17 tier2 orchestrator (Step 5)
- +13 tier2 schema (Step 4)
- +6 risk-overlay non-reliance C-tests (Step 4)
- +5 config smoke (Step 1)
Sum: 91 non-network ✓ matches 500 - 409 ✓

Files changed
-------------

PHASE_STATUS.md
- Line 84: "77 new tests (500 - 423 baseline)" →
  "91 new non-network tests (409 → 500) + 3 @network tests added"

docs/RESEARCH_FINDINGS.md
- PR 3d shipped block tests bullet rewritten with the corrected
  baseline + the 6-line per-file breakdown matching the actual
  per-step commits.

What was correct already
------------------------

SKILL.md and METHODOLOGY.md don't carry test-count references for
PR 3d (only the qualitative scope description), so neither needed
the cleanup.

Verification
------------

- ruff check . → clean (docs only)
- python -m compute.output.schema_check → in-sync
- pytest tests/ -m "not network" → 500 passed (unchanged)
- grep confirms no lingering "423" / "+77" PR-3d references in
  any doc.

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
edgartools' filing.text() routes through hybrid_section_detector
which CPU-bounds at ~5-10s per 10-K (~19,000 section detections ×
502 stocks blew past the 45-min workflow_dispatch timeout on the
first cold-cache run).

Going-concern phrase scan only needs raw text — \b-anchored regex
with [\s\-]+ flex handles BS's get_text output identically to
rich_to_text output. Section structure is irrelevant for phrase
matching; the LM-dictionary phrases appear verbatim in the prose
body regardless of MD&A boundaries.

Cold-run estimate: 50+ min (timed out) → ~25-30 min (saves 40-75 min).
Subsequent runs: unchanged (90-day 10-K cache hits).

No new dependencies (beautifulsoup4 + lxml already in pyproject.toml).
No test changes to scan_going_concern (synthetic mocking unaffected).

Changes
-------

compute/ingest/filing_text.py (+22 net LOC)
- Add `from bs4 import BeautifulSoup` import.
- Replace the 13-line `text_attr = getattr(most_recent, "text", None)`
  block with a 14-line `html_attr = getattr(...) → BeautifulSoup(html,
  "lxml").get_text(separator=" ")` block. Preserves the existing
  callable-vs-property handling for `html` (same dual-shape API
  caveat as `text` had).
- Updated function docstring to document the perf decision: explains
  why we deliberately skip `filing.text()`, names the section
  detector cost, references the workflow_dispatch timeout, and
  confirms correctness equivalence for the going-concern scan.
- New "10-K HTML fetch returned empty" warning code path: when
  `filing.html()` returns falsy, log + return None gracefully.

tests/test_ingest/test_filing_text.py (NEW, 5 cases)
- test_bs_extraction_preserves_going_concern_phrase_inline_b_tags:
  phrase split by `<b>` tags survives extraction.
- test_bs_extraction_handles_table_layouts: SEC table-formatted
  MD&A still yields detectable phrases.
- test_bs_extraction_collapses_inline_styling_whitespace:
  inline spans / `<br>` tags don't break `[\s\-]+`-flex regex.
- test_bs_extraction_negative_clean_filing: inverse sanity —
  clean text after BS strip still scans False.
- test_bs_extraction_empty_html_yields_empty_text: edge case.

Locks the contract: if edgartools' html() output ever changes
shape, or BS extraction starts losing whitespace boundaries, this
test file catches it before production.

Verification
------------
- ruff check . → clean
- python -m compute.output.schema_check → in-sync
- pytest tests/ -m "not network" → 505 passed (was 500 → +5 new
  BS-contract tests)
- npx tsc --noEmit → clean
- python -c "from compute.ingest.filing_text import
  fetch_latest_10k_text" → imports cleanly

edgartools API edge cases discovered
------------------------------------

Confirmed via `inspect.getsource`:

- `Filing.text()` is a `@lru_cache(maxsize=4)` method that fetches
  HTML via `self.html()`, then runs `HTMLParser(ParserConfig(form=
  ...)).parse(html_content)` → `rich_to_text(document, width=500)`.
  The HTMLParser path is what calls `hybrid_section_detector`.
- `Filing.html()` is also a method (not a property — confirmed by
  inspect). Returns `sgml.html()` directly with no parsing. Same
  dual-shape (callable vs property) defensive handling preserved
  in our code in case future edgartools versions change.
- `Filing.full_text_submission()` exists but downloads the entire
  .txt submission (primary + exhibits + headers); much larger
  payload than just html() of the primary document. Not used.
- `Filing.text_url` (property) just returns the URL; we'd download
  + strip ourselves. Equivalent path but with extra HTTP round
  trip. Not used.

Cache shape unchanged — still stores stripped text under the same
JSON key; existing 90-day TTL applies.

What's NOT in this commit
-------------------------
- Changes to compute/scoring/going_concern.py (regex-side is
  already correctness-equivalent on the new input)
- Changes to fetch_tier2_for_ticker (orchestrator is unchanged;
  only the underlying text-source path changed)
- Workflow timeout bump (Option A backup) — not needed if Option B
  succeeds; can be reconsidered if next workflow run still hits
  the 45-min ceiling for unrelated reasons

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
claude added 2 commits May 10, 2026 12:34
edgartools' EightK.items + EightK.sections route through
hybrid_section_detector via the @cached_property document parsing
(HTMLParser(ParserConfig(form='8-K')).parse(html)).

Cost: ~3,500 detector invocations across 502 stocks × ~7 8-K
filings each. This dominated the 45-min workflow timeout in run
#13 even after Option B fixed the 10-K path (commit 226840d).

Mirror edgartools' own Strategy 3 regex fallback for legacy SGML
filings (current_report.py:51 _extract_items_from_text) — proven
correct on the same input distribution. Use filing.html() +
BeautifulSoup raw text + regex Item detection.

Adds tests A15-A20 (+6) covering canonicalization, dedup, excerpt
cap, empty HTML, and 4.020 false-match guard. Test count: 505 → 511.

Cold-run time: 45-min timeout → expected 25-30 min.
Subsequent runs: unchanged (cache hits).

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
Defense-in-depth safety net for cold-cache scenarios. With the
Scenario A perf fix in 12ad7ff (skip 8-K parser via raw HTML +
regex), expected cold run is 25-30 min — well under the previous
45-min ceiling. But:

1. We hit the 45-min ceiling twice already (runs #12, #13).
   Operating on the cliff edge is bad practice.
2. Phase 4+ defenses (Beneish, Dechow, REIT FFO/AFFO,
   cross-source validator) will each add fetch + compute time.
   Better to have headroom than re-fight this battle later.
3. GitHub Actions billing is by minute and timeout-minutes only
   matters on FAILED runs (which terminate at the cap). Successful
   runs bill actual time. Bumping the cap costs nothing if the
   perf fix works.
4. 90 min = comfortable 3x baseline. Future expansion has room
   without further ceiling fights.

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
…efer

Run #14 timeout root cause: SEC EDGAR API throttling amplified by
tenacity retry policy (max=30s × 3 attempts = 60-90s per failed
stock). Run #11 (2 days ago) finished in 23m on same code path;
Run #14 stuck 1h+ in fundamentals stage = 3-6x SEC API slowdown
during incident.

Mitigations:
1. Tighten retry on BOTH _build_snapshot and _build_annual_history:
   stop=(stop_after_delay(30) | stop_after_attempt(2)),
   wait=wait_exponential(min=2, max=8). Caps per-stock retry at ~30s.
2. Per-stock fundamentals + history fetch timeout (fut.result(timeout=45))
   — graceful skip on stuck-task. Defensive backstop; real cap is the
   inner tenacity stop_after_delay.
3. Suppress noisy edgartools concept-miss UserWarnings via
   facts._suppress_warnings = True after company.get_facts(). Skips
   the difflib fuzzy-match suggestion pass and frees stderr for triage.
4. Per-stock latency histogram (<5s / 5-15s / 15-30s / 30s+) with
   thresholds aligned to retry-policy tiers, plus p50/p95 + top-20
   slow tickers logged for Phase 4 throttling-detection visibility.
5. fundamentals_coverage_pct + fundamentals_latency_p50_seconds +
   fundamentals_latency_p95_seconds in Metadata mirror the existing
   tier2_coverage_pct.

ALSO: defer 8-K event fetches (Defenses #9 + #10) to Phase 4. Three
workflow timeouts (#12, #13, #14) consumed budget; ship PR 3d with
going-concern (Defense #8) only. _EIGHT_K_DEFENSES_ENABLED feature
flag gates the 8-K branch — single-line flip in Phase 4 to re-enable
once the pre-cache layer lands. Schema unchanged; 8-K event fields
in tier2_events emit but always False/None until Phase 4. Active veto
count temporarily 3 (was planned 4); restored in Phase 4.

Tests: 511 → 526 (+15: 5 deferred-mode tier2, 5 histogram/percentile/
tuple-return main, 1 retry-policy contract, 4 fixture-extended A/D
tests for 8-K wiring).

Tracked: /tmp/issue_drafts/issue_8k_events_phase4.md +
/tmp/issue_drafts/issue_fundamentals_resilience_phase4.md.

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
@dackclup dackclup changed the title feat(phase-3d): Tier-2 event defenses + UI polish feat(phase-3d): Tier-2 going-concern + UI polish + fundamentals resilience May 10, 2026
@dackclup dackclup marked this pull request as ready for review May 10, 2026 15:43
@dackclup dackclup marked this pull request as draft May 10, 2026 15:56
Three Recharts label-clipping issues found in production Vercel
preview spot-check on iPhone-class viewports:

1. FairPriceBarChart 'Current $X' ReferenceLine label clipped
   above chart frame. Moved from position='top' to
   'insideBottomRight' with offset=8.
2. FairPriceBarChart rightmost x-axis tick ('$1238' for NVDA)
   clipped at right edge. Bumped BarChart margin from
   {top:8,right:24,bottom:8,left:0} to
   {top:10,right:30,bottom:10,left:10}.
3. PillarRadarChart 'Technical' axis label clipped at left edge.
   Added explicit cx/cy/outerRadius props; reduced outerRadius
   from default 80% to 70% to give label breathing room on all
   8 axes.

No component logic changes; CSS/margin/position adjustments only.
Frontend build still passes 506/506 routes; tsc clean.

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
Two additional UI quality issues found in deeper spot-check:

1. Rankings cards with null fair_price (~15 of 502 stocks
   including SPG #1, BKR #6) rendered visibly shorter than
   populated cards, breaking visual rhythm in the mobile scroll
   list. Added min-h-[112px] for uniform card height + h-full on
   the inner Link so the hover area fills the card. Improved
   placeholder typography: 'Fair ⚠ N/A' (slate-400, with title
   tooltip) instead of bare '—', matching the visual weight of
   populated rows.

2. FairPriceBarChart small-value bars (e.g., Graham defensive ~$28,
   Residual Income ~$23 on NVDA) crushed to ~6 pixels when chart
   spans $0-$1238 due to outlier EV/EBITDA $1031. Added
   minPointSize=5 to Bar component — small bars stay visible
   regardless of value magnitude. Linear scale preserved.

No component logic changes; CSS/Tailwind/Recharts prop adjustments
only. Frontend build still passes 506/506 routes; tsc clean.

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
Three layout iterations from continued user spot-check:

1. FairPriceBarChart 'Current $X' label moved back to position='top'
   with offset=12 + chart top margin bumped from 10 to 35 to prevent
   clipping. The previous insideBottomRight position overlapped the
   y-axis 'Residual Income' label. Median + max ReferenceLines have
   no labels so don't interfere.

2. FairPriceBarChart visual rhythm:
   - container height 18rem (288px) → 300px (more vertical room)
   - barCategoryGap='25%' (cleaner method separation)
   - y-tick fontSize 11 → 10, width 140 → 110 (label fit at smaller
     font + more horizontal bar space)

3. RankingTable mobile card layout restructured from horizontal-flex
   to flex-col with three rows:
   - Row 1 (items-start flex): ticker+rank+name on left, score badge
     top-aligned on right (was vertically centered, floating mid-card)
   - Row 2 (justify-between): sector | price (tabular-nums)
   - Row 3 (justify-between): fair | MoS (tabular-nums)
   Cleaner visual rhythm across all 502 cards regardless of name/sector
   length or null fair_price.

No component logic changes; Tailwind/Recharts layout adjustments only.
Frontend build still passes 506/506 routes; tsc clean.

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
Three PR-3d components carried sm:max-w-2xl that capped their width at
672px on viewports ≥640px (Tailwind sm breakpoint), while existing cards
on the same stock detail page (FairPriceCard, RawMetricsTable) have no
width cap and span the full container. On tablet/large-phone viewports
this read as visual inconsistency: the new cards looked narrower than
the surrounding sections, breaking the page rhythm.

Removed sm:max-w-2xl from:
- FairPriceBarChart
- PillarRadarChart
- Tier2EventCard

All three now match the full-width treatment used by FairPriceCard +
RawMetricsTable. No layout changes inside the cards — the charts'
ResponsiveContainer already adapts to the parent width.

Frontend build still passes 506/506 routes; tsc clean.

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
@dackclup dackclup marked this pull request as ready for review May 10, 2026 17:18
@dackclup dackclup merged commit cd32810 into main May 10, 2026
4 checks passed
@dackclup dackclup deleted the claude/phase-3d-tier2-events branch May 10, 2026 17:20
dackclup pushed a commit that referenced this pull request May 14, 2026
Independent verification of PR #49's TTM flow-item fix surfaced a
related but distinct bug — 26 of 502 S&P 500 tickers shipped with
shares_outstanding=None (and consequently market_cap=None, PE=NaN,
PB=NaN). Affected tickers include front-page names: META (rank #12),
ACN, MA, BRK-B, CMCSA, DASH, LEN, ABNB.

Root cause: the `_BALANCE_TAGS["shares_outstanding"]` chain only had
`us-gaap:CommonStockSharesOutstanding` + `CommonStockSharesIssued`.
META / ACN etc. don't tag either concept — they only file shares via
`us-gaap:WeightedAverageNumberOfDilutedSharesOutstanding` (the EPS
denominator).

Two distinct patterns surfaced when probing more candidate tags:

1. DEI-current pattern (WMT, META, ACN). The cover-page tag
   `dei:EntityCommonStockSharesOutstanding` reflects splits / buybacks
   within ~1 quarter of the filing date. WMT's DEI tag holds the correct
   post-Feb-2024-split 8B shares; the us-gaap tag held a stale 3.42B.

2. DEI-stale pattern (MA, BRK-B). DEI tag frozen at 2010-2011 for some
   legacy filers — MA shows 122M (2010-10-27) vs the correct 893M from
   `WeightedAverageNumberOfDilutedSharesOutstanding` (2026-03-31).
   First-non-null chain ordering can't distinguish "current DEI" from
   "stale DEI".

Fix:

- New `_try_balance_tags_most_recent(facts, tags)` helper picks the
  candidate concept with the most recent `period_end` across the entire
  chain instead of taking the first non-null. Falls back to None when
  all candidates are missing.
- `_build_snapshot` routes `shares_outstanding` through the new helper
  (other balance items keep first-non-null semantics — they don't have
  the multi-concept-divergence issue).
- Chain expanded: DEI + WeightedAverageDiluted + WeightedAverageBasic
  added; legacy us-gaap tags retained.

Validation against 6 tickers:
  Before               →   After             Expected
  META: None           →   2.564B            ~2.5B    ✓
  WMT:  3.42B (stale)  →   7.97B             ~8B post-split ✓
  ACN:  None           →   624M              ~637M    ✓
  MA:   122M (DEI stale)→  893M              ~893M    ✓
  BRK-B: None          →   1.64M (multi-class) ~2.16B (still wide)
  AAPL: 14.69B         →   14.69B (control, unchanged) ✓

BRK-B's dual-class A/B structure remains an edge case — no standard
concept captures the consolidated share count. Caught by the existing
`data_quality_input_corruption` veto (TBVPS would exceed $10K/share with
~$560B equity and only ~1.6M reported shares), so BRK-B is correctly
excluded from Top-5 ranking.

Regression guard: test_shares_outstanding_fallback_chain_covers_dei_and_weighted_avg
ensures future ingest edits can't silently drop these alternative tags.

Full offline suite: 646 / 646 pass.

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
dackclup added a commit that referenced this pull request May 14, 2026
…#49)

* fix(ingest): TTM flow items + pe_ratio formula (audit #6, deep clean)

Pre-v1.0 audit #6 found systemic single-period bugs in 9 income-statement
flow items. The `_NORMALIZED_LATEST` dict pulled values via
`facts.get_concept('xyz')` which returns latest single-period — a value
that can be Q1, H1 YTD, Q3 YTD, or full annual depending on each filer's
reporting cadence. Probed across 4 tickers in May 2026:

  TSLA  operating_income  $941M (single Q1-2026) vs $4.9B (TTM)  — 5× off
  AAPL  operating_income  $87B (H1 YTD)          vs $147B (TTM)  — 0.6× off
  GOOGL gross_profit      $96B (H1 YTD)          vs $217B (TTM)  — 0.4× off

Universe-wide impact:

- pe_ratio (= price / eps_diluted) broken on 381 of 430 S&P 500 tickers
  (88.6%) with PE off by > 30%. Median production PE = 77.5, correct
  median = 26.2 (3× artificial inflation).
- Profitability pillar: gross_margin, operating_margin, gross_profitability
  all mixed single-period numerator with TTM revenue → systematically
  understates margins.
- Health pillar: interest_coverage, debt_to_ebitda, Altman Z'' EBIT
  proxy all mixed single-period with balance items.
- Value pillar EV/EBITDA: broken via EBITDA = op_income + D&A both
  single-period.
- Fair-price multiples PE method, Beneish DEPI / TATA, Dechow Δroa all
  consume these snap fields downstream.

Fix:

A. New `_TTM_FLOW_TAGS` dict with US-GAAP tag chains for 9 income-
   statement flow items: operating_income, gross_profit, cost_of_revenue,
   sga_expense, depreciation_and_amortization, interest_expense,
   income_tax_expense, research_and_development, income_before_tax,
   dividends_paid. Each chain ordered most-general / modern first so the
   MAX-of-fresh heuristic (added in PR #48) picks consolidated totals
   over segment-level disclosures.

   interest_expense chain includes `InterestExpenseOperating` and
   `InterestExpenseNonoperating` (newer concepts) ahead of legacy
   `InterestExpense` — AAPL / MSFT / JPM / TSLA all probed stale on
   the legacy tag post-2024.

B. `_build_snapshot` now walks `_TTM_FLOW_TAGS` via the existing
   `_try_ttm_max_fresh` helper. `_NORMALIZED_LATEST` reduced to just
   `eps_basic` / `eps_diluted` (per-share figures with no clean TTM-via-
   tag substitute; consumers derive TTM EPS from NI_TTM / shares instead).

C. `pe_ratio` rewritten to use `NI_TTM / shares_outstanding` directly
   instead of `snap.eps_diluted`. Validation against 8 diverse tickers
   post-fix: AAPL 35.8, MSFT 24.0, NVDA 45.7, TSLA 425.9, GOOGL 30.4,
   JPM 13.7 — all within reasonable industry ranges (vs prior 61.6,
   30.8, 46.1, 3425.2, 78.8, 50.5 with mixed-period bug).

Regression guards added:
  - test_ttm_flow_tags_replace_normalized_latest_for_income_statement
  - test_pe_returns_nan_for_negative_earnings + test_pe_returns_nan_when_inputs_missing
  - test_pe_ratio (now expects PE=40 for fixture, was 20 under broken formula)

Full offline suite: 645 / 645 pass. Frontend tsc + next build clean.
Validation against EDGAR-direct ground truth pending workflow_dispatch
re-run + audit shortlist.

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2

* fix(ingest): smart shares_outstanding fallback (audit #6 follow-up)

Independent verification of PR #49's TTM flow-item fix surfaced a
related but distinct bug — 26 of 502 S&P 500 tickers shipped with
shares_outstanding=None (and consequently market_cap=None, PE=NaN,
PB=NaN). Affected tickers include front-page names: META (rank #12),
ACN, MA, BRK-B, CMCSA, DASH, LEN, ABNB.

Root cause: the `_BALANCE_TAGS["shares_outstanding"]` chain only had
`us-gaap:CommonStockSharesOutstanding` + `CommonStockSharesIssued`.
META / ACN etc. don't tag either concept — they only file shares via
`us-gaap:WeightedAverageNumberOfDilutedSharesOutstanding` (the EPS
denominator).

Two distinct patterns surfaced when probing more candidate tags:

1. DEI-current pattern (WMT, META, ACN). The cover-page tag
   `dei:EntityCommonStockSharesOutstanding` reflects splits / buybacks
   within ~1 quarter of the filing date. WMT's DEI tag holds the correct
   post-Feb-2024-split 8B shares; the us-gaap tag held a stale 3.42B.

2. DEI-stale pattern (MA, BRK-B). DEI tag frozen at 2010-2011 for some
   legacy filers — MA shows 122M (2010-10-27) vs the correct 893M from
   `WeightedAverageNumberOfDilutedSharesOutstanding` (2026-03-31).
   First-non-null chain ordering can't distinguish "current DEI" from
   "stale DEI".

Fix:

- New `_try_balance_tags_most_recent(facts, tags)` helper picks the
  candidate concept with the most recent `period_end` across the entire
  chain instead of taking the first non-null. Falls back to None when
  all candidates are missing.
- `_build_snapshot` routes `shares_outstanding` through the new helper
  (other balance items keep first-non-null semantics — they don't have
  the multi-concept-divergence issue).
- Chain expanded: DEI + WeightedAverageDiluted + WeightedAverageBasic
  added; legacy us-gaap tags retained.

Validation against 6 tickers:
  Before               →   After             Expected
  META: None           →   2.564B            ~2.5B    ✓
  WMT:  3.42B (stale)  →   7.97B             ~8B post-split ✓
  ACN:  None           →   624M              ~637M    ✓
  MA:   122M (DEI stale)→  893M              ~893M    ✓
  BRK-B: None          →   1.64M (multi-class) ~2.16B (still wide)
  AAPL: 14.69B         →   14.69B (control, unchanged) ✓

BRK-B's dual-class A/B structure remains an edge case — no standard
concept captures the consolidated share count. Caught by the existing
`data_quality_input_corruption` veto (TBVPS would exceed $10K/share with
~$560B equity and only ~1.6M reported shares), so BRK-B is correctly
excluded from Top-5 ranking.

Regression guard: test_shares_outstanding_fallback_chain_covers_dei_and_weighted_avg
ensures future ingest edits can't silently drop these alternative tags.

Full offline suite: 646 / 646 pass.

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2

* fix: complete eps_diluted → NI_TTM/shares migration (audit #6 follow-up)

Re-audit found 3 additional consumers of snap.eps_diluted that PR #49's
initial fix missed:

1. `compute/main.py::_build_universe_metrics` — pe_ttm used in
   raw_metrics.pe_ratio_ttm display + peer median computation. Switched
   to NI_TTM/shares.
2. `compute/main.py` per-ticker loop — raw_metrics.pe_ratio_ttm field
   that feeds the StockSummary JSON. Same fix.
3. `compute/valuation/ensemble.py::multiples_pe_fair_price` call site —
   eps_ttm input was snap.eps_diluted (single-period). Switched to
   NI_TTM/shares so the fair-price PE method is consistent with the
   value-pillar PE.
4. `compute/features/value.py::graham_number` (TTM variant used as a
   pillar factor, distinct from the fair-price Graham). Switched to
   NI_TTM/shares.

Without these, the fair-price PE method would have continued using
the wrong (quarterly/YTD/annual) EPS — affecting peer medians used for
relative valuation. PE pillar fixed but fair-price still corrupt.

Tests updated:
- test_main.py _snap fixture: add net_income=50 (matches eps_diluted=5
  with shares=10 so existing test invariants hold).
- test_build_universe_metrics_pe_uses_diluted_eps → renamed
  test_build_universe_metrics_pe_uses_ttm_eps_not_single_period,
  expectation switched to NI-derived.
- test_build_universe_metrics_negative_eps_yields_null_pe → renamed
  test_build_universe_metrics_negative_ni_yields_null_pe, uses
  net_income=-10 override.
- test_graham_number expectation updated: √(22.5 × 1.0 × 12) ≈ 16.43
  (vs prior √540 ≈ 23.24 under broken formula).

Full offline suite: 646 / 646 pass.

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2

* fix(ingest): expand TTM revenue + NI chains for utilities, tech, BKNG (audit #6 deeper)

Deeper audit found 3 more tag-coverage gaps causing None revenue / NI
across multiple tickers:

1. **BKNG NI = None**: edgartools picks `us-gaap:NetIncomeLoss` which
   is frozen at 2012 for BKNG. Fresh data lives under
   `us-gaap:NetIncomeLossAvailableToCommonStockholdersBasic` ($6.15B
   2026-03-31). Added to `_TTM_NET_INCOME_TAGS` chain.

2. **DUK / utilities revenue = None**: `us-gaap:Revenues` frozen at
   2017 for DUK. Fresh data lives under
   `us-gaap:RegulatedAndUnregulatedOperatingRevenue` ($33.17B). This
   is the standard utility-sector revenue concept. Added.

3. **CRWD / tech revenue = None**: CrowdStrike + some other modern
   tech filers tag revenue under
   `us-gaap:RevenueFromContractWithCustomerIncludingAssessedTax` (the
   "Including" assessed-tax variant of ASC 606) rather than the more
   common "Excluding" variant. Added.

The MAX-of-fresh heuristic from PR #48 handles concept selection — we
just need every relevant concept to be in the chain.

Validation results (universe sweep):
  Before this commit: 4 NI=None + 25 revenue=None (~6% universe)
  After this commit:  expected to drop to ~7 revenue=None
                      (banks WFC/GS/etc. need interest+noninterest
                      aggregation — Phase 4) + APA (energy-specific
                      tagging — Phase 4 too).

Per-ticker fixes verified locally:
  Ticker  Before     After
  DUK     None       $33.2B revenue
  CRWD    None       $4.8B revenue
  BA      None       $92.2B revenue (was None due to old edgartools
                                     pre-3-fresh chain — now works
                                     since `Revenues` is the only
                                     fresh concept)
  LMT     None       $75.1B revenue
  BKNG    NI=None    NI=$6.15B

Control unchanged: AAPL revenue $451.4B / NI $122.6B.

Full offline suite: 646 / 646 pass.

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2

* fix(ingest): add bank-specific RevenuesNetOfInterestExpense concept

Continued deep audit found WFC + GS shipping with revenue=None. Both are
diversified banks that don't tag `us-gaap:Revenues` for consolidated
revenue — instead they file under `us-gaap:RevenuesNetOfInterestExpense`
(industry-standard for banks reporting net-interest income + noninterest
income aggregated).

  WFC: was None → $85.0B fresh (Q1 2026)
  GS:  was None → $60.4B fresh (Q1 2026)

Adds the concept to `_TTM_REVENUE_TAGS` in the appropriate position
(after the utility-specific concept, before the legacy SalesRevenueNet
fallback). The MAX-of-fresh heuristic picks correctly: BAC already had
fresh `us-gaap:Revenues` so it sticks with that; WFC/GS fall to the
bank concept.

Remaining None-revenue tickers post-this-fix:
  - HBAN (regional bank — uses InterestIncome / NoninterestIncome
    separately, needs aggregation logic — Phase 4)
  - APA (Apache energy — uses srt: oil/gas-specific tags rather than
    us-gaap — Phase 4)

Full offline suite: 646 / 646 pass.

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2

* fix(ingest): mirror TTM tag chain expansion to _ANNUAL_TAGS

Continued sweep found annual history gaps for the same tickers that
needed snapshot-tag fixes:

  Before annual-chain fix:
    BKNG: 6y revenue, 0y NI  (NI tag stale)
    WFC:  0y revenue, 6y NI  (revenue tag missing)
    DUK:  0y revenue, 6y NI  (utilities concept missing)
  After:
    All four (AAPL control + BKNG + WFC + DUK): 6y / 6y

Impact: `_avg_3y_roe` (uses NI history), `revenue_cagr` and
`fcf_5y` were silently skipping for ~50 utility / bank / multi-class
filers because their annual `_ANNUAL_TAGS["revenue"]` or
`_ANNUAL_TAGS["net_income"]` returned empty. With both chains expanded
to mirror the TTM-chain fixes:

- revenue: + RevenueFromContractWithCustomerIncludingAssessedTax (CRWD-class)
           + RegulatedAndUnregulatedOperatingRevenue (utilities)
           + RevenuesNetOfInterestExpense (banks)
- net_income: + NetIncomeLossAvailableToCommonStockholdersBasic (BKNG-class)
              + NetIncomeLossAvailableToCommonStockholdersDiluted
              + ProfitLoss

The pre-existing `_avg_3y_roe` known-issue (#11 — uses current equity
as denominator, not per-year equity) is NOT addressed here. That's a
separate Phase 4 fix; the chain expansion just lets it COMPUTE for the
~50 affected tickers instead of returning None.

Full offline suite: 646 / 646 pass.

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2

* ci(compute-rankings): bump fundamentals cache key to v2

Critical finding from auditing the post-PR-#48 workflow_dispatch run on
main: the fix in PR #48 (TTM concept-pick + data_quality guard) is
CORRECT in code but the production output still shows the old wrong
values for AVB / CPT / ESS / UDR and many others.

Root cause: `actions/cache@v5` restores `compute/cache/fundamentals/`
across runs within the same quarter (cache key
`fundamentals-2026Q2-ubuntu-latest`). The pre-PR-#48 cached parquets
hold the wrong $7.1M (AVB) / $12.6M (CPT) / etc. revenue values, AND
their `latest_filed_date` is recent enough that the `_is_fresh` check
(45-day rolling) passes, so `fetch_fundamentals` returns the cached
snapshot WITHOUT re-fetching from EDGAR.

Local probe confirms: with PR #48 + #49 code, AVB returns $3.07B
revenue (correct). Production keeps shipping $7.1M because cache.

Fix: bump the cache key from `fundamentals-{quarter}-{os}` to
`fundamentals-v2-{quarter}-{os}`. Forces one cache miss on the next
workflow_dispatch run, which triggers fresh fetches for all 502
tickers (~50 min cold run vs 30 min warm). After that one run, the v2
cache repopulates with corrected data.

Bump the v-number whenever any tag-chain dict changes schema:
- `_BALANCE_TAGS["shares_outstanding"]` chain (PR #49 added DEI +
  weighted-average fallbacks + MAX-by-period logic)
- `_TTM_REVENUE_TAGS` (PR #49 added utility / bank / ASC-606 variants)
- `_TTM_NET_INCOME_TAGS` (PR #49 added BKNG-class concepts)
- `_TTM_FLOW_TAGS` (entirely new in PR #49)
- `_BALANCE_TAGS["property_plant_equipment"]` (PR #43)

Without this bump, the deep-clean fixes don't actually reach production.

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2

---------

Co-authored-by: Claude <noreply@anthropic.com>
dackclup added a commit that referenced this pull request May 15, 2026
….6.0 → 0.7.0 (#79)

Closes #14. Closes #17 (the "10-K + 8-K" log line is now accurate again).

## What this PR does

Flips `compute.scoring.tier2._EIGHT_K_DEFENSES_ENABLED` False → True.
That single boolean was the only thing standing between the PR 3d
wiring and an active 4th veto. Active vetoes count goes 4 → 5.

| Defense | ItemFlag.fired effect | Active in production |
|---|---|---|
| `non_reliance_filing` (8-K Item 4.02, Schroeder 2024 SSRN) | hard VETO — suppresses `entered_top5` | **YES, after this PR** |
| `auditor_change` (8-K Item 4.01, Reg S-K Item 304) | ANNOTATE — emits `tier2_events.auditor_change.fired` | **YES, after this PR** |

## Why this is safe to flip now

The PR 3d deferral cited 3 workflow-timeout incidents (runs #12 / #13
/ #14). Every root cause has since been mitigated:

| PR 3d failure | Fix |
|---|---|
| Run #12 — `filing.text()` routed through `hybrid_section_detector` (~5-10s × 502 stocks) | PR 3d hotfix 226840d — `filing.html()` shortcut |
| Run #13 — `EightK.items` access also triggered the detector | PR 3d hotfix 12ad7ff — regex-on-raw-HTML extraction (mirrors edgartools Strategy 3 fallback) |
| Run #14 — SEC EDGAR throttling × overly-aggressive tenacity retry (60-90s/stuck-stock × ~40% failure rate) | PR 3d Part 1 — `stop_after_delay(30) | stop_after_attempt(2)`, 45s per-stock orchestrator timeout, edgartools warning suppression |
| Cache state lost across CI runs (only `fundamentals` was being preserved) | PR 4a — workflow cache restore step expanded to all 6 paths, including `edgar_8k` |
| Weekly compute = 7-day recovery window on failure | PR 4f — daily Mon-Fri cron = 24h recovery window |

Latency p95 has dropped from the 30+s regime that bit run #14 to
14.41s on the latest production run. Kill-switch capability
(`QR_SKIP_TIER2` env var + the feature flag itself) is preserved.

## Files changed

- `compute/scoring/tier2.py` — `_EIGHT_K_DEFENSES_ENABLED = True`;
  comment block updated with the post-flip rationale + kill-switch
  pointer
- `compute/config.py` — `SCHEMA_VERSION` `0.6.0-phase3d` →
  `0.7.0-phase4g`. Promoting a defense flag from deferred to active
  veto is a **minor** semver bump per
  `.claude/skills/phase-4/schema-versioning/PLAN.md`.
- `tests/test_config.py` — locked-constant test renamed
  `test_schema_version_is_phase3d` → `test_schema_version_is_phase4g`;
  asserts new version
- `tests/test_smoke.py` — `SCHEMA_VERSION.startswith("0.6.0")` →
  `startswith("0.7.0")`
- `tests/test_scoring/test_tier2.py` — `eight_k_disabled` fixture
  added; E1-E5 + F2 updated to flip the flag explicitly when
  exercising kill-switch behavior (was relying on the default).
  Same threshold-symbolic-test pattern as SKILL Rule 17 — tests stay
  green if the constant moves.

## Scope NOT in this PR

- Pre-cache off-cycle workflow (issue #14 §1) — kept as an option
  for further perf headroom but not needed for correctness. If the
  first daily run with 8-K enabled comes in under the 90-min budget,
  the pre-cache layer becomes an optimization, not a requirement.
  File as follow-up if needed after monitoring the first 1-2 runs.
- Frontend updates — `tier2_events.non_reliance_filing` /
  `auditor_change` were already wired through to `Tier2EventCard`
  in PR 3d; values just flip from "always false" to "computed from
  EDGAR data" after this PR. No frontend code change required.

## Verification ladder
- ✅ ruff check — clean
- ✅ pytest -m "not network" — 772 passed (5 retitled / threshold-
  symbolic tests in tier2 + test_config + test_smoke)
- ✅ schema_check — Pydantic ↔ TypeScript snapshot still in sync
  (no field shape change; only the version string moved)
- ✅ tsc --noEmit — clean
- ✅ next build — 506 static pages

## Post-merge monitoring

After the first daily compute run with 8-K active:
- Section A schema field reads `0.7.0-phase4g`
- Section B `non_reliance_filing` count — expect 0-5 tickers in the
  S&P 500 universe per Schroeder 2024 base rate; > 10 deserves
  investigation
- Section B `auditor_change` count — expect 5-20 tickers per Reg S-K
  base rate over a 730-day window
- `tier2_coverage_pct` should remain ≥ 95% (was ~100% with 10-K only;
  partial 8-K fetch failures will drag it down some)
- Workflow runtime — expect +5-10 min cold cache hit, +1-2 min warm

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2

Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants