feat(phase-3d): Tier-2 going-concern + UI polish + fundamentals resilience#12
Merged
Conversation
…nstants Step 1 of PR 3d (Tier-2 event defenses). Lays groundwork for the 3 new defenses landing in Steps 2-3: - Defense #8 — going-concern phrase scan (annotate-only, Step 2) - Defense #9 — 8-K Item 4.02 hard veto (4th active veto, Step 3) - Defense #10 — 8-K Item 4.01 auditor change (annotate-only, Step 3) Changes ------- compute/config.py - SCHEMA_VERSION: "0.5.0-phase3c" -> "0.6.0-phase3d" - New constants in a Phase-3d-specific block: * EIGHT_K_LOOKBACK_DAYS_VETO = 365 (Item 4.02 trailing-12-month window — Schroeder 2024 SSRN shows ~50% subsequent restatement rate within this window) * EIGHT_K_LOOKBACK_DAYS_ANNOTATE = 730 (Item 4.01 2-year window per Reg S-K Item 304 disclosure) * GOING_CONCERN_FILING_LOOKBACK_DAYS = 400 (1y + buffer to capture the most recent 10-K — calendar-year filers cluster ~75d after fiscal year-end) PHASE_STATUS.md - Test count fix: "118 → 410" was a transposition error in PR 3c's Step 11 docs (actual test count was 409 throughout). Off-by-one cosmetic; flagged in pre-PR-3d verification, fixed here while touching docs anyway. tests/test_smoke.py - test_phase0_scaffold_imports: bump expected SCHEMA_VERSION prefix from "0.5.0" to "0.6.0" (this assertion gets updated each phase bump; same pattern as PR 3c). tests/test_config.py (NEW, ~35 LOC) - 5 trivial smoke tests locking the values of the 4 phase-3d constants + the schema version. Catches accidental drift. - test_schema_version_is_phase3d - test_eight_k_lookback_veto_is_one_year - test_eight_k_lookback_annotate_is_two_years - test_going_concern_filing_lookback_is_one_year_plus_buffer - test_eight_k_annotate_window_outlasts_veto_window (annotate window must be >= veto window — surfaces a 4.01 disclosure even after a 4.02 veto would have lapsed) Verification ------------ - ruff check . -> clean - python -m compute.output.schema_check -> in-sync (no Pydantic schema changes in this step; tier2_events field lands in Step 4) - pytest tests/ -m "not network" -> 414 passed (was 409 -> +5 new) - npx tsc --noEmit (frontend) -> clean What's NOT in this commit ------------------------- - New scoring modules (Step 2 going_concern.py; Step 3 eight_k_events.py) - Pydantic schema additions (Step 4 tier2_events field) - Frontend components (Steps 6-8) - Any new pip dependencies — beautifulsoup4 + lxml + edgartools + requests already cover everything needed for Tier-2 https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
…te-only)
Step 2 of PR 3d. Adds a pure scoring module that scans 10-K / 10-Q
filing text for going-concern indicator phrases drawn from the
Loughran-McDonald financial dictionary subset documented in
Mayew-Sethuraman-Venkatachalam 2015 *The Accounting Review*.
Annotate-only flag (Rule 16) — composite score is never modified.
Surfaces in StockDetail.tier2_events.going_concern_disclosure (the
field lands in Step 4) and the user-visible flag list on the detail
page (the UI lands in Step 6).
Changes
-------
compute/scoring/going_concern.py (NEW, ~120 LOC)
- GOING_CONCERN_PHRASES: tuple of 14 curated LM-dictionary phrases
(locked at module level; tuple = immutable, prevents accidental
runtime mutation).
- scan_going_concern(text) -> bool: pure function. Pre-compiled
per-phrase regex tuple at module load (avoids per-call recompile
on every ticker × every phrase).
- Each pattern:
* uses re.escape to neutralize metacharacters in the phrase
* replaces each escaped space with [\s\-]+ so multi-space, line
breaks, and hyphens between words all match
* anchors with \b at start and end so partial-word matches
(e.g., "ongoing concerns", "discontinued") do NOT trip the
flag — these are likely false-positive vectors and the spec
doesn't list them but the test suite asserts both
- Loughran-McDonald CC BY 4.0 attribution in module docstring.
- Returns False for None / empty (caller distinguishes "no signal"
from "couldn't fetch" via tier2_coverage_pct in Metadata).
- No new dependencies — uses stdlib re only.
tests/test_scoring/test_going_concern.py (NEW, 25 cases)
- A1-A8: primary phrases detected (parametrized over the 8 most
common boilerplate variants).
- B1-B4: whitespace + punctuation flex (multi-space, newline,
multi-double-space, hyphen).
- C1-C4: negative cases (clean text, "concern" alone, "doubt"
without "substantial", empty string).
- D1-D3: edge cases (None, single char, multi-occurrence).
- E1-E2: phrase at start / end of text.
- F1-F2: module surface (tuple-ness, ≥12 entries).
- G1-G2: word-boundary safety — not in the original spec, but
guards against the obvious false-positive vectors:
* "ongoing concerns" must NOT match (it contains the
substring "going concern" but both word boundaries fail
under \b anchoring)
* "discontinued operations" must NOT match (substring
"continued" inside a longer word)
Verification
------------
- ruff check . -> clean (1 import-sort fix auto-applied)
- python -m compute.output.schema_check -> in-sync
(no Pydantic schema changes in this step; tier2_events lands in Step 4)
- pytest tests/ -m "not network" -> 439 passed (was 414 -> +25 new)
- npx tsc --noEmit (frontend) -> clean
What's NOT in this commit
-------------------------
- 8-K event parsing (Step 3 module: eight_k_events.py)
- Pydantic schema additions (Step 4: StockDetail.tier2_events)
- compute/main.py wire-up (Step 5)
- Frontend Tier2EventCard (Step 6)
- 10-K filing fetch + cache layer (Step 5 — uses edgartools)
https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
Step 3 of PR 3d. Adds the 8-K event scoring module that backs: - Defense #9 (Item 4.02 "Non-Reliance on Previously Issued Financial Statements") — HARD VETO. Joins altman / sloan / NSI as the 4th active veto at v1.0. - Defense #10 (Item 4.01 "Changes in Registrant's Certifying Accountant") — annotate-only. Reg S-K Item 304 mandates the same disclosure for benign reasons, so false-positive rate is too high for veto. Both defenses surface in StockDetail.tier2_events (Pydantic field lands in Step 4) and the user-visible flag list. Per SKILL.md Rule 16, neither modifies the composite score. Changes ------- compute/config.py (+11 LOC) - New constants: * EDGAR_8K_CACHE_DIR = CACHE_DIR / "edgar_8k" * EDGAR_8K_CACHE_TTL_SECONDS = 7 * 86400 * EDGAR_8K_ITEM_TEXT_EXCERPT_CHARS = 500 compute/scoring/eight_k_events.py (NEW, ~310 LOC) - ItemFlag frozen dataclass — return shape for both check_* funcs. Fields: fired (bool), filing_date (str|None), filing_url (str|None), raw_item_text (str|None, ≤ EXCERPT_CHARS). - fetch_recent_8k_filings(ticker, lookback_days) -> list[dict] | None. Wraps edgartools' Company.get_filings(form="8-K", filing_date=...); parses each filing via filing.obj() (returns EightK with .items attribute returning list[str] like ["Item 5.02", "Item 9.01"]); extracts item-text excerpts from EightK.sections (best-effort — shape varies across edgartools versions, gracefully degrades to empty excerpts). - Returns None on EDGAR rate-limit / network failure / missing identity / ticker-not-found. Returns [] on successful fetch with zero 8-Ks in window. - check_non_reliance(ticker) — Item 4.02, 365-day lookback. - check_auditor_change(ticker) — Item 4.01, 730-day lookback. - Both accept optional `filings=` kwarg for unit-test injection. - Most-recent match wins when multiple 4.02 / 4.01 fire in window. - Item-number regex is dot-anchored both sides ("\bItem\s+4\.\s*02\b") so "Item 4.020" does NOT match "Item 4.02". - _ensure_edgar_identity is lazy (logged warning, not RuntimeError) on missing EDGAR_USER_AGENT — Tier-2 features are non-fatal, unlike fundamentals. Cache layer (inlined in eight_k_events.py, ~80 LOC) - JSON-on-disk at compute/cache/edgar_8k/<ticker>.json (gitignored by existing compute/cache/ rule). - 7-day TTL — safe because 4.02/4.01 events are sticky once filed (they don't disappear). - Cache hit requires cached_lookback >= requested_lookback (so a 365d entry can't serve a 730d request). - Atomic write via tmp + os.replace. - Corrupt JSON / unparseable timestamps treated as miss (logged warn). - Filename ticker-sanitized via [^A-Za-z0-9_-] regex (BRK-B works, path-traversal attempts neutralized). - invalidate_cache(ticker) — public helper, idempotent. tests/test_scoring/test_eight_k_events.py (NEW, 28 cases — 25 unit/cache + 3 @network) - A1-A14: synthetic Filing fixture tests (item parsing, lookback windows, multiple matches, case variants, excerpt truncation, frozen dataclass, item-number boundary precision). - B1-B6: cache layer (miss → fetch, hit → no fetch, expired → refetch, invalidate, corrupt JSON, lookback-undersize miss). - 2 ticker-path safety tests (BRK-B preservation, path traversal). - C1-C3: @network smoke against real SEC EDGAR (skipped without EDGAR_USER_AGENT). Asserts 5 known-clean tickers (AAPL/MSFT/GOOGL/ JPM/KO) have ≤1 fired flag, AAPL has neither 4.02 nor 4.01, cache effectiveness via timing. Verification ------------ - ruff check . -> clean (1 pytest.raises(Exception) lint fix — switched to FrozenInstanceError specifically) - python -m compute.output.schema_check -> in-sync (Step 4 adds the Pydantic tier2_events field) - pytest tests/ -m "not network" -> 464 passed (was 439 -> +25 new unit/cache; +3 @network properly skipped) - npx tsc --noEmit -> clean Edgartools API notes (for Step 5 wire-up) ------------------------------------------ - Company.get_filings(form="8-K", filing_date=(start, end)) returns an EntityFilings iterable. - Each Filing has .obj() that returns an EightK (for 8-K forms). - EightK.items returns List[str] like ["Item 5.02", "Item 9.01"] via a 3-tier fallback parser (modern sections → chunked_document → text-pattern extraction). Handles SGML legacy filings (1999-2001). - EightK.sections is the source for item-body excerpts but its shape varies (sometimes dict, sometimes list); the module guards with `if isinstance(sections, dict)` and degrades to empty excerpts if the shape doesn't match expectations. What's NOT in this commit ------------------------- - Pydantic schema additions (Step 4: StockDetail.tier2_events field + Metadata.tier2_coverage_pct) - Risk-overlay integration (Step 4: non_reliance_filing flag joins the risk_flags list) - compute/main.py wire-up (Step 5) - Frontend Tier2EventCard (Step 6) - New pip dependencies — uses existing edgartools + stdlib https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
Step 4 of PR 3d. Wires Defenses #8 / #9 / #10 into the JSON contract and makes #9 (8-K Item 4.02 non-reliance) the **4th active hard veto** at v1.0. Changes ------- compute/output/schemas.py - StockDetail.tier2_events: dict | None = None Display payload populated by Step 5; shape (when set): {"going_concern_disclosure": bool, "non_reliance_filing": bool, "auditor_change": bool, "latest_8k_filing_date": str | None, # ISO YYYY-MM-DD "latest_8k_filing_url": str | None} - Metadata.tier2_coverage_pct: float | None = None Population-level fetch-success rate. None when Tier-2 disabled (e.g., env var missing). frontend/lib/types.ts - New `Tier2Events` type mirroring the Python dict shape - `StockDetail.tier2_events: Tier2Events | null` - `Metadata.tier2_coverage_pct: number | null` - Inline doc on Tier2Events explaining which fields are veto vs annotate. compute/valuation/applicability.py - SKIP_REASONS: 21 → 24 entries. New stable identifiers: going_concern_disclosure, non_reliance_filing, auditor_change. These are tracked here so the JSON-contract reason taxonomy is complete; the same strings also appear in StockDetail.tier2_events (display) and risk_flags (only non_reliance_filing — hard veto). compute/scoring/risk_overlay.py - Module docstring updated: "three vetoes" → "four vetoes" with non_reliance_filing entry citing eight_k_events.check_non_reliance. - compute_risk_flags acquires a new optional kwarg `non_reliance_by_ticker: dict[str, bool] | None = None`: * Default (None): per-ticker fallback to check_non_reliance(ticker), which hits the 7-day on-disk EDGAR cache or returns ItemFlag(fired=False) when identity is unset (= test environment, sandbox). * Explicit dict: tests + Step 5 inject pre-computed results. Step 5 will share fetch work between this veto path and the StockDetail.tier2_events display path so the EDGAR fetch happens once per ticker per compute run, not twice. This is a slight extension of the spec's pure-inline `check_non_reliance(ticker)` call, but it keeps the function unit-testable without network mocking and avoids a duplicate fetch in production. The default behavior matches the spec exactly when the kwarg is omitted. frontend/lib/schema-snapshot.json - Regenerated via `python -m compute.output.schema_check --update-snapshot`. Diff: +tier2_coverage_pct under Metadata, +tier2_events under StockDetail. No collateral drift. tests/test_output/test_tier2_schema.py (NEW, 13 cases) - A1-A5: Pydantic field validation (StockDetail.tier2_events accepts dict / None; Metadata.tier2_coverage_pct accepts float / None; JSON round-trip preserves the dict shape). - B1-B5: SKIP_REASONS taxonomy (3 new entries present, count = 24, all entries unique). - D1-D3: schema-snapshot file (committed snapshot includes both new fields with correct type/required/default shape). tests/test_scoring/test_risk_overlay.py (+6 cases) - C1-C6: Defense #9 non_reliance integration: * inject {ticker: True} → flag appears * inject {ticker: False} → no flag * empty inject dict → no flag * default path with no EDGAR_USER_AGENT → no flag (existing PR-3c tests rely on this contract; tests use monkeypatch to ensure a clean cache + identity state) * additive with altman/sloan — all 4 vetoes can fire together * inject dict for ticker A doesn't pollute ticker B Verification ------------ - ruff check . -> clean (1 import-sort fix auto-applied) - python -m compute.output.schema_check -> in-sync after regen - pytest tests/ -m "not network" -> 483 passed (was 464 -> +19 new) - npx tsc --noEmit -> clean What's NOT in this commit ------------------------- - compute/main.py wire-up (Step 5 — pre-fetches Tier-2 data in parallel with fundamentals, populates tier2_events display dict + injects non_reliance_by_ticker into compute_risk_flags) - Frontend Tier2EventCard / PillarRadarChart / FairPriceBarChart components (Steps 6-8) https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
Step 5 of PR 3d. Wires Defenses #8 / #9 / #10 into the production weekly-compute pipeline. After this commit, rankings.json carries the 4th active veto (non_reliance_filing) and StockDetail.tier2_events on every stock; metadata.json reports population-level Tier-2 coverage. Architecture ------------ - New `compute/ingest/filing_text.py` (~210 LOC): 10-K text fetch with 90-day on-disk cache. Mirrors the eight_k_events.py cache pattern (atomic write via tmp + os.replace; fetched_at TTL gate; safe_ticker filename sanitization). Returns None on every failure mode (rate limit, missing identity, no recent 10-K) — never raises. - New `compute/scoring/tier2.py` (~180 LOC): Tier2Result frozen dataclass + fetch_tier2_for_ticker orchestrator + tier2_events_dict + coverage_pct helpers. The orchestrator catches every per-defense exception so one bad ticker can't crash the run. - Reuses `fetch_recent_8k_filings` ONCE per ticker (with the larger 730d lookback that covers both 4.02 and 4.01 windows) — both `check_non_reliance` and `check_auditor_change` operate on the same in-memory filing list. Avoids a duplicate EDGAR call per ticker. compute/config.py - New EDGAR_10K_TEXT_CACHE_DIR + EDGAR_10K_TEXT_CACHE_TTL_SECONDS (= 90 days). 10-K filings are annual so an 89-day stale cache hit returns the same filing. compute/main.py - New "Step 4b" between fundamentals + risk-flag computation: parallel Tier-2 fetch via ThreadPoolExecutor(max_workers=EDGAR_MAX_WORKERS=5). Same parallelism budget as fundamentals — well under SEC's 10/sec rate limit. - non_reliance_by_ticker dict built from tier2_results, injected into compute_risk_flags. Avoids the duplicate fetch the inline default path would have triggered. Only fired tickers go in (per Step 4 spec: dict.get(ticker, False) default). - Per-ticker StockDetail loop populates tier2_events from tier2_events_dict(tier2_results.get(ticker)). Tickers absent from the dict get tier2_events=None — graceful "no Tier-2 data" surface. - Metadata.tier2_coverage_pct populated from coverage_pct(tier2_results). None when universe is empty; 0.0 when all fetches failed; rounded to 2 decimal places otherwise. - Added `Tier2Result` to imports for type clarity (linter wanted it in a separate `from .. import` line because of the `as` alias on coverage_pct — accepted). Failure isolation ----------------- Three layers of safety: 1. Each underlying fetcher (fetch_latest_10k_text, fetch_recent_8k_filings) returns None on any failure — never raises. 2. fetch_tier2_for_ticker wraps each per-defense call in try/except; one defense's failure doesn't abort the orchestrator. 3. The compute/main.py executor loop also catches exceptions from fut.result() — defensive, since the orchestrator already swallows everything. A failed-fetch ticker simply won't appear in tier2_results; the per-ticker loop's tier2_results.get(ticker) returns None, which builds a StockDetail with tier2_events=None. tests/test_scoring/test_tier2.py (NEW, 17 cases) - A1-A6: orchestration permutations (clean, partial 10-K fail, partial 8-K fail, total fail, exception caught, both 8-K items present). - B1-B4: tier2_events_dict shape + non_reliance > auditor_change preference for latest_8k_filing date/url + 5-key contract check. - C1-C5: coverage_pct including 100% / 0% / 49.80% / empty / single. - D1: end-to-end synthetic 10-ticker pipeline covering all 3 defenses. - D2: Tier2Result frozen dataclass. Verification ------------ - ruff check . -> clean - python -m compute.output.schema_check -> in-sync - pytest tests/ -m "not network" -> 500 passed (was 483 -> +17 new) - npx tsc --noEmit -> clean - main.py wire-up smoke-imports cleanly; sanity grep confirms tier2_results / tier2_coverage_pct / tier2_events / non_reliance inject all wired through. Performance budget ------------------ Cold cache estimate (first run): - 502 tickers × 2 EDGAR fetches each (10-K + 8-K) at ~5 parallel workers = ~200s = ~3.5 min. Well under the +10-15 min budget. Subsequent weekly runs: mostly cache hits → +30-60s. NO new asyncio / concurrency primitives — ThreadPoolExecutor matches the existing fundamentals-fetch pattern. What's NOT in this commit ------------------------- - Frontend Tier2EventCard / PillarRadarChart / FairPriceBarChart (Steps 6-8) - Documentation updates (Step 9) - Production verification via workflow_dispatch (Step 10) https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
…e wire Step 6 of PR 3d. Adds the user-visible surface for Defenses #8 / #9 / #10 — a card that lists fired regulatory events with severity coding (HARD VETO red pill for non-reliance; Annotate amber pill for going-concern + auditor-change). Renders nothing when there are no events or when the StockDetail predates the schema (graceful forward-compat for stocks/*.json files written under PR-3c schema). Position: between PriceHistoryChart and FairPriceCard — regulatory events affect investability more than valuation, so they sit higher in the visual hierarchy. Changes ------- frontend/components/Tier2EventCard.tsx (NEW, ~165 LOC) - "use client" component with strict TypeScript types (no `any`). - Props: tier2_events: Tier2Events | null, ticker: string. - Renders null when: * tier2_events is null OR undefined (loose-equality check — `undefined` is the runtime shape for stock JSONs written under pre-PR-3d schemas, before Step 10's compute regenerates them with the field populated) * All 3 flags are false (clean ticker) - Otherwise renders rows in priority order: non_reliance_filing first (hard veto), then going_concern_disclosure, then auditor_change. Date footer (latest 8-K) shown only when an 8-K flag fired AND a date is present. - "View filing" link with target=_blank + rel=noopener,noreferrer for the 8-K rows; going-concern has no link (text scan, not 8-K). - Inline SVG icons (lucide-react is NOT in package.json — spec's hard constraint says "NO new npm dependencies"). Three 24px stroke icons styled to match lucide's visual language: AlertOctagon (veto), AlertTriangle (going-concern), UserMinus (auditor-change), plus a small ExternalLink for the filing-link affordance. - Light-theme palette matching existing components (rose/amber/slate ring-1 ring-inset badges) — the spec's bg-card/text-foreground tokens reference shadcn dark-theme but the project uses bg-white/text-slate-700. - Accessibility: aria-label on section, role="status" on severity pills, aria-hidden on decorative icons. - Mobile-first: stacked rows on <sm, side-by-side on sm+. frontend/app/stock/[ticker]/page.tsx - Import Tier2EventCard. - Wired between the Price (1y) section and the FairPriceCard block, per spec ordering: chart → events → fair price → fundamentals. Edge case fixed during build verification ----------------------------------------- Initial implementation guarded with `tier2_events === null`. The production stock JSONs committed under PR 3c lack the `tier2_events` key entirely (the schema is forward-compatible: the field is optional in Pydantic, so existing files just don't have it). JavaScript JSON.parse returns `undefined` for absent keys, not `null` — so `=== null` missed the case and the destructure crashed during `next build` for all 502 stocks. Fixed to `== null` (loose equality catches both null + undefined). Comment in the component explains the forward-compat reasoning. Tests (frontend) ---------------- The frontend has no test framework configured (no jest / vitest / @testing-library in package.json). Per spec ("If neither has component tests, skip in favor of visual regression"), no component tests added. `tsc --noEmit` + `next build` are the type/build correctness guarantees: - npx tsc --noEmit -> clean - npm run build -> 506 / 506 routes pre-rendered cleanly What's NOT in this commit ------------------------- - Visual snapshot regression tests (no harness; would require adding playwright or storybook — out of scope) - PillarRadarChart (Step 7) - FairPriceBarChart (Step 8) Verification ------------ - npx tsc --noEmit -> clean - npm run build -> 506 / 506 routes ✓ - ruff check . -> clean (no Python touched) - pytest tests/ -m "not network" -> 500 passed (no Python touched; sanity-check that nothing regressed) Visual spot-checks deferred to Vercel preview --------------------------------------------- I cannot render the component locally; spot-checks happen on the Vercel preview deploy after this commit lands. Spec scenarios: 1. Stock with no Tier-2 events (most production stocks at commit 9cd2c74) → card hidden ✓ (forward-compat null-check) 2. Stock with auditor_change only → amber Annotate row + link 3. Stock with non-reliance fired → red HARD VETO row + link 4. All 3 fired → 3-row card + 8-K date footer Production stock JSONs at HEAD won't have tier2_events populated (Step 10 workflow_dispatch is what triggers regeneration). So the preview will show "no Tier-2 events" everywhere; full visual verification of fired states happens at Step 10. https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
…age wire Step 7 of PR 3d. Adds the 8-pillar polar radar visualization on the stock detail page so users see composite-score breakdown at a glance. Recharts (already in deps from PR 3c) provides the polar primitives; no new npm dependencies. Position in detail page: after FairPriceCard, before the existing pillar-scores table. Final order is now PriceHistoryChart → Tier2EventCard → FairPriceCard → PillarRadarChart → RawMetricsTable. Changes ------- frontend/components/PillarRadarChart.tsx (NEW, ~125 LOC) - "use client" component, strict TypeScript types. - Renders only the 8 active pillars (quality / value / growth / momentum / health / profitability / technical / risk). The two always-null Phase-5+ pillars (sentiment / ml) are explicitly named in the footer instead of being plotted as zero-axes. - Null active-pillar handling: skipped from the dataset entirely. The footer separates "data quality issue this run" pillars from the permanent "Phase 5+" placeholder pair, so a degraded ticker isn't silently shrunk to fewer axes without explanation. - Hard floor: <5 non-null active pillars → returns null. A 4-axis radar is degenerate (two chord pairs) and less useful than the raw pillar-scores table downstream. - ACTIVE_PILLARS tuple typed with `as const satisfies ReadonlyArray<readonly [keyof PillarScores, string]>` so the pillar keys are checked at compile time against PillarScores. - ResponsiveContainer (Recharts SSR convention) inside an h-72 fixed-height wrapper — prevents layout shift on chart mount. - Light-theme palette: slate-200 grid, slate-600 angle-axis labels, indigo-500 radar fill at 0.4 opacity. Matches the Step 6 Tier2EventCard + the existing Step 10 (PR 3c) component palette. - isAnimationActive=false for the Radar — pre-rendered routes shouldn't hint at client-side animation jank. frontend/app/stock/[ticker]/page.tsx - Import PillarRadarChart. - Wired between FairPriceCard and RawMetricsTable. - Spec ordering preserved: events → fair price → radar → fundamentals. Verification ------------ - npx tsc --noEmit -> clean - npm run build -> 506/506 routes pre-rendered ✓ - ruff check . -> clean (no Python touched) - pytest tests/ -m "not network" -> 500 passed (no regression) What's NOT in this commit ------------------------- - Frontend component tests (no harness configured; build-time type + render verification covers the contract surface) - FairPriceBarChart (Step 8) - Docs updates (Step 9) - Production verification via workflow_dispatch (Step 10) https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
…page wire
Step 8 of PR 3d (last frontend component). Adds the horizontal-bar
visual summary of the 6 fair-price ensemble methods on the stock
detail page. Renders just above the existing FairPriceCard tabular
breakdown, so users see the visual landscape first then can dig into
the per-method details.
Position in detail page (final order, top to bottom):
1. PriceHistoryChart
2. Tier2EventCard (renders null when no events)
3. FairPriceBarChart (NEW) — visual ensemble summary
4. FairPriceCard — tabular detail
5. PillarRadarChart
6. RawMetricsTable
Changes
-------
frontend/components/FairPriceBarChart.tsx (NEW, ~210 LOC)
- "use client" + strict TS. Recharts only (already in deps).
- Renders one horizontal bar per APPLICABLE method (skipped methods
are simply absent — different from the FairPriceCard table which
shows them with reasons).
- Outlier detection: each method whose extreme_<method>_estimate
warning appears in fair_price.valuation_warnings is grayed
(slate-400) instead of indigo-500. The 6-method warning convention
was set in PR 3c Step 7.5.
- ReferenceLines for context:
* Current price — rose-500 dashed, with "Current $X.XX" label
* Median — indigo-700 solid (the headline ensemble value)
* Max (excl. outliers) — indigo-300 solid (upside scenario)
- Domain capping: 1.2× max(non-outlier values + current_price +
median + max). Outlier values are intentionally excluded from
the cap so an extreme value extends off-chart on the right —
visual signal that the value is extreme. Tooltip surfaces the
unclamped raw_value.
- Negative method values clamped to 0 in the bar geometry (bars
can't go negative visually); raw_value preserved in tooltip.
- Custom Tooltip component with method label + formatted price +
outlier annotation when applicable.
- Inline legend below the chart explains the 5 visual elements
(Applicable, Outlier when present, Median, Max, Current).
- Footer note about outlier 5×/0.2× cutoff shown only when at least
one outlier is in the chart.
- Returns null when:
* fair_price == null/undefined (forward-compat for older schema)
* current_price == null/NaN (no reference line possible)
* Zero applicable methods (e.g., BKR with all 6 nulled by the
Step 7.5 data-quality sanity guard)
- Recharts naming gotcha documented: BarChart layout="vertical"
produces HORIZONTAL bars. Inline comment so future readers don't
flip it.
- Method order matches FairPriceCard table order (Graham → P/E →
P/B → EV/EBITDA → RIM → DCF) for visual continuity between the
two components.
- Accessibility: aria-label on the section, role="img" with descriptive
aria-label on the chart container, aria-hidden on decorative
legend swatches.
frontend/app/stock/[ticker]/page.tsx
- Import FairPriceBarChart, wire between Tier2EventCard and
FairPriceCard.
Verification
------------
- npx tsc --noEmit -> clean
- npm run build -> 506/506 routes pre-rendered ✓ (route bundle
shared chunks unchanged at 87.5 kB)
- ruff check . -> clean (no Python touched)
- pytest tests/ -m "not network" -> 500 passed (no regression)
Decisions worth flagging
------------------------
1. Outlier graying via Recharts <Cell> children (the standard
per-bar override pattern), not a fill function. Cell-level fill
is the documented Recharts API.
2. Aggregate markers (median/max) implemented as ReferenceLines,
NOT separate Bars or ReferenceDots. Cleaner for a layout="vertical"
BarChart and avoids confusing them with method bars.
3. Tooltip is a custom React component because Recharts' default
tooltip can't render the "outlier — excluded from MAX" affordance.
4. Method labels match the FairPriceCard's METHOD_LABELS object
verbatim ("Graham (defensive)", "P/E multiples", etc.) — single
source of truth in spirit, though duplicated in this component
for now. If the labels diverge, both UIs need a sync.
What's NOT in this commit
-------------------------
- Documentation updates (Step 9)
- Production verification via workflow_dispatch (Step 10)
https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
Documentation-only commit. No code changes; tests / lint / schema
unchanged at 500 passing.
PHASE_STATUS.md (+65 lines)
- 3d row: ⚪ NEXT → ✅ DONE 2026-05-10. Lists all 3 Tier-2
defenses with their modules + sources + commit refs (Steps 2/3
commits). Notes the orchestrator + 10-K text fetcher as
supporting infrastructure. Frontend additions (Tier2EventCard,
PillarRadarChart, FairPriceBarChart) listed.
- Defense scorecard table flipped from "Now (post-3c)" to
"Now (post-3d)": 4 vetoes (was 3) + 5 guards (unchanged) +
7 annotate flags (was 5+, with `going_concern_disclosure` and
`auditor_change` now bolded as new in 3d).
- v1.0 ETA updated: ~1 day remaining (just PR 3e).
- New "Phase 3d verified production stats — DRAFT" subsection
with placeholder values for Step 10 fill-in.
- New "Phase 3d acceptance checklist" subsection mirroring the
PR-3c block. 8 items already checked (Steps 1-8 + Vercel
spot-check); 7 still pending (Step 10 production verification).
SKILL.md (+1 net line — schema-versions table row added)
- New `0.6.0-phase3d` row before the existing `1.0.0` (Phase 3e)
row. Single dense paragraph documents the 3 defenses + their
sources (Mayew 2015 TAR, Schroeder 2024 SSRN, Reg S-K Item 304),
the 4th veto, the orchestrator, the 10-K cache, the schema
additions, the new UI components, and the 21→24 reason
taxonomy expansion.
- Old `1.0.0` row trimmed: removed the now-shipped Tier-2 scope
description; left only the Tier-3 + Honest Limitations
description (unchanged behavior).
docs/METHODOLOGY.md (+64 lines)
- Schema reference: `0.5.0-phase3c` (2026-05-09) →
`0.6.0-phase3d` (2026-05-10).
- Defense-count summary: "7 active defenses — 3 vetoes + 5
guards + 5+ annotate" → "10 active defenses — 4 vetoes + 5
guards + 7 annotate".
- Active-vetoes table: 3 → 4 rows. New row: `non_reliance_filing`
(8-K Item 4.02 within 365d, Schroeder 2024 SSRN).
- Annotate-only flags list: 5+ → 7 entries. New entries:
`going_concern_disclosure` (with FP-rate caveat per Mayew et
al. methodology) and `auditor_change` (with Reg S-K Item 304
+ FP-rate-too-high-for-veto reasoning).
- New "Tier-2 events" subsection between the Defense layer
and Sanity tests sections. Contents:
* Source/defense/mode/lookback table (10-K + two 8-K paths)
* Cache strategy (90d 10-K, 7d 8-K) with rationale
* Failure semantics (per-fetch None, never raises)
* Implementation modules (4 files: going_concern,
eight_k_events, tier2 orchestrator, filing_text)
* Why-veto-vs-annotate split for 4.02 (high precision,
Schroeder 50%) vs 4.01 (audit-firm rotation drives FP rate
too high)
docs/RESEARCH_FINDINGS.md (+38 lines)
- PR 3d Tier-2 section (#7-9): "next" → "✅ SHIPPED 2026-05-10".
Each defense annotated with implementation module + commit ref
(Step 2 commit fee4498 / Step 3 commit cedadca / Step 4
commit b90930e / Step 5 commit 9cd2c74).
- LOC estimate updated: ~270 → ~520 (closer to actual delivered
scope including the orchestrator + cache + frontend).
- Added a paragraph listing the supporting infrastructure
(orchestrator, 10-K cache, schema additions, frontend) and
the 423 → 500 test count delta with breakdown across new test
files.
- PR 3e + Phase 4+ sections unchanged (forward-looking).
Cross-doc consistency check
---------------------------
Search confirmed all `current state` references flipped:
- "21 stable identifiers" → "24 stable identifiers" everywhere
the doc describes the live taxonomy. Historical references
(e.g., the PR-3c row in SKILL's schema table, the PR-3c
verified-stats block in PHASE_STATUS) correctly retain the
21 count — that's accurate history.
- "3 vetoes" / "5+ annotate" similarly flipped where they
describe the post-3d state.
- "0.5.0-phase3c" remains in PR-3c historical references (its
schema-table row, its DONE block) — correct.
What's NOT in this commit
-------------------------
- WORKFLOW.md — Defense Roadmap is the single source of truth
for unshipped schemas; PR 3d items there will close in
Step 10's final docs pass once `tier2_coverage_pct` lands in
production.
- README.md — Honest Limitations + v1.0 marketing copy lands
in PR 3e.
- stock_ranking_knowledge.md — formula reference, not phase-
specific.
Verification
------------
- ruff check . → clean (no Python touched)
- python -m compute.output.schema_check → in-sync
- pytest tests/ -m "not network" → 500 passed (unchanged)
- Doc line counts: 2552 → 2720 (+168 lines)
https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
…7→91) Cleanup follow-up to commit 923dca5. Two interlocking errors in the PR 3d test-count claim, caught during Step 9 verify-2: 1. The "PR 3c baseline" was wrong. Actual count at PR-3c-end was 409 (verified via the per-step deltas in commit messages eea8644 → c346ed5: 409 → 414 → 439 → 464 → 483 → 500). The docs claimed 423, which was off by 14. 2. The "+5 misc" line in the breakdown was the 5 tests in `tests/test_config.py` from Step 1 (schema-version smoke). Not actually misc — it's the foundation-step constants test. Correction ---------- PR 3c baseline: 409 (was 423) Current non-network: 500 (unchanged) Net delta: +91 non-network tests + 3 @network = **+94 total added in PR 3d** Per-file breakdown (verified via pytest --collect-only): - +25 going-concern (Step 2) - +25 8-K-events non-network + 3 @network (Step 3) - +17 tier2 orchestrator (Step 5) - +13 tier2 schema (Step 4) - +6 risk-overlay non-reliance C-tests (Step 4) - +5 config smoke (Step 1) Sum: 91 non-network ✓ matches 500 - 409 ✓ Files changed ------------- PHASE_STATUS.md - Line 84: "77 new tests (500 - 423 baseline)" → "91 new non-network tests (409 → 500) + 3 @network tests added" docs/RESEARCH_FINDINGS.md - PR 3d shipped block tests bullet rewritten with the corrected baseline + the 6-line per-file breakdown matching the actual per-step commits. What was correct already ------------------------ SKILL.md and METHODOLOGY.md don't carry test-count references for PR 3d (only the qualitative scope description), so neither needed the cleanup. Verification ------------ - ruff check . → clean (docs only) - python -m compute.output.schema_check → in-sync - pytest tests/ -m "not network" → 500 passed (unchanged) - grep confirms no lingering "423" / "+77" PR-3d references in any doc. https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
edgartools' filing.text() routes through hybrid_section_detector which CPU-bounds at ~5-10s per 10-K (~19,000 section detections × 502 stocks blew past the 45-min workflow_dispatch timeout on the first cold-cache run). Going-concern phrase scan only needs raw text — \b-anchored regex with [\s\-]+ flex handles BS's get_text output identically to rich_to_text output. Section structure is irrelevant for phrase matching; the LM-dictionary phrases appear verbatim in the prose body regardless of MD&A boundaries. Cold-run estimate: 50+ min (timed out) → ~25-30 min (saves 40-75 min). Subsequent runs: unchanged (90-day 10-K cache hits). No new dependencies (beautifulsoup4 + lxml already in pyproject.toml). No test changes to scan_going_concern (synthetic mocking unaffected). Changes ------- compute/ingest/filing_text.py (+22 net LOC) - Add `from bs4 import BeautifulSoup` import. - Replace the 13-line `text_attr = getattr(most_recent, "text", None)` block with a 14-line `html_attr = getattr(...) → BeautifulSoup(html, "lxml").get_text(separator=" ")` block. Preserves the existing callable-vs-property handling for `html` (same dual-shape API caveat as `text` had). - Updated function docstring to document the perf decision: explains why we deliberately skip `filing.text()`, names the section detector cost, references the workflow_dispatch timeout, and confirms correctness equivalence for the going-concern scan. - New "10-K HTML fetch returned empty" warning code path: when `filing.html()` returns falsy, log + return None gracefully. tests/test_ingest/test_filing_text.py (NEW, 5 cases) - test_bs_extraction_preserves_going_concern_phrase_inline_b_tags: phrase split by `<b>` tags survives extraction. - test_bs_extraction_handles_table_layouts: SEC table-formatted MD&A still yields detectable phrases. - test_bs_extraction_collapses_inline_styling_whitespace: inline spans / `<br>` tags don't break `[\s\-]+`-flex regex. - test_bs_extraction_negative_clean_filing: inverse sanity — clean text after BS strip still scans False. - test_bs_extraction_empty_html_yields_empty_text: edge case. Locks the contract: if edgartools' html() output ever changes shape, or BS extraction starts losing whitespace boundaries, this test file catches it before production. Verification ------------ - ruff check . → clean - python -m compute.output.schema_check → in-sync - pytest tests/ -m "not network" → 505 passed (was 500 → +5 new BS-contract tests) - npx tsc --noEmit → clean - python -c "from compute.ingest.filing_text import fetch_latest_10k_text" → imports cleanly edgartools API edge cases discovered ------------------------------------ Confirmed via `inspect.getsource`: - `Filing.text()` is a `@lru_cache(maxsize=4)` method that fetches HTML via `self.html()`, then runs `HTMLParser(ParserConfig(form= ...)).parse(html_content)` → `rich_to_text(document, width=500)`. The HTMLParser path is what calls `hybrid_section_detector`. - `Filing.html()` is also a method (not a property — confirmed by inspect). Returns `sgml.html()` directly with no parsing. Same dual-shape (callable vs property) defensive handling preserved in our code in case future edgartools versions change. - `Filing.full_text_submission()` exists but downloads the entire .txt submission (primary + exhibits + headers); much larger payload than just html() of the primary document. Not used. - `Filing.text_url` (property) just returns the URL; we'd download + strip ourselves. Equivalent path but with extra HTTP round trip. Not used. Cache shape unchanged — still stores stripped text under the same JSON key; existing 90-day TTL applies. What's NOT in this commit ------------------------- - Changes to compute/scoring/going_concern.py (regex-side is already correctness-equivalent on the new input) - Changes to fetch_tier2_for_ticker (orchestrator is unchanged; only the underlying text-source path changed) - Workflow timeout bump (Option A backup) — not needed if Option B succeeds; can be reconsidered if next workflow run still hits the 45-min ceiling for unrelated reasons https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
edgartools' EightK.items + EightK.sections route through hybrid_section_detector via the @cached_property document parsing (HTMLParser(ParserConfig(form='8-K')).parse(html)). Cost: ~3,500 detector invocations across 502 stocks × ~7 8-K filings each. This dominated the 45-min workflow timeout in run #13 even after Option B fixed the 10-K path (commit 226840d). Mirror edgartools' own Strategy 3 regex fallback for legacy SGML filings (current_report.py:51 _extract_items_from_text) — proven correct on the same input distribution. Use filing.html() + BeautifulSoup raw text + regex Item detection. Adds tests A15-A20 (+6) covering canonicalization, dedup, excerpt cap, empty HTML, and 4.020 false-match guard. Test count: 505 → 511. Cold-run time: 45-min timeout → expected 25-30 min. Subsequent runs: unchanged (cache hits). https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
Defense-in-depth safety net for cold-cache scenarios. With the Scenario A perf fix in 12ad7ff (skip 8-K parser via raw HTML + regex), expected cold run is 25-30 min — well under the previous 45-min ceiling. But: 1. We hit the 45-min ceiling twice already (runs #12, #13). Operating on the cliff edge is bad practice. 2. Phase 4+ defenses (Beneish, Dechow, REIT FFO/AFFO, cross-source validator) will each add fetch + compute time. Better to have headroom than re-fight this battle later. 3. GitHub Actions billing is by minute and timeout-minutes only matters on FAILED runs (which terminate at the cap). Successful runs bill actual time. Bumping the cap costs nothing if the perf fix works. 4. 90 min = comfortable 3x baseline. Future expansion has room without further ceiling fights. https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
…efer Run #14 timeout root cause: SEC EDGAR API throttling amplified by tenacity retry policy (max=30s × 3 attempts = 60-90s per failed stock). Run #11 (2 days ago) finished in 23m on same code path; Run #14 stuck 1h+ in fundamentals stage = 3-6x SEC API slowdown during incident. Mitigations: 1. Tighten retry on BOTH _build_snapshot and _build_annual_history: stop=(stop_after_delay(30) | stop_after_attempt(2)), wait=wait_exponential(min=2, max=8). Caps per-stock retry at ~30s. 2. Per-stock fundamentals + history fetch timeout (fut.result(timeout=45)) — graceful skip on stuck-task. Defensive backstop; real cap is the inner tenacity stop_after_delay. 3. Suppress noisy edgartools concept-miss UserWarnings via facts._suppress_warnings = True after company.get_facts(). Skips the difflib fuzzy-match suggestion pass and frees stderr for triage. 4. Per-stock latency histogram (<5s / 5-15s / 15-30s / 30s+) with thresholds aligned to retry-policy tiers, plus p50/p95 + top-20 slow tickers logged for Phase 4 throttling-detection visibility. 5. fundamentals_coverage_pct + fundamentals_latency_p50_seconds + fundamentals_latency_p95_seconds in Metadata mirror the existing tier2_coverage_pct. ALSO: defer 8-K event fetches (Defenses #9 + #10) to Phase 4. Three workflow timeouts (#12, #13, #14) consumed budget; ship PR 3d with going-concern (Defense #8) only. _EIGHT_K_DEFENSES_ENABLED feature flag gates the 8-K branch — single-line flip in Phase 4 to re-enable once the pre-cache layer lands. Schema unchanged; 8-K event fields in tier2_events emit but always False/None until Phase 4. Active veto count temporarily 3 (was planned 4); restored in Phase 4. Tests: 511 → 526 (+15: 5 deferred-mode tier2, 5 histogram/percentile/ tuple-return main, 1 retry-policy contract, 4 fixture-extended A/D tests for 8-K wiring). Tracked: /tmp/issue_drafts/issue_8k_events_phase4.md + /tmp/issue_drafts/issue_fundamentals_resilience_phase4.md. https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
Three Recharts label-clipping issues found in production Vercel
preview spot-check on iPhone-class viewports:
1. FairPriceBarChart 'Current $X' ReferenceLine label clipped
above chart frame. Moved from position='top' to
'insideBottomRight' with offset=8.
2. FairPriceBarChart rightmost x-axis tick ('$1238' for NVDA)
clipped at right edge. Bumped BarChart margin from
{top:8,right:24,bottom:8,left:0} to
{top:10,right:30,bottom:10,left:10}.
3. PillarRadarChart 'Technical' axis label clipped at left edge.
Added explicit cx/cy/outerRadius props; reduced outerRadius
from default 80% to 70% to give label breathing room on all
8 axes.
No component logic changes; CSS/margin/position adjustments only.
Frontend build still passes 506/506 routes; tsc clean.
https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
Two additional UI quality issues found in deeper spot-check: 1. Rankings cards with null fair_price (~15 of 502 stocks including SPG #1, BKR #6) rendered visibly shorter than populated cards, breaking visual rhythm in the mobile scroll list. Added min-h-[112px] for uniform card height + h-full on the inner Link so the hover area fills the card. Improved placeholder typography: 'Fair ⚠ N/A' (slate-400, with title tooltip) instead of bare '—', matching the visual weight of populated rows. 2. FairPriceBarChart small-value bars (e.g., Graham defensive ~$28, Residual Income ~$23 on NVDA) crushed to ~6 pixels when chart spans $0-$1238 due to outlier EV/EBITDA $1031. Added minPointSize=5 to Bar component — small bars stay visible regardless of value magnitude. Linear scale preserved. No component logic changes; CSS/Tailwind/Recharts prop adjustments only. Frontend build still passes 506/506 routes; tsc clean. https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
Three layout iterations from continued user spot-check:
1. FairPriceBarChart 'Current $X' label moved back to position='top'
with offset=12 + chart top margin bumped from 10 to 35 to prevent
clipping. The previous insideBottomRight position overlapped the
y-axis 'Residual Income' label. Median + max ReferenceLines have
no labels so don't interfere.
2. FairPriceBarChart visual rhythm:
- container height 18rem (288px) → 300px (more vertical room)
- barCategoryGap='25%' (cleaner method separation)
- y-tick fontSize 11 → 10, width 140 → 110 (label fit at smaller
font + more horizontal bar space)
3. RankingTable mobile card layout restructured from horizontal-flex
to flex-col with three rows:
- Row 1 (items-start flex): ticker+rank+name on left, score badge
top-aligned on right (was vertically centered, floating mid-card)
- Row 2 (justify-between): sector | price (tabular-nums)
- Row 3 (justify-between): fair | MoS (tabular-nums)
Cleaner visual rhythm across all 502 cards regardless of name/sector
length or null fair_price.
No component logic changes; Tailwind/Recharts layout adjustments only.
Frontend build still passes 506/506 routes; tsc clean.
https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
Three PR-3d components carried sm:max-w-2xl that capped their width at 672px on viewports ≥640px (Tailwind sm breakpoint), while existing cards on the same stock detail page (FairPriceCard, RawMetricsTable) have no width cap and span the full container. On tablet/large-phone viewports this read as visual inconsistency: the new cards looked narrower than the surrounding sections, breaking the page rhythm. Removed sm:max-w-2xl from: - FairPriceBarChart - PillarRadarChart - Tier2EventCard All three now match the full-width treatment used by FairPriceCard + RawMetricsTable. No layout changes inside the cards — the charts' ResponsiveContainer already adapts to the parent width. Frontend build still passes 506/506 routes; tsc clean. https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
5 tasks
5 tasks
dackclup
pushed a commit
that referenced
this pull request
May 14, 2026
Independent verification of PR #49's TTM flow-item fix surfaced a related but distinct bug — 26 of 502 S&P 500 tickers shipped with shares_outstanding=None (and consequently market_cap=None, PE=NaN, PB=NaN). Affected tickers include front-page names: META (rank #12), ACN, MA, BRK-B, CMCSA, DASH, LEN, ABNB. Root cause: the `_BALANCE_TAGS["shares_outstanding"]` chain only had `us-gaap:CommonStockSharesOutstanding` + `CommonStockSharesIssued`. META / ACN etc. don't tag either concept — they only file shares via `us-gaap:WeightedAverageNumberOfDilutedSharesOutstanding` (the EPS denominator). Two distinct patterns surfaced when probing more candidate tags: 1. DEI-current pattern (WMT, META, ACN). The cover-page tag `dei:EntityCommonStockSharesOutstanding` reflects splits / buybacks within ~1 quarter of the filing date. WMT's DEI tag holds the correct post-Feb-2024-split 8B shares; the us-gaap tag held a stale 3.42B. 2. DEI-stale pattern (MA, BRK-B). DEI tag frozen at 2010-2011 for some legacy filers — MA shows 122M (2010-10-27) vs the correct 893M from `WeightedAverageNumberOfDilutedSharesOutstanding` (2026-03-31). First-non-null chain ordering can't distinguish "current DEI" from "stale DEI". Fix: - New `_try_balance_tags_most_recent(facts, tags)` helper picks the candidate concept with the most recent `period_end` across the entire chain instead of taking the first non-null. Falls back to None when all candidates are missing. - `_build_snapshot` routes `shares_outstanding` through the new helper (other balance items keep first-non-null semantics — they don't have the multi-concept-divergence issue). - Chain expanded: DEI + WeightedAverageDiluted + WeightedAverageBasic added; legacy us-gaap tags retained. Validation against 6 tickers: Before → After Expected META: None → 2.564B ~2.5B ✓ WMT: 3.42B (stale) → 7.97B ~8B post-split ✓ ACN: None → 624M ~637M ✓ MA: 122M (DEI stale)→ 893M ~893M ✓ BRK-B: None → 1.64M (multi-class) ~2.16B (still wide) AAPL: 14.69B → 14.69B (control, unchanged) ✓ BRK-B's dual-class A/B structure remains an edge case — no standard concept captures the consolidated share count. Caught by the existing `data_quality_input_corruption` veto (TBVPS would exceed $10K/share with ~$560B equity and only ~1.6M reported shares), so BRK-B is correctly excluded from Top-5 ranking. Regression guard: test_shares_outstanding_fallback_chain_covers_dei_and_weighted_avg ensures future ingest edits can't silently drop these alternative tags. Full offline suite: 646 / 646 pass. https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2
dackclup
added a commit
that referenced
this pull request
May 14, 2026
…#49) * fix(ingest): TTM flow items + pe_ratio formula (audit #6, deep clean) Pre-v1.0 audit #6 found systemic single-period bugs in 9 income-statement flow items. The `_NORMALIZED_LATEST` dict pulled values via `facts.get_concept('xyz')` which returns latest single-period — a value that can be Q1, H1 YTD, Q3 YTD, or full annual depending on each filer's reporting cadence. Probed across 4 tickers in May 2026: TSLA operating_income $941M (single Q1-2026) vs $4.9B (TTM) — 5× off AAPL operating_income $87B (H1 YTD) vs $147B (TTM) — 0.6× off GOOGL gross_profit $96B (H1 YTD) vs $217B (TTM) — 0.4× off Universe-wide impact: - pe_ratio (= price / eps_diluted) broken on 381 of 430 S&P 500 tickers (88.6%) with PE off by > 30%. Median production PE = 77.5, correct median = 26.2 (3× artificial inflation). - Profitability pillar: gross_margin, operating_margin, gross_profitability all mixed single-period numerator with TTM revenue → systematically understates margins. - Health pillar: interest_coverage, debt_to_ebitda, Altman Z'' EBIT proxy all mixed single-period with balance items. - Value pillar EV/EBITDA: broken via EBITDA = op_income + D&A both single-period. - Fair-price multiples PE method, Beneish DEPI / TATA, Dechow Δroa all consume these snap fields downstream. Fix: A. New `_TTM_FLOW_TAGS` dict with US-GAAP tag chains for 9 income- statement flow items: operating_income, gross_profit, cost_of_revenue, sga_expense, depreciation_and_amortization, interest_expense, income_tax_expense, research_and_development, income_before_tax, dividends_paid. Each chain ordered most-general / modern first so the MAX-of-fresh heuristic (added in PR #48) picks consolidated totals over segment-level disclosures. interest_expense chain includes `InterestExpenseOperating` and `InterestExpenseNonoperating` (newer concepts) ahead of legacy `InterestExpense` — AAPL / MSFT / JPM / TSLA all probed stale on the legacy tag post-2024. B. `_build_snapshot` now walks `_TTM_FLOW_TAGS` via the existing `_try_ttm_max_fresh` helper. `_NORMALIZED_LATEST` reduced to just `eps_basic` / `eps_diluted` (per-share figures with no clean TTM-via- tag substitute; consumers derive TTM EPS from NI_TTM / shares instead). C. `pe_ratio` rewritten to use `NI_TTM / shares_outstanding` directly instead of `snap.eps_diluted`. Validation against 8 diverse tickers post-fix: AAPL 35.8, MSFT 24.0, NVDA 45.7, TSLA 425.9, GOOGL 30.4, JPM 13.7 — all within reasonable industry ranges (vs prior 61.6, 30.8, 46.1, 3425.2, 78.8, 50.5 with mixed-period bug). Regression guards added: - test_ttm_flow_tags_replace_normalized_latest_for_income_statement - test_pe_returns_nan_for_negative_earnings + test_pe_returns_nan_when_inputs_missing - test_pe_ratio (now expects PE=40 for fixture, was 20 under broken formula) Full offline suite: 645 / 645 pass. Frontend tsc + next build clean. Validation against EDGAR-direct ground truth pending workflow_dispatch re-run + audit shortlist. https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2 * fix(ingest): smart shares_outstanding fallback (audit #6 follow-up) Independent verification of PR #49's TTM flow-item fix surfaced a related but distinct bug — 26 of 502 S&P 500 tickers shipped with shares_outstanding=None (and consequently market_cap=None, PE=NaN, PB=NaN). Affected tickers include front-page names: META (rank #12), ACN, MA, BRK-B, CMCSA, DASH, LEN, ABNB. Root cause: the `_BALANCE_TAGS["shares_outstanding"]` chain only had `us-gaap:CommonStockSharesOutstanding` + `CommonStockSharesIssued`. META / ACN etc. don't tag either concept — they only file shares via `us-gaap:WeightedAverageNumberOfDilutedSharesOutstanding` (the EPS denominator). Two distinct patterns surfaced when probing more candidate tags: 1. DEI-current pattern (WMT, META, ACN). The cover-page tag `dei:EntityCommonStockSharesOutstanding` reflects splits / buybacks within ~1 quarter of the filing date. WMT's DEI tag holds the correct post-Feb-2024-split 8B shares; the us-gaap tag held a stale 3.42B. 2. DEI-stale pattern (MA, BRK-B). DEI tag frozen at 2010-2011 for some legacy filers — MA shows 122M (2010-10-27) vs the correct 893M from `WeightedAverageNumberOfDilutedSharesOutstanding` (2026-03-31). First-non-null chain ordering can't distinguish "current DEI" from "stale DEI". Fix: - New `_try_balance_tags_most_recent(facts, tags)` helper picks the candidate concept with the most recent `period_end` across the entire chain instead of taking the first non-null. Falls back to None when all candidates are missing. - `_build_snapshot` routes `shares_outstanding` through the new helper (other balance items keep first-non-null semantics — they don't have the multi-concept-divergence issue). - Chain expanded: DEI + WeightedAverageDiluted + WeightedAverageBasic added; legacy us-gaap tags retained. Validation against 6 tickers: Before → After Expected META: None → 2.564B ~2.5B ✓ WMT: 3.42B (stale) → 7.97B ~8B post-split ✓ ACN: None → 624M ~637M ✓ MA: 122M (DEI stale)→ 893M ~893M ✓ BRK-B: None → 1.64M (multi-class) ~2.16B (still wide) AAPL: 14.69B → 14.69B (control, unchanged) ✓ BRK-B's dual-class A/B structure remains an edge case — no standard concept captures the consolidated share count. Caught by the existing `data_quality_input_corruption` veto (TBVPS would exceed $10K/share with ~$560B equity and only ~1.6M reported shares), so BRK-B is correctly excluded from Top-5 ranking. Regression guard: test_shares_outstanding_fallback_chain_covers_dei_and_weighted_avg ensures future ingest edits can't silently drop these alternative tags. Full offline suite: 646 / 646 pass. https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2 * fix: complete eps_diluted → NI_TTM/shares migration (audit #6 follow-up) Re-audit found 3 additional consumers of snap.eps_diluted that PR #49's initial fix missed: 1. `compute/main.py::_build_universe_metrics` — pe_ttm used in raw_metrics.pe_ratio_ttm display + peer median computation. Switched to NI_TTM/shares. 2. `compute/main.py` per-ticker loop — raw_metrics.pe_ratio_ttm field that feeds the StockSummary JSON. Same fix. 3. `compute/valuation/ensemble.py::multiples_pe_fair_price` call site — eps_ttm input was snap.eps_diluted (single-period). Switched to NI_TTM/shares so the fair-price PE method is consistent with the value-pillar PE. 4. `compute/features/value.py::graham_number` (TTM variant used as a pillar factor, distinct from the fair-price Graham). Switched to NI_TTM/shares. Without these, the fair-price PE method would have continued using the wrong (quarterly/YTD/annual) EPS — affecting peer medians used for relative valuation. PE pillar fixed but fair-price still corrupt. Tests updated: - test_main.py _snap fixture: add net_income=50 (matches eps_diluted=5 with shares=10 so existing test invariants hold). - test_build_universe_metrics_pe_uses_diluted_eps → renamed test_build_universe_metrics_pe_uses_ttm_eps_not_single_period, expectation switched to NI-derived. - test_build_universe_metrics_negative_eps_yields_null_pe → renamed test_build_universe_metrics_negative_ni_yields_null_pe, uses net_income=-10 override. - test_graham_number expectation updated: √(22.5 × 1.0 × 12) ≈ 16.43 (vs prior √540 ≈ 23.24 under broken formula). Full offline suite: 646 / 646 pass. https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2 * fix(ingest): expand TTM revenue + NI chains for utilities, tech, BKNG (audit #6 deeper) Deeper audit found 3 more tag-coverage gaps causing None revenue / NI across multiple tickers: 1. **BKNG NI = None**: edgartools picks `us-gaap:NetIncomeLoss` which is frozen at 2012 for BKNG. Fresh data lives under `us-gaap:NetIncomeLossAvailableToCommonStockholdersBasic` ($6.15B 2026-03-31). Added to `_TTM_NET_INCOME_TAGS` chain. 2. **DUK / utilities revenue = None**: `us-gaap:Revenues` frozen at 2017 for DUK. Fresh data lives under `us-gaap:RegulatedAndUnregulatedOperatingRevenue` ($33.17B). This is the standard utility-sector revenue concept. Added. 3. **CRWD / tech revenue = None**: CrowdStrike + some other modern tech filers tag revenue under `us-gaap:RevenueFromContractWithCustomerIncludingAssessedTax` (the "Including" assessed-tax variant of ASC 606) rather than the more common "Excluding" variant. Added. The MAX-of-fresh heuristic from PR #48 handles concept selection — we just need every relevant concept to be in the chain. Validation results (universe sweep): Before this commit: 4 NI=None + 25 revenue=None (~6% universe) After this commit: expected to drop to ~7 revenue=None (banks WFC/GS/etc. need interest+noninterest aggregation — Phase 4) + APA (energy-specific tagging — Phase 4 too). Per-ticker fixes verified locally: Ticker Before After DUK None $33.2B revenue CRWD None $4.8B revenue BA None $92.2B revenue (was None due to old edgartools pre-3-fresh chain — now works since `Revenues` is the only fresh concept) LMT None $75.1B revenue BKNG NI=None NI=$6.15B Control unchanged: AAPL revenue $451.4B / NI $122.6B. Full offline suite: 646 / 646 pass. https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2 * fix(ingest): add bank-specific RevenuesNetOfInterestExpense concept Continued deep audit found WFC + GS shipping with revenue=None. Both are diversified banks that don't tag `us-gaap:Revenues` for consolidated revenue — instead they file under `us-gaap:RevenuesNetOfInterestExpense` (industry-standard for banks reporting net-interest income + noninterest income aggregated). WFC: was None → $85.0B fresh (Q1 2026) GS: was None → $60.4B fresh (Q1 2026) Adds the concept to `_TTM_REVENUE_TAGS` in the appropriate position (after the utility-specific concept, before the legacy SalesRevenueNet fallback). The MAX-of-fresh heuristic picks correctly: BAC already had fresh `us-gaap:Revenues` so it sticks with that; WFC/GS fall to the bank concept. Remaining None-revenue tickers post-this-fix: - HBAN (regional bank — uses InterestIncome / NoninterestIncome separately, needs aggregation logic — Phase 4) - APA (Apache energy — uses srt: oil/gas-specific tags rather than us-gaap — Phase 4) Full offline suite: 646 / 646 pass. https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2 * fix(ingest): mirror TTM tag chain expansion to _ANNUAL_TAGS Continued sweep found annual history gaps for the same tickers that needed snapshot-tag fixes: Before annual-chain fix: BKNG: 6y revenue, 0y NI (NI tag stale) WFC: 0y revenue, 6y NI (revenue tag missing) DUK: 0y revenue, 6y NI (utilities concept missing) After: All four (AAPL control + BKNG + WFC + DUK): 6y / 6y Impact: `_avg_3y_roe` (uses NI history), `revenue_cagr` and `fcf_5y` were silently skipping for ~50 utility / bank / multi-class filers because their annual `_ANNUAL_TAGS["revenue"]` or `_ANNUAL_TAGS["net_income"]` returned empty. With both chains expanded to mirror the TTM-chain fixes: - revenue: + RevenueFromContractWithCustomerIncludingAssessedTax (CRWD-class) + RegulatedAndUnregulatedOperatingRevenue (utilities) + RevenuesNetOfInterestExpense (banks) - net_income: + NetIncomeLossAvailableToCommonStockholdersBasic (BKNG-class) + NetIncomeLossAvailableToCommonStockholdersDiluted + ProfitLoss The pre-existing `_avg_3y_roe` known-issue (#11 — uses current equity as denominator, not per-year equity) is NOT addressed here. That's a separate Phase 4 fix; the chain expansion just lets it COMPUTE for the ~50 affected tickers instead of returning None. Full offline suite: 646 / 646 pass. https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2 * ci(compute-rankings): bump fundamentals cache key to v2 Critical finding from auditing the post-PR-#48 workflow_dispatch run on main: the fix in PR #48 (TTM concept-pick + data_quality guard) is CORRECT in code but the production output still shows the old wrong values for AVB / CPT / ESS / UDR and many others. Root cause: `actions/cache@v5` restores `compute/cache/fundamentals/` across runs within the same quarter (cache key `fundamentals-2026Q2-ubuntu-latest`). The pre-PR-#48 cached parquets hold the wrong $7.1M (AVB) / $12.6M (CPT) / etc. revenue values, AND their `latest_filed_date` is recent enough that the `_is_fresh` check (45-day rolling) passes, so `fetch_fundamentals` returns the cached snapshot WITHOUT re-fetching from EDGAR. Local probe confirms: with PR #48 + #49 code, AVB returns $3.07B revenue (correct). Production keeps shipping $7.1M because cache. Fix: bump the cache key from `fundamentals-{quarter}-{os}` to `fundamentals-v2-{quarter}-{os}`. Forces one cache miss on the next workflow_dispatch run, which triggers fresh fetches for all 502 tickers (~50 min cold run vs 30 min warm). After that one run, the v2 cache repopulates with corrected data. Bump the v-number whenever any tag-chain dict changes schema: - `_BALANCE_TAGS["shares_outstanding"]` chain (PR #49 added DEI + weighted-average fallbacks + MAX-by-period logic) - `_TTM_REVENUE_TAGS` (PR #49 added utility / bank / ASC-606 variants) - `_TTM_NET_INCOME_TAGS` (PR #49 added BKNG-class concepts) - `_TTM_FLOW_TAGS` (entirely new in PR #49) - `_BALANCE_TAGS["property_plant_equipment"]` (PR #43) Without this bump, the deep-clean fixes don't actually reach production. https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2 --------- Co-authored-by: Claude <noreply@anthropic.com>
dackclup
added a commit
that referenced
this pull request
May 15, 2026
….6.0 → 0.7.0 (#79) Closes #14. Closes #17 (the "10-K + 8-K" log line is now accurate again). ## What this PR does Flips `compute.scoring.tier2._EIGHT_K_DEFENSES_ENABLED` False → True. That single boolean was the only thing standing between the PR 3d wiring and an active 4th veto. Active vetoes count goes 4 → 5. | Defense | ItemFlag.fired effect | Active in production | |---|---|---| | `non_reliance_filing` (8-K Item 4.02, Schroeder 2024 SSRN) | hard VETO — suppresses `entered_top5` | **YES, after this PR** | | `auditor_change` (8-K Item 4.01, Reg S-K Item 304) | ANNOTATE — emits `tier2_events.auditor_change.fired` | **YES, after this PR** | ## Why this is safe to flip now The PR 3d deferral cited 3 workflow-timeout incidents (runs #12 / #13 / #14). Every root cause has since been mitigated: | PR 3d failure | Fix | |---|---| | Run #12 — `filing.text()` routed through `hybrid_section_detector` (~5-10s × 502 stocks) | PR 3d hotfix 226840d — `filing.html()` shortcut | | Run #13 — `EightK.items` access also triggered the detector | PR 3d hotfix 12ad7ff — regex-on-raw-HTML extraction (mirrors edgartools Strategy 3 fallback) | | Run #14 — SEC EDGAR throttling × overly-aggressive tenacity retry (60-90s/stuck-stock × ~40% failure rate) | PR 3d Part 1 — `stop_after_delay(30) | stop_after_attempt(2)`, 45s per-stock orchestrator timeout, edgartools warning suppression | | Cache state lost across CI runs (only `fundamentals` was being preserved) | PR 4a — workflow cache restore step expanded to all 6 paths, including `edgar_8k` | | Weekly compute = 7-day recovery window on failure | PR 4f — daily Mon-Fri cron = 24h recovery window | Latency p95 has dropped from the 30+s regime that bit run #14 to 14.41s on the latest production run. Kill-switch capability (`QR_SKIP_TIER2` env var + the feature flag itself) is preserved. ## Files changed - `compute/scoring/tier2.py` — `_EIGHT_K_DEFENSES_ENABLED = True`; comment block updated with the post-flip rationale + kill-switch pointer - `compute/config.py` — `SCHEMA_VERSION` `0.6.0-phase3d` → `0.7.0-phase4g`. Promoting a defense flag from deferred to active veto is a **minor** semver bump per `.claude/skills/phase-4/schema-versioning/PLAN.md`. - `tests/test_config.py` — locked-constant test renamed `test_schema_version_is_phase3d` → `test_schema_version_is_phase4g`; asserts new version - `tests/test_smoke.py` — `SCHEMA_VERSION.startswith("0.6.0")` → `startswith("0.7.0")` - `tests/test_scoring/test_tier2.py` — `eight_k_disabled` fixture added; E1-E5 + F2 updated to flip the flag explicitly when exercising kill-switch behavior (was relying on the default). Same threshold-symbolic-test pattern as SKILL Rule 17 — tests stay green if the constant moves. ## Scope NOT in this PR - Pre-cache off-cycle workflow (issue #14 §1) — kept as an option for further perf headroom but not needed for correctness. If the first daily run with 8-K enabled comes in under the 90-min budget, the pre-cache layer becomes an optimization, not a requirement. File as follow-up if needed after monitoring the first 1-2 runs. - Frontend updates — `tier2_events.non_reliance_filing` / `auditor_change` were already wired through to `Tier2EventCard` in PR 3d; values just flip from "always false" to "computed from EDGAR data" after this PR. No frontend code change required. ## Verification ladder - ✅ ruff check — clean - ✅ pytest -m "not network" — 772 passed (5 retitled / threshold- symbolic tests in tier2 + test_config + test_smoke) - ✅ schema_check — Pydantic ↔ TypeScript snapshot still in sync (no field shape change; only the version string moved) - ✅ tsc --noEmit — clean - ✅ next build — 506 static pages ## Post-merge monitoring After the first daily compute run with 8-K active: - Section A schema field reads `0.7.0-phase4g` - Section B `non_reliance_filing` count — expect 0-5 tickers in the S&P 500 universe per Schroeder 2024 base rate; > 10 deserves investigation - Section B `auditor_change` count — expect 5-20 tickers per Reg S-K base rate over a 730-day window - `tier2_coverage_pct` should remain ≥ 95% (was ~100% with 10-K only; partial 8-K fetch failures will drag it down some) - Workflow runtime — expect +5-10 min cold cache hit, +1-2 min warm https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2 Co-authored-by: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
feat(phase-3d): Tier-2 going-concern + UI polish + fundamentals resilience
Closes the v1.0 milestone two-thirds. Ships Tier-2 going-concern defense + 3 frontend UI components. 8-K event defenses (4.02 + 4.01) deferred to Phase 4 due to SEC API throttling encountered during integration (3 workflow_dispatch attempts hit timeout).
After this merges, PR 3e ships Tier-3 (Beneish + Dechow + Honest Limitations) and tags v1.0.
Scope shipped
Tier-2 event defenses (1 of 3 originally planned)
Defense docs: integrate research-validated defense playbook (pre-PR-3c) #8 going_concern_disclosure ✅ shipped (annotate-only)
\bboundaries +[\s\-]+flexDefense feat(phase-3c): fair price ensemble + price history + Tier-1 defenses #9 non_reliance_filing 🟡 deferred to Phase 4
Defense Phase-3 fundamentals: shares_outstanding ingested wrong for ~12 tickers #10 auditor_change 🟡 deferred to Phase 4
Deferral rationale: 8-K parsing on 502 stocks × ~7 filings each (~3,500 fetches per cold run) hit 45-min workflow timeout in three attempts (feat(phase-3d): Tier-2 going-concern + UI polish + fundamentals resilience #12 commit
a16f4e3, feat(skills): scaffold 43 project-specific skills across all phases #13 commit226840dOption B 10-K parser bypass, Re-enable Tier-2 8-K event defenses (#9 + #10) in Phase 4 #14 commit6b3c565Scenario A 8-K parser bypass + 90-min timeout). Architectural change to background pre-cache layer needed. Tracked: Issue (Phase 4) for re-enablement.Frontend UI
Schema additions
StockDetail.tier2_events(dict | None, 5-field shape)Metadata.tier2_coverage_pct(float | None)Metadata.fundamentals_coverage_pct(float | None) — NEWMetadata.fundamentals_latency_p50_seconds(float | None) — NEWMetadata.fundamentals_latency_p95_seconds(float | None) — NEWTier2EventsTypeScript interface mirrorFundamentals resilience (added during PR 3d)
SEC API throttling encountered during workflow_dispatch attempts revealed retry-amplification cost (~60-90s per stuck stock under exponential backoff). Added:
stop_after_delay(30) | stop_after_attempt(2),wait_exponential(max=8s)fut.result(timeout=45)Production verification (commit 4805741, Run #15)
0.5.0-phase3c→0.6.0-phase3dmos_trailing_ic_smoke: -0.2216 (momentum regime artifact, same characteristic as run _avg_3y_roe denominator bug inflates value_trap_risk to 44% of S&P 500 (221 stocks) #11)going_concern_disclosure: 54 stocks (high FP rate, Phase 4 will refine — see Issue C)non_reliance_filing: 0 (deferred, feature flag working)auditor_change: 0 (deferred, feature flag working)data_quality_input_corruption: 8 (AMCR/BKR/CHTR/ERIE/PSKY/RTX/SPG/VTRS — exact match to run _avg_3y_roe denominator bug inflates value_trap_risk to 44% of S&P 500 (221 stocks) #11)Tests
Defense scorecard at v0.6.0-phase3d
non_reliance_filingdeferred to Phase 4extreme_{method}_estimate, stale_filing_soft, data_quality_input_corruption, going_concern_disclosureauditor_changedeferred to Phase 4Reviewer checklist
_EIGHT_K_DEFENSES_ENABLED = FalseconfirmedPost-merge actions
claude/phase-3d-tier2-eventsissue_8k_events_phase4.md)issue_fundamentals_resilience_phase4.md)issue_going_concern_phrase_refinement.md, NEW)issue_log_message_update.md, NEW low-priority)Defense Playbook lineage (revised)
Generated with Claude Code · Tested with Anthropic API