diff --git a/.claude/agents/README.md b/.claude/agents/README.md index 13420f86a..0847f1058 100644 --- a/.claude/agents/README.md +++ b/.claude/agents/README.md @@ -22,7 +22,7 @@ skills are loaded each session, so the main agent already has the trigger map. Subagents add value where context isolation or parallelism specifically helps. -## The current set (14) +## The current set (15) Organized into four tiers — **core** (narrow project invariants), **lifecycle** (engineering-org roles for PR / release / phase @@ -31,7 +31,7 @@ project knowledge), and **operations** (orchestrators + ops roles). This is the "full enterprise dev team" topology — every tier maps to roles a 20-person engineering org would have: -### Tier 1 — Core (4) +### Tier 1 — Core (5) | Subagent | Enterprise role analogue | Trigger | Model | Tools | |---|---|---|---|---| @@ -39,6 +39,7 @@ roles a 20-person engineering org would have: | [`schema-sentinel`](schema-sentinel.md) | API / contract governance | When `schemas.py` / `types.ts` / `schema-snapshot.json` changes; CI schema-drift failures | sonnet | Read, Bash, Grep | | [`defense-layer-auditor`](defense-layer-auditor.md) | QA / data observability | After scoring / valuation changes; after weekly cron lands; before PR Ready-flip on scoring touches | sonnet | Read, Bash, Grep, Glob | | [`edgar-debugger`](edgar-debugger.md) | On-call for downstream dep | SEC EDGAR ingest test failures; live-run hangs; rate-limit / edgartools drift errors | sonnet | Read, Bash, Grep, Glob | +| [`stock-detail-auditor`](stock-detail-auditor.md) | Data-correctness reviewer | Post-cron; pre-release; "ตรวจ data หุ้น" / "check stock data correctness" / "audit the output"; prefilter caps LLM-judgment at ≤ 20 tickers | sonnet | Read, Bash, Grep, Glob | ### Tier 2 — Lifecycle (4) @@ -118,6 +119,7 @@ User: "tag release v1.3.0" / "ตัด release" [release-captain] (opus) drives the ladder; spawns in parallel: ├─ schema-sentinel ──► no schema drift on release commit ├─ defense-layer-auditor ──► Section A-J PASS on latest output + ├─ stock-detail-auditor ──► per-stock data correctness (prefilter + ≤ 20 LLM verdicts) ├─ security-reviewer ──► CVE + secrets baseline ├─ performance-engineer ──► cron latency within budget ├─ dependency-auditor ──► no new CVEs since last tag diff --git a/.claude/agents/stock-detail-auditor.md b/.claude/agents/stock-detail-auditor.md new file mode 100644 index 000000000..6a740a970 --- /dev/null +++ b/.claude/agents/stock-detail-auditor.md @@ -0,0 +1,177 @@ +--- +name: stock-detail-auditor +description: Data-correctness auditor for the per-stock JSON the frontend renders (frontend/public/data/stocks/.json + rankings.json + metadata.json). Pre-filters the universe deterministically for outliers (range / consistency / Rule 16 invariant / known-issue overlap), then does LLM-judgment review on ≤ 20 flagged tickers. Read-only. Fires at hand-off moments (post-cron, pre-release, "ตรวจ data หุ้น"), not on every code edit. Covers OUTPUT correctness; FORMULA correctness is the methodology-scientist slot. +tools: Read, Bash, Grep, Glob +model: sonnet +--- + +You audit QuantRank's per-stock output JSON for data-correctness +bugs that would render incorrect details on the app's `/stock/ +[ticker]` pages. Your job is to find broken or suspicious data +BEFORE users see it — not to validate the underlying formulas (that +is `methodology-scientist`'s slot — see escalation table below). + +## Read these first (every invocation) + +1. `CLAUDE.md` §Phase status — current schema version, active + veto / annotate count, known gotchas (issues #7 / #10 / #11 / + #16 / #18) +2. `compute/output/schemas.py` — authoritative shape for + `StockDetail` / `Metadata` / `RawMetrics` / `PillarScores` / + `DataQuality` +3. The cron output: + - `frontend/public/data/metadata.json` + - `frontend/public/data/rankings.json` + - sample of `frontend/public/data/stocks/*.json` + +## Workflow + +### Step 1 — Recon (always) + +```bash +python3 -c " +import json, glob +md = json.load(open('frontend/public/data/metadata.json')) +rk = json.load(open('frontend/public/data/rankings.json')) +print('schema_version:', md.get('version') or md.get('schema_version')) +print('universe_size:', md.get('universe_size')) +print('git_commit:', md.get('git_commit')) +print('generated_at:', md.get('generated_at') or md.get('cron_ts')) +print('ranking count:', len(rk)) +print('files:', len(glob.glob('frontend/public/data/stocks/*.json'))) +" +``` + +### Step 2 — Deterministic outlier prefilter + +Walk all stock JSON files. Flag every ticker that violates any of +the rules below. Output a tight table grouped by severity. **No +LLM in this loop.** + +#### Range / shape rules (schema violations → always flag) + +- `composite_score` outside `[0, 100]` +- Any non-null entry in `pillar_scores.{quality, value, growth, + momentum, health, profitability, technical, risk, sentiment, + ml}` outside `[0, 100]` +- `current_price` ≤ 0 or None when `has_history` is True +- `market_cap` ≤ 0 or None +- `fair_price.median` ≤ 0 or > 10000 (the $10K ceiling guard + from `compute/valuation/ensemble.py`) +- `rank` ≤ 0 or > `metadata.universe_size` + +#### Consistency rules (input corruption → always flag) + +- `abs(market_cap - current_price * raw_metrics.shares_outstanding) + / market_cap > 0.05` — > 5% gap is **issue #10 + `shares_outstanding` territory**; expect overlap with the + `data_quality_input_corruption` flag (~12 tickers known affected) +- `raw_metrics.revenue < 0` (impossible for revenue) +- `raw_metrics.free_cash_flow != raw_metrics.operating_cash_flow - + raw_metrics.capex` within ±$1M tolerance, when all three present +- `abs(raw_metrics.eps_diluted) > 500` — likely XBRL fact unit + mis-parse (per-share value > $500 is essentially never real) +- `fair_price.mos_pct` outside `[-500, 500]` (absolute % — > 5× MoS + is data error, not signal) + +#### Rule 16 invariant (annotate-and-veto-Top-N) + +- `entered_top5 == True` AND `risk_flags` is non-empty → **Rule 16 + violation**, see `SKILL.md` Rule 16. The annotate-and-veto + contract requires a flagged top-5 stock to lose the badge. + +#### Known-issue overlap (don't double-report, note for context) + +- Ticker carries `data_quality_input_corruption` in `risk_flags` → + already caught by Step 7.5 sanity guard (issues #10 / #18) +- Ticker in Financials sector with `sloan_accruals_top_decile` + flag → known **issue #7** (Sloan over-fires on Financials) +- Ticker with `value_trap_risk` flag → may be **issue #11** noise + (single-period equity denominator) — cross-check whether RIM was + the only method dropped + +### Step 3 — LLM-judgment review (cap ≤ 20 tickers) + +Take the top-20 most-suspicious tickers from Step 2 (one row per +ticker; dedup if a ticker hit multiple rules; rank by severity +SCHEMA > CONSISTENCY > RULE_16 > KNOWN_ISSUE). For each: + +- Read the full `frontend/public/data/stocks/.json` +- Cross-reference `risk_flags`, `valuation_warnings`, and + `pillar_scores` to decide: **real_outlier** (data is plausible, + flag is informative) vs **broken_data** (something upstream + mis-parsed) +- For the `broken_data` verdict, point at the most likely upstream + cause: + - XBRL fact extraction → `compute/ingest/fundamentals.py` + - Price / market_cap → `compute/ingest/prices.py` + - 10-K narrative parse → `compute/ingest/filing_text.py` + - Sector classification → universe source (Wikipedia scrape) + +## Output discipline + +Reply with exactly this structure — terse. Under 400 words total. + +``` +Stock Detail Audit — + +Cron grounding: +- schema_version: +- universe_size: <502> +- git_commit: +- generated_at: + +Deterministic prefilter (Step 2): +- SCHEMA_VIOLATION: + · · · + ... +- CONSISTENCY_BUG: + · · · + ... +- RULE_16_VIOLATION: + · · entered_top5=True · risk_flags=[] +- KNOWN_ISSUE_OVERLAP: (deduped from above) + · · + +LLM-judgment (Step 3, ≤ 20): +- · · · +... + +Summary: / / / + violations. +Top suspicion: (). + +Next: . +``` + +## What you do NOT do + +- DO NOT modify `frontend/public/data/*.json` — frontend output is + CI-job-only per `AGENTS.md` §Boundaries. +- DO NOT propose threshold recalibrations — that's the methodology + layer's job, not yours. +- DO NOT validate the underlying formulas (Altman Z weights, Beneish + M coefficients, etc.) — scope is "is the data internally + consistent + within sane ranges", not "is the formula right". +- DO NOT touch more than 20 individual stock files in Step 3 — the + prefilter exists exactly to bound LLM-judgment cost. +- DO NOT spawn other agents from inside this agent — escalate via + the table below and let the user pick the next step. +- DO NOT re-derive the verification ladder; if the user wants the + full Section A-H scan, point them at + `python .claude/skills/verify-production-output/helper.py`. + +## Escalation paths + +If a finding falls outside this agent's scope, surface it in the +"Next" line and let the user spawn the specialist: + +| Finding category | Escalate to | +|---|---| +| Formula derivation looks wrong (e.g., Altman Z coefficients drift) | `methodology-scientist` | +| Schema shape mismatch (field missing / type wrong) | `schema-sentinel` | +| Defense-layer count vs prior run regressed | `defense-layer-auditor` | +| Specific ticker hangs SEC fetch / 429 / 403 | `edgar-debugger` | +| Multi-ticker pattern suggesting cron-wide corruption | `incident-commander` | +| Frontend rendering bug given correct data | `frontend-design-reviewer` | diff --git a/AGENTS.md b/AGENTS.md index 0f50858e5..3802aaeb4 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -88,7 +88,7 @@ frontend/ # Next.js static site (read/write OK) tests/ # pytest suite docs/ # Academic methodology + research findings .claude/skills/ # 42 loaded skills + phase-N/ planning docs -.claude/agents/ # 14 project-specific subagents in 4 tiers (core + lifecycle + specialized + operations; Claude Code only — Copilot / Cursor / Devin do not auto-route to these; full enterprise dev-team topology with 6 codified coordination flows) +.claude/agents/ # 15 subagents — Tier 1 Core 5 (incl. stock-detail-auditor for per-stock JSON correctness) + Tier 2 Lifecycle 4 + Tier 3 Specialized 4 + Tier 4 Operations 2; Claude Code only — Copilot / Cursor / Devin do not auto-route to these .claude/hooks/ # PostToolUse Bash hooks (log-bash.sh, schema-reminder.sh) wired by .claude/settings.json (Claude Code only — Copilot / Cursor / Devin ignore) .claude/settings.json # Claude Code harness config (hooks, permissions). Per-user overrides go in .claude/settings.local.json (gitignored) .github/workflows/ # ⚠️ ask before editing @@ -551,6 +551,17 @@ configuration. Two PostToolUse Bash hooks ship today: `schema-sentinel` subagent) before commit. Closes the local pre-commit gap left by the schema-drift CI guard. +The 15 subagents under `.claude/agents/` follow the **gate-moment +auto-routing policy** in [`CLAUDE.md`](CLAUDE.md) §Auto-routing +policy — most cues fire at "ready to push" / explicit ask / signal +event, not on every edit. This is the reduced-token policy +introduced after the original "spawn-on-every-diff" rule proved +too expensive. Notable Tier 1 addition: `stock-detail-auditor` for +data correctness of per-stock JSON the frontend renders (range / +consistency / Rule 16 / known-issue overlap; prefilter caps +LLM-judgment at ≤ 20 tickers per run; fires post-cron + pre-release ++ "ตรวจ data หุ้น"). + Both hooks are bash + `jq` only, 5-second timeout, fail-open on missing dependencies / unwritable filesystem / empty stdin. Copilot / Cursor / Devin do NOT execute `.claude/hooks/` — those tools diff --git a/CLAUDE.md b/CLAUDE.md index 648607fc4..b899f7ec8 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -31,7 +31,7 @@ backing. | `frontend/public/data/` | Compute output: `metadata.json` + `rankings.json` + `stocks/.json` | | `tests/` | pytest suite (offline + `@network` gated; see CI for current count) | | `.claude/skills/` | 42 invocation-triggerable skills + phase planning docs. See [`THIRD_PARTY_NOTICES.md`](THIRD_PARTY_NOTICES.md) for vendoring / license posture per source. | -| `.claude/agents/` | 14 project-specific subagents in 4 tiers: **Tier 1 Core** (quantrank-reviewer · schema-sentinel · defense-layer-auditor · edgar-debugger), **Tier 2 Lifecycle** (security-reviewer · frontend-design-reviewer · release-captain · phase-coordinator), **Tier 3 Specialized** (test-engineer · methodology-scientist · performance-engineer · dependency-auditor), **Tier 4 Operations** (docs-reviewer · incident-commander). Spawned via the `Agent` tool with a separate context window; see [`.claude/agents/README.md`](.claude/agents/README.md) for the routing matrix + 6 coordination flows (pre-push gate / release ladder / new-defense flow / incident response / review escalation / quarterly audit). | +| `.claude/agents/` | 15 project-specific subagents in 4 tiers + 1 data-correctness reviewer: **Tier 1 Core** (quantrank-reviewer · schema-sentinel · defense-layer-auditor · edgar-debugger · **stock-detail-auditor**), **Tier 2 Lifecycle** (security-reviewer · frontend-design-reviewer · release-captain · phase-coordinator), **Tier 3 Specialized** (test-engineer · methodology-scientist · performance-engineer · dependency-auditor), **Tier 4 Operations** (docs-reviewer · incident-commander). Spawned via the `Agent` tool with a separate context window; see [`.claude/agents/README.md`](.claude/agents/README.md) for the routing matrix + 6 coordination flows (pre-push gate / release ladder / new-defense flow / incident response / review escalation / quarterly audit). | | `.claude/hooks/` | Bash hook scripts wired by `.claude/settings.json`. 2 PostToolUse hooks: `log-bash.sh` (append every Bash command to gitignored `.claude/session.log`) + `schema-reminder.sh` (inject reminder when any file in the Pydantic↔TS↔snapshot triple is touched via Write/Edit). Both fail-open (missing `jq` / unwritable FS / empty stdin → exit 0). 5-second timeout each. | ## Commands @@ -115,53 +115,70 @@ for the full 4-step pattern + Section I forcing example. ## Auto-routing policy -Subagents under [`.claude/agents/`](.claude/agents/) auto-spawn on the -cues below. The main agent MUST spawn them without asking for -confirmation first — they are all read-only and their cost is bounded. -Only destructive commands a subagent *proposes* require user -authorization. - -| Cue / situation | Auto-spawn | Mode | +Subagents under [`.claude/agents/`](.claude/agents/) auto-spawn on +the cues below — **lean-by-design**. Most cues fire at GATE moments +(`ready to push` / explicit ask / signal event), **not on every +edit**. Each spawn costs a separate context window; the policy keeps +that cost bounded while preserving the safety net at decision +points. The hook layer covers per-edit reminders that don't need +LLM judgment. + +The main agent MUST spawn without asking for confirmation — all +subagents are read-only. Only destructive commands a subagent +*proposes* require user authorization. + +**Edits alone do NOT auto-spawn.** Editing `schemas.py` / +`compute/scoring/` / `frontend/components/` / docs no longer fires +an agent on the edit. The schema-triple hook covers Pydantic ↔ TS +reminders; everything else batches into one parallel review at +"ready to push". This is the change vs the original wide policy +that fired per-diff. + +| When | Auto-spawn | Notes | |---|---|---| -| Edit to `compute/output/schemas.py` / `frontend/lib/types.ts` / `frontend/lib/schema-snapshot.json` | `schema-sentinel` | Parallel with the edit; report PASS/FAIL before commit | -| Edit to anything under `compute/scoring/` or `compute/valuation/` | `quantrank-reviewer` after the edit set stabilizes; `defense-layer-auditor` if `frontend/public/data/` is committed in the same PR | Sequential — reviewer first, auditor on output | -| Edit to `frontend/components/` / `frontend/app/` | `frontend-design-reviewer` | Parallel; emits Playwright matrix for user spot-check | -| Test failure under `tests/test_ingest/` OR live-run hang OR `429`/`403` from SEC | `edgar-debugger` | On-demand, only when the failure signal appears | -| Edit to `.github/workflows/*.yml` OR new dep added to `pyproject.toml` / `frontend/package.json` OR new env-var read introduced | `security-reviewer` | Pre-push | -| User says "ก่อน push" / "ready to push" / "open PR" / "mark ready" / "ตรวจก่อน push" | `quantrank-reviewer` + `phase-coordinator` Mode B | Parallel pre-push gate | -| User says "tag release" / "cut a release" / "release vX.Y.Z" / "ตัด release" / phase-epic PR just merged | `release-captain` (acts as orchestrator and spawns the others as the ladder demands) | Owns the release ladder | +| Test failure under `tests/test_ingest/` OR live-run hang OR `429`/`403` from SEC | `edgar-debugger` | Signal-driven, on-demand | +| Weekly cron warm-cache > 10 min OR p95 latency > 20s | `performance-engineer` | Signal-driven, on detection | +| Dependabot alert lands OR new dep added to `pyproject.toml` / `frontend/package.json` | `dependency-auditor` + `security-reviewer` | Signal-driven, parallel | +| Production cron fails / hangs / produces corrupt output, OR Vercel deploy breaks, OR schema-snapshot CI fails, OR user says "production is broken" / "site is down" / "incident" | `incident-commander` (P1; orchestrator that fans out to relevant specialists) | Immediate | +| `workflow_dispatch` on `compute-rankings.yml` lands green | `defense-layer-auditor` Section A-J + Section I (Playwright) + `stock-detail-auditor` (per-stock data audit) | Auto post-cron, parallel | +| Quarterly cohort audit scheduled date reached (next 2026-08-19) | `methodology-scientist` Mode C + `defense-layer-auditor` | Scheduled, sequential | +| New defense flag proposed (new risk_flag in `compute/scoring/`) | `methodology-scientist` (validate paper anchor) + `test-engineer` (positive + negative tests) | Rare; sequential — methodology first | +| Threshold / weight constant changed in `compute/scoring/manipulation_index.py` or `earnings_quality.py` | `methodology-scientist` Mode B | Rare; on the edit | +| User says "ก่อน push" / "ready to push" / "open PR" / "mark ready" / "ตรวจก่อน push" | `quantrank-reviewer` + `phase-coordinator` Mode B. Conditional batch-mates on the same gate: `schema-sentinel` if schema triple touched · `defense-layer-auditor` if `compute/scoring/` or `compute/valuation/` touched · `frontend-design-reviewer` if `frontend/components/` or `frontend/app/` touched · `docs-reviewer` if any of the 7 docs modified · `security-reviewer` if `.github/workflows/` or new env-var or new dep touched · `test-engineer` if production code added without a test | Parallel pre-push gate; one report cycle | +| User says "ตรวจ data หุ้น" / "check stock data correctness" / "audit the output" / "verify the output" / "ตรวจ output" / pre-release | `stock-detail-auditor` (deterministic prefilter caps LLM-judgment at ≤ 20 tickers) | One sonnet spawn, bounded | +| User says "tag release" / "cut a release" / "release vX.Y.Z" / "ตัด release" / phase-epic PR just merged | `release-captain` (orchestrator; spawns ladder agents as needed) | Owns release ladder | | User asks to create a new `claude/*` branch from a handoff prompt | `phase-coordinator` Mode A | Before first non-trivial edit | | Phase / sub-PR marked complete on this branch | `phase-coordinator` Mode C | After merge / on close | -| `workflow_dispatch` on `compute-rankings.yml` lands green | `defense-layer-auditor` Section A-J + Section I (Playwright) | Automatic post-cron | -| New defense flag proposed (new risk_flag in `compute/scoring/`) | `methodology-scientist` (validate paper anchor) + `test-engineer` (positive + negative tests) | Sequential — methodology first | -| Threshold / weight constant changed in `compute/scoring/manipulation_index.py` or `earnings_quality.py` | `methodology-scientist` Mode B | On the edit | -| Production code added without a corresponding test in the same PR | `test-engineer` | Pre-push | -| Weekly cron warm-cache exceeds 10 min OR p95 latency > 20s | `performance-engineer` | On detection | -| Dependabot alert lands OR new dep added to `pyproject.toml` / `frontend/package.json` | `dependency-auditor` + `security-reviewer` | Parallel | -| Any of CLAUDE.md / AGENTS.md / SKILL.md / WORKFLOW.md / PHASE_STATUS.md / README.md / METHODOLOGY.md modified | `docs-reviewer` (substance check; complements `phase-coordinator` Mode B which checks file-touch) | Parallel with the edit | -| Production cron fails / hangs / produces corrupt output, OR Vercel deploy breaks, OR schema-snapshot CI fails, OR user says "production is broken" / "site is down" / "incident" | `incident-commander` (P1; orchestrator that fans out to relevant specialists) | Immediate | -| Quarterly cohort audit scheduled date reached (next 2026-08-19) | `methodology-scientist` Mode C + `defense-layer-auditor` | Sequential — scientist drives | +| Diff > 200 lines on `compute/scoring/` OR user says "full review" / "deep review" | `quantrank-reviewer` with `model: opus` override | Rare; user authorization required | ### Spawn discipline -- **Spawn without asking** for read-only subagents (all 8 are - read-only). Do not pause the user's flow with "should I spawn X?" - — just spawn and report back. +- **Default model = sonnet.** Opus only for cross-domain + orchestration (`incident-commander`, `release-captain`), + literature-heavy validation (`methodology-scientist` on new flag + / threshold), or large-diff reviews (`quantrank-reviewer` with + explicit user authorization for the opus override). 5 agents + currently default to opus by design; the rest are sonnet. +- **Spawn without asking** for read-only subagents — just spawn + and report back. Do not pause the user's flow with "should I + spawn X?". - **Ask before authorizing the destructive command** a subagent proposes (e.g., `release-captain` emits `git tag` + `git push - origin ` — that command needs explicit user authorization per - §Executing actions with care). -- **Skip auto-spawn** if the user explicitly says "skip the X agent", - "don't review this one", "I'll handle it manually" — note the skip - in chat and proceed. + origin ` — that command needs explicit user authorization + per §Executing actions with care). +- **Skip auto-spawn** if the user explicitly says "skip the X + agent", "don't review this one", "I'll handle it manually" — + note the skip in chat and proceed. - **De-duplicate**: if a subagent ran on the same diff within the - last ~10 minutes and the diff hasn't moved, don't re-spawn — point - to the prior result instead. -- **Parallel by default**: when multiple cues fire on the same edit - (e.g., `schemas.py` + `compute/scoring/*` together), spawn the - agents in parallel — they each have their own context window. -- **Disable per-session**: user can `/agents` → toggle off any agent - they don't want auto-routing this session. + last ~10 minutes and the diff hasn't moved, don't re-spawn — + point to the prior result instead. +- **Parallel at gate moments, not on every edit**: when multiple + conditional batch-mates fire at the "ready to push" gate, spawn + them in parallel — they each have their own context window, and + the user gets one consolidated report cycle. +- **Disable per-session**: user can `/agents` → toggle off any + agent they don't want auto-routing this session, or say "spawn + only on explicit ask this session" to force the strictest mode. ## Gotchas @@ -537,6 +554,34 @@ WARNs from the audit, not FAILs): `FairPriceBarChart.tsx` tabular-nums + verdict-badge shape; `RawMetricsTable` + `PillarRadarChart` loose-null 5 instances; `RankingTable` toolbar-search aria-label. +**Lean auto-routing + stock-detail-auditor in flight (this PR)** — +two coupled changes to the agent layer. **(a) Auto-routing policy +rewrite**: 17-row table collapsed and reshaped so most cues fire at +GATE moments (`ready to push` / explicit ask / signal event) instead +of on every edit. The schema-triple hook covers per-edit reminders +that don't need LLM judgment; everything else batches into one +parallel pre-push review. Edits to `compute/scoring/` / +`compute/valuation/` / `frontend/components/` / docs no longer +spawn an agent — they now ride the next "ready to push" gate as +conditional batch-mates. Spawn discipline updated: default model +sonnet (opus only for cross-domain orchestration / methodology / +large-diff with user authorization). Token economy paragraph added +to the §Auto-routing intro. **(b) New 15th agent**: Tier 1 +`stock-detail-auditor` (sonnet, read-only) audits per-stock JSON +the frontend renders. Step 2 deterministic prefilter walks the +~502-ticker universe for range / consistency (issue #10 +`shares_outstanding` gap > 5%, |eps_diluted| > 500 XBRL parse +errors, |mos_pct| > 500%) / Rule 16 invariant (entered_top5 + +risk_flags) / known-issue overlap (#7 Sloan-Financials, #11 +value_trap noise). Step 3 LLM-judgment review capped at ≤ 20 +tickers (real_outlier vs broken_data + upstream cause). Fires +post-cron, pre-release, "ตรวจ data หุ้น"; folded into release +ladder Flow 2 alongside `defense-layer-auditor`. Covers OUTPUT +correctness; FORMULA correctness remains methodology-scientist's +slot. No compute / schema / scoring / valuation / frontend code +change. Closes the gap left when the previous "wide" policy spawn +cost compounded across multi-file edits. + **Next deliverables** (pick by appetite): - **Phase 4.5e** — Form 4 insider clustering (~3w → v1.3.0; weight slots already declared in `FLAG_WEIGHTS`)