feat(defense): add share_count_extraction_missing annotate (closes #176)#181
Merged
Conversation
New annotate-only defense flag closing the visibility gap surfaced by the stock-detail-auditor dry-run on PR #175. STZ on the 2026-05-14 cron shipped with market_cap=null and risk_flags=[] because shares_outstanding failed XBRL extraction despite revenue + balance sheet extracting cleanly. The new flag share_count_extraction_missing fires when `shares_outstanding is None AND revenue > 0 AND total_assets > 0` — narrow guard distinguishing partial XBRL extraction from "entire extraction broken" (issue #15 territory). Annotate-only per portable-annotate-before-veto: the existing data_quality_input_corruption veto keeps its shares_outstanding=None silence contract (issue #18 / test_D3) so the two pathways stay coherent. Asymmetry tests lock the None-vs-zero behavior since shares_outstanding=0 is a legitimate edge (not extraction failure). STZ is rank 308 so no Top-5 impact either way; promotion to veto deferred to the Q3 2026-08-19 quarterly cohort audit. Schema 0.9.5-phase4h.5 → 0.9.6-phase4h.6 for the new diagnostic Metadata.share_count_extraction_missing_count: int | None (Rule 18 observability shipped in the same PR as the flag emission). Defense layer headline 28 → 29 emitted boolean flags. Tests 1031 → 1040 (+9). The deeper XBRL-manifest fix (extend _FUNDAMENTALS_REQUIRED_ATTRS with share-class-scoped fact names + cover-page fallback) is a follow-up needing SEC live access. https://claude.ai/code/session_01HHo4UHKc9iKKytkKfxfVnA
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Contributor
Pre-merge production simulation
Diff vs main
Main baseline: Top-10 movers (sorted by |Δcomposite_score|)
|
dackclup
added a commit
that referenced
this pull request
May 21, 2026
…shares_outstanding (closes #176) (#182) Issue #176 ships in two PRs: - PR #181 (visibility) — annotate `share_count_extraction_missing` surfaces tickers where shares_outstanding extraction fails. - This PR (root cause) — actually recovers the missing share count. Live SEC probe (2026-05-21) confirmed STZ files `dei:EntityCommonStockSharesOutstanding` only with Class A / Class B share-class dimensions. The SEC `companyfacts` aggregate API filters out dimensional facts (companyconcept API returns HTTP 404 on STZ for both `dei:EntityCommonStockSharesOutstanding` and `us-gaap:CommonStockSharesOutstanding`), so the primary extraction path via `Company.facts.get_concept` returns None even though revenue + balance sheet extract cleanly. New `_fetch_shares_from_per_filing_xbrl(company)` pulls the most recent 10-K (falls back to 10-Q if none on file), aggregates `dei:EntityCommonStockSharesOutstanding` across all dimensional contexts at the most-recent `period_instant` (share-count facts are instant-type, not flow-type), and returns the sum. Falls back to `us-gaap:CommonStockSharesIssued` if the dei concept is empty. Wrapped in graceful-degradation try/except — any failure returns None and the upstream PR-#181 annotate keeps firing as the safety net. Triggered ONLY when the primary extraction returns None AND revenue > 0 AND total_assets > 0 (the PR-#181 signature), bounding universe-wide HTTP cost to ~1-3 tickers per cron (blast radius = 1 on the 2026-05-14 baseline). Live verification: STZ: 172.20M shares (Class A 172.17M + Class B 26K) ✓ AAPL: 14.78B ✓ WMT: 7.97B ✓ Tests 1040 → 1049: +9 offline mock tests covering positive STZ-signature aggregation, most-recent period_instant filter, us-gaap fallback chain, six graceful-degradation paths, plus 1 @network STZ live drift-detector (locks the period_instant column + get_facts_by_concept shape against future edgartools API drift). No schema change — operates at the snapshot-construction layer. https://claude.ai/code/session_01HHo4UHKc9iKKytkKfxfVnA Co-authored-by: Claude <noreply@anthropic.com>
This was referenced May 23, 2026
dackclup
added a commit
that referenced
this pull request
May 23, 2026
…ing (#220) Two cron-#3 audit follow-ups (2026-05-23) folded into one PR. Both surfaced by stock-detail-auditor + root-caused by edgar-debugger. ## Bug 1 — DD eps_diluted XBRL single-period mis-parse DD shipped 2026-05-23 with `eps_diluted=0.39` against `NI/shares = $7M/410M = $0.017` (~23× off). Root cause: `compute/main.py` `_build_raw_metrics` was passing `snapshot.eps_diluted` raw to RawMetrics. The XBRL `EarningsPerShareDiluted` concept returns the **latest single-period value** per `fundamentals.py:114-117` — for a quarterly filer that's one quarter's EPS, not TTM. `pe_ratio_ttm` was already on the NI/shares path since audit #6 / PR #49, so the valuation chain held internal consistency — but the EPS display field on /stock/DD rendered the wrong number to users. Fix: compute `ttm_eps = NI / shares` once and use for both `eps_basic` + `eps_diluted` display fields. The basic-vs-diluted spread on the S&P 500 is typically < 1-3% — within display precision. Negative net_income preserves sign (loss-year tickers show "−$0.42 EPS" not "null"). `pe_ratio_ttm` formula unchanged. ## Bug 2 — STZ shares_outstanding fallback silent failure STZ on 2026-05-23 ran with `shares_outstanding=null` + `market_cap= null` despite PR #182's per-filing XBRL fallback. 2026-05-21 live SEC probe confirmed the fallback works (returns 172.20M). Two days later under cron load it returned None silently. Root cause: PR #182's outer `except: return None` was bare — no log line on the failure path. The operator couldn't distinguish transient SEC 429 from structural XBRL drift without re-running a live probe. PR-3d amplification pattern parallel: graceful degradation correct, but silence kills observability. Fix: - Thread `ticker` arg into `_fetch_shares_from_per_filing_xbrl` (optional kwarg, back-compat preserved — existing 8 offline tests still pass without the kwarg) - Emit `logger.warning("shares_outstanding fallback FAILED for %s — %s: %s", ticker, type(e).__name__, e)` on the outer except; the message includes a note that `share_count_extraction_missing` annotate will fire as the safety net - Inner two `except` blocks (filings.head() + get_facts_by_concept) log at DEBUG so the failure mode is distinguishable in verbose mode without spamming default-level logs Annotate `share_count_extraction_missing` (PR #181) keeps firing upstream — this PR is observability-only, not a new recovery path. ## Tests (+7) `tests/test_main.py` (+5): - `test_build_raw_metrics_eps_diluted_derived_from_ni_not_xbrl_singleperiod` pins the DD case: snap.eps_diluted=0.39 + NI=7M + shares=410M → RawMetrics.eps_diluted = 7M/410M ≈ $0.017 (NOT 0.39) - `test_build_raw_metrics_eps_preserves_negative_sign_on_loss_year` loss-year shows signed EPS, pe_ttm null - `test_build_raw_metrics_eps_null_when_shares_outstanding_missing` STZ regression case — eps fields null, no exception - `test_build_raw_metrics_eps_null_when_zero_shares` defensive guard - `test_build_raw_metrics_pe_ttm_unchanged_by_eps_fix` audit-#6 / PR #49 regression guard — pe_ttm logic preserved `tests/test_ingest/test_fundamentals_xbrl_fallback.py` (+2): - `test_per_filing_fallback_emits_warning_on_outer_except` pins the logger.warning emission with caplog when get_filings raises - `test_per_filing_fallback_ticker_arg_optional_for_back_compat` pins the back-compat path — no ticker kwarg → "?" sentinel in the log message, existing call sites unbroken Tests 1049 → 1056 (+7). Pre-existing optional-deps skips (ipca / qlib / OSAP — `.[factors]` extra) unaffected. ## Verification - `ruff check .` — clean - `python -m pytest tests/ -m "not network" --ignore=tests/test_features/test_osap_e2e_integration.py --ignore=tests/test_ingest/test_osap.py` — **1103 passed, 7 skipped, 20 deselected** - No schema / Pydantic / TypeScript / snapshot triple touch - No frontend touch ## Issues filed in parallel (cron-#3 audit follow-ups) - **#217** stock-detail-auditor factor-exposure proxy heuristic (OSAP false-positive prevention) - **#218** verify-helper Section L OSAP invariant assertion ## Cross-references - Issue #176 / PR #181 / PR #182 — STZ shares_outstanding recovery ladder this PR completes - Audit #6 / PR #49 — `pe_ratio_ttm` NI/shares pattern this PR extends to `eps_diluted` / `eps_basic` display fields - 2026-05-23 cron #3 stock-detail-auditor + edgar-debugger reports Co-authored-by: Claude <noreply@anthropic.com>
dackclup
pushed a commit
that referenced
this pull request
May 24, 2026
Addresses release-captain BLOCKED-ON-PRE-FLIGHT blocker #3 from the v1.3.0 tag attempt — PHASE_STATUS.md / SKILL.md / WORKFLOW.md were 3 days + ~32 PRs stale (last touched PR #171, 2026-05-21). Brings all three docs current to main HEAD 1ff6c11 so the release-captain ladder can re-attempt cleanly. PHASE_STATUS.md - Header date 2026-05-21 → 2026-05-24 - Current state table: schema 0.9.4-phase4h.4 → 0.10.2-phase4.5e; defense layer 27 → 32 emitted flags; subagent inventory 14 → 18 (named tier roster — 4 opus / 14 sonnet); skill inventory 42 → 43; production run a16c887 → 9015748 (cron #3 2026-05-23); release- tag line annotated with v1.3.0 target pending - Recently-merged block: refreshed to PR #170 → PR #237 (~36 entries with commit shas, chronological), drops the stale PR #147-#169 block - Next-deliverables list: 5 items updated — Phase 4.5e PR 5 cluster weight promotion / Issue #67 sector-CoE flip / v1.3.0 release tag gate / Phase 4i.1-4j.1-4k.1 factor integrations / Phase 5 ML meta-learner - Open issues line: drops resolved #155 (closed via PR #160), refreshes #41 (15 open advisories, zero exploitability on static-export), refreshes #67 (data-collection merged PR #204) SKILL.md - Schema-version table: 7 new rows added in reverse-chronological order (matches existing 0.9.x convention) for `0.9.5` → `0.9.6` → `0.9.7` → `0.9.8` → `0.10.0` → `0.10.1` → `0.10.2` covering PRs #180/#181/#183/#204/#205/#222/#224. Each row carries PR # + 1-line scope + backward-compat note + literature anchor. WORKFLOW.md - Phase Overview table 4.5 row marked ✅ DONE 2026-05-23 + 10b5-1 filter scope note - SEC Filing Roadmap Form 4 row flipped "planned" → "active" with 4-PR ladder reference (#167/#205/#222/#224 + 100% coverage on cron #3) - Phase 4.5e task list — 5 items flipped `[ ]` → `[x]` with per-PR commits + methodology-scientist Mode B verdicts inline + Aboody et al. 2010 §3.2 weight-promotion gate noted - Phase 4.5 Acceptance Criteria — all 9 items flipped to `[x]` with completion evidence (cron #3 / methodology verdicts / PR refs) - Phase 4.5f tag item — flipped `[ ]` → `[x]` (`v1.2.0-phase4.5` cut 2026-05-17 at 6d414a9) PHASE_STATUS_INFLIGHT.md - Append new "(this PR)" entry under In-flight section per the PR #237 side-file convention. Documents the doc-refresh scope + cross-refs to release-captain blockers 1/2/4/5 still pending. Lockstep - PR #237's PHASE_STATUS_INFLIGHT.md side-file pattern handles the §Conventions "ship with every PR" rule for this doc-only PR - No CLAUDE.md / AGENTS.md substantive change required — the in-flight entry lives in the side-file per the new convention - No compute / schema / scoring / valuation / frontend / Python / TypeScript code change - Unblocks v1.3.0 tag blocker #3; blockers 1 (wrong-branch), 2 (pyproject.toml 0.3.0 → 1.3.0), 4 (production output 1 cron cycle behind code), and 5 (release notes draft scope) still need resolution before tag cut
5 tasks
dackclup
pushed a commit
that referenced
this pull request
May 24, 2026
Addresses release-captain BLOCKED-ON-PRE-FLIGHT blocker #3 from the v1.3.0 tag attempt — PHASE_STATUS.md / SKILL.md / WORKFLOW.md were 3 days + ~32 PRs stale (last touched PR #171, 2026-05-21). Brings all three docs current to main HEAD 1ff6c11 so the release-captain ladder can re-attempt cleanly. PHASE_STATUS.md - Header date 2026-05-21 → 2026-05-24 - Current state table: schema 0.9.4-phase4h.4 → 0.10.2-phase4.5e; defense layer 27 → 32 emitted flags; subagent inventory 14 → 18 (named tier roster — 4 opus / 14 sonnet); skill inventory 42 → 43; production run a16c887 → 9015748 (cron #3 2026-05-23); release- tag line annotated with v1.3.0 target pending - Recently-merged block: refreshed to PR #170 → PR #237 (~36 entries with commit shas, chronological), drops the stale PR #147-#169 block - Next-deliverables list: 5 items updated — Phase 4.5e PR 5 cluster weight promotion / Issue #67 sector-CoE flip / v1.3.0 release tag gate / Phase 4i.1-4j.1-4k.1 factor integrations / Phase 5 ML meta-learner - Open issues line: drops resolved #155 (closed via PR #160), refreshes #41 (15 open advisories, zero exploitability on static-export), refreshes #67 (data-collection merged PR #204) SKILL.md - Schema-version table: 7 new rows added in reverse-chronological order (matches existing 0.9.x convention) for `0.9.5` → `0.9.6` → `0.9.7` → `0.9.8` → `0.10.0` → `0.10.1` → `0.10.2` covering PRs #180/#181/#183/#204/#205/#222/#224. Each row carries PR # + 1-line scope + backward-compat note + literature anchor. WORKFLOW.md - Phase Overview table 4.5 row marked ✅ DONE 2026-05-23 + 10b5-1 filter scope note - SEC Filing Roadmap Form 4 row flipped "planned" → "active" with 4-PR ladder reference (#167/#205/#222/#224 + 100% coverage on cron #3) - Phase 4.5e task list — 5 items flipped `[ ]` → `[x]` with per-PR commits + methodology-scientist Mode B verdicts inline + Aboody et al. 2010 §3.2 weight-promotion gate noted - Phase 4.5 Acceptance Criteria — all 9 items flipped to `[x]` with completion evidence (cron #3 / methodology verdicts / PR refs) - Phase 4.5f tag item — flipped `[ ]` → `[x]` (`v1.2.0-phase4.5` cut 2026-05-17 at 6d414a9) PHASE_STATUS_INFLIGHT.md - Append new "(this PR)" entry under In-flight section per the PR #237 side-file convention. Documents the doc-refresh scope + cross-refs to release-captain blockers 1/2/4/5 still pending. Lockstep - PR #237's PHASE_STATUS_INFLIGHT.md side-file pattern handles the §Conventions "ship with every PR" rule for this doc-only PR - No CLAUDE.md / AGENTS.md substantive change required — the in-flight entry lives in the side-file per the new convention - No compute / schema / scoring / valuation / frontend / Python / TypeScript code change - Unblocks v1.3.0 tag blocker #3; blockers 1 (wrong-branch), 2 (pyproject.toml 0.3.0 → 1.3.0), 4 (production output 1 cron cycle behind code), and 5 (release notes draft scope) still need resolution before tag cut
dackclup
added a commit
that referenced
this pull request
May 24, 2026
…239) Addresses release-captain BLOCKED-ON-PRE-FLIGHT blocker #3 from the v1.3.0 tag attempt — PHASE_STATUS.md / SKILL.md / WORKFLOW.md were 3 days + ~32 PRs stale (last touched PR #171, 2026-05-21). Brings all three docs current to main HEAD 1ff6c11 so the release-captain ladder can re-attempt cleanly. PHASE_STATUS.md - Header date 2026-05-21 → 2026-05-24 - Current state table: schema 0.9.4-phase4h.4 → 0.10.2-phase4.5e; defense layer 27 → 32 emitted flags; subagent inventory 14 → 18 (named tier roster — 4 opus / 14 sonnet); skill inventory 42 → 43; production run a16c887 → 9015748 (cron #3 2026-05-23); release- tag line annotated with v1.3.0 target pending - Recently-merged block: refreshed to PR #170 → PR #237 (~36 entries with commit shas, chronological), drops the stale PR #147-#169 block - Next-deliverables list: 5 items updated — Phase 4.5e PR 5 cluster weight promotion / Issue #67 sector-CoE flip / v1.3.0 release tag gate / Phase 4i.1-4j.1-4k.1 factor integrations / Phase 5 ML meta-learner - Open issues line: drops resolved #155 (closed via PR #160), refreshes #41 (15 open advisories, zero exploitability on static-export), refreshes #67 (data-collection merged PR #204) SKILL.md - Schema-version table: 7 new rows added in reverse-chronological order (matches existing 0.9.x convention) for `0.9.5` → `0.9.6` → `0.9.7` → `0.9.8` → `0.10.0` → `0.10.1` → `0.10.2` covering PRs #180/#181/#183/#204/#205/#222/#224. Each row carries PR # + 1-line scope + backward-compat note + literature anchor. WORKFLOW.md - Phase Overview table 4.5 row marked ✅ DONE 2026-05-23 + 10b5-1 filter scope note - SEC Filing Roadmap Form 4 row flipped "planned" → "active" with 4-PR ladder reference (#167/#205/#222/#224 + 100% coverage on cron #3) - Phase 4.5e task list — 5 items flipped `[ ]` → `[x]` with per-PR commits + methodology-scientist Mode B verdicts inline + Aboody et al. 2010 §3.2 weight-promotion gate noted - Phase 4.5 Acceptance Criteria — all 9 items flipped to `[x]` with completion evidence (cron #3 / methodology verdicts / PR refs) - Phase 4.5f tag item — flipped `[ ]` → `[x]` (`v1.2.0-phase4.5` cut 2026-05-17 at 6d414a9) PHASE_STATUS_INFLIGHT.md - Append new "(this PR)" entry under In-flight section per the PR #237 side-file convention. Documents the doc-refresh scope + cross-refs to release-captain blockers 1/2/4/5 still pending. Lockstep - PR #237's PHASE_STATUS_INFLIGHT.md side-file pattern handles the §Conventions "ship with every PR" rule for this doc-only PR - No CLAUDE.md / AGENTS.md substantive change required — the in-flight entry lives in the side-file per the new convention - No compute / schema / scoring / valuation / frontend / Python / TypeScript code change - Unblocks v1.3.0 tag blocker #3; blockers 1 (wrong-branch), 2 (pyproject.toml 0.3.0 → 1.3.0), 4 (production output 1 cron cycle behind code), and 5 (release notes draft scope) still need resolution before tag cut Co-authored-by: Claude <noreply@anthropic.com>
This was referenced May 25, 2026
This was referenced Jun 2, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes the visibility gap raised by issue #176 (STZ on the 2026-05-14 cron shipped with
market_cap: nullandrisk_flags: []becauseshares_outstandingfailed XBRL extraction despite revenue + balance sheet extracting cleanly — likely cause: Constellation Brands files share-count via a share-class-scoped XBRL fact name that_FUNDAMENTALS_REQUIRED_ATTRSdoes not yet cover).This PR is the surface-the-bug fix (issue #176 step 4). The deeper root-cause fix — extending the XBRL manifest with share-class-scoped fact names + cover-page fallback — is a follow-up that needs SEC live access to discover the actual fact name STZ files under.
Approach trade-off (decided up-front)
data_quality_input_corruptionveto's existingshares_outstanding=Nonesilence contract (issue #18 /test_D3). Composite rank untouched. Perportable-annotate-before-veto.Veto promotion (Pattern 4 inside_data_quality_input_corruption)_snap()fixture updates across ~30 existing tests. Rejected.STZ is currently rank 308 so the choice does not affect Top-5 either way. Promotion to veto deferred to the Q3 2026-08-19 cohort audit after the first cron's firing rate confirms the pattern is narrow.
What changed
compute/scoring/risk_overlay.py— new public detectorcheck_share_count_extraction_missing(snap) -> bool. Pattern: fires iffshares_outstanding is None AND revenue > 0 AND total_assets > 0(narrow guard distinguishing partial extraction from full-blackout).compute/main.py— emitshare_count_extraction_missingtovaluation_warningsin the per-ticker loop; incrementshare_count_extraction_missing_count; thread intoMetadata(...).compute/output/schemas.py+frontend/lib/types.ts+frontend/lib/schema-snapshot.json— triple-lockstep: newMetadata.share_count_extraction_missing_count: int | None = None.compute/config.py— schema bump0.9.5-phase4h.5→0.9.6-phase4h.6(additive optional field → PATCH per schema-versioning convention).tests/test_scoring/test_risk_overlay.py— +9 unit tests covering positive STZ-signature + 7 silence guards + 1 explicit None-vs-zero asymmetry lock.tests/test_config.py— bump schema-version pin.CLAUDE.md+AGENTS.md+docs/METHODOLOGY.md— lockstep entries per §Conventions ship-with-every-PR rule.Defense layer + diagnostics
Metadata.share_count_extraction_missing_countships in the SAME PR as the flag emission so the next cron's firing rate is visible at-a-glance.shares_outstanding == 0is a legitimate edge (not extraction failure) and stays silent — onlyNonetriggers the annotate.Blast radius
Scan of
frontend/public/data/stocks/*.jsonon the 2026-05-14 cron: 1/502 tickers match the signature (STZ, rank 308). Single-stock blast radius confirms the narrow guard.Test plan
ruff check compute/ tests/cleanpython -m compute.output.schema_checkreports in-syncpython -m pytest -m "not network" -q— 1040 passed (1031 → 1040, +9 new tests)cd frontend && npx --no -- tsc --noEmitcleanvaluation_warnings; everyone else unchanged; rawcomposite_scoreper Rule 16 untouched)Constraints upheld
composite_scoreuntouched)portable-annotate-before-veto(annotate-first per project discipline)Follow-ups (out of scope)
_FUNDAMENTALS_REQUIRED_ATTRSwithdei:EntityCommonStockSharesOutstanding+ share-class-scoped variants; add 10-Q cover-page fallback. Needs SEC live access.https://claude.ai/code/session_01HHo4UHKc9iKKytkKfxfVnA
Generated by Claude Code