Symptom
frontend/public/data/stocks/ERIE.json (Sat 2026-05-23 9015748) reports raw_metrics.shares_outstanding = 2542.0 for Erie Indemnity Company. Real share count is ~57M (Class A 46M + Class B 2.5K, source: 10-K + Yahoo Finance). The extracted value is off by ~22,000×.
{
"shares_outstanding": 2542.0,
"market_cap": 569992.65,
"risk_flags": ["data_quality_input_corruption"],
"fair_price": {
"methods": { /* all 6 methods skip with reason="data_quality_input_corruption" */ },
"valuation_warnings": ["data_quality_input_corruption"]
},
"composite_score": 60.42,
"rank": 69
}
Defense layer behavior
✅ Veto chain works as designed: data_quality_input_corruption fires on risk_flags, every valuation method skips, median fair price is None.
⚠️ But ranking is still emitted — ERIE shows up at rank #69 with composite 60.42, because the composite is built from non-valuation pillars that don't observe shares_outstanding. Per Rule 16 (annotate-and-veto-Top-N), the rank itself is correct (the stock just can't enter Top-5); but a composite_score=60.42 with no valid valuation is a misleading data shape for downstream consumers (frontend RankingTable renders the number normally).
Root cause hypothesis
Same class as #176 (STZ) + #182 (per-filing XBRL fallback): SEC companyfacts aggregate API filters out dimensional facts. Erie's Class A + Class B share-count facts likely live only with dei:EntityCommonStockSharesOutstanding dimensional contexts, which the aggregate API returns as None. The PR #182 fallback _fetch_shares_from_per_filing_xbrl() should have caught this — but didn't, likely because:
- (Hypothesis 1) ERIE's most recent 10-K has the share count in a dimensional context the fallback doesn't enumerate, OR
- (Hypothesis 2) The fallback ran and returned
2542 from a fragment (e.g., one class only at a partial reporting period), not the sum across all class contexts.
The 2542 value is suspiciously close to the Class B share count (~2,541 per recent filings) — strongly suggests the fallback IS running but returning ONE class only, not summing across both. This is a different bug from STZ (which extracted None and triggered the fallback to compute the sum correctly).
Proposed fix scope
edgar-debugger to scope. Likely changes in compute/ingest/fundamentals.py:_fetch_shares_from_per_filing_xbrl:
- Verify the sum-across-dimensions logic actually enumerates all contexts of the dei concept for filers with Class A + Class B (Erie, BRK-B siblings, Class-A/B/C stocks like GOOGL).
- Add a sanity threshold: if extracted shares × current_price < $10M, log a warning + return
None (let the upstream share_count_extraction_missing annotate fire instead of propagating a clearly-wrong value through the veto).
Acceptance criteria
Sibling tickers to spot-check post-fix
GOOGL (Class A + C), GOOG (Class C), BRK-B (Class B), LBRDA/LBRDK (Liberty Broadband Class A/K), DISCK/DISCA (legacy Discovery classes), FOX/FOXA, NWS/NWSA. All multi-class S&P 500 constituents are at risk of the same partial-extraction pattern.
Related
cc edgar-debugger
Symptom
frontend/public/data/stocks/ERIE.json(Sat 2026-05-239015748) reportsraw_metrics.shares_outstanding = 2542.0for Erie Indemnity Company. Real share count is ~57M (Class A 46M + Class B 2.5K, source: 10-K + Yahoo Finance). The extracted value is off by ~22,000×.{ "shares_outstanding": 2542.0, "market_cap": 569992.65, "risk_flags": ["data_quality_input_corruption"], "fair_price": { "methods": { /* all 6 methods skip with reason="data_quality_input_corruption" */ }, "valuation_warnings": ["data_quality_input_corruption"] }, "composite_score": 60.42, "rank": 69 }Defense layer behavior
✅ Veto chain works as designed:
data_quality_input_corruptionfires onrisk_flags, every valuation method skips, median fair price isNone.shares_outstanding. Per Rule 16 (annotate-and-veto-Top-N), the rank itself is correct (the stock just can't enter Top-5); but acomposite_score=60.42with no valid valuation is a misleading data shape for downstream consumers (frontendRankingTablerenders the number normally).Root cause hypothesis
Same class as #176 (STZ) + #182 (per-filing XBRL fallback): SEC
companyfactsaggregate API filters out dimensional facts. Erie's Class A + Class B share-count facts likely live only withdei:EntityCommonStockSharesOutstandingdimensional contexts, which the aggregate API returns asNone. The PR #182 fallback_fetch_shares_from_per_filing_xbrl()should have caught this — but didn't, likely because:2542from a fragment (e.g., one class only at a partial reporting period), not the sum across all class contexts.The
2542value is suspiciously close to the Class B share count (~2,541 per recent filings) — strongly suggests the fallback IS running but returning ONE class only, not summing across both. This is a different bug from STZ (which extractedNoneand triggered the fallback to compute the sum correctly).Proposed fix scope
edgar-debuggerto scope. Likely changes incompute/ingest/fundamentals.py:_fetch_shares_from_per_filing_xbrl:None(let the upstreamshare_count_extraction_missingannotate fire instead of propagating a clearly-wrong value through the veto).Acceptance criteria
python -m compute.mainwith the ERIE-only universe.shares_outstanding ≈ 57M(Class A sum, or Class A + B in shares-equivalent if accounting for the 2400:1 conversion ratio).tests/test_ingest/test_fundamentals.pywith a synthetic Class A + Class B XBRL fixture mimicking ERIE's shape.@network-gated drift-detector for ERIE alongside the existing STZ + AAPL + WMT pins.Sibling tickers to spot-check post-fix
GOOGL (Class A + C), GOOG (Class C), BRK-B (Class B), LBRDA/LBRDK (Liberty Broadband Class A/K), DISCK/DISCA (legacy Discovery classes), FOX/FOXA, NWS/NWSA. All multi-class S&P 500 constituents are at risk of the same partial-extraction pattern.
Related
market_cap: null(visibility annotate)share_count_extraction_missingannotate (the annotate that would fire if fallback returnedNoneinstead of2542)cc
edgar-debugger