Skip to content

ERIE shares_outstanding=2,542 — XBRL dimensional-fact extraction failure (STZ-class pattern, sibling of #176) #246

@dackclup

Description

@dackclup

Symptom

frontend/public/data/stocks/ERIE.json (Sat 2026-05-23 9015748) reports raw_metrics.shares_outstanding = 2542.0 for Erie Indemnity Company. Real share count is ~57M (Class A 46M + Class B 2.5K, source: 10-K + Yahoo Finance). The extracted value is off by ~22,000×.

{
  "shares_outstanding": 2542.0,
  "market_cap": 569992.65,
  "risk_flags": ["data_quality_input_corruption"],
  "fair_price": {
    "methods": { /* all 6 methods skip with reason="data_quality_input_corruption" */ },
    "valuation_warnings": ["data_quality_input_corruption"]
  },
  "composite_score": 60.42,
  "rank": 69
}

Defense layer behavior

Veto chain works as designed: data_quality_input_corruption fires on risk_flags, every valuation method skips, median fair price is None.

⚠️ But ranking is still emitted — ERIE shows up at rank #69 with composite 60.42, because the composite is built from non-valuation pillars that don't observe shares_outstanding. Per Rule 16 (annotate-and-veto-Top-N), the rank itself is correct (the stock just can't enter Top-5); but a composite_score=60.42 with no valid valuation is a misleading data shape for downstream consumers (frontend RankingTable renders the number normally).

Root cause hypothesis

Same class as #176 (STZ) + #182 (per-filing XBRL fallback): SEC companyfacts aggregate API filters out dimensional facts. Erie's Class A + Class B share-count facts likely live only with dei:EntityCommonStockSharesOutstanding dimensional contexts, which the aggregate API returns as None. The PR #182 fallback _fetch_shares_from_per_filing_xbrl() should have caught this — but didn't, likely because:

  1. (Hypothesis 1) ERIE's most recent 10-K has the share count in a dimensional context the fallback doesn't enumerate, OR
  2. (Hypothesis 2) The fallback ran and returned 2542 from a fragment (e.g., one class only at a partial reporting period), not the sum across all class contexts.

The 2542 value is suspiciously close to the Class B share count (~2,541 per recent filings) — strongly suggests the fallback IS running but returning ONE class only, not summing across both. This is a different bug from STZ (which extracted None and triggered the fallback to compute the sum correctly).

Proposed fix scope

edgar-debugger to scope. Likely changes in compute/ingest/fundamentals.py:_fetch_shares_from_per_filing_xbrl:

  • Verify the sum-across-dimensions logic actually enumerates all contexts of the dei concept for filers with Class A + Class B (Erie, BRK-B siblings, Class-A/B/C stocks like GOOGL).
  • Add a sanity threshold: if extracted shares × current_price < $10M, log a warning + return None (let the upstream share_count_extraction_missing annotate fire instead of propagating a clearly-wrong value through the veto).

Acceptance criteria

  • Reproduce on a fresh python -m compute.main with the ERIE-only universe.
  • Identify which dimensional context the fallback is pulling.
  • Patch so ERIE reports shares_outstanding ≈ 57M (Class A sum, or Class A + B in shares-equivalent if accounting for the 2400:1 conversion ratio).
  • Add regression test under tests/test_ingest/test_fundamentals.py with a synthetic Class A + Class B XBRL fixture mimicking ERIE's shape.
  • Add a @network-gated drift-detector for ERIE alongside the existing STZ + AAPL + WMT pins.

Sibling tickers to spot-check post-fix

GOOGL (Class A + C), GOOG (Class C), BRK-B (Class B), LBRDA/LBRDK (Liberty Broadband Class A/K), DISCK/DISCA (legacy Discovery classes), FOX/FOXA, NWS/NWSA. All multi-class S&P 500 constituents are at risk of the same partial-extraction pattern.

Related

cc edgar-debugger

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions