feat(scoring): Issue #261 PR-A — multi_class_aggregate_shares_suspected annotate (CIK-collision detector)#264
Merged
Conversation
…ed annotate (CIK-collision detector) Closes the observability half of issue #261 (GOOG/GOOGL multi-class shares overcount; $4.6T market_cap displayed for ~$1T per class). Methodology-scientist Mode B verdict 2026-05-26: NEEDS-MORE-CALIBRATION — split into PR-A (annotate-only, this PR) + PR-B (reverse-allowlist per-class XBRL extraction, next PR). New module compute/scoring/multi_class_shares.py: - detect_multi_class_aggregate_shares_suspected(cik_by_ticker, market_cap_by_ticker) -> set[str] - Trigger: ticker's CIK collides with ≥ 1 other ticker AND market_cap > 10% × universe-median(market_cap) - MARKET_CAP_FLOOR_RATIO = 0.10 (Damodaran 2019 Ch. 16 + Mode B verdict; recalibration target Q3 2026-08-19) - Pure function; graceful on None CIK / None market_cap inputs Wired into compute/main.py: - Pre-compute cik_by_ticker + market_cap_by_ticker dicts BEFORE the per-ticker scoring loop (universe-level scan) - Per-ticker emit inside loop alongside cross_source_disagreement - Counter wired to Metadata.multi_class_aggregate_shares_suspected_count Schema bump 0.10.4 → 0.10.5-phase4.5e (PATCH; additive Metadata field). Triple lockstep: schemas.py + types.ts + snapshot.json. ZERO behavior change for 496 non-colliding S&P 500 tickers — composite / risk_flags / fair_price / top5 unchanged. 6 multi- class tickers (GOOG / GOOGL / NWS / NWSA / FOX / FOXA) gain the annotate in valuation_warnings. Edgar-debugger 2026-05-26 live probe (Alphabet 10-K accession 0001652044-26-000018) confirms per-class dimensional contexts ARE available in XBRL for PR-B structural fix. Critical gotcha for PR-B: GOOG Class C uses filer-specific namespace `goog:CapitalClassCMember` NOT standard `us-gaap:CommonClassCMember` — allowlist must key on filer-namespace member. Tests: 1216 passing offline (+13 new in test_multi_class_shares.py: empty universe / no-collision / canonical GOOG-GOOGL / micro-class below-floor / partial-above-floor / None-CIK / None-mcap / all-None / 3-way collision / threshold-boundary / Hypothesis subset property + test_config.py schema version pin updated). Verification: - ruff check . — All checks passed - python -m compute.output.schema_check — clean - pytest tests/ -m "not network" — 1216 passed, 7 skipped (ipca/qlib not installed), 24 deselected PHASE_STATUS_INFLIGHT.md side-file entry satisfies §Conventions "ship with every PR" lockstep per PR #237 convention. No CLAUDE.md / AGENTS.md substance change — the annotate doesn't introduce a new invariant; the pattern is already covered in §Gotchas under shares-extraction. https://claude.ai/code/session_01JwntEE4PNAXSMkZxRA9BB4
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
CI ci-triage-engineer verdict: ruff I001 fired in CI on tests/test_scoring/test_multi_class_shares.py:14 because `from hypothesis import given, strategies as st` was joined as one statement, which ruff 0.4 splits + reorders to two separate `from hypothesis import ...` lines per the project's isort config. Local `ruff check .` passed before push (likely cached/diff scope), CI runs a clean ruff invocation on the full tree. Applied via `ruff check --fix tests/test_scoring/test_multi_class_shares.py` — pure mechanical sort/split, no behavior change. 12 tests still pass (test_multi_class_shares.py). https://claude.ai/code/session_01JwntEE4PNAXSMkZxRA9BB4
8 tasks
dackclup
added a commit
that referenced
this pull request
May 26, 2026
…utput_anomalous + writer-parity for veto cohort UI (#265) Closes issue #262 (DQIC dual-surface emission inconsistency) per methodology-scientist Mode B verdict 2026-05-26 (APPROVED-AS-ANNOTATE, Path 3 = rename). The bug: data_quality_input_corruption emitted from TWO independent check sites with different trigger conditions: - Site 1 (risk_overlay.py:411) — INPUT-level corruption; risk_flags VETO - Site 2 (ensemble.py:545) — OUTPUT-level anomaly; valuation_warnings ANNOTATE Site 2's check is strictly broader. Universe scan 2026-05-23 cron #3: - 2 tickers fire BOTH (ERIE rank 69, BRK-B rank 223) - 4 tickers fire Site 1 ONLY (MTB/CPT/MRNA/HBAN) — UI explainability gap - 1 ticker fires Site 2 ONLY (NVR rank 267) — Top-5-safety gap if rose The bigger smell than NVR Top-5 risk is the UI gap: FairPriceCard.tsx:82 reads only valuation_warnings, so MTB/CPT/MRNA/HBAN render all-null fair-price with NO explanation chip. Path 3 fix: - Rename Site 2 (ensemble.py:533+545) → valuation_output_anomalous - Writer-parity in compute/main.py: when DQIC veto fires in risk_flags, ALSO emit valuation_output_anomalous to valuation_warnings (closes UI gap for veto-only cohort) - applicability.py SKIP_REASONS taxonomy gains valuation_output_anomalous (count 25 → 26); legacy data_quality_input_corruption retained for backward-compat on pre-rename JSON snapshots - sanity.py:83 IC-smoke exclusion ORs both identifiers - FairPriceCard.tsx:82 dataQualityIssue check ORs both identifiers Tests: - test_ensemble.py 4 assertions — assert new identifier on Site 2 emit - test_tier2_schema.py::test_B4_skip_reasons_count → 25 to 26 - test_sanity_smoke.py / test_recommendation.py unchanged (legacy-snapshot path verified via OR check; veto identifier unchanged) No schema bump (string-identifier rename only). SCHEMA_VERSION stays 0.10.5-phase4.5e. Triple lockstep unchanged. ZERO composite-rank impact — composite scores / risk_flags VETO identifiers / Top-5 rotation unchanged. Verification: ruff clean / schema_check clean / pytest 1216 passed (unchanged from PR #264 baseline). PHASE_STATUS_INFLIGHT.md side-file entry satisfies §Conventions "ship with every PR" lockstep per PR #237 convention. No CLAUDE.md / AGENTS.md substance change — rename doesn't introduce new invariant; methodology verdict documented in INFLIGHT entry. https://claude.ai/code/session_01JwntEE4PNAXSMkZxRA9BB4 Co-authored-by: Claude <noreply@anthropic.com>
4 tasks
dackclup
added a commit
that referenced
this pull request
May 26, 2026
…Craft frontend (#266) Cuts the v1.3.0-phase4.5e release tag, closing the Phase 4.5e Form-4 insider-clustering ladder (PRs #167+#205+#222+#224+#238) and shipping the LedgerCraft frontend reskin (A1-A3+B1-B4+animation PRs 1-3+#244 polish+dark-mode tooltip fixes through PR #263) since v1.2.0-phase4.5 (6d414a9, 2026-05-17). Scope (3 files): - pyproject.toml — version 0.3.0 → 1.3.0 - docs/release-notes/v1.3.0-phase4.5e.md (NEW) — release body grouped by Form-4 cluster / data-quality / defense layer / frontend / methodology + agent infra / CI hygiene; ~800 words - PHASE_STATUS.md — Current state schema 0.10.4 → 0.10.5-phase4.5e, defense layer headline 32 → 33 declared flags, production-run pointer refreshed to 26423296287 Pre-flight ladder verified by release-captain (opus): - ruff clean - pytest 1216 passed (offline) - schema_check in sync at 0.10.5-phase4.5e - verify-production-output Section A-G + I-L PASS; Section H 1 known FAIL (orphan BK.json legacy snapshot, pre-existing) - frontend build verified via vercel-preview-auditor (sonnet) on main HEAD e6013ba — 506/506 routes compiled, types validated, runtime clean, 3-route UA probe PASS Defense scorecard: 7 active vetoes unchanged (altman_distress / sloan_accruals_top_decile / net_issuance_top_decile / non_reliance_filing / beneish_manipulation_veto / dechow_manipulation_veto / data_quality_input_corruption). Headline 32 → 33 declared boolean flags (adds multi_class_aggregate_shares_suspected per PR #264; PR #265 DQIC rename is identifier-shape, not new flag). Production output: metadata.json reports 0.10.4-phase4.5e from cron #4 (2026-05-26T01:12); next weekday cron Wed 2026-05-27 22:00 UTC re-renders at full 0.10.5-phase4.5e semantics. Tag is anchored to code, not last committed snapshot per release-tag SKILL.md §Gotchas. CVE baseline 25 → 15 open (0C/6H/7M/2L); all 15 are next@14.x SSR advisories with zero exploitability on static-export. Post-merge: tag command + GitHub Release creation require explicit user authorization per CLAUDE.md §Executing actions with care. https://claude.ai/code/session_01JwntEE4PNAXSMkZxRA9BB4 Co-authored-by: Claude <noreply@anthropic.com>
11 tasks
dackclup
added a commit
that referenced
this pull request
May 26, 2026
… fix for GOOG/GOOGL $4.6T overcount) (#269) Closes the structural half of issue #261 — the OVERCOUNT pattern where SEC companyfacts returns Alphabet's 12.12B total shares to both per-class tickers, producing $4.6T market_cap per ticker vs real ~$1.05T per class. PR-A (PR #264, merged) shipped the multi_class_aggregate_shares_suspected annotate; this PR-B ships the actual fix. Per methodology-scientist Mode B 2026-05-26 (Path 1 reverse- allowlist) + edgar-debugger live probe (Alphabet 10-K accession 0001652044-26-000018). compute/config.py: - New MULTI_CLASS_OVERCOUNT_ALLOWLIST: dict[str, str]: - GOOGL → "us-gaap:CommonClassAMember" (standard namespace) - GOOG → "goog:CapitalClassCMember" (FILER-SPECIFIC namespace gotcha — caught by edgar-debugger probe) - SCHEMA_VERSION bump 0.10.5 → 0.10.6-phase4.5e compute/ingest/fundamentals.py: - Extended _fetch_shares_from_per_filing_xbrl with target_class_member parameter (None=sum-all PR #182 STZ pattern; set=filter to specific class member via xbrl.contexts[ref].dimensions lookup) - New elif branch in _build_snapshot fires when ticker in allowlist + primary plausible + QR_SKIP_FUNDAMENTALS not set; overrides primary IFF per_class < primary; defensive mc_reconcile_failure counter for sanity-check failures (per_class >= primary OR per_class fraction outside 5-95% of primary) compute/output/schemas.py + frontend/lib/types.ts + snapshot: - Triple lockstep — 2 additive Metadata fields: multi_class_per_class_override_count (expected steady-state ≈ 2) multi_class_mc_reconcile_failure_count (defensive Rule-18 guard) compute/main.py — wire counters from _FALLBACK_STATS Tests +9 (1216 → 1225): - test_config.py: schema pin update, allowlist membership pin, disjoint-allowlist invariant test - test_ingest/test_fundamentals.py: GOOG override (filer-namespace), GOOGL override (standard namespace), non-allowlist ticker doesn't fire, QR_SKIP_FUNDAMENTALS escape-hatch, per_class >= primary sanity skip, mc_reconcile warning on <5% fraction, None return silently skipped. Plus 3 existing _FALLBACK_STATS tests updated to new 5-key dict shape (was 3 keys). ZERO behavior change for 500 non-allowlist tickers. 2 allowlist tickers (GOOG/GOOGL) gain corrected shares_outstanding (~5.4B/~5.8B from prior 12.12B overcount); flows through to market_cap (~$4.6T → ~$1.05T per class), pe_ratio_ttm, fair-price ensemble. The multi_class_aggregate_shares_suspected annotate (PR-A) continues to fire correctly (CIK collision invariant holds). Verification: - ruff clean - python -m compute.output.schema_check — triple in sync at 0.10.6 - pytest 1225 passed (offline), 7 skipped (factors extras), 24 deselected (@network — GOOG/GOOGL live drift-detector deferred) PHASE_STATUS_INFLIGHT.md side-file satisfies §Conventions lockstep per PR #237 convention. https://claude.ai/code/session_01JwntEE4PNAXSMkZxRA9BB4 Co-authored-by: Claude <noreply@anthropic.com>
7 tasks
dackclup
added a commit
that referenced
this pull request
May 27, 2026
…AUDE.md (#271) Refactors a user-shared research report (Master Prompt + 6 phase sub-prompts + CLAUDE.md template) into the existing doc surface without creating a new .claude/skills/agentic-6-phase/ skill. The report's underlying logic is already implemented in the 18 subagents + CLAUDE.md §Auto-routing; what was genuinely missing was a 6-phase mapping table a new session can scan in < 30 sec on top of the 9 phases. Scope (2 substance files + 1 INFLIGHT entry): - WORKFLOW.md — new section "Agentic 6-Phase Cadence" between §"Tools You'll Use Daily" and §"Phase Overview". Mapping table (Step × Fire trigger × Subagent(s) × Done when) over Planning → Code Gen → Integration → Test → Deploy → Monitor + 5 cadence invariants. Reuses the 18 standing subagents — no new agent files. Session-start protocol cites schema 0.10.5-phase4.5e (PRs #264 + #265; cron #4 still at 0.10.4, next cron Wed 2026-05-27 re-renders at 0.10.5), defense layer 33 declared = 7 vetoes + 26 annotates, tag v1.3.0-phase4.5e, CVE baseline 15 open (0C / 6H / 7M / 2L) after PR #194 patch + PR #226 triage. - CLAUDE.md — new §Conventions bullet "Session-start phase identification" (~5 lines) pointing readers at PHASE_STATUS.md §"Current state" + WORKFLOW.md §"Agentic 6-Phase Cadence" using the standing 18 subagents. - PHASE_STATUS_INFLIGHT.md — new in-flight entry per PR #237 side-file lockstep convention. Out of scope (deliberately NOT done per user direction 2026-05-27): - NO .claude/skills/agentic-6-phase/ — overhead exceeds benefit - NO Master Prompt / phase sub-prompts copied into the repo - NO edits to any of the 18 subagent files under .claude/agents/ - NO AGENTS.md substance edit — the cadence is Claude-Code-subagent- specific; cross-tool agents would route differently. INFLIGHT entry satisfies §Conventions "ship with every PR" lockstep. docs-reviewer verdict (2026-05-27, agent id a2c87ed3679f55fe5): NEEDS-CROSS-REF-FIX — both items applied in this commit: 1. CVE attribution: "after PR #226 triage" → "after PR #194 patch + PR #226 triage" (PR #194 closed the 10 advisories; PR #226 documented the resulting state) 2. Step 4 fire-trigger col: "Sections A-J" → "Sections A-L" (Section L added by PR #221 OSAP proxy invariant; internal match with the same row's Done-when col) All else passes: 4 cited numbers, 18 agent names, 3 cross-refs, token budget (WORKFLOW ≤ 1 page, CLAUDE ≤ 5 lines), Rule 16 + Rule 18 no contradiction. Pre-existing SKILL.md schema-version table gap (rows for 0.10.5-phase4.5e PR #264 + valuation_output_anomalous rename PR #265 missing) escalated to schema-sentinel as separate doc-only PR per docs-reviewer recommendation — not blocking on this scope. Verification: - ruff check . — N/A (no Python) - python -m compute.output.schema_check — N/A (no schemas) - pytest tests/ -m "not network" — N/A (no test surface) - docs-reviewer subagent — PASS after the 2 fixes above Co-authored-by: Claude <noreply@anthropic.com>
2 tasks
dackclup
pushed a commit
that referenced
this pull request
May 28, 2026
Addresses 3 findings from docs-reviewer (sonnet) substance review of the housekeeping PR (commit e060cb9): 1. PHASE_STATUS.md:98 — "v1.3.0 release tag" entry in §Next deliverables was stale (v1.3.0 shipped 2026-05-26, v1.4.0 shipped 2026-05-27). Replaced with forward-look for v1.5.0 gated on Phase 4.5e PR 5 + Issue #67 sector-CoE flip. 2. WORKFLOW.md:730 — Phase 4.5 historical task list trailing clause "v1.3.0 target pending release-captain ladder (LedgerCraft A-B series + this doc-refresh)" was stale. Replaced with the actual landing dates + SHAs for both v1.3.0 (5db3b97) and v1.4.0 (bbca9ca). 3. docs/METHODOLOGY.md — annotate-only flag count out of sync with declared flags: - Line 16: "21 annotate-only flags" → "23 annotate-only flags" - Line 164: section heading "(21)" → "(23)" - Added 2 new bullets at end of §Annotate-only flags section: - `multi_class_aggregate_shares_suspected` (Issue #261 PR-A, PR #264) — CIK-collision detector; identity-equation check per Damodaran 2019 Ch. 16; no academic prior (data-quality) - `valuation_output_anomalous` (Issue #262, PR #265) — Site-2 output-anomaly detector renamed from `data_quality_input_corruption`; no academic prior (data-quality); semantic distinction from Site-1 input-corruption veto documented inline Both new METHODOLOGY entries are data-quality detectors with no new academic claim — `multi_class_aggregate_shares_suspected` cites the existing Damodaran 2019 anchor (already in the file) for the per-class market-cap identity equation, and `valuation_output_anomalous` is the renamed Site-2 emission of the existing `data_quality_input_corruption` defense. Neither requires methodology-scientist verdict per the "Internal — data-quality" pattern shared with `goodwill_heavy` / `data_quality_input_corruption` Site-1 already in the section.
dackclup
added a commit
that referenced
this pull request
May 28, 2026
…v1.4.0 (#286) * chore(docs): housekeeping PR-B — drain INFLIGHT + bump pointers post-v1.4.0 Phase B post-tag housekeeping. Drains 7 stale `(in flight, ...)` markers from PHASE_STATUS_INFLIGHT.md (PRs #269, #267, #271, #280, #281, #282, #285) to `(merged YYYY-MM-DD, <SHA>)`, and bumps stale schema/tag pointers across CLAUDE.md / PHASE_STATUS.md / SKILL.md / WORKFLOW.md to reflect the v1.4.0-phase4.6 release at `bbca9cac`. Changes: - CLAUDE.md §Phase status — schema `0.10.2-phase4.5e` → `0.10.7-phase4.6`, tag `v1.3.0-phase4.5e` → `v1.4.0-phase4.6` (2026-05-27, `bbca9cac`), "Recently merged" list refreshed from PRs #147-#154 → PRs #264-#285 - PHASE_STATUS.md §Current state — mirrored pointer bump; production-run pointer → `559c5269` (cron-#5 2026-05-27 chore commit); "Recently merged" prepended with 22 entries since v1.3.0, legacy list relabeled as "Earlier" - SKILL.md schema-version table — 3 new rows: `0.10.7-phase4.6` (PR #283 release), `0.10.6-phase4.5e` (PR #269 GOOG/GOOGL per-class XBRL fix), `0.10.5-phase4.5e` (PR #264 multi-class CIK detector) - WORKFLOW.md §Agentic 6-Phase Cadence Session-start protocol — pointer block bumped to current state - PHASE_STATUS_INFLIGHT.md — 7 stale markers drained + new entry for this PR appended at end AGENTS.md substance untouched per the existing delegation pattern (line 372-375: "Canonical 'current state' lives in CLAUDE.md §Phase status. Schema-version history table is in SKILL.md."). Cross-tool agents reading state pull from CLAUDE.md as the source of truth. Doc-only PR — `ruff` / `schema_check` trivially pass; no compute / schema / scoring / valuation / frontend / Python / TS code change. * docs(review-fix): docs-reviewer NEEDS-CROSS-REF-FIX — 3 items Addresses 3 findings from docs-reviewer (sonnet) substance review of the housekeeping PR (commit e060cb9): 1. PHASE_STATUS.md:98 — "v1.3.0 release tag" entry in §Next deliverables was stale (v1.3.0 shipped 2026-05-26, v1.4.0 shipped 2026-05-27). Replaced with forward-look for v1.5.0 gated on Phase 4.5e PR 5 + Issue #67 sector-CoE flip. 2. WORKFLOW.md:730 — Phase 4.5 historical task list trailing clause "v1.3.0 target pending release-captain ladder (LedgerCraft A-B series + this doc-refresh)" was stale. Replaced with the actual landing dates + SHAs for both v1.3.0 (5db3b97) and v1.4.0 (bbca9ca). 3. docs/METHODOLOGY.md — annotate-only flag count out of sync with declared flags: - Line 16: "21 annotate-only flags" → "23 annotate-only flags" - Line 164: section heading "(21)" → "(23)" - Added 2 new bullets at end of §Annotate-only flags section: - `multi_class_aggregate_shares_suspected` (Issue #261 PR-A, PR #264) — CIK-collision detector; identity-equation check per Damodaran 2019 Ch. 16; no academic prior (data-quality) - `valuation_output_anomalous` (Issue #262, PR #265) — Site-2 output-anomaly detector renamed from `data_quality_input_corruption`; no academic prior (data-quality); semantic distinction from Site-1 input-corruption veto documented inline Both new METHODOLOGY entries are data-quality detectors with no new academic claim — `multi_class_aggregate_shares_suspected` cites the existing Damodaran 2019 anchor (already in the file) for the per-class market-cap identity equation, and `valuation_output_anomalous` is the renamed Site-2 emission of the existing `data_quality_input_corruption` defense. Neither requires methodology-scientist verdict per the "Internal — data-quality" pattern shared with `goodwill_heavy` / `data_quality_input_corruption` Site-1 already in the section. --------- Co-authored-by: Claude <noreply@anthropic.com>
This was referenced May 28, 2026
dackclup
added a commit
that referenced
this pull request
May 28, 2026
…ket_cap` 2.2× inflated) (#292) * fix(ingest): Issue #288 — GOOG/GOOGL XBRL concept-name omission Closes #288. `multi_class_per_class_override_count = 0` on every production cron since PR #269 landed 2026-05-26 — both GOOG and GOOGL rendered inflated `market_cap` (~$4.66T / $4.71T) instead of correct per-class values (~$2.09T / $2.59T). Root cause (edgar-debugger verdict 2026-05-28): `compute/ingest/fundamentals.py:735` `_fetch_shares_from_per_filing_xbrl` filter mode queried only 2 XBRL concepts (`dei:EntityCommonStockSharesOutstanding` + `us-gaap:CommonStockSharesIssued`). Alphabet's 10-K files per-class share counts under `us-gaap:CommonStockSharesOutstanding` — the missing 3rd concept. Primary path at lines 115-124 already queries all 3 in this order; XBRL fallback path drifted out of parity. Existing tests at `test_fundamentals.py:822-857` mock `_fetch_shares_from_per_filing_xbrl` entirely (`return_value=per_class`) — they confirm `_build_snapshot` Branch-3 trigger but never exercise the actual concept-lookup path. Bug survived the test suite. Fix (9 files): - `compute/ingest/fundamentals.py:735-749` — add `us-gaap:CommonStockSharesOutstanding` to the concept tuple (between the 2 existing entries, matching primary path order); fix misleading docstring at lines 686-687 - `compute/ingest/fundamentals.py:48-71` — `_FALLBACK_STATS` dict gains `"per_class_attempt": 0`; reset wired in `reset_fallback_stats()` - `compute/ingest/fundamentals.py:~1030` — increment `per_class_attempt` AT TOP of Branch 3 elif (before XBRL call), so the counter captures "branch entered" regardless of XBRL success - `compute/config.py:30` — schema PATCH bump `0.10.7-phase4.6 → 0.10.8-phase4.6` - `compute/output/schemas.py:~340` — new `Metadata.multi_class_per_class_attempt_count: int | None = None` field (Rule 18 disambiguator) - `compute/main.py:~2023` — wire `multi_class_per_class_attempt_count` to Metadata construction - `frontend/lib/types.ts:~233` — mirror TS field - `frontend/lib/schema-snapshot.json` — regenerated via `--update-snapshot` - `tests/test_config.py` — schema version pin `0.10.7 → 0.10.8`; docstring updated to reference Issue #288 - `tests/test_ingest/test_issue288_xbrl_concept_tuple.py` (NEW) — 4 regression tests: GOOG class-C lookup, GOOGL class-A lookup, concept- tuple inclusion pin (`assert "us-gaap:CommonStockSharesOutstanding" in concept_list` — explicit guard against re-omission), and Branch-3 attempt-counter wiring. Would have FAILED on pre-fix code. Rule 18 disambiguation (the new counter): - `attempt == override == 0` → Branch 3 never triggered - `attempt > 0`, `override = 0` → XBRL lookup returned None (regression #288) - `attempt == override > 0` → normal operation; post-fix steady-state = 2 Impact (display-only, NOT a scoring regression): - Composite scores / rankings / Rule 16 / Top-5 rotation UNAFFECTED (`market_cap` not an 8-pillar input) - `multi_class_aggregate_shares_suspected` annotate safety net continues firing (PR #264) - `/stock/GOOG` + `/stock/GOOGL` UI renders correct per-class market_cap on next cron - `pe_ratio_ttm` re-derives from corrected shares Verification: - `ruff check .` — PASS - `python -m compute.output.schema_check` — PASS (triple in sync at 0.10.8-phase4.6) - `schema-sentinel` verdict — TRIPLE-IN-SYNC - `python -m pytest tests/test_ingest/test_issue288_xbrl_concept_tuple.py` — 4 passed - `python -m pytest tests/test_config.py tests/test_output/ -q -m "not network"` — 70 passed Deferred (NOT in this PR): - @network GOOG/GOOGL drift-detector test (live SEC, EDGAR_USER_AGENT required) - Issue #289 NVR DQIC fix (Option C per methodology-scientist) — separate PR * fix(test): update 3 _FALLBACK_STATS pins for new per_class_attempt key CI failure on PR #292 — 3 existing `_FALLBACK_STATS` pin-tests in `test_fundamentals.py` (lines 419 / 450 / 712) hardcoded the dict to exactly 5 keys. This PR's Issue #288 fix added a 6th key (`per_class_attempt`) for the Rule 18 disambiguator. The pin-tests correctly caught the schema change; just needed updates. Changes: - `test_reset_fallback_stats_zeros_counters` — set `per_class_attempt=5` in the non-zero preamble; expect `per_class_attempt: 0` post-reset - `test_get_fallback_stats_returns_copy_not_reference` — expect 6-key shape after reset - `test_get_fallback_stats_returns_five_keys_after_dimensional_path` → renamed to `_returns_six_keys_after_dimensional_path` + updated docstring to reference Issue #288's `per_class_attempt` addition; pin updated to 6-key shape (dimensional path doesn't touch per_class_attempt, so stays at 0) Verification: - `python -m pytest tests/test_ingest/test_fundamentals.py tests/test_ingest/test_issue288_xbrl_concept_tuple.py` — 32 passed - All 3 previously-failing tests now pass post-fix * fix(test): remove unused pytest + SimpleNamespace imports (ruff F401/I001) PR #292 Python (lint + test) CI failed on commit c428fe6 due to ruff F401 (unused imports) + I001 (import ordering) in the new regression test file. Auto-fixable via `ruff check --fix`. Leftover imports from draft iterations: - `from types import SimpleNamespace` — never referenced (final version uses `unittest.mock.MagicMock` instead) - `import pytest` — leftover from a draft that used @pytest.mark decorators; final version uses plain `assert` statements with no markers Local `python -m pytest tests/test_ingest/` confirms 32 passed post-fix; ci-triage-engineer 2026-05-28 verdict: ruff-F401-unused-import + compounding ruff-I001-import-ordering (HIGH confidence). --------- Co-authored-by: Claude <noreply@anthropic.com>
2 tasks
dackclup
added a commit
that referenced
this pull request
May 28, 2026
…+ bump pointers (#295) End-of-day Track-A2 housekeeping. After 6 PRs landed on main today (#286 / #290 / #291 / #292 / #293 / #294), the CLAUDE.md / PHASE_STATUS.md / SKILL.md pointers drifted again — schema bumped via PR #292 (0.10.7 → 0.10.8-phase4.6); USE_SECTOR_COE flipped via PR #294. This PR closes the doc-drift loop so session N+1 reads correct state. Changes (4 files, doc-only): - CLAUDE.md §Phase status — schema `0.10.7-phase4.6 → 0.10.8-phase4.6`; defense layer narrative notes `USE_SECTOR_COE = True` post-#294; new "Post-tag production patches" subsection citing PRs #292 / #293 / #294. "Recently merged" list prepended with 6 same-day entries; legacy "Earlier (PR #264 → PR #285)" subsection relabeled. - PHASE_STATUS.md §Current state — schema mirror; new "Post-tag production patches" row; Production-run pointer `559c5269 → 0ad1d57` (cron #69 chore-commit). "Recently merged" prepended. - SKILL.md schema-version table — new top row for `0.10.8-phase4.6` (PR #292 GOOG/GOOGL XBRL fix + Rule 18 disambiguator). - PHASE_STATUS_INFLIGHT.md — 6 stale `(in flight, 2026-05-28)` markers drained to `(merged 2026-05-28, <SHA>)` (PRs #286 / #290 / #291 / #292 / #293 / #294). Bodies preserved. Doc-only PR — `ruff` / `schema_check` pass; no compute / schema / scoring / valuation / frontend / Python / TS change. CLAUDE.md substance touched (pointer block + Recently merged refresh). AGENTS.md substance unchanged per the delegation-pattern (PR #291 already bumped this morning). Co-authored-by: Claude <noreply@anthropic.com>
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes the observability half of issue #261 — GOOG/GOOGL multi-class shares overcount where both tickers store Alphabet's 12.12B total shares (companyfacts aggregate-only filer), producing $4.6T market_cap per ticker vs real ~$1.05T per class.
Per methodology-scientist Mode B verdict 2026-05-26 (
NEEDS-MORE-CALIBRATION):multi_class_aggregate_shares_suspected(CIK-collision detector)edgar-debuggerprobe that confirmed per-class dimensional contexts are XBRL-available.Detector
New
compute/scoring/multi_class_shares.pyexports a pure universe-level function:Trigger (methodology-scientist Mode B):
MARKET_CAP_FLOOR_RATIO × universe-median(market_cap)(MARKET_CAP_FLOOR_RATIO = 0.10)Annotate-only per
portable-annotate-before-veto; composite rank unchanged.Schema bump
0.10.4-phase4.5e→0.10.5-phase4.5e(PATCH; one additiveMetadatafield). Triple lockstep:compute/output/schemas.py—Metadata.multi_class_aggregate_shares_suspected_count: int | Nonefrontend/lib/types.ts— TypeScript mirrorfrontend/lib/schema-snapshot.json— regeneratedWiring in
compute/main.pycik_by_ticker+market_cap_by_tickerdicts (universe-level scan needs full data upfront)multi_class_aggregate_shares_suspectedtovaluation_warningswhen ticker in flagged set; increment counterMetadata(...)Tests (1196 → 1216 passing, +13 new + 1 updated)
tests/test_scoring/test_multi_class_shares.py:MARKET_CAP_FLOOR_RATIOconstant pin>)tests/test_config.py::test_schema_version_is_phase4_5eupdated0.10.4→0.10.5-phase4.5e.Edgar-debugger findings (live probe, 2026-05-26)
Filing inspected: Alphabet 10-K accession
0001652044-26-000018, FY2025.VERDICT:
PER-CLASS-AVAILABLE-IN-XBRL✅Per-class share counts ARE present as dimensional facts on
us-gaap:CommonStockSharesOutstandingat the balance-sheet date:StatementClassOfStockAxismemberus-gaap:CommonClassAMemberus-gaap:CommonClassBMembergoog:CapitalClassCMemberPer-class sum = 12.088B ≈ aggregate (perfect reconcile per Damodaran 2019 Ch. 16
Σ MC_class = MC_totalidentity).CRITICAL gotcha for PR-B: GOOG Class C uses the filer-specific namespace
goog:CapitalClassCMember, NOT the standardus-gaap:CommonClassCMember. An allowlist keyed on the standard namespace would silently return zero rows for GOOG. PR-B's allowlist will need:Behavior impact
ZERO behavior change for 496 non-colliding S&P 500 tickers — composite / risk_flags / fair_price / top5 rotation unchanged. The 6 multi-class tickers (GOOG / GOOGL / NWS / NWSA / FOX / FOXA) gain the new annotate in
valuation_warnings; composite rank unaffected.Expected Metadata fingerprint post first cron:
multi_class_aggregate_shares_suspected_count ≈ 6Verification ladder
ruff check .— All checks passedpython -m compute.output.schema_check— clean (triple in sync)pytest tests/ -m "not network"— 1216 passed, 7 skipped (ipca/qlib not installed), 24 deselected (@network)Deferred follow-ups (not in this PR)
goog:member for GOOG.|Σ MC_per_class − MC_aggregate| / MC_aggregate < 0.05) — methodology-scientist Mode B Q3 suggestion.multi_class_aggregate_shares_suspected_counthistory; decide whether to retire after ≥ 2 crons of clean reconcile + recalibrate the 10% floor.Lockstep
PHASE_STATUS_INFLIGHT.mdentry appended per PR docs(workflow): adopt PHASE_STATUS_INFLIGHT.md side-file (structural fix for parallel-PR collision) #237 side-file conventionTest plan
Metadatafield; no consumer migration)/stock/GOOGand/stock/GOOGLshould now showmulti_class_aggregate_shares_suspectedin valuation_warnings on the next cron-feat(phase-2): SEC EDGAR fundamentals + per-stock detail pages #4https://claude.ai/code/session_01JwntEE4PNAXSMkZxRA9BB4
Generated by Claude Code