Skip to content

docs(claude-md): token diet — auto-loaded file 236 → 172 lines (Optimization PR B)#142

Merged
dackclup merged 1 commit into
mainfrom
chore/claude-md-token-diet
May 20, 2026
Merged

docs(claude-md): token diet — auto-loaded file 236 → 172 lines (Optimization PR B)#142
dackclup merged 1 commit into
mainfrom
chore/claude-md-token-diet

Conversation

@dackclup
Copy link
Copy Markdown
Owner

Summary

PR B in the .md optimization sequence (Option D — 7-PR overhaul). PR A shipped via #141; this is the CLAUDE.md token diet.

CLAUDE.md is the file Claude Code auto-loads every session, so every line is "spent" in every turn. The bloated parts were:

  • §Multi-session audit pattern (28 lines of procedure that's rarely invoked)
  • §Layout .claude/skills/ row (one massive line listing every vendor / license bucket)
  • §Phase status (mixing transient PR-in-flight detail with structural facts)
  • §Conventions "ships with every PR" bullet (12 lines of justification on a 1-line rule)

Numbers

File Before After Δ
CLAUDE.md 236 lines 172 lines -27%
AGENTS.md 344 lines 386 lines +42 (hosts relocated content)

Net token impact: ~-64 × ~12 tokens/line ≈ -750 tokens per session auto-loaded. Compounds across all sessions per day.

What moved where

Content Was in Now in
§Multi-session audit pattern (4-step procedure + Section I forcing example) CLAUDE.md (28 lines) AGENTS.md §new section (full procedure preserved) + CLAUDE.md 5-line reference back
Skill inventory vendor / license breakdown CLAUDE.md §Layout (1 massive line) THIRD_PARTY_NOTICES.md (canonical source, referenced)
Phase 4h history paragraph CLAUDE.md §Phase status PHASE_STATUS.md + SKILL.md schema-version table (already there)
Karpathy LLM-Wiki detail CLAUDE.md §Phase status .claude/skills/karpathy-llm-wiki/SKILL.md (already there)

Section-by-section trim

Section Lines before → after Change
§Layout 14 → 14 Skill row 1 long line → 1 short line + link
§Commands 32 → 25 Merged workflow_dispatch + connector-aware check (was split + redundant)
§Connectors 28 → 18 Full audit pattern moved out; verbose Vercel/Supabase/Sentry descriptions tightened
§Conventions 42 → 29 "ships with every PR" bullet 12 → 6
§Gotchas 24 → 22 Hypothesis bullet 7 → 5
§Phase status 47 → 28 Epic #125 + Karpathy detail → "Recently merged" + "Next deliverables" lists
§Companion files 10 → 10 +THIRD_PARTY_NOTICES.md, −claude-Creator.md (rarely in-session useful)

Risk notes

  • The multi-session audit pattern is now ONE link away (in AGENTS.md). Non-Claude agents reading AGENTS.md see the full procedure unchanged.
  • All cross-references verified: Rules 1-18 (SKILL.md actually has Rule 18 — confirmed) · §"Multi-session audit pattern" (anchor exists in AGENTS.md L345).
  • No information deleted — only relocated to better-fitting docs OR de-duplicated where it lived in 3+ places.

What this PR does NOT touch

  • Code (compute/, frontend/, tests/)
  • Schemas
  • CI workflows
  • pre-merge-prod-sim.yml won't trigger (path filter excludes .md)

Next in sequence (NOT this PR)

PR C (AGENTS.md sync + dedup) · PR D (WORKFLOW.md archive Phase 0-3) · PR E (SKILL.md restructure) · PR F (skill description audit ×38) · PR G (PHASE_STATUS.md "Current State" summary).

Test plan

  • CI green (lint + tests + frontend build)
  • pre-merge-prod-sim.yml does not trigger
  • Visual scan of CLAUDE.md confirms structure still makes sense at a glance
  • All anchor links resolve

https://claude.ai/code/session_015649aRyi2bvciQYZVNACd2


Generated by Claude Code

PR B in the .md optimization sequence (Option D). CLAUDE.md is the
file Claude Code auto-loads every session, so every line is "spent"
in every turn — high-leverage trim target.

CLAUDE.md (236 → 172 lines):
- §Layout: skill inventory row (1 massive line) → 1 line + link to
  THIRD_PARTY_NOTICES.md (the canonical source for vendor / license
  breakdown). Future skill add/remove no longer needs to re-edit a
  multi-clause inline parenthetical.
- §Commands: merged "After every workflow_dispatch" + "Connector-
  aware first-line check" into one 5-line block (was 14 lines split
  across two subsections that overlapped).
- §Connectors: full §Multi-session audit pattern (28 lines) → 5-line
  reference + link to the moved-to-AGENTS.md version. Connector
  table descriptions tightened (Vercel / Supabase / Sentry / Gmail
  rows lost verbose justifications already implicit elsewhere).
- §Conventions: "CLAUDE.md + AGENTS.md ship with every PR" bullet
  12 → 6 lines (the rule is the rule — the four-paragraph
  justification was filler).
- §Gotchas: Hypothesis property-based tests bullet 7 → 5 lines.
- §Phase status: Epic #125 Item 3 + Karpathy LLM-Wiki detail (18 +
  6 = 24 lines) → "Recently merged" + "Next deliverables" lists
  (12 lines). Detail lives in PHASE_STATUS.md and the skill's own
  SKILL.md.
- §Companion files: added THIRD_PARTY_NOTICES.md; dropped
  claude-Creator.md self-reference (rarely useful in-session).

AGENTS.md (344 → 386 lines):
- New §"Multi-session audit pattern" hosts the full 4-step procedure
  + Section I forcing example (28 lines moved from CLAUDE.md).
  Non-Claude agents (Copilot / Cursor / Devin) read AGENTS.md so
  the pattern stays accessible cross-tool.
- §"Phase + version state": added bullet documenting the PR B token
  diet + the multi-PR optimization sequence (PR A shipped #141 ·
  PR B this one · PR C-G planned).

CLAUDE.md + AGENTS.md edits ship per the lockstep rule. No code
touched, no schema touched — pre-merge-prod-sim.yml won't trigger.

Token impact estimate: CLAUDE.md is auto-loaded at session start +
referenced from every system prompt assembly. -64 lines × ~12
tokens/line ≈ -750 tokens per session. Across many sessions per day
this compounds.

Next in sequence (TBD): PR C (AGENTS.md sync + dedup) · PR D
(WORKFLOW.md archive) · PR E (SKILL.md restructure) · PR F (skill
description audit ×38) · PR G (PHASE_STATUS.md restructure).
@vercel
Copy link
Copy Markdown

vercel Bot commented May 20, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
quantrank Ready Ready Preview, Comment May 20, 2026 6:02am

@dackclup dackclup marked this pull request as ready for review May 20, 2026 06:08
@dackclup dackclup merged commit f24d437 into main May 20, 2026
4 checks passed
@dackclup dackclup deleted the chore/claude-md-token-diet branch May 20, 2026 06:08
dackclup added a commit that referenced this pull request May 20, 2026
Third PR in the .md optimization sequence (Option D). PR B (#142)
trimmed CLAUDE.md from 236 → 172 lines and moved the multi-session
audit pattern out. PR C now closes the loop: AGENTS.md sections that
duplicate CLAUDE.md become reference pointers; unique cross-tool
content stays.

AGENTS.md (386 → 342 lines):

Dedup'd to reference CLAUDE.md as canonical:
- §Tech stack — was 10 lines repeating CLAUDE.md §Stack with two
  extra deps. Now 6 lines: link to CLAUDE.md + note the two extras
  (pyarrow / yfinance) relevant for local build/test work.
- §Commands — was 19 rows of full command table mirroring CLAUDE.md.
  Now 8 rows of cross-tool setup + dev-loop commands not in
  CLAUDE.md (install with extras, ruff --fix, single-module test,
  npm run dev, npm run lint). CLAUDE.md's verification ladder is
  the canonical command surface.
- §Project structure — was 54-line tree with inline annotations
  duplicating CLAUDE.md §Gotchas (stale: "3 active vetoes" /
  "_EIGHT_K_DEFENSES_ENABLED = False until Phase 4"). Now 42 lines
  with file-purpose annotations only; bugs / drift live exclusively
  in CLAUDE.md §Gotchas.
- §Phase + version state — was 46 lines duplicating CLAUDE.md
  §Phase status. Now 15 lines: reference to CLAUDE.md as canonical
  + cross-tool-specific bits only (production-verified-run baseline
  for local validation, open issue list, optimization PR sequence
  tracker).
- §Companion files — refreshed to match CLAUDE.md's updated list
  (added THIRD_PARTY_NOTICES.md, dropped agent-Creator.md self-ref).

Unique cross-tool content kept verbatim (no Claude-only-context
equivalent exists):
- §Testing (19 lines) — pytest @network marker + EDGAR_USER_AGENT
  requirement + where-to-put-tests guidance
- §Code style (73 lines) — Python + TypeScript ✅ Good / ❌ Avoid
  examples; rationale why type hints + tabular-nums etc.
- §Git workflow (29 lines) — branch naming + commit format + PR
  Draft↔Ready discipline + no-direct-main-push
- §Boundaries (51 lines) — ✅ Always OK / ⚠️ Ask first / 🚫 Never;
  GOLD content for non-Claude agents
- §Security considerations (8 lines)
- §Claude-Code-specific tooling (16 lines) — graceful-degradation
  note for Copilot / Cursor / Devin
- §Multi-session audit pattern (30 lines) — moved here in PR B,
  full 4-step procedure + Section I forcing example

CLAUDE.md (172 → 180 lines):
- §Phase status "Recently merged" — added PR #142 (B token diet)
- §Phase status — added "PR C in flight" note per lockstep convention

Lockstep CLAUDE.md + AGENTS.md edit per the per-PR convention.

Cumulative result for PR A + B + C on the agent-doc surface:
- CLAUDE.md: 236 (pre-A) → 180 lines (today) = -24%
- AGENTS.md: 344 (pre-A) → 342 lines = roughly flat, but the
  internal information density went up substantially (drift removed,
  duplication removed, multi-session pattern added)
- Combined: 580 → 522 lines (-10%) with strictly more useful
  signal-to-token ratio

Next in sequence: PR D (WORKFLOW.md archive Phase 0-3, 1732 → ~1450
lines) · PR E (SKILL.md restructure) · PR F (skill description
audit ×38) · PR G (PHASE_STATUS.md "Current State" summary).

Co-authored-by: Claude <noreply@anthropic.com>
dackclup added a commit that referenced this pull request May 20, 2026
… + top-10 movers) (#148)

Epic #125 Item 3 PR 2 — closes the substantive remainder of Epic #125
after PR 1 (#140) shipped the workflow skeleton. Adds the per-ticker
composite-score diff vs main, top-10 movers table, universe-size delta,
and failure-path comment (PR 1 fell through silently on red checks).

Baseline source: main's COMMITTED `frontend/public/data/` via
`git show origin/main:...`, not a fresh re-run on main. The diff
answers "did this PR change what production shows users?" — the right
anchor is the last-committed main output, not a counterfactual.
Free + no doubled EDGAR rate-limit pressure. Stale-baseline (>7 days)
warning shows inline when the weekly cron hasn't run recently.

Files:
- `tools/pre_merge_diff.py` (~150 LOC, pure stdlib)
- `tests/test_pre_merge_diff.py` (18 offline tests, no pandas/numpy)
- `.github/workflows/pre-merge-prod-sim.yml` — adds 2 steps
  (fetch + diff), extends sticky comment, wraps post step with
  `if: !cancelled()` for failure-path comment
- `CLAUDE.md` + `AGENTS.md` — lockstep update per Rule from PR #142

Co-authored-by: Claude <noreply@anthropic.com>
dackclup added a commit that referenced this pull request May 20, 2026
…loses #117) (#149)

helper.py Section B was hard-failing on `non_reliance_filing` and
`auditor_change` fires with "expected 0; flag broken?" — but PR #79
(Phase 4g, 2026-05-15) re-enabled both 8-K Tier-2 defenses by flipping
`compute/scoring/tier2._EIGHT_K_DEFENSES_ENABLED = True`. Non-zero fires
in the normal cohort band are EXPECTED post-4g, not bugs.

Changes:
- `section_b_tier2()` now takes `metadata` as a second parameter and
  replaces the hard-fail-on-any with a soft-band check against the
  academic cohort priors that calibrated each flag:
    * going_concern_disclosure  — Mayew 2015: 1-3%; WARN > 5%
    * non_reliance_filing        — Schroeder 2024: rare 4.02s; WARN > 2%
    * auditor_change             — Cohen-Malloy-Nguyen 2020: 1-5%; WARN > 5%
- Regression guard inverts: if `tier2_coverage_pct` ≤ 5% (proxy for
  `_EIGHT_K_DEFENSES_ENABLED = False` at compute time) and a flag still
  fires, that's the real bug — keeps the original "feature flag must
  hold" contract intact without flipping it backwards on healthy runs.
- SKILL.md Section B description + Hard contract checks updated.
- CLAUDE.md + AGENTS.md lockstep update per Rule from PR #142.

Verification on current production data (commit `3da995dc`, 502 stocks):
  Section A-H run: 0 failures, 0 warnings (was: 2 failures pre-fix on
  the stale Section B expectations).

Co-authored-by: Claude <noreply@anthropic.com>
dackclup added a commit that referenced this pull request May 20, 2026
…label clarification (#151)

Closes Phase 0 of foundation reconciliation roadmap (epic #150). Adds
honesty surface for analytical claims the rest of the docs glossed over.

docs/METHODOLOGY.md — new §"Known limitations" section covering:
- Survivorship bias (Wikipedia current SP500, not point-in-time)
- Score semantics (percentile rank, not absolute quality)
- Pillar correlation (Quality + Profitability ROE double-count)
- extreme_*_estimate as method-applicability, not manipulation
- Pillar weight rationale (empirical, not academic-derived)
- Top-decile vetoes fire on top 10% by construction
- Known calibration drift cross-refs (#11, #16, Phase 4.5d, #130)

frontend/components/PillarRadarChart.tsx — sub-header now reads
"0-100 percentile rank against current S&P 500 (sector-relative for
Quality/Value/Growth/Profitability)" instead of generic "against
the universe".

CLAUDE.md + AGENTS.md — lockstep update per Rule from PR #142.

Co-authored-by: Claude <noreply@anthropic.com>
dackclup added a commit that referenced this pull request May 20, 2026
…17 → 27 (#154)

The CLAUDE.md `## Phase status` headline claimed "defense layer 17"
(7 vetoes + 10 annotates). The 2026-05-20 quarterly audit on issue #130
discovered the actual emit surface is 27 boolean flags:

- 7 active vetoes (rank-suppressors) — unchanged
- 10 annotate flags (the documented set) — unchanged
- 5 method-applicability flags (extreme_<dcf,ev_ebitda,pb,pe,rim>_estimate
  — currently mis-aggregated into `manipulation_index`, scheduled for
  semantic split in epic #150 Phase 2)
- 5 additional informational flags (cross_source_disagreement,
  late_filing_notification, manipulation_triple_flag, rem_suspect,
  restatement_history)

Phase 1.2 closure: this PR updates the headline summary in CLAUDE.md to
reflect reality (17 declared veto+annotate flags PLUS 10 additional
flags), links to the audit comment as canonical evidence, and notes
that Phase 2 of epic #150 will reorganize the taxonomy properly.

No compute logic change. No schema change. The 10 additional flags
already emit in production — this PR just acknowledges them in the
agent-facing summary docs.

Also reframes the "in flight" section to reflect Phase 0 + 1.3 closed,
1.2 in progress, 1.4-1.6 + 2-3 remaining.

CLAUDE.md + AGENTS.md lockstep update per Rule from PR #142.

Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants