A Claude Code plugin that displays per-operation token usage when agents and subagents complete. Only outputs in debug mode (claude --debug). Tracks Claude Code 2.1.85–2.1.108.
After each Claude Code response (in debug mode), a compact unicode-bordered report appears in the terminal showing:
- Token counts — fresh input + cache-write (what counts toward rate limits) and output
- Cache breakdown — cache-write (included in limits) and cache-read (excluded) shown separately
- Cache efficiency — percentage of total input that came from cache
- Cache invalidation detection — detects and flags cache busts with cause classification:
- TTL expiry ("hey!" effect — idle >5 min, millions of tokens burned on resume)
- File change (Edit/Write triggers
loadChangedFiles()) - Bash side effect (linter/formatter modifies files → file watcher invalidates)
- External change (file watcher detects non-Claude modifications)
- Penalty cost and batching opportunity analysis
- Compact boundary markers — v2.1.90 ground-truth
system/compact_boundaryevents withpreTokensfrom auto-compaction - CLAUDE.md / rule reloads — v2.1.101
InstructionsLoadedevents, broken down byload_reason(session_start, nested_traversal, path_glob_match, include, compact) with red warning for compact-triggered reloads - PostCompact / PermissionDenied / TaskCreated — v2.1.90+ lifecycle events rolled into the next Stop report
- tool-results/ spillover — v2.1.90 large tool outputs spilled to
~/.claude/projects/<project>/<session>/tool-results/are credited to the originating tool - Worktree sub-agent breakdown — dedicated box showing per-agent cache dynamics when skills spawn agent swarms in worktrees
- Duration — elapsed time from first to last message in the operation
- Per-tool attribution — input, output, and result tokens for each tool used
- Per-skill attribution — v2.1.108+ built-in slash commands route through the
Skilltool. Each skill invocation shows invocation count, result→input bytes (the skill content loaded into context), output bytes, and an estimated cost per skill. Skills are sorted by cost so the biggest drains float to the top - Sub-agent aggregation by type — when the same agent type is spawned multiple times (e.g.
Explore x5), an aggregated row shows the total count, tokens, and cost per type in addition to the individual per-instance drill-down - Standalone Skills box (opt-in via
SKILLS_BOX) — when enabled, the per-skill breakdown is rendered in its own dedicated unicode box instead of as an inline section in the main report - Per-section truncation (
MAX_ENTRIES_PER_SECTION, default 12) — long lists (skills, bash, web, files, sub-agents) are capped to keep inline output under Claude Code's hook output limit, with a⋯ +N more — see HTML reportindicator - Full HTML report (always on in debug mode) — every Stop/SubagentStop event writes a self-contained nicely-formatted HTML file to
<main-repo-root>/reports/token-reporter/<YYYYMMDD_HHMMSS±HHMM>-<event>-<session>.htmlcontaining every section without truncation. The path resolves viagit worktree listso linked worktrees still write to the main checkout'sreports/folder. The inline output ends with aFull report: <path>footer line so you can open the complete record in a browser. Add bothreports/andreports_dev/to your project's.gitignoreper the agent-reports-location rule — reports often contain private data (session IDs, file paths, tool outputs) that must never be committed. - Cost estimate — based on published Anthropic API pricing, scoped to lifetime (agents) or current operation (session)
- Agent identity — agent type/name (via v2.1.101
agent_id/agent_typehook input fields), model, message count, duration - Bash commands — every shell command executed (full list in HTML, capped inline)
- Web fetches — every URL fetched (full list in HTML, capped inline)
- Files touched — read, edited, and written files (full list in HTML, capped inline)
╭──────────────────────────────────────────────────────────────╮
│ Subagent Explore ae3d16d9 | haiku-4-5 | 3 messages | 12s │
├──────────────────────────────────────────────────────────────┤
│ Tokens │ 34.8K input / 573 output │
│ L cache-write (included): 2.1K │
│ L cache-read (excluded): 12.4K │
│ L cache efficiency: 36% of input from cache │
│ Cost │ $0.04 (lifetime) │
│ Tools │ WebFetch x1 / Bash x2 │
│ L WebFetch x1: 144 out / 85.2K result→input │
│ L Bash x2: 429 out / 312 result→input │
│ Bash │ 2 commands │
│ $ git status │
│ $ ls -la src/ │
│ Files │ 3 read │
│ · README.md │
│ · src/index.ts │
│ · package.json │
╰──────────────────────────────────────────────────────────────╯
╭──────────────────────────────────────────────────────────────────────────────╮
│ Session 2779c422 | opus-4-6 | 15 messages | 2m34s │
├──────────────────────────────────────────────────────────────────────────────┤
│ Tokens │ 367.5K input / 1.1K output │
│ L cache-write (included): 54.8K │
│ L cache-read (excluded): 528.5K │
│ L cache efficiency: 56% of input from cache │
│ Cost │ $2.59 (this op) │
│ Tools │ Bash x12 / Edit x3 / Read x2 │
│ L Bash x12: 2.0K out / 1.4K result→input │
│ L Edit x3: 890 out / 245 result→input │
│ L Read x2: 251 out / 6.3K result→input │
│ MCP │ 3 tools / x10 calls │
│ L chrome-devtools:take_screenshot x3: 2.1K result→input │
│ L chrome-devtools:navigate_page x2: 89 out / 1.2K r→input │
│ L grepika:search x5: 200 out / 3.1K result→input │
│ L total result→input: 97.1K │
│ Bash │ 12 commands │
│ $ git status │
│ $ npm test │
│ $ ... │
│ Web │ 1 fetches │
│ → https://api.github.com/repos/... │
│ Files │ 2 read / 1 edited │
│ · README.md │
│ · src/index.ts │
│ * scripts/token-reporter.py │
╰──────────────────────────────────────────────────────────────────────────────╯
in— input tokens attributed to the API call where the model invoked this toolout— output tokens the model generated to call this tool (the tool_use JSON block)result→input(orr→infor MCP tools) — tokens in the tool's result that got fed back as input on the next API turn (tokenized with tiktoken cl100k_base). This is where tools like WebFetch, Read, and Bash consume the most tokens
MCP tool names (e.g. mcp__chrome-devtools__take_screenshot) are shortened to server:tool format (e.g. chrome-devtools:take_screenshot) for display:
MCProw shows total tool count and call count- Each tool listed below with shortened name, call count, and token breakdown
- Long lines wrap within the 80-column box with content-aligned continuation
- uv — the script runs via
uv run --with tiktokento manage the tiktoken dependency automatically - Python 3.8+ — any Python 3.8+ accessible to uv
- Claude Code 2.1.85+ — for v2.1.85 JSONL format (agentId removed, toolUseResult.agentId added). StopFailure requires 2.1.78+. TeammateIdle/TaskCompleted require 2.1.69+. PostCompact/TaskCreated/PermissionDenied require 2.1.90+. InstructionsLoaded /
agent_id/agent_typehook input fields require 2.1.101+. Lower versions silently ignore the hook registrations for events they don't support.
Install uv if you don't have it:
curl -LsSf https://astral.sh/uv/install.sh | sh- Plugin name:
token-reporter— this is the name inplugin.jsonand what you use withclaude plugin install - GitHub repo:
Emasoft/token-reporter-plugin— where the source code lives
The plugin name and repo name are intentionally different. When installing or referencing the plugin, always use token-reporter (the plugin name), not token-reporter-plugin (the repo name).
claude plugin install token-reporter@emasoft-pluginsIf you haven't added the marketplace yet:
claude plugin marketplace add Emasoft/emasoft-pluginsThen install:
claude plugin install token-reporter@emasoft-pluginsRestart Claude Code to activate.
Add the marketplace and enable the plugin in ~/.claude/settings.json:
{
"pluginMarketplaces": [
"Emasoft/emasoft-plugins"
],
"enabledPlugins": {
"token-reporter@emasoft-plugins": true
}
}Restart Claude Code or run /reload-plugins to activate.
# Clone the plugin repo directly
git clone https://github.com/Emasoft/token-reporter-plugin.git /tmp/token-reporter-plugin
# Install from local path
claude plugin install /tmp/token-reporter-plugin/token-reporterOr copy to a local marketplace:
mkdir -p ~/.claude/plugins/marketplaces/local-marketplace/plugins/
cp -r /tmp/token-reporter-plugin/token-reporter ~/.claude/plugins/marketplaces/local-marketplace/plugins/token-reporterThen enable in ~/.claude/settings.json:
{
"enabledPlugins": {
"token-reporter@local-marketplace": true
}
}Restart Claude Code to activate.
token-reporter/
.claude-plugin/
plugin.json # Plugin manifest (userConfig: OUTPUT_LIMIT_CHARS)
.github/
workflows/
notify-marketplace.yml # Auto-notify emasoft-plugins on version bump
bin/
token-report.py # v2.1.91+ bin/ helper — on-demand report (cross-platform Python wrapper)
hooks/
hooks.json # Hook event → command mapping (9 events)
scripts/
token-reporter.py # Main hook script (~2400 LOC)
bump_version.py # Semver bumper for plugin.json
publish.py # Release pipeline (lint, CPV validation, bump, tag, push, gh release)
pre-push # Git pre-push quality gate (symlinked to .git/hooks/pre-push)
pyproject.toml # Python project metadata and tool config
cliff.toml # Changelog generation config (git-cliff)
The plugin only outputs reports in debug mode. Start Claude Code with:
claude --debugRun any command. When the response completes, you should see the token report box in the terminal. Look for lines prefixed with [token-reporter] in stderr output (visible in ~/.claude/debug/).
Without --debug, the hook exits immediately with no output.
The plugin registers nine hook events in hooks/hooks.json. The first five produce reports; the remaining four are lightweight event-loggers that fold into the next Stop/SubagentStop report (so they don't spam the terminal).
Report-emitting hooks:
| Hook Event | When it fires | What the script does | Cost label |
|---|---|---|---|
| Stop | Main session response complete | Parses session transcript (since last user prompt), collects any saved subagent reports, displays all together | (this op) |
| StopFailure | Turn ends due to API error (rate limit, auth) | Same as Stop — tokens are still consumed before the error | (this op) |
| SubagentStop | Subagent (Explore, Plan, etc.) finished | Parses agent's full lifetime transcript | (lifetime) |
| TeammateIdle | Teammate agent paused/waiting | Same as SubagentStop | (lifetime) |
| TaskCompleted | Background task finished | Same as SubagentStop | (lifetime) |
Lightweight event-log hooks (v2.1.90+):
| Hook Event | Claude Code | Recorded to ${CLAUDE_PLUGIN_DATA}/events/ and surfaced next Stop |
|---|---|---|
| InstructionsLoaded | 2.1.101+ | file_path, memory_type, load_reason (incl. compact), globs, trigger_file_path, parent_file_path |
| PostCompact | 2.1.90+ | preTokens, trigger, compactMetadata |
| TaskCreated | 2.1.90+ | agent_id, agent_type for agents spawned in the session |
| PermissionDenied | 2.1.90+ | Permission-denial count surfaces as a red row in the report |
Debug gate: The hook first walks the process tree (getppid() → ps -o args=) checking for a parent claude process with --debug flag. If not found, the hook exits immediately with no output or processing. Lightweight hooks still log events even when --debug is off, so the next debug-mode Stop hook report includes the full history.
The plugin ships bin/token-report.py, a Python wrapper that prints a report on demand without waiting for the Stop hook. Because v2.1.91+ adds bin/ executables to the Bash tool's PATH while the plugin is enabled, you can ask Claude Code to run it directly:
Please run: token-report.py
Or from any shell inside the project:
cd /path/to/project && /path/to/plugin/bin/token-report.pyThe helper reads the current session transcript (found via ${CLAUDE_PROJECT_DIR} / newest JSONL under ~/.claude/projects/<slug>/) and prints the same report format as the Stop hook, but to stdout as plain text instead of as a systemMessage. Useful for mid-session snapshots.
Python was chosen over Bash so the helper is cross-platform (macOS, Linux, Windows) as long as uv is on PATH. CPV (claude-plugins-validation) flags extensionless executables and .sh files as platform-specific; .py is recognized as cross-platform.
The plugin exposes three userConfig entries in plugin.json. Each one is also overridable via a plain env var (prefix TOKEN_REPORTER_) for local development or older Claude Code versions that lack userConfig support.
| Key | Type | Default | Purpose |
|---|---|---|---|
OUTPUT_LIMIT_CHARS |
number | 10000 |
Max characters injected into the transcript. The Claude Code binary hardcodes a 10,000 char cap on hook output (additionalContext / systemMessage / stdout) — output exceeding this is silently saved to disk and replaced with an opaque preview stub, destroying the inline box. Source: official hooks docs. Keep at 10000 unless Anthropic raises the cap in a future Claude Code version. The plugin enforces this cap itself (drops oldest sub-agent reports first, then hard-truncates) so the unicode box stays renderable. |
SKILLS_BOX |
boolean | false |
When true, the per-skill cost breakdown is rendered in its own dedicated unicode box instead of as an inline section in the main report. Useful for sessions with many skill invocations where the inline section would crowd the main box. |
MAX_ENTRIES_PER_SECTION |
number | 12 |
Caps the number of entries shown per inline list section (skills, bash commands, web fetches, files, sub-agents). Lists exceeding this length show a ⋯ +N more — see HTML report indicator. The full untruncated data is always available in the HTML report. Set to 0 to disable truncation entirely. |
Env var overrides (dev/local): TOKEN_REPORTER_OUTPUT_LIMIT_CHARS, TOKEN_REPORTER_SKILLS_BOX, TOKEN_REPORTER_MAX_ENTRIES_PER_SECTION. Boolean values accept 1/true/yes/on and 0/false/no/off (case-insensitive).
When the hook fires (which it only does in claude --debug mode), it always writes a full HTML report containing every section without truncation to:
<main-repo-root>/reports/token-reporter/<YYYYMMDD_HHMMSS±HHMM>-<event>-<session>.html
<main-repo-root> is resolved via git worktree list — when the hook fires inside a linked worktree, the report still lands in the main checkout's reports/ folder so nothing is lost when a worktree branch is pruned. When the hook's cwd isn't a git repo, the plugin falls back to writing under that cwd.
The timestamp embeds the GMT offset (%z, compact ±HHMM form — e.g. 20260421_183012+0200) so plain glob/ls -t ordering works across timezones.
The path is appended to the inline output as a Full report: <path> footer line. The HTML uses an inline dark-theme CSS, summary cards, and per-section tables — open it in any browser, no external assets required.
Add both /reports/ and /reports_dev/ to your project's .gitignore per the agent-reports-location rule — reports often contain private data (session IDs, file paths, tool outputs) that must never be committed. The convention is shared across all of this author's plugins: every plugin saves its reports under <main-repo-root>/reports/<plugin-name>/.
Why the temp file pattern? Claude Code only renders systemMessage output to the terminal for Stop events. SubagentStop/TeammateIdle/TaskCompleted output is consumed as system context but not displayed. So the script saves child agent reports to temp files, and the Stop hook collects and displays them all together.
Why the retry loop? The Stop hook fires before the current response is fully written to the JSONL transcript file. The script retries up to 6 times with exponential backoff (1s → 5s) until assistant messages appear.
The script reads Claude Code's JSONL transcript files and tracks:
- Per-message usage —
input_tokens,output_tokens,cache_creation_input_tokens,cache_read_input_tokensfrom each assistant message'susagefield - Per-tool output — output tokens divided among tools in each assistant message
- Per-tool result→input — the script matches
tool_use_idfrom assistant messages totool_resultblocks in the following user messages, tokenizes the result content with tiktoken, and attributes those tokens back to the originating tool
- Input counted toward limits:
input_tokens+cache_creation_input_tokens - NOT counted toward limits:
cache_read_input_tokens
The report labels these as (included) and (excluded) respectively.
Each hook runs:
uv run --with tiktoken python3 ${CLAUDE_PLUGIN_ROOT}/scripts/token-reporter.py
uv run --with tiktokenprovides the tiktoken dependency in a cached virtual environment (first run ~3s, subsequent runs ~3ms overhead)${CLAUDE_PLUGIN_ROOT}is expanded by Claude Code to the plugin's install directory- The script reads hook input from stdin (JSON) and writes
{"systemMessage": "..."}to stdout
If tiktoken is not available (e.g., running the script directly without uv), token counts fall back to a chars/4 estimate and a warning is printed to stderr.
| Model | Input $/M | Output $/M | Cache Write $/M | Cache Read $/M |
|---|---|---|---|---|
| Claude Opus 4.6 / 4.5 | $5.00 | $25.00 | $6.25 | $0.50 |
| Claude Opus 4.1 / 4 | $15.00 | $75.00 | $18.75 | $1.50 |
| Claude Sonnet 4.6 / 4.5 / 4 | $3.00 | $15.00 | $3.75 | $0.30 |
| Claude Haiku 4.5 | $1.00 | $5.00 | $1.25 | $0.10 |
| Claude Haiku 3.5 | $0.80 | $4.00 | $1.00 | $0.08 |
| Claude Haiku 3 | $0.25 | $1.25 | $0.30 | $0.03 |
Unknown models default to Sonnet pricing.
The plugin requires debug mode to produce any output. Start Claude Code with:
claude --debugDetailed stderr logs (visible in ~/.claude/debug/):
[token-reporter] hook invoked
[token-reporter] hook_event=Stop session=2779c422
[token-reporter] retry 1/5, waiting 1.0s for transcript flush...
[token-reporter] parsing session transcript: ... last_op_only=True
[token-reporter] messages=15 inp=367500 out=1100 cw=0 cr=528500
[token-reporter] tools={'Bash': 12, 'Edit': 3, 'Read': 2}
[token-reporter] tools_tokens={'Bash': {'input': ..., 'output': ..., 'result_tokens': ...}, ...}
[token-reporter] collected 1 subagent reports
[token-reporter] report built, length=1234
Without --debug, the hook detects the absence via process tree inspection and exits immediately (no transcript parsing, no output).
# Bump patch version, tag, push, create GitHub release
uv run scripts/publish.py
# Or specify bump level
uv run scripts/publish.py --minor
uv run scripts/publish.py --major
uv run scripts/publish.py --set 2.0.0
# Preview without changes
uv run scripts/publish.py --dry-runThe pre-push hook runs ruff lint and syntax checks before allowing pushes to main. Install it with:
ln -sf ../../scripts/pre-push .git/hooks/pre-pushDesigned for dark terminal backgrounds:
| Color | Used for |
|---|---|
| Bright blue | Borders, labels, all static text |
| Bright yellow | Token values |
| Bright green | Cost values |
| Bright magenta | Tool counts |
| Bright cyan | Session/agent hash |
| Bright white | Model names, tool names, file names |
- Marketplace: Emasoft/emasoft-plugins
- Repository: Emasoft/token-reporter-plugin
MIT