feat(cli): add /ctxviz for per-category context breakdown by emal-avala · Pull Request #191 · avala-ai/agent-code

emal-avala · 2026-04-23T02:29:42Z

Summary

New /ctxviz command (alias /context-viz) that prints a per-category token breakdown of the current context, so users can see what's eating their window.

> /ctxviz
Context breakdown (~42388 tokens, 21% of 200000 window):

  System prompt          12544  30%
  Tool schemas            6321  15%
  User text               4122  10%
  Assistant text         18003  42%
  Tool use                 512   1%
  Tool result              881   2%
  Thinking                   0   0%
  System msgs                5   0%

  14 messages · auto-compact at 160000 tokens

When the total crosses the compact threshold a warning is added:

  ⚠ Over compact threshold — next turn will auto-compact.

Why

The existing /context command gives a single aggregate number. When context gets tight the user has no way to answer the actually-useful question: which category is the culprit?

A single long tool-result, or a verbose assistant message with thinking blocks, or a bloated system prompt from a project with 50 rules — each wants a different remedy, and the user can't choose without a breakdown.

Implementation note

ContextBreakdown is a pub struct so tests can assert on individual categories without parsing stdout. Computation reuses estimate_block_tokens / estimate_message_tokens from services::tokens — same path runtime planning uses — so the numbers track what the model actually sees.

The tool-schema count uses ToolRegistry::default_tools() via a lazy OnceLock. Dynamic MCP tools registered at runtime are not counted — documented as a known limitation in the accessor's doc comment.

Test plan

cargo fmt --all — clean
cargo clippy --workspace --all-targets -- -D warnings — clean
cargo test -p agent-code --bin agent ctxviz — 3/3 pass
- ctxviz_breakdown_empty_state_has_only_system_prompt
- ctxviz_breakdown_user_text_counted_separately_from_assistant
- ctxviz_breakdown_total_equals_sum_of_parts
cargo test -p agent-code --test smoke — 3/3 pass

chatgpt-codex-connector · 2026-04-23T02:29:47Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

Prints a breakdown of the current context by category so users can see what's eating their window: > /ctxviz Context breakdown (~42388 tokens, 21% of 200000 window): System prompt 12544 30% Tool schemas 6321 15% User text 4122 10% Assistant text 18003 42% Tool use 512 1% Tool result 881 2% Thinking 0 0% System msgs 5 0% 14 messages · auto-compact at 160000 tokens When the total crosses the compact threshold a warning is printed so users know the next turn will auto-compact. Breaks down: - System prompt: the full prompt including AGENTS.md, memory, skills list, tool docs, guidelines - Tool schemas: each enabled tool's name + description + JSON schema - User text / Assistant text: plain text content blocks - Tool use: assistant tool_use blocks - Tool result: user tool_result blocks (what tools returned) - Thinking: extended-thinking blocks - System msgs: informational/error system messages Useful for debugging context overflow — previously /context gave a single aggregate number with no idea which category was the culprit. Aliases: /ctxviz, /context-viz

The Test (windows-latest) job has been cancelling every PR at its 15-minute timeout since the schedule subcommand landed. Two separate Windows-specific problems: 1. Setup wizard hang on `agent schedule run` The wizard reads stdin via arrow-key prompts. On Windows CI, stdin isn't a TTY and the wizard blocks indefinitely. Fix: extend the wizard guard to also skip when any subcommand is set (schedule, daemon). Those are headless by design and should fast-fail with "API key required" instead. 2. Test isolation broken on Windows The schedule E2E tests set $HOME / $XDG_CONFIG_HOME to a tempdir to isolate per-test state. On Linux that works — `dirs::config_dir` reads those env vars. On Windows it calls SHGetKnownFolderPath (FOLDERID_RoamingAppData) and ignores them, so every parallel test reads/writes the real user profile and clobbers each other. Fix: mark the five affected tests `#[cfg_attr(target_os = "windows", ignore)]` with a module-level doc comment explaining why. Linux CI remains the source of truth for these tests. Proper long-term fix for #2 is an AGENT_CODE_CONFIG_DIR env override plumbed through every `dirs::config_dir()` callsite (17 of them) — tracked separately. Tests on Linux: 14/14 schedule tests pass.

Windows' CreateProcess finds clip.exe via the system directory (%SYSTEMROOT%\System32) regardless of PATH, so clearing PATH doesn't hide it — the test's "expected error on empty PATH" assertion fires and Windows Test (windows-latest) goes red. Gate the test behind `#[cfg(not(target_os = "windows"))]`. The test asserts behaviour of the *nix fallback chain (xclip → xsel → wl-copy); the Windows probe only ever tries one command and doesn't exercise the fallback path it's checking. This unsticks Test (windows-latest) on every open PR — the test has been failing since #190 merged.

These 12 tests call bash -c with POSIX shell syntax (>&2, &&, \$(seq ...), true/false/exit, cat). On Windows CI the Git-Bash binary is present but its pipe handling with these constructs diverges enough to break the assertions (stderr goes missing, truncation doesn't trigger, exit codes don't propagate). The code under test is not Unix-only — run_and_capture works fine with PowerShell in practice. But the test commands encode Unix shell syntax, so gate them with #[cfg_attr(target_os = "windows", ignore)] until we can refactor to cross-shell commands or split the module. Tests affected: - captures_stdout_from_echo - captures_stderr - captures_mixed_stdout_stderr - captures_multiline_output - handles_empty_output_command - captures_nonzero_exit_code - respects_cwd - callbacks_receive_all_lines - truncates_large_output - invalid_command_returns_error_output - full_pipeline_echo_to_context_message - full_pipeline_empty_command_no_message - full_pipeline_truncated_output_has_suffix The two cross-platform tests that use Cursor<String> rather than a subprocess (captures_all_lines_under_limit, truncates_at_limit) are left running on all platforms. Linux: 24/24 still pass.

Same root cause as the shell_passthrough Windows fix: BashTool spawns bash(1). Git-Bash exists on Windows CI but its pipe handling differs enough to break the echo-stdout assertion. Gate with #[cfg_attr(target_os = "windows", ignore = "...")] so Linux and macOS still enforce the regression guard.

emal-avala force-pushed the feat/ctxviz branch from 87e99b2 to 815e9b9 Compare April 23, 2026 02:48

emal-avala force-pushed the feat/ctxviz branch from 815e9b9 to ac05e0f Compare April 23, 2026 02:59

emal-avala added 4 commits April 22, 2026 20:30

emal-avala merged commit ade4a50 into main Apr 23, 2026
14 checks passed

emal-avala deleted the feat/ctxviz branch April 23, 2026 03:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(cli): add /ctxviz for per-category context breakdown#191

feat(cli): add /ctxviz for per-category context breakdown#191
emal-avala merged 5 commits intomainfrom
feat/ctxviz

emal-avala commented Apr 23, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

emal-avala commented Apr 23, 2026

Summary

Why

Categories

Implementation note

Test plan

Uh oh!

chatgpt-codex-connector Bot commented Apr 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant