Skip to content

ppgranger/token-saver

Repository files navigation

Token-Saver

Universal token-saver extension for AI CLI tools. Compresses verbose command outputs (git, tests, builds, lint, ls...) without losing any critical information.

Compatible with Claude Code and Gemini CLI.

Why

AI assistants in CLI consume tokens on every command output. A 500-line git diff, a pytest run with 200 passing tests, an npm install with 80 packages: everything is sent as-is to the model, which only needs the actionable information (errors, modified files, results).

Token-Saver intercepts these outputs and compresses them before they reach the model, preserving 100% of useful information.

How It Works

Architecture

CLI command  -->  Specialized processor  -->  Compressed output
                        |
                  18 processors
                  (git, test, package_list,
                   build, lint, network,
                   docker, kubectl, terraform,
                   env, search, system_info,
                   gh, db_query, cloud_cli,
                   file_listing, file_content,
                   generic)

The engine (CompressionEngine) maintains a priority-ordered chain of processors. The first processor that can handle the command (can_handle()) produces the compressed output. GenericProcessor serves as a fallback and always matches last.

When a specialized processor doesn't achieve the minimum compression ratio (10%), the engine tries the generic processor as a fallback before returning uncompressed output.

After the specialized processor runs, a lightweight cleanup pass (clean()) strips residual ANSI codes and collapses consecutive blank lines.

Platform Integration

The two platforms use different mechanisms:

Claude Code (PreToolUse hook):

1. Claude wants to run `git status`
2. PreToolUse hook intercepts the command
3. Rewrites to: python3 wrap.py 'git status'
4. wrap.py executes the original command
5. Compresses the output
6. Claude receives the compressed version

Claude Code's PreToolUse hook cannot modify output after execution. The only way to reduce tokens is to rewrite the command to go through a wrapper that executes, compresses, and returns the result.

Gemini CLI (AfterTool hook):

1. Gemini executes the command
2. AfterTool hook receives the raw output
3. Compresses the output
4. Replaces it via {"decision": "deny", "reason": "<compressed output>"}

Gemini CLI allows direct output replacement through the deny/reason mechanism.

Precision Guarantees

  • Short outputs (< 200 characters) are never modified
  • Compression is only applied if the gain exceeds 10%
  • All errors, stack traces, and actionable information are fully preserved
  • Only "noise" is removed: progress bars, passing tests, installation logs, ANSI codes, platform lines
  • 478 unit tests including precision-specific tests that verify every critical piece of data survives compression

Installation

Prerequisites

  • Python 3.10+
  • Claude Code and/or Gemini CLI

Method 1: Claude Code Plugin (recommended)

From the self-hosted marketplace:

/plugin marketplace add ppgranger/token-saver
/plugin install token-saver

Or test directly from a local clone:

git clone https://github.com/ppgranger/token-saver.git
claude --plugin-dir ./token-saver

Method 2: Manual installation

git clone https://github.com/ppgranger/token-saver.git
cd token-saver
python3 install.py --target claude    # Claude Code only
python3 install.py --target gemini    # Gemini CLI only
python3 install.py --target both      # Both platforms

The manual installer registers token-saver as a native Claude Code plugin (equivalent to /plugin install). It appears in /plugin list and hooks, skills, and commands are managed natively by Claude Code.

The repo/zip can be deleted after installation. Token-Saver copies everything it needs to ~/.token-saver/ and the platform plugin directories.

Development Mode

python3 install.py --target claude --link   # Symlinks instead of copies

Changes in the source directory are immediately applied. Do not delete the repo in this mode.

Uninstall

python3 install.py --uninstall              # Remove from all platforms
python3 install.py --uninstall --keep-data  # Keep stats DB

Updating

Plugin install: Claude Code handles updates automatically when you refresh the marketplace.

Manual install: Run token-saver update from anywhere, or:

cd token-saver && git pull && python3 install.py --target claude

GitHub releases: Both methods check for new releases via the GitHub API. The token-saver update CLI command and the SessionStart hook notification work regardless of install method.

Upgrading from v1.x to v2.0

If you previously installed token-saver v1.x:

cd token-saver
git pull
python3 install.py --target claude

The installer automatically:

  • Removes legacy hooks from ~/.claude/settings.json (no longer needed)
  • Removes the old ~/.claude/plugins/token-saver/ directory
  • Installs to the plugin cache as a native Claude Code plugin
  • Registers in enabledPlugins and installed_plugins.json

You can also run token-saver update from anywhere to auto-upgrade.

Avoid dual installation

Do NOT install token-saver via BOTH /plugin install AND python3 install.py simultaneously — this could register the plugin twice. Use one method or the other.

To switch from manual to marketplace:

python3 install.py --uninstall --target claude
/plugin marketplace add ppgranger/token-saver
/plugin install token-saver

What the Installer Does

  1. Copies (or symlinks) files to:
    • Core: ~/.token-saver/ (CLI, updater, shared source)
    • Claude Code: ~/.claude/plugins/cache/token-saver-marketplace/token-saver/
    • Gemini CLI: ~/.gemini/extensions/token-saver/
  2. Registers as a native Claude Code plugin in installed_plugins.json and enabledPlugins
  3. Installs token-saver CLI to ~/.local/bin/
  4. Stamps the current version into plugin manifests
  5. Cleans up any legacy token-saving or v1.x installation

CLI

After installation, the token-saver command is available:

token-saver version              # Print current version
token-saver stats                # Show savings statistics
token-saver stats --json         # JSON output for scripting
token-saver update               # Check for and apply updates

If ~/.local/bin is not in your PATH, the installer prints instructions.

Processors

Each processor handles a family of commands. The first one that matches (can_handle()) processes the output. Detailed documentation for each processor is in docs/processors/.

# Processor Priority Commands Docs
1 Package List 15 pip list/freeze, npm ls, conda list, gem list, brew list package_list.md
2 Git 20 status, diff, log, show, push/pull/fetch, branch, stash, reflog, blame, cherry-pick, rebase, merge git.md
3 Test 21 pytest, jest, vitest, mocha, cargo test, go test, rspec, phpunit, bun test, npm/yarn/pnpm test, dotnet test, swift test, mix test test_output.md
4 Build 25 npm/yarn/pnpm build/install, cargo build, make, cmake, gradle, mvn, pip install, tsc, webpack, vite, next build, turbo, nx, bazel, sbt, mix compile, docker build build_output.md
5 Lint 27 eslint, ruff, flake8, pylint, clippy, mypy, prettier, biome, shellcheck, hadolint, rubocop, golangci-lint lint_output.md
6 Network 30 curl, wget, http/https (httpie) network.md
7 Docker 31 ps, images, logs, pull/push, inspect, stats, compose up/down/build/ps/logs docker.md
8 Kubernetes 32 kubectl/oc get, describe, logs, top, apply, delete, create kubectl.md
9 Terraform 33 terraform/tofu plan, apply, destroy, init, output, state list/show terraform.md
10 Environment 34 env, printenv (with secret redaction) env.md
11 Search 35 grep -r, rg, ag, fd, fdfind search.md
12 System Info 36 du, wc, df system_info.md
13 GitHub CLI 37 gh pr/issue/run list/view/diff/checks/status gh.md
14 Database Query 38 psql, mysql, sqlite3, pgcli, mycli, litecli db_query.md
15 Cloud CLI 39 aws, gcloud, az (JSON/table/text output compression) cloud_cli.md
16 File Listing 50 ls, find, tree, exa, eza file_listing.md
17 File Content 51 cat, head, tail, bat, less, more (content-aware: code, config, log, CSV) file_content.md
18 Generic 999 Any command (fallback: ANSI strip, dedup, truncation) generic.md

Configuration

Thresholds are configurable via JSON file or environment variables.

Configuration File

~/.token-saver/config.json:

{
  "enabled": true,
  "min_input_length": 200,
  "min_compression_ratio": 0.10,
  "max_diff_hunk_lines": 150,
  "max_log_entries": 20,
  "max_file_lines": 300,
  "generic_truncate_threshold": 500,
  "debug": false
}

Environment Variables

Every key can be overridden with the TOKEN_SAVER_ prefix:

export TOKEN_SAVER_MAX_LOG_ENTRIES=50
export TOKEN_SAVER_DEBUG=true

# Disable compression entirely (bypass mode)
export TOKEN_SAVER_ENABLED=false

Complete Parameter List

Parameter Default Description
enabled true Master switch -- set to false to bypass all compression
min_input_length 200 Minimum threshold (characters) to attempt compression
min_compression_ratio 0.10 Minimum gain (10%) to apply compression
wrap_timeout 300 Wrapper timeout in seconds
max_diff_hunk_lines 150 Max lines per hunk in git diff
max_diff_context_lines 3 Context lines kept before/after each change in diffs
max_log_entries 20 Max entries in git log/reflog
max_file_lines 300 Threshold before file content compression kicks in
file_keep_head 150 Lines kept from the start of file (fallback strategy)
file_keep_tail 50 Lines kept from the end of file (fallback strategy)
file_code_head_lines 20 Import/header lines to preserve in code files
file_code_body_lines 3 Body lines kept per function/class definition
file_log_context_lines 2 Context lines around errors in log files
file_csv_head_rows 5 Data rows kept from start of CSV files
file_csv_tail_rows 3 Data rows kept from end of CSV files
generic_truncate_threshold 500 Generic truncation threshold
generic_keep_head 200 Lines kept from the start (generic)
generic_keep_tail 100 Lines kept from the end (generic)
ls_compact_threshold 20 Items before ls compaction
find_compact_threshold 30 Results before find compaction
tree_compact_threshold 50 Lines before tree truncation
lint_example_count 2 Examples shown per lint rule
lint_group_threshold 3 Occurrences before grouping by rule
search_max_per_file 3 Max match lines shown per file
search_max_files 20 Max files shown in search results
kubectl_keep_head 10 Lines kept from start of kubectl logs
kubectl_keep_tail 20 Lines kept from end of kubectl logs
docker_log_keep_head 10 Lines kept from start of docker logs
docker_log_keep_tail 20 Lines kept from end of docker logs
git_branch_threshold 30 Branches before compaction
git_stash_threshold 10 Stash entries before truncation
max_traceback_lines 30 Max traceback lines before truncation
db_prune_days 90 Stats retention in days
debug false Enable debug logging

Savings Tracking

Token-Saver records every compression in a local SQLite database:

~/.token-saver/savings.db

Tables

  • savings: each individual compression (timestamp, command, processor, sizes, platform)
  • sessions: aggregated totals per session (first/last activity, total original/compressed, command count)

Automatic Stats

On every session start, the SessionStart hook displays a summary:

[token-saver] Lifetime: 342 cmds, 307.2k tokens saved (67.3%) | Session: 5 cmds, 11.3k tokens saved (72.1%)

If a newer version is available, the notification is appended:

[token-saver] Lifetime: 342 cmds, 307.2k tokens saved (67.3%) | Update available: v1.0.1 -> v1.1.0 -- Run: token-saver update

Manual Stats

token-saver stats
token-saver stats --json
Token-Saver Statistics
========================================

Session
----------------------------------------
  Commands compressed:  12
  Original tokens:      61.3k tokens
  Compressed tokens:    15.5k tokens
  Saved:                45.8k tokens (74.7%)

Lifetime
----------------------------------------
  Sessions:             47
  Commands compressed:  342
  Original tokens:      461.0k tokens
  Compressed tokens:    147.3k tokens
  Saved:                307.2k tokens (67.3%)

Top Processors
----------------------------------------
  git                    142 cmds, 121.8k tokens saved
  test                    89 cmds, 78.0k tokens saved
  build                   45 cmds, 49.7k tokens saved

Maintenance

  • Auto-pruning of records older than 90 days (configurable)
  • Automatic recovery on database corruption
  • Thread-safe (reentrant lock on all operations)
  • WAL mode for concurrent write performance

Security

  • No shell injection: commands are passed through shlex.quote() when rewriting
  • Fail-open: if the hook fails (Python error, missing file), the original command executes normally
  • No sensitive data: only sizes are stored, not output content
  • Secret redaction: the env processor automatically redacts values of variables matching *KEY*, *SECRET*, *TOKEN*, *PASSWORD*, *CREDENTIAL* patterns, preventing accidental leakage into AI context windows
  • Signal forwarding: the wrapper propagates SIGINT/SIGTERM to the child process
  • Exclusions: commands with complex pipes, redirections, sudo, editors, ssh are never intercepted
  • Safe trailing pipes: simple trailing pipes (| head, | tail, | wc, | grep, | sort) are allowed
  • Chained commands: && and ; chains are supported — each segment is validated individually
  • Self-protection: commands containing token-saver or wrap.py are not intercepted (prevents recursion)

Project Structure

token-saver/
├── .claude-plugin/                  # Plugin metadata
│   ├── plugin.json                  # Plugin manifest
│   └── marketplace.json             # Marketplace catalog for distribution
├── hooks/                           # Native hook declarations
│   └── hooks.json                   # Claude Code reads this automatically
├── skills/                          # Agent skills
│   └── token-saver-config/
│       └── SKILL.md
├── commands/                        # Slash commands
│   └── token-saver-stats.md
├── scripts/                         # Python hook scripts
│   ├── __init__.py                  # Package init (prevents namespace conflicts)
│   ├── hook_pretool.py              # PreToolUse hook (Claude Code)
│   ├── wrap.py                      # CLI wrapper (Claude Code)
│   └── hook_session.py              # SessionStart hook wrapper
├── gemini/                          # Gemini CLI specific files
│   ├── gemini-extension.json        # Gemini extension metadata
│   ├── hooks.json                   # Gemini hook definitions
│   └── hook_aftertool.py            # AfterTool hook (Gemini CLI)
├── bin/                             # CLI executables
│   ├── token-saver                  # Unix CLI wrapper
│   └── token-saver.cmd              # Windows CLI wrapper
├── src/                             # Shared source code
│   ├── __init__.py                  # Version (__version__)
│   ├── chain_utils.py               # Chained command splitting (&&, ;)
│   ├── cli.py                       # CLI entry point (version/stats/update)
│   ├── config.py                    # Configuration system
│   ├── engine.py                    # Compression engine (orchestrator)
│   ├── hook_session.py              # SessionStart hook (stats + update notif)
│   ├── platforms.py                 # Platform detection + I/O abstraction
│   ├── stats.py                     # Stats display
│   ├── tracker.py                   # SQLite tracking
│   ├── version_check.py             # GitHub update check
│   └── processors/                  # 18 auto-discovered processors
│       ├── __init__.py
│       ├── base.py                  # Abstract Processor class
│       ├── utils.py                 # Shared utilities (diff compression)
│       ├── package_list.py          # pip list/freeze, npm ls, conda list
│       ├── git.py                   # git status/diff/log/show/blame/push/pull
│       ├── test_output.py           # pytest/jest/cargo/go/dotnet/swift/mix test
│       ├── build_output.py          # npm/cargo/make/webpack/tsc/turbo/nx/docker build
│       ├── lint_output.py           # eslint/ruff/pylint/clippy/mypy/shellcheck/hadolint
│       ├── network.py               # curl/wget/httpie
│       ├── docker.py                # docker ps/images/logs/inspect/stats/compose
│       ├── kubectl.py               # kubectl get/describe/logs/apply/delete/create
│       ├── terraform.py             # terraform plan/apply/init/output/state
│       ├── env.py                   # env/printenv (with secret redaction)
│       ├── search.py                # grep/rg/ag/fd/fdfind
│       ├── system_info.py           # du/wc/df
│       ├── gh.py                    # gh pr/issue/run list/view/diff/checks
│       ├── db_query.py              # psql/mysql/sqlite3/pgcli/mycli/litecli
│       ├── cloud_cli.py             # aws/gcloud/az
│       ├── file_listing.py          # ls/find/tree/exa/eza
│       ├── file_content.py          # cat/bat (content-aware compression)
│       └── generic.py               # Universal fallback
├── docs/
│   └── processors/                  # Per-processor documentation
│       ├── build_output.md
│       ├── cloud_cli.md
│       ├── db_query.md
│       ├── docker.md
│       ├── env.md
│       ├── file_content.md
│       ├── file_listing.md
│       ├── generic.md
│       ├── gh.md
│       ├── git.md
│       ├── kubectl.md
│       ├── lint_output.md
│       ├── network.md
│       ├── package_list.md
│       ├── search.md
│       ├── system_info.md
│       ├── terraform.md
│       └── test_output.md
├── installers/                      # Modular installer package
│   ├── common.py                    # Shared constants + utilities
│   ├── claude.py                    # Claude Code installer (native plugin registration)
│   └── gemini.py                    # Gemini CLI installer
├── install.py                       # Installer entry point
├── CLAUDE.md                        # Plugin instructions
├── tests/
│   ├── test_engine.py               # Engine + registry tests (28)
│   ├── test_processors.py           # Per-processor tests (263)
│   ├── test_hooks.py                # Hook pattern + integration tests (77)
│   ├── test_precision.py            # Precision preservation tests (44)
│   ├── test_tracker.py              # SQLite + concurrency tests (20)
│   ├── test_config.py               # Configuration tests (6)
│   ├── test_version_check.py        # Version check + fail-open tests (12)
│   ├── test_cli.py                  # CLI subcommand tests (7)
│   └── test_installers.py           # Installer utility tests (21)
├── audit_compression.py             # Deep audit tool for compression analysis
├── pyproject.toml                   # Python project config + Ruff rules
├── CONTRIBUTING.md                  # Developer guide
├── LICENSE                          # Apache 2.0
└── README.md

Tests

python3 -m pytest tests/ -v

478 tests covering:

  • test_engine.py (28 tests): compression thresholds, processor priority, ANSI cleanup, generic fallback, hook pattern coverage for 73 commands
  • test_processors.py (263 tests): each processor with nominal and edge cases, chained command routing, all subcommands (blame, inspect, stats, compose, apply/delete, init/output/state, fd, exa, httpie, dotnet/swift/mix test, shellcheck/hadolint/biome, traceback truncation)
  • test_hooks.py (77 tests): matching patterns for all supported commands, exclusions (pipes, sudo, editors, redirections), subprocess integration, global options (git, docker, kubectl), chained commands, safe trailing pipes
  • test_precision.py (44 tests): verification that every critical piece of data survives compression (filenames, hashes, error messages, stack traces, line numbers, rule IDs, diff changes, warning types, secret redaction, unhealthy pods, terraform changes, unmet dependencies)
  • test_tracker.py (20 tests): CRUD, concurrency (4 threads), corruption recovery, session tracking, stats CLI
  • test_config.py (6 tests): defaults, env overrides, invalid values
  • test_version_check.py (12 tests): version parsing, comparison, fail-open on errors
  • test_cli.py (7 tests): version/stats/help subcommands, bin script execution
  • test_installers.py (21 tests): version stamping, legacy migration, CLI install/uninstall

Debugging

To diagnose issues:

# Test compression on a command without replacing the output
python3 scripts/wrap.py --dry-run 'git status'

# Enable debug logging
export TOKEN_SAVER_DEBUG=true

# Check stats
token-saver stats

# Check version
token-saver version

Known Limitations

  • Does not compress commands with complex pipelines, redirections (> file), or || chains
  • Simple trailing pipes are supported (| head, | tail, | wc, | grep, | sort, | uniq, | cut)
  • Chained commands (&&, ;) are supported — each segment is validated individually
  • sudo, ssh, vim commands are never intercepted
  • Long diff compression truncates per-hunk, not per-file: a diff with many small hunks is not reduced
  • The generic processor only deduplicates consecutive identical lines, not similar lines
  • Gemini CLI: the deny/reason mechanism may have side effects if other extensions use the same hook

About

Content-aware output compression for AI coding assistants. Replaces blind truncation with intelligent strategies per file type: structural summaries for code, schema extraction for configs, error-focused filtering for logs, and smart sampling for CSVs. Saves tokens while preserving what the model actually needs.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages