One extension to rule your tokens.
Token counter · Usage bars · Cache timer · Caveman mode · Context compressor
TokenCave is a browser extension for claude.ai that combines two things:
- Token intelligence — see exactly how much context you're using, when it expires, and how close you are to limits
- Caveman mode — slash your output tokens by ~75% by making Claude speak with radical brevity — without losing technical accuracy
Approximate token count for your current conversation, with a mini progress bar against Claude's 200k context limit. Uses the same o200k_base tokenizer family.
Live countdown showing how long your conversation remains cached. Cached messages cost significantly less — this timer tells you when to keep going vs. when to start fresh.
Session (5-hour) and weekly (7-day) usage with progress bars and reset countdowns. More accurate than Claude's /usage page because it reads exact utilization fractions from the SSE stream, not rounded percentages.
Make Claude respond like a brilliant, extremely terse caveman. Same technical substance — roughly 75% fewer output tokens.
| Level | Example |
|---|---|
| 🪶 Lite | Drop filler and hedging, keep full sentences |
| 🪨 Full | Drop articles, use fragments, short synonyms |
| 🔥 Ultra | Abbreviate everything, arrows for causality (X→Y), one word when sufficient |
Commands (type in chat):
/cave → toggle on/off
/cave lite → switch to lite mode
/cave full → switch to full mode
/cave ultra → switch to ultra mode
/cave off → disable
/cave status → show current state
Or use the extension popup to click between levels.
Compress your CLAUDE.md, project notes, and documentation files to cut input tokens by ~46% — without losing meaning. Runs locally via Python.
pip install -r requirements.txt
python -m skills.compress.scripts path/to/CLAUDE.md- Download the latest release ZIP (or clone this repo)
- Go to
chrome://extensionsand enable Developer mode - Drag and drop the ZIP onto the page (or click Load unpacked)
- Go to
about:debugging#/runtime/this-firefox - Click Load Temporary Add-on → select
manifest.json
|
|
Same fix. ~75% fewer tokens. Brain still big.
- ✅ All data stays in your browser — no external servers, no tracking
- ✅ Only communicates with
claude.ai - ✅ Reads
lastActiveOrgcookie only to query Claude's own/usageendpoint - ✅
postMessagerestricted towindow.location.origin(not'*') - ✅ Caveman state stored in
sessionStorage— clears when you close the tab - ✅ Compression scripts refuse to process secrets,
.env, keys, or credentials
See SECURITY.md for full details.
When caveman mode is active, TokenCave prepends a bracketed instruction to your message before it's sent:
[RESPONSE STYLE: full caveman — drop articles, fragments OK, short synonyms…]
Your actual question here
Claude respects this instruction and responds tersely for the rest of the message. The prefix is invisible in the UI — you just see the compressed response.
The instruction is NOT stored anywhere, NOT sent to any server, and clears when you turn caveman mode off.
The compression CLI (skills/compress/scripts/) uses Claude's API to:
- Detect whether a file is natural language (compressible) vs. code/config (skip)
- Refuse to process any file that looks like it contains secrets or credentials
- Compress natural-language files to ~50% of original tokens
- Validate the output preserves all technical content
- Retry up to 2 times if quality check fails
Before: CLAUDE.md — 8,200 tokens
After: CLAUDE.md — 4,400 tokens (~46% reduction)
TokenCave/
├── manifest.json # Extension manifest (MV3)
├── popup.html # Extension popup UI
├── src/
│ ├── content/
│ │ ├── constants.js # Shared constants & colors
│ │ ├── bridge-client.js
│ │ ├── tokens.js # Token counting & caching
│ │ ├── caveman.js # 🆕 Caveman mode manager
│ │ ├── ui.js # UI components & DOM injection
│ │ └── main.js # Orchestrator
│ ├── injected/
│ │ └── bridge.js # Page-world fetch interceptor
│ ├── styles.css
│ └── vendor/
│ └── o200k_base.js # Bundled tokenizer
├── icons/ # Extension icons
├── skills/
│ ├── caveman/SKILL.md # AI coding assistant skill (Claude Code / Cursor)
│ └── compress/
│ ├── SKILL.md
│ └── scripts/ # Python compression CLI
├── requirements.txt
├── SECURITY.md
├── CONTRIBUTING.md
└── LICENSE
- Token counting via gpt-tokenizer (MIT)
- Caveman speech compression concept — viral LLM observation that terse prompts preserve accuracy