⚡ Defluff — Token Minimizer

Cut LLM token costs by 15–55% before your prompts ever leave your machine.

Defluff works three ways — pick any or all:

Method	Best for
Chrome Extension	Compressing prompts in ChatGPT, Claude.ai, Gemini, Perplexity & more
Claude Code Hooks	Automatic compression inside every Claude Code conversation
Python CLI / SDK	Pipelines, scripts, API cost reduction in your own apps

1. Chrome Extension

Install the Extension

Requires: Google Chrome, Brave, Arc, or any Chromium-based browser.

Step 1 — Download / clone this repo

git clone https://github.com/subrahmanyabhat/defluff.git
# or just download the ZIP and unzip it

Step 2 — Open Chrome Extensions

Paste this in your address bar and press Enter:

chrome://extensions

Step 3 — Enable Developer Mode

Toggle Developer mode ON in the top-right corner.

Step 4 — Load the extension

Click Load unpacked → navigate to the defluff folder (the one with manifest.json) → click Select Folder.

The ⚡ Defluff icon appears in your Chrome toolbar. Done.

Auto-update: Because this is a local extension, it won't auto-update from the Chrome Web Store. Pull the repo and reload the extension to get updates.

Using the Extension

Floating badge

When you open any supported LLM chat site, a small badge appears near the input box:

⚡ 1,234 tokens  [Minimize ↓]  📋

Element	What it does
⚡ 1,234 tokens	Live token count as you type
Minimize ↓	Compress your prompt in place
📋	Copy minimized version to clipboard (without changing the input)

After minimizing:

⚡ 847 tokens  -31%  [↩ Restore]  📋

Click ↩ Restore to undo and get your original text back.

Diff preview (default: ON)

When you click Minimize, a preview panel shows before/after so you can confirm before applying:

┌─────────────────────────────────────────┐
│  Preview changes                        │
│                                         │
│  Before                                 │
│  I would like you to please explain in  │
│  order to understand...                 │
│                                         │
│  After                                  │
│  Please explain...                      │
│                                         │
│  Tokens: 1,234 → 847  Saved: 387 (31%) │
│                                         │
│  [Apply ↓]  [Cancel]                   │
└─────────────────────────────────────────┘

Popup stats

Click the ⚡ icon in the toolbar to see:

Session tokens saved + cost saved
All-time tokens saved + cost saved
Average compression ratio
Quick controls for level and model

Settings

Click ⚙ Options in the popup to open the full settings page.

Setting	Default	Description
Extension enabled	ON	Global on/off switch
Show live token counter	ON	Show the floating badge
Show diff before applying	ON	Preview before/after
Auto-minimize on send	OFF	Compress automatically when you press Enter
Compression level	Medium	Light / Medium / Aggressive
Model	Claude Sonnet 4	Used for cost calculation
Custom price	—	USD per 1M tokens (if your model isn't listed)
Excluded sites	—	Hostnames where Defluff should not run

Live preview — paste any prompt into the Options page to see exactly what gets compressed before using it in a real chat.

2. Claude Code Hooks

Defluff includes three Claude Code hooks. Each runs automatically — no clicks needed.

Hook	Event	What it does
`claude_hook.py`	`UserPromptSubmit`	Compresses your message before Claude reads it
`claude_code_hook.py`	`PreToolUse`	Compresses `prompt`/`content` fields in tool inputs
`post_tool_hook.py`	`PostToolUse`	Compresses large tool outputs (Bash, Grep, MCP) before Claude reads them

Install All Hooks

Option A — Automated installer (recommended)

cd defluff
chmod +x hooks/install.sh
bash hooks/install.sh

This copies defluff_hook.py to ~/.claude/ and registers it in ~/.claude/settings.json.

# Test the hook with a sample prompt
bash hooks/install.sh --test

# Test with your own text
bash hooks/install.sh --test "I would like you to please explain in order to understand..."

# Check installation status + all-time stats
bash hooks/install.sh --status

# Change compression level
bash hooks/install.sh --config level=aggressive

# Remove the hook
bash hooks/install.sh --uninstall

Option B — Python package hooks (more powerful)

These hooks use the full defluff pipeline (deduplication, JSON minification, URL compression, etc.) and require the Python package to be installed.

pip install -e ".[dev]"   # install in editable mode from this repo

Then add all three hooks to ~/.claude/settings.json:

{
  "hooks": {
    "UserPromptSubmit": [
      {
        "matcher": "",
        "hooks": [
          {
            "type": "command",
            "command": "python3 /absolute/path/to/defluff/src/defluff/integrations/claude_hook.py"
          }
        ]
      }
    ],
    "PreToolUse": [
      {
        "matcher": "",
        "hooks": [
          {
            "type": "command",
            "command": "python3 /absolute/path/to/defluff/src/defluff/integrations/claude_code_hook.py"
          }
        ]
      }
    ],
    "PostToolUse": [
      {
        "matcher": "",
        "hooks": [
          {
            "type": "command",
            "command": "python3 /absolute/path/to/defluff/src/defluff/integrations/post_tool_hook.py"
          }
        ]
      }
    ]
  }
}

Replace /absolute/path/to/defluff with the actual path (find it with pwd inside the repo).

UserPromptSubmit Hook (compresses your messages)

What it compresses:

Before: "I would like you to please explain, due to the fact that I am
         confused, how in order to write a sort function. Please note
         that I am basically a beginner."

After:  "Please explain how to write a sort function. I'm a beginner."

Output in Claude Code UI (stderr):

⚡ defluff  67 → 50 tokens  ↓23.1%  saved 15  [medium]

Configure (src/defluff/integrations/claude_hook.py top of file):

PRESET  = "medium"            # "light" | "medium" | "aggressive"
MODEL   = "claude-sonnet-4-6" # used for stats only
MIN_TOKENS = 30               # skip if prompt is shorter than this
SHOW_STATS = True             # show savings in the UI

PreToolUse Hook (compresses tool inputs)

Compresses large prompt, content, message, text, query fields passed into tools.

Safe by design — never touches:

Read, Write, Edit, MultiEdit, NotebookRead, NotebookEdit inputs (Claude needs exact content to produce correct file edits)

Configure (src/defluff/integrations/claude_code_hook.py):

PRESET     = "medium"
MIN_TOKENS = 50      # skip short values
VERBOSE    = False   # set True to log per-field savings

PostToolUse Hook (compresses tool outputs)

Compresses large tool outputs before Claude reads them — the biggest savings opportunity since tool outputs can be huge.

Supported tools:

Tool	Compression
`mcp__filesystem__*`	✅ Full output replacement
`mcp__fetch__*`	✅ Full output replacement
`mcp__brave_search__*`	✅ Full output replacement
`mcp__github__*`	✅ Full output replacement
`mcp__postgres__*`	✅ Full output replacement
`mcp____` (any MCP)	✅ Full output replacement
`Bash`	✅ stdout compressed
`Grep`	✅ output compressed
`WebFetch`	✅ content compressed
`WebSearch`	✅ results compressed
`Glob`, `LS`	✅ output compressed
`Read`, `Write`, `Edit`	❌ Never touched

Configure (src/defluff/integrations/post_tool_hook.py):

PRESET     = "tool_output"   # conservative preset safe for code/JSON
MIN_TOKENS = 100             # only compress outputs larger than this
SHOW_STATS = True

Manual settings.json setup

If you prefer to edit ~/.claude/settings.json yourself:

# Find your absolute path
cd /path/to/defluff && pwd

Open ~/.claude/settings.json (create it if it doesn't exist) and add the hooks block. Example with all three hooks:

{
  "hooks": {
    "UserPromptSubmit": [
      {
        "matcher": "",
        "hooks": [
          {
            "type": "command",
            "command": "python3 /Users/yourname/defluff/src/defluff/integrations/claude_hook.py"
          }
        ]
      }
    ],
    "PreToolUse": [
      {
        "matcher": "",
        "hooks": [
          {
            "type": "command",
            "command": "python3 /Users/yourname/defluff/src/defluff/integrations/claude_code_hook.py"
          }
        ]
      }
    ],
    "PostToolUse": [
      {
        "matcher": "",
        "hooks": [
          {
            "type": "command",
            "command": "python3 /Users/yourname/defluff/src/defluff/integrations/post_tool_hook.py"
          }
        ]
      }
    ]
  }
}

Restart Claude Code after saving.

3. Python CLI

Install the CLI

From this repo (development):

pip install -e .

With optional ML tokenizers:

pip install -e ".[ml]"    # adds transformers + torch for HuggingFace tokenizers

Verify:

defluff --help

Commands

defluff compress      Compress text to reduce token usage
defluff count         Count tokens in text
defluff diff          Show diff between two files
defluff list-models   List supported models and pricing
defluff list-strategies  List all compression strategies
defluff stats         Show your all-time savings
defluff share         Generate a shareable savings summary

CLI Examples

Compress a string:

defluff compress --text "I would like you to please explain in order to understand why this is important"

# Output:
# Please explain why this is important
# Before: 18 tokens | After: 7 tokens | Saved: 11 (61%)

Compress a file:

defluff compress prompt.txt
defluff compress prompt.txt --output compressed.txt
defluff compress prompt.txt --preset aggressive

Compress from stdin (pipe):

echo "Please note that in order to do this you should..." | defluff compress -
cat big_prompt.txt | defluff compress - --preset light

Compress and show diff:

defluff compress prompt.txt --diff
defluff compress prompt.txt --diff --diff-format side-by-side

Choose compression preset:

defluff compress prompt.txt --preset light       # safe, whitespace only
defluff compress prompt.txt --preset medium      # default, removes filler phrases
defluff compress prompt.txt --preset aggressive  # maximum compression

Count tokens only (no compression):

defluff count --text "How many tokens is this sentence?"
defluff count prompt.txt --model claude-sonnet-4-6
defluff count prompt.txt --tokenizer anthropic

See all model prices:

defluff list-models

See all compression strategies:

defluff list-strategies

View your savings history:

defluff stats
defluff share    # generate a shareable text summary

JSON output (for scripts):

defluff compress prompt.txt --output-json
# Outputs:
# {
#   "original_tokens": 45,
#   "compressed_tokens": 28,
#   "tokens_saved": 17,
#   "ratio": 37.8,
#   "compressed_text": "..."
# }

4. Python SDK

Use defluff in your own Python code:

from defluff.pipeline import build_preset
from defluff.tokenizers import AnthropicCounter

# Build a pipeline
pipeline = build_preset("medium")  # "light" | "medium" | "aggressive" | "tool_output"

# Compress text
result = pipeline.run("I would like you to please explain in order to understand this topic.")

print(result.compressed_text)
# → "Please explain this topic."

print(result.original_text)
# → "I would like you to please explain in order to understand this topic."

# See step-by-step what each compressor did
for step in result.steps:
    print(f"{step.name}: {len(step.text_before)} → {len(step.text_after)} chars")

Count tokens accurately:

from defluff.tokenizers import AnthropicCounter

counter = AnthropicCounter(model="claude-sonnet-4-6")
tokens = counter.count("Your prompt text here")
print(tokens)  # e.g. 5

Available tokenizers:

from defluff.tokenizers import AnthropicCounter  # uses Anthropic's token counting
# tiktoken counter also available (for OpenAI models)

Track savings:

from defluff.stats import record, summarize

# Record a compression
record(
    tokens_before=45,
    tokens_after=28,
    model="claude-sonnet-4-6",
    preset="medium",
    source="my-app",
)

# Get summary
summary = summarize()
print(f"Total saved: {summary['total_tokens_saved']:,} tokens")
print(f"Avg ratio:   {summary['avg_ratio']}%")

Integration with Anthropic SDK:

import anthropic
from defluff.pipeline import build_preset

client = anthropic.Anthropic()
pipeline = build_preset("medium")

def chat(user_message: str) -> str:
    # Compress before sending
    result = pipeline.run(user_message)
    
    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        messages=[{"role": "user", "content": result.compressed_text}]
    )
    return response.content[0].text

print(chat("I would like you to please explain in order to understand how sorting works."))

Compression Levels

Light (5–15% savings)

Normalize line endings
Remove trailing whitespace per line
Collapse multiple blank lines → max 2
Collapse multiple spaces → single space

Medium (15–35% savings) — default

Everything in Light, plus:

Removes	Replaces with
`please note that`	(nothing)
`it is important to note that`	(nothing)
`in order to`	`to`
`due to the fact that`	`because`
`at this point in time`	`now`
`with respect to`	`regarding`
`in the event that`	`if`
`I would like you to`	`please`
`each and every`	`every`
`first and foremost`	`first`
`a large number of`	`many`
`Certainly! / Of course! / Absolutely!`	(nothing)
`As an AI language model`	(nothing)
`past history / future plans / free gift`	`history / plans / gift`
(60+ more rules)

Aggressive (30–55% savings)

Everything in Medium, plus:

Removes
`very / really / quite / rather`
`basically / literally / actually / simply / just`
`totally / extremely / incredibly`
`end result / forward progress / complete opposite`
`I think that → I think`
`I believe that → I believe`
`there is no doubt that → certainly`
`it is clear that → (removed)`

Supported Sites & Models

Chrome Extension — Supported Sites

Site	URL
ChatGPT	chat.openai.com, chatgpt.com
Claude	claude.ai
Gemini	gemini.google.com
Microsoft Copilot	copilot.microsoft.com
Perplexity	perplexity.ai
Poe	poe.com
Mistral	chat.mistral.ai
HuggingFace Chat	huggingface.co/chat

Model Prices (for cost estimation)

Model	Price per 1M input tokens
Claude Opus 4	$15.00
Claude Sonnet 4	$3.00
Claude Haiku 4	$0.80
GPT-4o	$2.50
GPT-4o mini	$0.15
GPT-4 Turbo	$10.00
Gemini 1.5 Pro	$3.50
Gemini 1.5 Flash	$0.35
Mistral Large	$8.00
Custom	Your price

Troubleshooting

Chrome extension badge doesn't appear

Make sure the extension is enabled at chrome://extensions
Refresh the page after loading the extension
Check that "Show live token counter" is ON in Options
The site may not be in the supported list — check manifest.json host_permissions

Extension not minimizing text on Claude.ai

Claude.ai uses ProseMirror (a rich text editor). If the minimize button applies but text doesn't change:

Click directly inside the chat input first
Try clicking Minimize again

The extension uses execCommand('insertText') which requires the input to be focused

Hook not running in Claude Code

# Check if hook is registered
bash hooks/install.sh --status

# Test the hook directly
bash hooks/install.sh --test "your prompt here"

# Check settings.json
cat ~/.claude/settings.json

Make sure the path in settings.json is an absolute path (starts with /).

Hook crashes / exits with error

The hooks are built to never crash Claude Code — they always exit 0 and fall back to passthrough on any error. If you see issues:

# Run the hook directly and check stderr
echo '{"hook_event_name":"UserPromptSubmit","prompt":"test prompt"}' \
  | python3 src/defluff/integrations/claude_hook.py

Python package not found (ModuleNotFoundError: No module named 'defluff')

cd /path/to/defluff
pip install -e .

Or use the standalone hook (hooks/defluff_hook.py) which has zero dependencies.

pip not found

python3 -m pip install -e .

Compression changes meaning / breaks sentences

Lower the compression level: Options → Level → Light

Or exclude specific sites: Options → Excluded Sites → add the hostname.

Uninstall Everything

Chrome Extension:

Go to chrome://extensions
Find Defluff → click Remove

Claude Code Hooks:

bash hooks/install.sh --uninstall

Or manually remove the hook entries from ~/.claude/settings.json.

Python CLI:

pip uninstall defluff

Stats data:

rm -rf ~/.defluff          # CLI stats (JSON Lines file)
rm ~/.claude/defluff_stats.json    # Hook stats

Project Structure

defluff/
├── manifest.json              Chrome Extension manifest (MV3)
├── background/
│   └── service-worker.js      Stats persistence, settings, message hub
├── content/
│   ├── content.js             Injected into LLM sites — live counter + minimize UI
│   └── content.css            Styles for the floating badge
├── popup/
│   ├── popup.html/js/css      Toolbar popup — stats + quick controls
├── options/
│   ├── options.html/js/css    Full settings page with live preview
├── utils/
│   ├── tokenizer.js           Token counting (BPE approximation)
│   └── minimizer.js           60+ compression rules, 3 levels
├── hooks/
│   ├── defluff_hook.py         Standalone hook (zero dependencies)
│   └── install.sh             Automated hook installer/uninstaller
├── icons/
│   ├── icon16/48/128.png      Extension icons
│   └── generate-icons.py      Regenerate icons (no external deps)
└── src/defluff/                Python package
    ├── cli.py                 Typer CLI app
    ├── pipeline.py            Compression pipeline + presets
    ├── stats.py               JSON Lines stats tracker
    ├── tokenizers/            tiktoken + Anthropic token counters
    ├── compressors/           Individual compression strategies
    └── integrations/
        ├── claude_hook.py          UserPromptSubmit hook
        ├── claude_code_hook.py     PreToolUse hook
        └── post_tool_hook.py       PostToolUse hook

Quick Start (TL;DR)

# 1. Clone
git clone https://github.com/subrahmanyabhat/defluff.git
cd defluff

# 2. Chrome Extension: chrome://extensions → Developer mode → Load unpacked → select this folder

# 3. Claude Code hooks (automated)
bash hooks/install.sh

# 4. Python CLI
pip install -e .
defluff compress --text "I would like you to please explain in order to understand this"

That's it. Every prompt you send costs less from now on.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

⚡ Defluff — Token Minimizer

Table of Contents

1. Chrome Extension

Install the Extension

Using the Extension

Settings

2. Claude Code Hooks

Install All Hooks

UserPromptSubmit Hook (compresses your messages)

PreToolUse Hook (compresses tool inputs)

PostToolUse Hook (compresses tool outputs)

Manual settings.json setup

3. Python CLI

Install the CLI

Commands

CLI Examples

4. Python SDK

Compression Levels

Light (5–15% savings)

Medium (15–35% savings) — default

Aggressive (30–55% savings)

Supported Sites & Models

Chrome Extension — Supported Sites

Model Prices (for cost estimation)

Troubleshooting

Uninstall Everything

Project Structure

Quick Start (TL;DR)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
background		background
content		content
hooks		hooks
icons		icons
options		options
popup		popup
src/defluff		src/defluff
utils		utils
.gitignore		.gitignore
README.md		README.md
manifest.json		manifest.json
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

⚡ Defluff — Token Minimizer

Table of Contents

1. Chrome Extension

Install the Extension

Using the Extension

Settings

2. Claude Code Hooks

Install All Hooks

UserPromptSubmit Hook (compresses your messages)

PreToolUse Hook (compresses tool inputs)

PostToolUse Hook (compresses tool outputs)

Manual settings.json setup

3. Python CLI

Install the CLI

Commands

CLI Examples

4. Python SDK

Compression Levels

Light (5–15% savings)

Medium (15–35% savings) — default

Aggressive (30–55% savings)

Supported Sites & Models

Chrome Extension — Supported Sites

Model Prices (for cost estimation)

Troubleshooting

Uninstall Everything

Project Structure

Quick Start (TL;DR)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages