Skip to content

subrahmanyabhat/defluff

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

⚡ Defluff — Token Minimizer

Cut LLM token costs by 15–55% before your prompts ever leave your machine.

Defluff works three ways — pick any or all:

Method Best for
Chrome Extension Compressing prompts in ChatGPT, Claude.ai, Gemini, Perplexity & more
Claude Code Hooks Automatic compression inside every Claude Code conversation
Python CLI / SDK Pipelines, scripts, API cost reduction in your own apps

Table of Contents

  1. Chrome Extension
  2. Claude Code Hooks
  3. Python CLI
  4. Python SDK
  5. Compression Levels
  6. Supported Sites & Models
  7. Troubleshooting
  8. Uninstall Everything

1. Chrome Extension

Install the Extension

Requires: Google Chrome, Brave, Arc, or any Chromium-based browser.

Step 1 — Download / clone this repo

git clone https://github.com/subrahmanyabhat/defluff.git
# or just download the ZIP and unzip it

Step 2 — Open Chrome Extensions

Paste this in your address bar and press Enter:

chrome://extensions

Step 3 — Enable Developer Mode

Toggle Developer mode ON in the top-right corner.

Step 4 — Load the extension

Click Load unpacked → navigate to the defluff folder (the one with manifest.json) → click Select Folder.

The ⚡ Defluff icon appears in your Chrome toolbar. Done.

Auto-update: Because this is a local extension, it won't auto-update from the Chrome Web Store. Pull the repo and reload the extension to get updates.


Using the Extension

Floating badge

When you open any supported LLM chat site, a small badge appears near the input box:

⚡ 1,234 tokens  [Minimize ↓]  📋
Element What it does
⚡ 1,234 tokens Live token count as you type
Minimize ↓ Compress your prompt in place
📋 Copy minimized version to clipboard (without changing the input)

After minimizing:

⚡ 847 tokens  -31%  [↩ Restore]  📋

Click ↩ Restore to undo and get your original text back.

Diff preview (default: ON)

When you click Minimize, a preview panel shows before/after so you can confirm before applying:

┌─────────────────────────────────────────┐
│  Preview changes                        │
│                                         │
│  Before                                 │
│  I would like you to please explain in  │
│  order to understand...                 │
│                                         │
│  After                                  │
│  Please explain...                      │
│                                         │
│  Tokens: 1,234 → 847  Saved: 387 (31%) │
│                                         │
│  [Apply ↓]  [Cancel]                   │
└─────────────────────────────────────────┘

Popup stats

Click the ⚡ icon in the toolbar to see:

  • Session tokens saved + cost saved
  • All-time tokens saved + cost saved
  • Average compression ratio
  • Quick controls for level and model

Settings

Click ⚙ Options in the popup to open the full settings page.

Setting Default Description
Extension enabled ON Global on/off switch
Show live token counter ON Show the floating badge
Show diff before applying ON Preview before/after
Auto-minimize on send OFF Compress automatically when you press Enter
Compression level Medium Light / Medium / Aggressive
Model Claude Sonnet 4 Used for cost calculation
Custom price USD per 1M tokens (if your model isn't listed)
Excluded sites Hostnames where Defluff should not run

Live preview — paste any prompt into the Options page to see exactly what gets compressed before using it in a real chat.


2. Claude Code Hooks

Defluff includes three Claude Code hooks. Each runs automatically — no clicks needed.

Hook Event What it does
claude_hook.py UserPromptSubmit Compresses your message before Claude reads it
claude_code_hook.py PreToolUse Compresses prompt/content fields in tool inputs
post_tool_hook.py PostToolUse Compresses large tool outputs (Bash, Grep, MCP) before Claude reads them

Install All Hooks

Option A — Automated installer (recommended)

cd defluff
chmod +x hooks/install.sh
bash hooks/install.sh

This copies defluff_hook.py to ~/.claude/ and registers it in ~/.claude/settings.json.

# Test the hook with a sample prompt
bash hooks/install.sh --test

# Test with your own text
bash hooks/install.sh --test "I would like you to please explain in order to understand..."

# Check installation status + all-time stats
bash hooks/install.sh --status

# Change compression level
bash hooks/install.sh --config level=aggressive

# Remove the hook
bash hooks/install.sh --uninstall

Option B — Python package hooks (more powerful)

These hooks use the full defluff pipeline (deduplication, JSON minification, URL compression, etc.) and require the Python package to be installed.

pip install -e ".[dev]"   # install in editable mode from this repo

Then add all three hooks to ~/.claude/settings.json:

{
  "hooks": {
    "UserPromptSubmit": [
      {
        "matcher": "",
        "hooks": [
          {
            "type": "command",
            "command": "python3 /absolute/path/to/defluff/src/defluff/integrations/claude_hook.py"
          }
        ]
      }
    ],
    "PreToolUse": [
      {
        "matcher": "",
        "hooks": [
          {
            "type": "command",
            "command": "python3 /absolute/path/to/defluff/src/defluff/integrations/claude_code_hook.py"
          }
        ]
      }
    ],
    "PostToolUse": [
      {
        "matcher": "",
        "hooks": [
          {
            "type": "command",
            "command": "python3 /absolute/path/to/defluff/src/defluff/integrations/post_tool_hook.py"
          }
        ]
      }
    ]
  }
}

Replace /absolute/path/to/defluff with the actual path (find it with pwd inside the repo).


UserPromptSubmit Hook (compresses your messages)

What it compresses:

Before: "I would like you to please explain, due to the fact that I am
         confused, how in order to write a sort function. Please note
         that I am basically a beginner."

After:  "Please explain how to write a sort function. I'm a beginner."

Output in Claude Code UI (stderr):

⚡ defluff  67 → 50 tokens  ↓23.1%  saved 15  [medium]

Configure (src/defluff/integrations/claude_hook.py top of file):

PRESET  = "medium"            # "light" | "medium" | "aggressive"
MODEL   = "claude-sonnet-4-6" # used for stats only
MIN_TOKENS = 30               # skip if prompt is shorter than this
SHOW_STATS = True             # show savings in the UI

PreToolUse Hook (compresses tool inputs)

Compresses large prompt, content, message, text, query fields passed into tools.

Safe by design — never touches:

  • Read, Write, Edit, MultiEdit, NotebookRead, NotebookEdit inputs (Claude needs exact content to produce correct file edits)

Configure (src/defluff/integrations/claude_code_hook.py):

PRESET     = "medium"
MIN_TOKENS = 50      # skip short values
VERBOSE    = False   # set True to log per-field savings

PostToolUse Hook (compresses tool outputs)

Compresses large tool outputs before Claude reads them — the biggest savings opportunity since tool outputs can be huge.

Supported tools:

Tool Compression
mcp__filesystem__* ✅ Full output replacement
mcp__fetch__* ✅ Full output replacement
mcp__brave_search__* ✅ Full output replacement
mcp__github__* ✅ Full output replacement
mcp__postgres__* ✅ Full output replacement
mcp__*__* (any MCP) ✅ Full output replacement
Bash ✅ stdout compressed
Grep ✅ output compressed
WebFetch ✅ content compressed
WebSearch ✅ results compressed
Glob, LS ✅ output compressed
Read, Write, Edit Never touched

Configure (src/defluff/integrations/post_tool_hook.py):

PRESET     = "tool_output"   # conservative preset safe for code/JSON
MIN_TOKENS = 100             # only compress outputs larger than this
SHOW_STATS = True

Manual settings.json setup

If you prefer to edit ~/.claude/settings.json yourself:

# Find your absolute path
cd /path/to/defluff && pwd

Open ~/.claude/settings.json (create it if it doesn't exist) and add the hooks block. Example with all three hooks:

{
  "hooks": {
    "UserPromptSubmit": [
      {
        "matcher": "",
        "hooks": [
          {
            "type": "command",
            "command": "python3 /Users/yourname/defluff/src/defluff/integrations/claude_hook.py"
          }
        ]
      }
    ],
    "PreToolUse": [
      {
        "matcher": "",
        "hooks": [
          {
            "type": "command",
            "command": "python3 /Users/yourname/defluff/src/defluff/integrations/claude_code_hook.py"
          }
        ]
      }
    ],
    "PostToolUse": [
      {
        "matcher": "",
        "hooks": [
          {
            "type": "command",
            "command": "python3 /Users/yourname/defluff/src/defluff/integrations/post_tool_hook.py"
          }
        ]
      }
    ]
  }
}

Restart Claude Code after saving.


3. Python CLI

Install the CLI

From this repo (development):

pip install -e .

With optional ML tokenizers:

pip install -e ".[ml]"    # adds transformers + torch for HuggingFace tokenizers

Verify:

defluff --help

Commands

defluff compress      Compress text to reduce token usage
defluff count         Count tokens in text
defluff diff          Show diff between two files
defluff list-models   List supported models and pricing
defluff list-strategies  List all compression strategies
defluff stats         Show your all-time savings
defluff share         Generate a shareable savings summary

CLI Examples

Compress a string:

defluff compress --text "I would like you to please explain in order to understand why this is important"

# Output:
# Please explain why this is important
# Before: 18 tokens | After: 7 tokens | Saved: 11 (61%)

Compress a file:

defluff compress prompt.txt
defluff compress prompt.txt --output compressed.txt
defluff compress prompt.txt --preset aggressive

Compress from stdin (pipe):

echo "Please note that in order to do this you should..." | defluff compress -
cat big_prompt.txt | defluff compress - --preset light

Compress and show diff:

defluff compress prompt.txt --diff
defluff compress prompt.txt --diff --diff-format side-by-side

Choose compression preset:

defluff compress prompt.txt --preset light       # safe, whitespace only
defluff compress prompt.txt --preset medium      # default, removes filler phrases
defluff compress prompt.txt --preset aggressive  # maximum compression

Count tokens only (no compression):

defluff count --text "How many tokens is this sentence?"
defluff count prompt.txt --model claude-sonnet-4-6
defluff count prompt.txt --tokenizer anthropic

See all model prices:

defluff list-models

See all compression strategies:

defluff list-strategies

View your savings history:

defluff stats
defluff share    # generate a shareable text summary

JSON output (for scripts):

defluff compress prompt.txt --output-json
# Outputs:
# {
#   "original_tokens": 45,
#   "compressed_tokens": 28,
#   "tokens_saved": 17,
#   "ratio": 37.8,
#   "compressed_text": "..."
# }

4. Python SDK

Use defluff in your own Python code:

from defluff.pipeline import build_preset
from defluff.tokenizers import AnthropicCounter

# Build a pipeline
pipeline = build_preset("medium")  # "light" | "medium" | "aggressive" | "tool_output"

# Compress text
result = pipeline.run("I would like you to please explain in order to understand this topic.")

print(result.compressed_text)
# → "Please explain this topic."

print(result.original_text)
# → "I would like you to please explain in order to understand this topic."

# See step-by-step what each compressor did
for step in result.steps:
    print(f"{step.name}: {len(step.text_before)}{len(step.text_after)} chars")

Count tokens accurately:

from defluff.tokenizers import AnthropicCounter

counter = AnthropicCounter(model="claude-sonnet-4-6")
tokens = counter.count("Your prompt text here")
print(tokens)  # e.g. 5

Available tokenizers:

from defluff.tokenizers import AnthropicCounter  # uses Anthropic's token counting
# tiktoken counter also available (for OpenAI models)

Track savings:

from defluff.stats import record, summarize

# Record a compression
record(
    tokens_before=45,
    tokens_after=28,
    model="claude-sonnet-4-6",
    preset="medium",
    source="my-app",
)

# Get summary
summary = summarize()
print(f"Total saved: {summary['total_tokens_saved']:,} tokens")
print(f"Avg ratio:   {summary['avg_ratio']}%")

Integration with Anthropic SDK:

import anthropic
from defluff.pipeline import build_preset

client = anthropic.Anthropic()
pipeline = build_preset("medium")

def chat(user_message: str) -> str:
    # Compress before sending
    result = pipeline.run(user_message)
    
    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        messages=[{"role": "user", "content": result.compressed_text}]
    )
    return response.content[0].text

print(chat("I would like you to please explain in order to understand how sorting works."))

Compression Levels

Light (5–15% savings)

  • Normalize line endings
  • Remove trailing whitespace per line
  • Collapse multiple blank lines → max 2
  • Collapse multiple spaces → single space

Medium (15–35% savings) — default

Everything in Light, plus:

Removes Replaces with
please note that (nothing)
it is important to note that (nothing)
in order to to
due to the fact that because
at this point in time now
with respect to regarding
in the event that if
I would like you to please
each and every every
first and foremost first
a large number of many
Certainly! / Of course! / Absolutely! (nothing)
As an AI language model (nothing)
past history / future plans / free gift history / plans / gift
(60+ more rules)

Aggressive (30–55% savings)

Everything in Medium, plus:

Removes
very / really / quite / rather
basically / literally / actually / simply / just
totally / extremely / incredibly
end result / forward progress / complete opposite
I think that → I think
I believe that → I believe
there is no doubt that → certainly
it is clear that → *(removed)*

Supported Sites & Models

Chrome Extension — Supported Sites

Site URL
ChatGPT chat.openai.com, chatgpt.com
Claude claude.ai
Gemini gemini.google.com
Microsoft Copilot copilot.microsoft.com
Perplexity perplexity.ai
Poe poe.com
Mistral chat.mistral.ai
HuggingFace Chat huggingface.co/chat

Model Prices (for cost estimation)

Model Price per 1M input tokens
Claude Opus 4 $15.00
Claude Sonnet 4 $3.00
Claude Haiku 4 $0.80
GPT-4o $2.50
GPT-4o mini $0.15
GPT-4 Turbo $10.00
Gemini 1.5 Pro $3.50
Gemini 1.5 Flash $0.35
Mistral Large $8.00
Custom Your price

Troubleshooting

Chrome extension badge doesn't appear

  • Make sure the extension is enabled at chrome://extensions
  • Refresh the page after loading the extension
  • Check that "Show live token counter" is ON in Options
  • The site may not be in the supported list — check manifest.json host_permissions

Extension not minimizing text on Claude.ai

Claude.ai uses ProseMirror (a rich text editor). If the minimize button applies but text doesn't change:

  1. Click directly inside the chat input first
  2. Try clicking Minimize again
  • The extension uses execCommand('insertText') which requires the input to be focused

Hook not running in Claude Code

# Check if hook is registered
bash hooks/install.sh --status

# Test the hook directly
bash hooks/install.sh --test "your prompt here"

# Check settings.json
cat ~/.claude/settings.json

Make sure the path in settings.json is an absolute path (starts with /).

Hook crashes / exits with error

The hooks are built to never crash Claude Code — they always exit 0 and fall back to passthrough on any error. If you see issues:

# Run the hook directly and check stderr
echo '{"hook_event_name":"UserPromptSubmit","prompt":"test prompt"}' \
  | python3 src/defluff/integrations/claude_hook.py

Python package not found (ModuleNotFoundError: No module named 'defluff')

cd /path/to/defluff
pip install -e .

Or use the standalone hook (hooks/defluff_hook.py) which has zero dependencies.

pip not found

python3 -m pip install -e .

Compression changes meaning / breaks sentences

Lower the compression level: Options → Level → Light

Or exclude specific sites: Options → Excluded Sites → add the hostname.


Uninstall Everything

Chrome Extension:

  1. Go to chrome://extensions
  2. Find Defluff → click Remove

Claude Code Hooks:

bash hooks/install.sh --uninstall

Or manually remove the hook entries from ~/.claude/settings.json.

Python CLI:

pip uninstall defluff

Stats data:

rm -rf ~/.defluff          # CLI stats (JSON Lines file)
rm ~/.claude/defluff_stats.json    # Hook stats

Project Structure

defluff/
├── manifest.json              Chrome Extension manifest (MV3)
├── background/
│   └── service-worker.js      Stats persistence, settings, message hub
├── content/
│   ├── content.js             Injected into LLM sites — live counter + minimize UI
│   └── content.css            Styles for the floating badge
├── popup/
│   ├── popup.html/js/css      Toolbar popup — stats + quick controls
├── options/
│   ├── options.html/js/css    Full settings page with live preview
├── utils/
│   ├── tokenizer.js           Token counting (BPE approximation)
│   └── minimizer.js           60+ compression rules, 3 levels
├── hooks/
│   ├── defluff_hook.py         Standalone hook (zero dependencies)
│   └── install.sh             Automated hook installer/uninstaller
├── icons/
│   ├── icon16/48/128.png      Extension icons
│   └── generate-icons.py      Regenerate icons (no external deps)
└── src/defluff/                Python package
    ├── cli.py                 Typer CLI app
    ├── pipeline.py            Compression pipeline + presets
    ├── stats.py               JSON Lines stats tracker
    ├── tokenizers/            tiktoken + Anthropic token counters
    ├── compressors/           Individual compression strategies
    └── integrations/
        ├── claude_hook.py          UserPromptSubmit hook
        ├── claude_code_hook.py     PreToolUse hook
        └── post_tool_hook.py       PostToolUse hook

Quick Start (TL;DR)

# 1. Clone
git clone https://github.com/subrahmanyabhat/defluff.git
cd defluff

# 2. Chrome Extension: chrome://extensions → Developer mode → Load unpacked → select this folder

# 3. Claude Code hooks (automated)
bash hooks/install.sh

# 4. Python CLI
pip install -e .
defluff compress --text "I would like you to please explain in order to understand this"

That's it. Every prompt you send costs less from now on.

About

⚡ Cut LLM token costs 15–55% — Chrome extension, Claude Code hooks, Python CLI/SDK

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors