Icemage _^(icmg)

Stop burning tokens. Stop losing context. Ship faster.

A small helper app that makes AI coding assistants — Claude Code, Cursor, and friends — 70 – 98 % cheaper to run, without making them less helpful.

40 MCP tools · 1207/1207 tests · single-binary · 100 % local · pure-bash hooks (zero Python/jq dependency).

If you've ever watched a huge token bill evaporate on a single file read, paid for "thinking" you didn't need, or re-explained your project to the AI for the fifth time today — Icemage is for you.

🟢 Why Icemage

AI assistants are powerful but wasteful by default. Every time the AI opens a file, runs a command, or starts a new chat, it re-reads context it has seen many times and dumps full output into the conversation. Icemage sits quietly in the background and trims the noise before it ever reaches the AI:

Long files → only the relevant slice
Noisy command output → just the parts that matter
Web pages → cached + summarised
Past decisions → remembered across sessions so the AI doesn't ask twice
Repeated work → results reused instead of recomputed

The AI keeps its full intelligence. Your wallet keeps more of its money.

📊 Headline numbers

Metric	Typical	Best	Since
File-read savings	70 – 85 % fewer tokens	up to 92 %	v0.5
Test / build output	60 – 80 % shorter	up to 90 %	v0.5
Multi-file UI propagation (style-clone)	30 – 50× cheaper	up to 98 %	v1.22.0
Cross-project bundle (port)	8 – 12× cheaper	up to 95 %	v1.24.0
Compressed-Write (AI emit diff)	70 – 95% fewer tokens	up to 98 %	v1.25.0
Web-fetch reduction	70 – 90 % smaller	up to 95 %	v0.4
Repeat-context recall	near-zero, < 5 ms cached	—	v1.21.8
Semantic atom recall (`recall --atoms`)	fact-level hits, not blobs	—	v1.79.0
Auto-bisect (`icmg bisect`)	first-bad commit in ~log2(N) tests	—	v1.81.0
Past-chat full-text search	< 10 ms across months	—	v1.21.7
Graph symbol lookup	256-slot in-RAM cache	—	v1.21.8
First-prompt warmup	< 1 s	—	v1.18
Cold build time (icmg itself)	~50 % faster (20 min → 9-10 min)	—	v1.26.0
MCP response filter (verbose plugins)	50 – 80 % smaller	up to 90 %	v1.30.0
Auto-thinking suppress (trivial prompts)	~1500 tok / call saved	—	v1.30.0
Sayless-auto (long-prose replies)	60 – 75 % compress	up to 85 %	v1.30.0
Service auto-start (UserPromptSubmit)	0-touch warm-up	—	v1.30.0
Path ambiguity warning (icmg context)	wrong-file lookups → loud	—	v1.29.0
rg-wrapper + brace glob (icmg grep/files)	flag-mirror, {a,b} expand	—	v1.29.0
Local AI model (built-in, opt-in)	0 cloud calls	privacy-first	v1.31.0
Smart router (REGEX vs LLM_LOCAL vs CACHE)	<100 us p99	hot-path forced regex	v1.31.0
HTTP streaming download (model fetch + SHA256)	400 MB - 2 GB safe-verify	tamper-detect	v1.31.0
icmg git wrapper (single ergonomic entry)	Tkil-filtered + safety-gated	enforces icmg-FIRST	v1.31.0
Python-free core (PRECOMPACT_PY dropped)	-200-500 ms boot saved	single-binary	v1.31.0
pack --rerank (LLM-reorder memory hits)	opt-in warm-path	router-gated	v1.32.0
PreCompact LLM summary (warm-pool Qwen 0.5B)	<15 s cold	regex fallback always	v1.32.0
icmg compact-bg (proactive memory worker)	<3 s warm	manual + future hook	v1.32.0
Smarter local AI memory	multi-prompt safe	no overflow	v1.32.0
Code graph viz + report (`icmg graph viz`)	interactive D3 + god-nodes	—	v1.71.0
Secret scanner (`icmg scan`)	21 detectors, CI-gate	redact-by-default	v1.68.0
MCP server hardening (token + rate-limit + path-guard)	abuse / RCE-safe	—	v1.72.0
Post-compact memory re-anchor	rules survive compaction	auto on `init`	v1.73.0
Scripted-safe `icmg run` (non-interactive guard)	no hang on destructive	`--yes`/env opt-in	v1.74.0
Clean self-upgrade (idempotent Defender step)	no phantom B: drive popup	`--no-defender` opt-out	v1.75.0
Encryption-at-rest (`icmg encrypt`, SQLCipher AES-256)	opt-in full-DB encrypt	BM25 recall intact	v1.76.0
Hot recall cache (RAM, daemon-shared)	< 5 ms repeat recall	self-governing RAM	v1.77.0
Init + upgrade hardening (no 30-min hang, no stale-proc lock)	`init` returns immediately	detached imports + lock-guard	v1.78.4
Cost per AI session	down 70 – 90 % vs. raw	up to 95 %	—

Measured on real-world sessions. Your mileage will vary with project size and habits — anyone running a busy AI agent for a day already sees meaningful savings.

✨ What's new

v1.81.0 - icmg memory atomize completes the dual-memory system, and icmg run now works on plain Windows. The semantic atom layer (v1.79) now has a full management CLI: icmg memory atomize run drains the pending queue on demand (capped --max N); icmg memory atomize status shows atom count and queue depth; ICMG_ATOMIZE=0 disables atomization project-wide. The queue also drains automatically on every compact-bg tick. recall --atoms opt-in atom-FTS hybrid locked by a roundtrip test (enqueue->drain->FTS->source-node). Also: non-MSYS Windows now routes icmg run through pwsh (PS7+) or powershell (PS5 fallback) instead of cmd.exe, so Select-String, Get-ChildItem, and other PS cmdlets work without MSYS2. Full automated suite passes (1207 checks).
v1.81.0 - icmg bisect: auto-find the commit that broke a test, plus atom-memory completion. New icmg bisect --good <ref> --bad <ref> --test "<cmd>" binary-searches your git history, checking out each midpoint and running your test, to pinpoint the first commit where it starts failing, then restores your original HEAD (refuses a dirty working tree, never commits or rewrites history). It prints a ~N test runs estimate up front. Separately, the v1.79.0 semantic-atom layer is completed: derived atoms now carry precomputed embeddings when an ONNX backend is available (enabling semantic atom matching; clean BM25 fallback when absent), and ICMG_ATOMIZE_LLM=1 opts into local-LLM fact extraction (self-contained, pronoun-resolved) with automatic heuristic fallback. Also folds in a rebrand fix so icmg bug-report files to the correct repository. Full automated suite passes (1207 checks).
v1.79.0 - Semantic memory: recall can hit single facts instead of whole blobs — plus a real headless sub-agent. icmg now derives an atomic-fact layer from your stored memories. When you icmg store a multi-sentence decision, a background worker (icmg atomize run) splits it into atomic propositions — heuristic by default, opt-in local-LLM via ICMG_ATOMIZE_LLM=1 — so icmg recall "<query>" --atoms matches the exact fact and returns its source memory. Sharper hits on a large store, and zero latency added to store (it only enqueues; atomization happens off the hot path) and zero change to default recall (--atoms is opt-in). New icmg atomize status shows atom count + pending queue; opt out entirely with ICMG_ATOMIZE=0. Separately, icmg agent "<task>" --exec upgrades the LLM proxy into a real headless sub-agent with file-edit + shell tools (gated behind ICMG_AGENT_EXEC=1 so it never fires by accident). This release also folds in two Windows reliability fixes: icmg init no longer hangs for 30+ minutes (background imports now run fully detached) and icmg update --apply no longer gets blocked when stale icmg processes hold the binary (updating.lock sentinel + rename retry). Full automated suite passes (1196 checks).
v1.78.4 - Fix: icmg init no longer hangs, and self-upgrade no longer gets blocked by stale processes. Two Windows reliability fixes. (1) On some projects icmg init could appear to hang for 30+ minutes: it launched its background import helpers (claudemd import / plan import / skill index) through a shell whose stdout pipe was inherited by the spawned grandchild processes, so icmg blocked reading that pipe until those children exited — and they in turn stalled on a database lock. init now spawns the background work fully detached (no inherited handles), so it returns immediately while the imports finish on their own. (2) icmg update --apply could fail to swap the binary when other icmg processes (editor hooks, the background daemon) were still running and holding the .exe file lock. icmg now writes an updating.lock sentinel so any freshly-spawned icmg bails out during the brief swap window, and the upgrade retries the rename a few times to let in-flight processes exit cleanly. Full automated suite passes (1187 checks).
v1.78.3 - RAM cache now survives daemon restart. v1.77 introduced the in-memory recall cache that keeps repeat recalls under 5 ms; v1.78.2 shipped the write-through + warm-reload building blocks. v1.78.3 finally wires it end-to-end inside the daemon: every PUT is persisted asynchronously through a write queue (non-blocking on the hot path), and on first touch of a project's scope the daemon lazy-hydrates the top-256 hottest entries from disk into RAM. The cache is daemon-shared across sessions and projects, but each entry is tagged with a scope hash so different projects can't see each other's recalls. The RCACHE protocol gains a scope field on PUT/GET (older clients without it land in a back-compat empty bucket and continue to work). Persist is on by default; opt out with ICMG_RECALL_CACHE_PERSIST=0 or a per-project .icmg/cache-persist.off marker. Full automated suite passes (1182 atomic, +13 new daemon-wire tests).

🧰 What you'll actually use day-to-day

After install, the only command most people type is icmg init once per project. Everything else happens automatically. A few useful commands when you want to peek under the hood:

Want to	Type
See how much you saved this month	`icmg savings`
See a chart in the terminal	`icmg savings --ascii`
Recall a past decision in this project	`icmg recall "<question>"`
Recall something from another project	`icmg cross-recall "<question>"`
Wake-up briefing for a fresh session	`icmg wake-up`
Update Icemage in place	`icmg update --apply`
Health-check the install	`icmg doctor`

For the full menu run icmg --help.

🤖 Works with

Claude Code (primary target — best-tested)
Cursor — drop-in via the same hooks
Cline, Windsurf, OpenCode — same approach, may need a small config nudge
Anything that exposes hooks or MCP — the MCP server bundled with Icemage is reusable

🛡️ Safety + privacy

100 % local. Everything Icemage knows about your projects lives in a small SQLite database next to your code. Nothing is sent to a remote server — not the project name, not the file paths, not the recalled snippets.
No telemetry. Icemage doesn't phone home.
Open source. Apache-2.0. Audit the binary, the release notes, and the file structure freely. Source code is held privately to keep the bug surface manageable for a solo maintainer — public reports + private fixes is the operating model.
Tamper-evident. Every release ships with a sha256 sidecar so you can verify the binary you downloaded.

🩹 Honest limits

Windows + Linux only for prebuilt binaries today. macOS users currently need to wait for a self-hosted runner build (planned).
First-time install on Windows with strict antivirus can be slow until you let Icemage run once. After that it's fast.
Not a replacement for the AI. Icemage is a token-trimming layer — it doesn't write code for you and it doesn't make a bad AI smart.

💖 Support

If Icemage saved you a few hours or a few dollars and you want to send a small thank-you, both routes work:

All revenue goes straight into more releases — there is no team behind this, just one maintainer and a long backlog of "make AI agents less wasteful" ideas.

❓ FAQ

Does Icemage send my code anywhere? No. Everything is local. The only network call is when you ask Icemage to update itself or fetch a URL through icmg fetch.

Can my company use it? Yes — Apache-2.0 licensed, free for any use including commercial. If you want a private support arrangement or a custom build, open a sponsorship.

Why is the source code repo private? One maintainer, no security team. Public bug reports + private fixes lets me ship hotfixes the same day without telegraphing exploitable details. The release binaries and reproducible build hash are still public.

Does it slow my AI down? No. Trimming happens before the AI reads anything, so the AI sees a smaller, cleaner version of the same context. End-to-end interactions get faster, not slower.

Where are the savings stored? In .icmg/data.db inside each project (small SQLite file). Run icmg savings to see the breakdown.

How do I report a bug or ask for a feature? Open an issue at the GitHub issues page. Real-world reproductions with icmg savings --json attached get triaged fastest.

🌟 Star history

📜 License

Apache-2.0.

📚 Other docs

CHANGELOG.md — full version history
SECURITY.md — vulnerability reporting
NOTICE — third-party attributions

Name		Name	Last commit message	Last commit date
Latest commit History 647 Commits
.clusterfuzzlite		.clusterfuzzlite
.github		.github
assets		assets
embed		embed
examples/hooks		examples/hooks
fuzz		fuzz
migrations		migrations
multimodal		multimodal
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
COMMANDS.md		COMMANDS.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
SECURITY.md		SECURITY.md
session-log.md		session-log.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Icemage _^(icmg)

🟢 Why Icemage

📊 Headline numbers

Measured on real-world sessions. Your mileage will vary with project size and habits — anyone running a busy AI agent for a day already sees meaningful savings.

✨ What's new

🧰 What you'll actually use day-to-day

🤖 Works with

🛡️ Safety + privacy

🩹 Honest limits

💖 Support

❓ FAQ

🌟 Star history

📜 License

📚 Other docs

About

Uh oh!

Releases 265

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Icemage (icmg)

🟢 Why Icemage

📊 Headline numbers

Measured on real-world sessions. Your mileage will vary with project size and habits — anyone running a busy AI agent for a day already sees meaningful savings.

✨ What's new

🧰 What you'll actually use day-to-day

🤖 Works with

🛡️ Safety + privacy

🩹 Honest limits

💖 Support

❓ FAQ

🌟 Star history

📜 License

📚 Other docs

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 265

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Icemage _^(icmg)

Packages