Skip to content

ncmonx/icemage

Icemage

Icemage (icmg)

release downloads last-commit ctest mcp tools license OpenSSF Scorecard OpenSSF Best Practices sponsor ko-fi

Stop burning tokens. Stop losing context. Ship faster.

A small helper app that makes AI coding assistants — Claude Code, Cursor, and friends — 70 – 98 % cheaper to run, without making them less helpful.

40 MCP tools · 1207/1207 tests · single-binary · 100 % local · pure-bash hooks (zero Python/jq dependency).

If you've ever watched a huge token bill evaporate on a single file read, paid for "thinking" you didn't need, or re-explained your project to the AI for the fifth time today — Icemage is for you.


🟢 Why Icemage

AI assistants are powerful but wasteful by default. Every time the AI opens a file, runs a command, or starts a new chat, it re-reads context it has seen many times and dumps full output into the conversation. Icemage sits quietly in the background and trims the noise before it ever reaches the AI:

  • Long files → only the relevant slice
  • Noisy command output → just the parts that matter
  • Web pages → cached + summarised
  • Past decisions → remembered across sessions so the AI doesn't ask twice
  • Repeated work → results reused instead of recomputed

The AI keeps its full intelligence. Your wallet keeps more of its money.


📊 Headline numbers

Metric Typical Best Since
File-read savings 70 – 85 % fewer tokens up to 92 % v0.5
Test / build output 60 – 80 % shorter up to 90 % v0.5
Multi-file UI propagation (style-clone) 30 – 50× cheaper up to 98 % v1.22.0
Cross-project bundle (port) 8 – 12× cheaper up to 95 % v1.24.0
Compressed-Write (AI emit diff) 70 – 95% fewer tokens up to 98 % v1.25.0
Web-fetch reduction 70 – 90 % smaller up to 95 % v0.4
Repeat-context recall near-zero, < 5 ms cached v1.21.8
Semantic atom recall (recall --atoms) fact-level hits, not blobs v1.79.0
Auto-bisect (icmg bisect) first-bad commit in ~log2(N) tests v1.81.0
Past-chat full-text search < 10 ms across months v1.21.7
Graph symbol lookup 256-slot in-RAM cache v1.21.8
First-prompt warmup < 1 s v1.18
Cold build time (icmg itself) ~50 % faster (20 min → 9-10 min) v1.26.0
MCP response filter (verbose plugins) 50 – 80 % smaller up to 90 % v1.30.0
Auto-thinking suppress (trivial prompts) ~1500 tok / call saved v1.30.0
Sayless-auto (long-prose replies) 60 – 75 % compress up to 85 % v1.30.0
Service auto-start (UserPromptSubmit) 0-touch warm-up v1.30.0
Path ambiguity warning (icmg context) wrong-file lookups → loud v1.29.0
rg-wrapper + brace glob (icmg grep/files) flag-mirror, {a,b} expand v1.29.0
Local AI model (built-in, opt-in) 0 cloud calls privacy-first v1.31.0
Smart router (REGEX vs LLM_LOCAL vs CACHE) <100 us p99 hot-path forced regex v1.31.0
HTTP streaming download (model fetch + SHA256) 400 MB - 2 GB safe-verify tamper-detect v1.31.0
icmg git wrapper (single ergonomic entry) Tkil-filtered + safety-gated enforces icmg-FIRST v1.31.0
Python-free core (PRECOMPACT_PY dropped) -200-500 ms boot saved single-binary v1.31.0
pack --rerank (LLM-reorder memory hits) opt-in warm-path router-gated v1.32.0
PreCompact LLM summary (warm-pool Qwen 0.5B) <15 s cold regex fallback always v1.32.0
icmg compact-bg (proactive memory worker) <3 s warm manual + future hook v1.32.0
Smarter local AI memory multi-prompt safe no overflow v1.32.0
Code graph viz + report (icmg graph viz) interactive D3 + god-nodes v1.71.0
Secret scanner (icmg scan) 21 detectors, CI-gate redact-by-default v1.68.0
MCP server hardening (token + rate-limit + path-guard) abuse / RCE-safe v1.72.0
Post-compact memory re-anchor rules survive compaction auto on init v1.73.0
Scripted-safe icmg run (non-interactive guard) no hang on destructive --yes/env opt-in v1.74.0
Clean self-upgrade (idempotent Defender step) no phantom B: drive popup --no-defender opt-out v1.75.0
Encryption-at-rest (icmg encrypt, SQLCipher AES-256) opt-in full-DB encrypt BM25 recall intact v1.76.0
Hot recall cache (RAM, daemon-shared) < 5 ms repeat recall self-governing RAM v1.77.0
Init + upgrade hardening (no 30-min hang, no stale-proc lock) init returns immediately detached imports + lock-guard v1.78.4
Cost per AI session down 70 – 90 % vs. raw up to 95 %

Measured on real-world sessions. Your mileage will vary with project size and habits — anyone running a busy AI agent for a day already sees meaningful savings.

✨ What's new

  • v1.81.0 - icmg memory atomize completes the dual-memory system, and icmg run now works on plain Windows. The semantic atom layer (v1.79) now has a full management CLI: icmg memory atomize run drains the pending queue on demand (capped --max N); icmg memory atomize status shows atom count and queue depth; ICMG_ATOMIZE=0 disables atomization project-wide. The queue also drains automatically on every compact-bg tick. recall --atoms opt-in atom-FTS hybrid locked by a roundtrip test (enqueue->drain->FTS->source-node). Also: non-MSYS Windows now routes icmg run through pwsh (PS7+) or powershell (PS5 fallback) instead of cmd.exe, so Select-String, Get-ChildItem, and other PS cmdlets work without MSYS2. Full automated suite passes (1207 checks).
  • v1.81.0 - icmg bisect: auto-find the commit that broke a test, plus atom-memory completion. New icmg bisect --good <ref> --bad <ref> --test "<cmd>" binary-searches your git history, checking out each midpoint and running your test, to pinpoint the first commit where it starts failing, then restores your original HEAD (refuses a dirty working tree, never commits or rewrites history). It prints a ~N test runs estimate up front. Separately, the v1.79.0 semantic-atom layer is completed: derived atoms now carry precomputed embeddings when an ONNX backend is available (enabling semantic atom matching; clean BM25 fallback when absent), and ICMG_ATOMIZE_LLM=1 opts into local-LLM fact extraction (self-contained, pronoun-resolved) with automatic heuristic fallback. Also folds in a rebrand fix so icmg bug-report files to the correct repository. Full automated suite passes (1207 checks).
  • v1.79.0 - Semantic memory: recall can hit single facts instead of whole blobs — plus a real headless sub-agent. icmg now derives an atomic-fact layer from your stored memories. When you icmg store a multi-sentence decision, a background worker (icmg atomize run) splits it into atomic propositions — heuristic by default, opt-in local-LLM via ICMG_ATOMIZE_LLM=1 — so icmg recall "<query>" --atoms matches the exact fact and returns its source memory. Sharper hits on a large store, and zero latency added to store (it only enqueues; atomization happens off the hot path) and zero change to default recall (--atoms is opt-in). New icmg atomize status shows atom count + pending queue; opt out entirely with ICMG_ATOMIZE=0. Separately, icmg agent "<task>" --exec upgrades the LLM proxy into a real headless sub-agent with file-edit + shell tools (gated behind ICMG_AGENT_EXEC=1 so it never fires by accident). This release also folds in two Windows reliability fixes: icmg init no longer hangs for 30+ minutes (background imports now run fully detached) and icmg update --apply no longer gets blocked when stale icmg processes hold the binary (updating.lock sentinel + rename retry). Full automated suite passes (1196 checks).
  • v1.78.4 - Fix: icmg init no longer hangs, and self-upgrade no longer gets blocked by stale processes. Two Windows reliability fixes. (1) On some projects icmg init could appear to hang for 30+ minutes: it launched its background import helpers (claudemd import / plan import / skill index) through a shell whose stdout pipe was inherited by the spawned grandchild processes, so icmg blocked reading that pipe until those children exited — and they in turn stalled on a database lock. init now spawns the background work fully detached (no inherited handles), so it returns immediately while the imports finish on their own. (2) icmg update --apply could fail to swap the binary when other icmg processes (editor hooks, the background daemon) were still running and holding the .exe file lock. icmg now writes an updating.lock sentinel so any freshly-spawned icmg bails out during the brief swap window, and the upgrade retries the rename a few times to let in-flight processes exit cleanly. Full automated suite passes (1187 checks).
  • v1.78.3 - RAM cache now survives daemon restart. v1.77 introduced the in-memory recall cache that keeps repeat recalls under 5 ms; v1.78.2 shipped the write-through + warm-reload building blocks. v1.78.3 finally wires it end-to-end inside the daemon: every PUT is persisted asynchronously through a write queue (non-blocking on the hot path), and on first touch of a project's scope the daemon lazy-hydrates the top-256 hottest entries from disk into RAM. The cache is daemon-shared across sessions and projects, but each entry is tagged with a scope hash so different projects can't see each other's recalls. The RCACHE protocol gains a scope field on PUT/GET (older clients without it land in a back-compat empty bucket and continue to work). Persist is on by default; opt out with ICMG_RECALL_CACHE_PERSIST=0 or a per-project .icmg/cache-persist.off marker. Full automated suite passes (1182 atomic, +13 new daemon-wire tests).

🧰 What you'll actually use day-to-day

After install, the only command most people type is icmg init once per project. Everything else happens automatically. A few useful commands when you want to peek under the hood:

Want to Type
See how much you saved this month icmg savings
See a chart in the terminal icmg savings --ascii
Recall a past decision in this project icmg recall "<question>"
Recall something from another project icmg cross-recall "<question>"
Wake-up briefing for a fresh session icmg wake-up
Update Icemage in place icmg update --apply
Health-check the install icmg doctor

For the full menu run icmg --help.


🤖 Works with

  • Claude Code (primary target — best-tested)
  • Cursor — drop-in via the same hooks
  • Cline, Windsurf, OpenCode — same approach, may need a small config nudge
  • Anything that exposes hooks or MCP — the MCP server bundled with Icemage is reusable

🛡️ Safety + privacy

  • 100 % local. Everything Icemage knows about your projects lives in a small SQLite database next to your code. Nothing is sent to a remote server — not the project name, not the file paths, not the recalled snippets.
  • No telemetry. Icemage doesn't phone home.
  • Open source. Apache-2.0. Audit the binary, the release notes, and the file structure freely. Source code is held privately to keep the bug surface manageable for a solo maintainer — public reports + private fixes is the operating model.
  • Tamper-evident. Every release ships with a sha256 sidecar so you can verify the binary you downloaded.

🩹 Honest limits

  • Windows + Linux only for prebuilt binaries today. macOS users currently need to wait for a self-hosted runner build (planned).
  • First-time install on Windows with strict antivirus can be slow until you let Icemage run once. After that it's fast.
  • Not a replacement for the AI. Icemage is a token-trimming layer — it doesn't write code for you and it doesn't make a bad AI smart.

💖 Support

If Icemage saved you a few hours or a few dollars and you want to send a small thank-you, both routes work:

All revenue goes straight into more releases — there is no team behind this, just one maintainer and a long backlog of "make AI agents less wasteful" ideas.


❓ FAQ

Does Icemage send my code anywhere? No. Everything is local. The only network call is when you ask Icemage to update itself or fetch a URL through icmg fetch.

Can my company use it? Yes — Apache-2.0 licensed, free for any use including commercial. If you want a private support arrangement or a custom build, open a sponsorship.

Why is the source code repo private? One maintainer, no security team. Public bug reports + private fixes lets me ship hotfixes the same day without telegraphing exploitable details. The release binaries and reproducible build hash are still public.

Does it slow my AI down? No. Trimming happens before the AI reads anything, so the AI sees a smaller, cleaner version of the same context. End-to-end interactions get faster, not slower.

Where are the savings stored? In .icmg/data.db inside each project (small SQLite file). Run icmg savings to see the breakdown.

How do I report a bug or ask for a feature? Open an issue at the GitHub issues page. Real-world reproductions with icmg savings --json attached get triaged fastest.


🌟 Star history

Star history

📜 License

Apache-2.0.


📚 Other docs