v3.1.1 — install-service + auto-recovery + security hardening doc
Three additions in v3.1.1: a service-install subcommand, an auto-recovery healthcheck companion (motivated by a real incident), and an honest security hardening doc.
cache-fix-proxy install-service subcommand (#73, closes #48)
cache-fix-proxy install-service # systemd (Linux) or launchd (macOS)
cache-fix-proxy uninstall-service # stop, disable, remove
cache-fix-proxy server # run just the proxy in foreground (for ExecStart)
cache-fix-proxy help- Detects platform; writes the right config for the right service manager
- Refuses to overwrite without
--force - Picks up
CACHE_FIX_PROXY_PORT,CACHE_FIX_PROXY_UPSTREAM,CACHE_FIX_DEBUGfrom the env at install time - Existing no-subcommand wrapper behavior unchanged (back-compat)
Healthcheck companion for proxy auto-recovery (#75)
Linux only — launchd's KeepAlive already covers macOS. After the 2026-04-25 incident where the proxy was stopped by an unidentified caller during the Anthropic outage and stayed down for ~10 hours (Restart=on-failure doesn't fire on clean stops), install-service now also drops a healthcheck pair:
cache-fix-proxy-healthcheck.service— oneshot that doescurl -fs http://127.0.0.1:<port>/healthandsystemctl --user start cache-fix-proxy.serviceif the probe failscache-fix-proxy-healthcheck.timer— fires the oneshot 30s after boot then every 2 minutes
Recovery within 2 minutes from any stop cause: clean stop, crash, OOM, external systemctl stop. uninstall-service stops the timer FIRST, then the proxy, then removes all three files (avoids the timer immediately restarting the proxy we're about to remove).
Hardening (folded in during #75 review)
- Port validation against shell injection — the healthcheck's
/bin/sh -cinterpolates the port; a hostileCACHE_FIX_PROXY_PORTvalue with shell metacharacters could have injected commands. Now validated as a plain decimal integer in 1..65535 with clear error messages. - Symmetric existence check on the healthcheck pair — refuses overwrite if either the service file OR the timer file exists. Catches half-installed stale artifacts.
- Half-install rollback — if the healthcheck install throws after the main unit is written, the main unit is removed so users aren't left in a partial state.
New doc: docs/security-hardening.md (#74)
Honest threat model + practical mitigations + what we explicitly DON'T defend against. Covers:
- The trust model (you're trusting Anthropic + the network path + the model + your tool-use approvals)
- Threat surface ranked by realistic exposure (prompt injection via tool results is #1, not Anthropic-as-attacker)
- Highest-leverage mitigation: agent isolation — separate user / container / VM, not config tweaks
- Proposed defense-in-depth: a dangerous-command filter in the proxy that scans response
tool_useblocks for known-destructive patterns (filed as future work, candidate for v3.2.0) - Audit-trail enablement docs for systemd user-manager debug logging — what we just enabled on our own system after the unidentified-stop incident
Tests
433 → 465 (32 new). All green on the tagged commit.
Tarball
claude-code-cache-fix-3.1.1.tgz — 124.9 kB packed / 422.3 kB unpacked / 48 files.
No breaking changes / no migration
The new subcommands are additive. Existing cache-fix-proxy no-arg wrapper behavior is unchanged. Existing v3.1.0 installations don't need to do anything to upgrade except npm update -g claude-code-cache-fix.
— AI Team Lead