Skip to content

v3.1.1 — install-service + auto-recovery + security hardening doc

Choose a tag to compare

@cnighswonger cnighswonger released this 25 Apr 13:32
· 121 commits to main since this release
8d702ad

Three additions in v3.1.1: a service-install subcommand, an auto-recovery healthcheck companion (motivated by a real incident), and an honest security hardening doc.

cache-fix-proxy install-service subcommand (#73, closes #48)

cache-fix-proxy install-service     # systemd (Linux) or launchd (macOS)
cache-fix-proxy uninstall-service   # stop, disable, remove
cache-fix-proxy server              # run just the proxy in foreground (for ExecStart)
cache-fix-proxy help
  • Detects platform; writes the right config for the right service manager
  • Refuses to overwrite without --force
  • Picks up CACHE_FIX_PROXY_PORT, CACHE_FIX_PROXY_UPSTREAM, CACHE_FIX_DEBUG from the env at install time
  • Existing no-subcommand wrapper behavior unchanged (back-compat)

Healthcheck companion for proxy auto-recovery (#75)

Linux only — launchd's KeepAlive already covers macOS. After the 2026-04-25 incident where the proxy was stopped by an unidentified caller during the Anthropic outage and stayed down for ~10 hours (Restart=on-failure doesn't fire on clean stops), install-service now also drops a healthcheck pair:

  • cache-fix-proxy-healthcheck.service — oneshot that does curl -fs http://127.0.0.1:<port>/health and systemctl --user start cache-fix-proxy.service if the probe fails
  • cache-fix-proxy-healthcheck.timer — fires the oneshot 30s after boot then every 2 minutes

Recovery within 2 minutes from any stop cause: clean stop, crash, OOM, external systemctl stop. uninstall-service stops the timer FIRST, then the proxy, then removes all three files (avoids the timer immediately restarting the proxy we're about to remove).

Hardening (folded in during #75 review)

  • Port validation against shell injection — the healthcheck's /bin/sh -c interpolates the port; a hostile CACHE_FIX_PROXY_PORT value with shell metacharacters could have injected commands. Now validated as a plain decimal integer in 1..65535 with clear error messages.
  • Symmetric existence check on the healthcheck pair — refuses overwrite if either the service file OR the timer file exists. Catches half-installed stale artifacts.
  • Half-install rollback — if the healthcheck install throws after the main unit is written, the main unit is removed so users aren't left in a partial state.

New doc: docs/security-hardening.md (#74)

Honest threat model + practical mitigations + what we explicitly DON'T defend against. Covers:

  • The trust model (you're trusting Anthropic + the network path + the model + your tool-use approvals)
  • Threat surface ranked by realistic exposure (prompt injection via tool results is #1, not Anthropic-as-attacker)
  • Highest-leverage mitigation: agent isolation — separate user / container / VM, not config tweaks
  • Proposed defense-in-depth: a dangerous-command filter in the proxy that scans response tool_use blocks for known-destructive patterns (filed as future work, candidate for v3.2.0)
  • Audit-trail enablement docs for systemd user-manager debug logging — what we just enabled on our own system after the unidentified-stop incident

Tests

433 → 465 (32 new). All green on the tagged commit.

Tarball

claude-code-cache-fix-3.1.1.tgz — 124.9 kB packed / 422.3 kB unpacked / 48 files.

No breaking changes / no migration

The new subcommands are additive. Existing cache-fix-proxy no-arg wrapper behavior is unchanged. Existing v3.1.0 installations don't need to do anything to upgrade except npm update -g claude-code-cache-fix.

— AI Team Lead