fix(cache): invalidate schema cache on contract change (#58)#59
Conversation
mcp2cli's per-credential/per-service schema cache only expired on a 24h TTL, so after an upstream contract bump (e.g. Open Brain memory-tools.v11 adding append_session_event.create_if_missing) `mcp2cli schema` and generate-skills served the OLD tool shape for up to a day, even though live tool calls already worked against the new contract. This lands the cache-coherence core (general, capability-tiered): - Schema fingerprint in cache metadata. writeCache now records a deterministic schemaFingerprint over the full tool surface (incl. inputSchema, so a new field like create_if_missing changes it). Accepts an authoritative contract hash when a server publishes one; otherwise derives it from the tool surface. New fingerprintSchemas() helper (order-independent, change-sensitive). - Invalidate ALL keys for a service. clearServiceCacheKeys() clears the bare entry AND every per-credential key (credential:<base64url([service,user])>), so a bump can't leave a credential-scoped read serving the old schema. - Daemon self-heal + invalidation. The drift-hook (already runs on connect via the open connection -- no self-call) now clears all cache keys for a service when it detects drift, then repopulates the base entry. - Client-side coherence via piggyback. The daemon stamps its current schemaFingerprint (per-service) onto /list-tools and /schema responses; the client compares it to its own cached fingerprint and drops its stale cache on mismatch -- closing the staleness on the client's ~/.cache layer with no extra round-trip. Cold/absent cache is left alone (refetches naturally). - cache warm --force: clear-then-refetch, the documented recovery after a bump. Deferred to a follow-up (noted on #58): the `cache warm` daemon self-call deadlock fix touches the #52 "route schema refresh through daemon" decision and warrants its own PR; the drift-hook repopulation here already routes around it. Verified: bun run typecheck clean; full suite 1069 pass / 0 fail, deterministic across repeated runs. New unit coverage for fingerprintSchemas, clearServiceCacheKeys (base + credential keys), reconcileClientCache (mismatch / match / cold / no-fingerprint / error), and cache warm --force. Refs #55 (nested-field schema discoverability -- worked together). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Review swarm: BLOCKER found — do not merge yetRan a 3-reviewer adversarial swarm on this PR. All three independently converged on the same root cause. The client-side piggyback coherence is broken as written and would destroy the cache it's meant to keep coherent. BLOCKER — fingerprint comparison is across two different keys (never converges)
BLOCKER (B2) — divergent hash surface even on the bare keyDaemon writers hash MAJOR —
|
…58 review) Review-swarm blockers on PR #59 (all three reviewers converged): - B1 (key mismatch, false-positive invalidation): the daemon stamped the per-credential pool-key fingerprint while the client compared its bare-service fingerprint -- different files, never converge -> the client cleared its cache on every call after a warm. The daemon now stamps readCacheFingerprint( body.service) (bare) on /list-tools and /schema. The drift-hook already maintains the bare entry on every connect (checkDriftOnConnect runs with baseServiceName), so both sides reference the same, credential-independent key. - B2 (description-default divergence): the daemon hashed `description ?? ""` while the client warm path hashed the already-defaulted "(no description)", producing different fingerprints for the same tool. hashToolSchema now normalizes "" and "(no description)" to one canonical value, so every writer derives the same fingerprint (proven: daemon-style and client-style hashes are now identical; real descriptions stay distinct). - M1 (silent under-delete): clearServiceCacheKeys incremented its count even when unlink was swallowed, reporting success while a stale credential entry survived -- re-opening the staleness. It now counts only resolved unlinks and logs failures. - Added a real daemon/client fingerprint-convergence round-trip test that exercises both write surfaces; it would have caught B1/B2 (the prior tests hand-injected fingerprints and never compared two independently-derived ones). Verified: bun run typecheck clean; full suite 1070 pass / 0 fail, deterministic. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… guard) The re-review noted the B1 fix (daemon stamps the bare-service fingerprint, not the credential pool key) had no regression guard -- a revert to the pool key would ship silently. This adds a daemon /list-tools test that wires a per-user credential so the pool key is genuinely a `credential:` key distinct from the bare key, seeds the two with different fingerprints, and asserts the response carries the BARE one. Mutation-verified: reverting the stamp to listPoolKey fails this test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Re-review: blockers fixed, verified SAFE TO MERGEAdversarial re-verification (same reviewer) on the fixes:
Residual (MINOR, tracked as #61): bare-key fingerprint is last-writer-wins if an upstream returns different tool sets per credential. Self-heals, no loop, no data crossing; uncommon in practice. Out of scope here. Verified: typecheck clean; full suite 1071 pass / 0 fail, deterministic. CI green. Ready to merge. |
Problem
mcp2cli's per-credential/per-service schema cache only expired on a 24h TTL, so after an upstream contract bump (e.g. Open Brain
memory-tools.v11addingappend_session_event.create_if_missing)mcp2cli schemaandgenerate-skillsserved the old tool shape for up to a day — even though live tool calls already worked against the new contract. Fixes #58.What this lands (cache-coherence core — general, capability-tiered)
Schema fingerprint in cache metadata.
writeCacherecords a deterministicschemaFingerprintover the full tool surface (incl.inputSchema, so a new field likecreate_if_missingchanges it). Accepts an authoritative contract hash when a server publishes one (e.g. OB'sget_contract.schema_hash); otherwise derives it from the tool surface. NewfingerprintSchemas()helper — order-independent, change-sensitive.Invalidate ALL keys for a service.
clearServiceCacheKeys()clears the bare entry and every per-credential key (credential:<base64url([service,user])>), so a bump can't leave a credential-scoped read serving the old schema (the observed failure: client cache stale while another key looked refreshed).Daemon self-heal + invalidation. The drift-hook (already runs on connect via the open connection — no self-call) now clears all cache keys for a service on detected drift, then repopulates the base entry.
Client-side coherence via piggyback. The daemon stamps its current per-service
schemaFingerprintonto/list-toolsand/schemaresponses; the client compares it to its own cached fingerprint and drops its stale~/.cacheentries on mismatch — no extra round-trip. Cold/absent cache is left alone (refetches naturally).cache warm --force— clear-then-refetch, the documented recovery after a bump.Deferred to a follow-up (noted on #58)
The
cache warmdaemon self-call deadlock (Daemon failed to start within 10000mswhen warming a daemon-routed service) touches PR #52's "route schema refresh through daemon" decision and warrants its own focused PR. The drift-hook repopulation in this PR already routes around it (reuses the open connection), so the invalidation core is complete and deadlock-safe without it.Acceptance status
rm.credential:*keys.generate-skillsregenerates from current schemas (drift-hook auto-regen path, unchanged).cache warmsucceeds on the CT216 daemon — follow-up (deadlock fix + [codex] route schema refresh through daemon #52 reconciliation).Verification
bun run typecheckclean.fingerprintSchemas,clearServiceCacheKeys(base + credential keys, unrelated services untouched),reconcileClientCache(mismatch / match / cold / no-fingerprint / error response), andcache warm --force.Refs #55 (nested-field schema discoverability — worked together).
🤖 Generated with Claude Code