CacheSentry v0.1.0

CacheSentry is an offline prompt-cache hygiene guardrail for LLM applications.

It helps teams detect unstable prompt-prefix content such as UUIDs, timestamps, dynamic metadata, and template drift before those changes reduce prompt/KV-cache reuse.

Highlights

Trace-wide prompt-prefix stability audit
CI policy mode with exit codes 0/1/2
GitHub annotations for cache-breaking prompt patterns
Culprit detection for UUID/request IDs, timestamps, and dynamic metadata
Field-level attribution such as messages[0].content
Safe fix recommendations requiring human review
Redacted Markdown and JSON reports
Docker image support
GitHub Action packaging
Optional local benchmark mode for backend validation
Release checklist, positioning docs, and overclaiming tests

What CacheSentry does not do

It does not replace vLLM, SGLang, TensorRT-LLM, LMCache, LangSmith, Promptfoo, or semantic caches.
It does not call paid APIs in CI mode.
It does not guarantee latency improvement.
It does not measure backend TTFT/cache metrics in CI mode.
It does not automatically rewrite prompts.

Best first command

python -m cachesentry.cli ci examples/traces/mixed_cache_breakers.jsonl \
  --model Qwen/Qwen2.5-1.5B-Instruct \
  --fail-on-severity high \
  --redaction-mode mask \
  --github-annotations

Expected result: the mixed trace fails with cache-breaker violations for UUID/request ID and timestamp patterns.

Release status

This is a v0.1.0 first public MVP release. The CLI and reports are usable, but the project is still pre-1.0 and future versions may change APIs or schemas.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CacheSentry v0.1.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

CacheSentry v0.1.0

Highlights

What CacheSentry does not do

Best first command

Release status

Uh oh!