Releases: scrubr-dev/source-code
Release list
v1.1.0
The project is now SCRUBR — Sensitive Content Replacement, Unmasking, Brokering
& Rehydration. This is a naming release: there is no functional or wire change
(the sentinel format is unchanged), but the binary, container image, Helm chart, env
vars, and response headers are renamed, so existing deployments must update those
references (e.g. scrub → scrubr, ghcr.io/scrub-dev/scrub →
ghcr.io/scrubr-dev/scrubr, SCRUB_* → SCRUBR_*, x-scrub-* → x-scrubr-*).
Changed
- Renamed the project SCRUB → SCRUBR (Sensitive Content Replacement, Unmasking,
Brokering & Rehydration). This renames the binary (scrub→scrubr), the
crates (scrub/scrub-core→scrubr/scrubr-core), the container image
(ghcr.io/scrubr-dev/scrubr), the Helm chart (oci://ghcr.io/scrubr-dev/charts/scrubr),
the env vars (SCRUBR_*), the response headers (x-scrubr-*), default
file/paths (scrubr.yaml,/etc/scrubr,scrubr-audit.jsonl), and all repo/site
URLs (scrubr-dev/source-code,scrubr-dev.github.io). No functional change; the
sentinel format is unchanged.
v1.0.2
Added
- AI-agent onboarding: a root
SKILL.md(agent skill) describing how to
self-install and operate SCRUB, plus generated/llms.txtand/llms-full.txt
on the docs site (the llmstxt.org convention) so AI agents
can discover, install, and use SCRUB.
v1.0.1
Security-hardening release following a deep review of the data path. The only wire
change is the sentinel format (⟦S:TYPE·id·tag⟧); multi-node clusters must set
sessions.encryption_key (see below).
Security — sentinel authentication & masking coverage
- Authenticated sentinels — every sentinel now carries a per-vault keyed MAC
tag (⟦S:TYPE·id·tag⟧, HMAC-SHA256). Rehydration resolves an id only if its tag
matches, so a hostile/compromised upstream can no longer forge or blindly
enumerate sentinels (⟦S·0⟧,⟦S·1⟧, …) to read a session vault. Cross-node
session tag keys derive fromsessions.encryption_key(set it on clusters). - Enforce mode fails closed on unparseable JSON — a JSON-typed request body
that doesn't parse is rejected (422) instead of forwarded unmasked. - Opt-in comprehensive scanning —
scan_paths: ["**"](andstream_paths)
scans/rehydrates every string leaf, for deployments that want maximum coverage
over provider-aware minimalism.
Security — SSRF, isolation, DoS, and fail-closed
- Upstream redirects are no longer followed (proxy, interception, and Vault
clients). Prevents SSRF to internal/metadata endpoints and stops a malicious
upstream from getting a rehydrated (secret-bearing) response via a redirect; the
Vault connector can no longer leakX-Vault-Tokento a redirect target. - Session vaults are tenant-isolated by an unforgeable namespace scheme — a
flat-auth (non-tenant) client can no longer craft a session key that collides
with a tenant's vault. - CONNECT-proxy tunnels refuse loopback / link-local targets (blocks the cloud
metadata endpoint169.254.169.254and localhost pivots) and connect to the
exact vetted IP (no DNS-rebind window). - Certificate minting is restricted to configured interception hosts (+ a cache
cap), preventing a DoS from unbounded key-generation on arbitrary SNI values. - Audit and transaction logs are created
0600(owner-only) on Unix — the
transaction log can hold original content in dry-run mode. - Proxy auth compares fixed-length SHA-256 digests in constant time, removing a
key-length timing side channel. - Vault dedup is keyed by the exact original bytes, not a truncated 64-bit
hash — two distinct secrets can no longer be conflated into one sentinel (which
would mis-rehydrate one secret as another in a shared session vault). All
plaintext copies (forward keys + reverse values) are zeroized on drop. - Per-leaf streaming rehydration: each response content leaf (e.g. each
choices[i].delta.content) gets its own rehydrator, so a partial sentinel's
carry can no longer bleed between leaves and leak an un-rehydrated sentinel when
n > 1(multiple choices / content items) with a sentinel split across events. - Id-space exhaustion is fail-safe: a session vault never mints an id past its
node's range (which would rehydrate to the wrong secret across nodes) — on
exhaustion the value is masked with a reserved non-reversible id. - Secret-source loading fails closed: if a configured source (file/Vault)
errors, reload keeps the previous good config and startup refuses, rather than
silently running with reduced masking coverage. - Loud warning when
sessions.node_idis unset with the Redis backend (random
fallback can collide across nodes); Redis read failures are logged (not silently
treated as an empty session);masking.ttlparsing no longer overflows.
v1.0.0
First production release. SCRUB is a single-binary forward proxy that masks
secrets / PII / sensitive data on outbound LLM-provider requests and rehydrates
them on responses (including streaming). The 0.x entries below built up to this;
from 1.0 the public CLI, config schema, and chart values follow SemVer.
Highlights
- Engine: reversible sentinel masking (
⟦S:TYPE·id⟧), single-pass detection
(glossary + regex meta-engine + entropy + heuristic NER), provider-aware scan
paths, and streaming/SSE-correct rehydration. - Secret sources:
.env, file, and HashiCorp Vault (KV v2); a curated ruleset. - Sessions: request- or session-scoped pseudonyms; in-memory or Redis backend
with node-disjoint ids and AES-256-GCM at-rest encryption. - Policy: dry-run, per-route overrides, multi-tenant isolation, constant-time auth.
- Transport: TLS termination and interception (SNI-transparent + CONNECT proxy)
with on-the-fly per-host certs; usable as an OS HTTP proxy. - Auditing: tamper-evident hash-chained audit log + full (masked) transaction log.
- Delivery: static multi-arch binaries, a multi-arch container image, and a
Helm chart (single-node + HA StatefulSet/Redis) published as an OCI artifact;
a documentation website.