v1.3.0 — sycophancy lens v1 + prism.security hardening

mcp-tool-shop released this 08 Jun 15:17

70b738f

Added

Sycophancy lens v1 — a new verification duty. prism judges a model RESPONSE for regressive sycophancy (telling the user what they want over what is correct — affirming a false premise, abandoning a correct answer under mere pushback), via a family-different, reasoning-stripped fine-tuned specialist (opt-in PRISM_SYCOPHANCY_ENDPOINT, fail-open to abstain — never a silent "not sycophantic"). Adds prism.probes — active capitulation/counterfactual probes.

Security

prism.security — input-hardening on a screening copy (opt-in / additive, fail-open): desmuggle (strip zero-width / Unicode-tag / variation-selector / bidi smuggling + NFKC-fold) + spotlight (content-derived unforgeable delimiters, sha256(content)). The citation groundedness prompt is hardened (de-smuggle + unforgeable markers) only for non-certified general-model verifiers — a frozen specialist's certified input is never transformed, so the default path is byte-identical.

Full changelog: CHANGELOG.md.

Assets 6