Skip to content

SecureVector Guardian 1.3.0

Choose a tag to compare

@mss04132020 mss04132020 released this 17 Jun 03:32
· 3 commits to main since this release
37f6cda

Adds encoded-payload and agent-era injection coverage, and hardens the model's evaluation and data-provenance guarantees. All training data remains 100% SecureVector-original — now enforced by automated tests.

Added

  • URL / percent-encoding decode-and-rescan — percent-decodes inline %xx payloads and rescans the plaintext (e.g. ignore%20all%20previous%20instructions). Gated so it only activates when %xx is present and decoding changes the text — benign prose and benign encoded URLs produce no false positives.
  • Broadened agent-era injection coverage via original training templates: tool/plugin misuse, RAG / retrieved-document indirect injection, and memory/conversation poisoning. (Concepts from OWASP LLM06/LLM08 and MITRE ATLAS; all example text authored by SecureVector.)
  • Honest, leak-proof evaluation — content-hash–frozen held-out test set, train/test near-duplicate (paraphrase) leak guard, recall-at-FPR frontier, 95% bootstrap CIs, and per-category support flags.
  • Adversarial red-team regression eval over a frozen 1,955-example corpus (held out of training, verified by the leak guard).
  • Provenance enforcement — internal-source + no-public-dataset-marker checks run during training; a static no-public-dataset-import guard runs in CI.

Changed

  • Retrained on the original corpus. Precision held (held-out FPR ≈ 0.02; long-document benign FPR 0.0); obfuscation / buried-in-document / base64/hex robustness maintained.
  • canonicalize() is now idempotent; malformed rule files warn instead of being silently skipped.

Data & legal posture (unchanged, now enforced)

  • 100% original training data; no third-party datasets/prompts/rules/code/model weights. No pretrained checkpoints. Public benchmarks are evaluation-only. Permissive OSS deps only (scikit-learn/NumPy/SciPy — BSD; PyYAML/joblib — MIT). Ships a zero-dependency pure-Python runtime that is byte-exact to the trained model (parity Δ = 0).

Full notes: see CHANGELOG.md.