Skip to content

v0.8.0

Choose a tag to compare

@github-actions github-actions released this 14 May 10:47
· 89 commits to main since this release
fea58d5

TL;DR

The v0.8.0 wave shipped what v0.7 set up: ten validator-backed national-ID
recognizers (Aadhaar / NIR / Steuer-ID / BSN / CPF / CNPJ / NHS / SSN /
NINO / PAN) across five new locale packs, a unified core rulepack with
a safety_tier field that machine-encodes activation, versioned
recognizer lineage on every audit row, the new gaze-proxy crate that
puts a PII chokepoint in front of OpenAI / Anthropic / Gemini API calls,
and a 150-scenario coverage corpus that replaces the previous stochastic
templates. The gaze-proxy 0.8.0 crate is published alongside the rest
of the workspace.

Highlights

  • Multi-provider HTTP proxy with daemon mode. PR #212 introduces
    the gaze-proxy crate. An adapter/driver pattern lets a single proxy
    serve OpenAI's /v1/chat/completions, Anthropic's /v1/messages, and
    Gemini's /v1beta/models/*:{generateContent,streamGenerateContent}
    without translation. SSE streaming and tool-call argument
    reconstruction are wired through gaze::Pipeline, so a PII span that
    arrives chunk-split inside a tool_calls.function.arguments field is
    accumulated and redacted before leaving the proxy. Daemon-mode
    subcommands — gaze proxy {serve,start,stop,status,logs,restart}
    plus opt-in install-launchd / install-systemd-user — mean the
    adopter never has to remember the bind flags after the first
    gaze proxy start.

  • Seven checksum-backed locale validators. PR #207 added
    ValidatorKind variants for Aadhaar (Verhoeff), French NIR (MOD-97
    variant), German Steuer-ID (MOD 11,10), Dutch BSN (MOD-11), Brazilian
    CPF + CNPJ (MOD-11), and UK NHS numbers (MOD-11), with five new
    locale packs at locale-fr, locale-nl, locale-br, locale-in,
    locale-uk. Every entity ships at safety_tier = "safe_default",
    so adopters in BR/FR/NL/IN/UK get coverage with a single --locale
    flag.

  • Three locale-gated regex recognizers. PR #208 added US SSN,
    UK NINO, and Indian PAN as safety_tier = "locale_gated" cue-anchored
    recognizers. They require an explicit locale to fire; no bare 9-digit
    or 10-character shapes activate without the cue context. PAN extended
    the existing locale-in pack from Tier 2 in place.

  • Bundle unification + safety_tier field. PR #201 collapsed the
    core (6 recognizers) and core-extended (10 recognizers) bundles
    into a single core bundle. Each recognizer now declares
    safety_tier ∈ {SafeDefault, LocaleGated, OptIn}, replacing the
    v0.4.5 PR #58 surprise-activation behavior with a closed enum.
    --rulepack-bundled core-extended aliases to --rulepack-bundled core
    with a deprecation warning; it will be removed in v0.10.0.

  • Versioned recognizer lineage on every audit row. PR #203 added
    Candidate.recognizer_version_id + RedactionEntry.recognizer_id +
    recognizer_version_id (all Option<String>, additive). The
    SQLite audit schema gains the two columns; pre-migration rows
    receive a legacy_unversioned tag. The NER recognizer's bare
    "ner" slug expanded to ner.<model>.<vN> so safety-net
    observers carry the model SHA + version inline.

  • Kiji DistilBERT SafetyNet backend. PR #202 introduced an
    observer-only Pass-3 backend that runs DistilBERT NER (26 PII classes,
    Apache-2.0) alongside the existing OpenAI Privacy Filter. The
    subprocess contract is identical: read clean text on stdin, emit a
    JSON span array on stdout, never mutate the manifest. Activate via
    gaze clean --safety-net-backend kiji-distilbert. Pinned-artifact
    contract fails closed if any of SHA256SUMS, labels.json,
    model.onnx, tokenizer.json are missing.

  • Replaced the coverage corpus. PR #205 deleted the 61 stochastic
    status-quo templates and the fixture_variants mechanism, replacing
    them with 150 deliberate scenarios driven by the fake crate as an
    xtask-only dev dependency. Each scenario declares its expected
    emissions with recognizer_version_id from day one. Seed pinned in
    a documented COVERAGE_CORPUS_SEED constant; baseline.json was
    fully re-snapped.

  • Two new repo-root docs. PR #206 added UPGRADE.md (a per-minor
    migration guide complementing CHANGELOG.md, covering v0.4 → v0.8).
    PR #211 added ARCHITECTURE.md — a 14.8 KiB overview of how the
    ten workspace crates fit together, with eight numbered Key Design
    Decisions and a one-diagram pipeline view of the redact/restore path.

  • Two new research docs. PR #210 is a coverage map of all 26 Kiji
    PII classes against gaze's recognizers (6 beat-via-Tier-2, 1
    beat-via-Tier-3, 16 observer-only-via-Tier-2.5, 3 parity, 0 deferred).
    PR #209 is a two-mode benchmark methodology (direct-detector +
    observer-residual, headlining strict span leak rate) with a
    rule-floor snapshot pinned to corpus + Gaze tag.

Known limitations

  • The Kiji benchmark in PR #209 ships rule-floor numbers only. The
    Kiji direct-detector + observer-residual cells are marked not_run
    because no pinned model SHA / frozen snapshot exists in the repo
    yet; the supply-chain audit + pin is tracked as a v0.8.x follow-up.
  • gaze-proxy ships OpenAI / Anthropic / Gemini adapters at v0.8.0;
    certificate management, PAC mode, Electron integration, and
    transparent MITM stay out of scope (they belong in gaze-lens, not
    the core proxy).
  • LocaleAwareModel trait + per-locale NER routing flips were
    deferred from v0.8 Tier 2.5 to v0.8.1. The current per-locale
    routing logic (4-tier locale chain) is unchanged.

Adopter notes

  • The only action-required item is bundle unification:
    --rulepack-bundled core-extended users see a deprecation warning;
    adopters who relied on PR #58's no-policy surprise activation of
    phone.national.{de,us} + postal.{de,us} must now pass an
    explicit --locale flag. UPGRADE.md walks through this in detail.
  • All Tier 2 + Tier 3 entities are additive. Adopters in BR / FR / NL /
    IN / UK / US-only / UK-only see no behavior change unless they
    enable their locale.
  • gaze-proxy is opt-in: build with --features proxy on gaze-cli
    and gaze proxy start after first config.
  • The published gaze-proxy crate joins the workspace at v0.8.0; the
    total published-crate count rises from 9 to 10.

Download

The ten workspace crates publish to crates.io as gaze-types,
gaze-recognizers, gaze-audit, gaze-pii, gaze-assembly,
gaze-mcp-core, gaze-mcp-rmcp, gaze-document, gaze-proxy,
gaze-cli. Cargo dep snippet:

[dependencies]
gaze-pii = "0.8"

For the proxy:

[dependencies]
gaze-pii = "0.8"
gaze-proxy = "0.8"

CLI install (with proxy feature):

cargo install gaze-cli --features proxy

Full CHANGELOG

See CHANGELOG.md § [0.8.0] for the
machine-readable changelog.