v0.8.0-rc.1
Pre-releaseTL;DR
This is the first v0.8 pre-release. It lights up three of the v0.8 tiers — versioned recognizer lineage (Tier 1), bundle unification with explicit safety tiers (Tier 1.5), and a second NER-style observer at the Pass-3 chokepoint (Tier 2.5). Tier 2 (checksum-backed Aadhaar / NIR / Steuer-ID / BSN / CPF / CNPJ / NHS) and Tier 3 (regex SSN / NINO / PAN) follow in rc.2 and rc.3. The contract surface for Pipeline, Session, Policy, and the rulepack schema is unchanged from the v0.7 line. Existing v0.7.x adopters can upgrade with no policy edits.
Highlights
Versioned recognizer lineage (Tier 1, #203)
Every Candidate and RedactionEntry now carries an optional recognizer_id plus recognizer_version_id. PR #203 (commit 3c95304) threaded the lineage through the audit boundary in pipeline.rs so the version no longer drops at source. The SQLite audit schema gained nullable recognizer_id and recognizer_version_id columns; pre-migration rows were retrofitted to legacy_unversioned so historical queries still join cleanly. NER recognizer emissions stamped themselves with ner.<model>.<vN> from the artifact config metadata (ner.unknown.v0 was the closed fallback when the artifact carried no version). docs/architecture/locale-chain.md picked up a coverage matrix listing every bundled recognizer × supported locale × ValidatorKind. North-star axes 4 (trust/auditable) + 5 (adopter ergonomics).
Bundle unification + SafetyTier enum (Tier 1.5, #201)
PR #201 (commit 8ab9daf) folded the core and core-extended bundled rulepacks into a single core bundle. Each recognizer now declares a safety_tier field — SafeDefault, LocaleGated, or OptIn — backed by a #[non_exhaustive] Rust enum. No-policy activation gates on SafeDefault only, replacing the prior dual-bundle activation model with a single closed enum. The --rulepack-bundled core-extended CLI surface is deprecated (removal scheduled for v0.10.0): it now aliases to --rulepack-bundled core and auto-activates the LocaleGated tier with a tracing::warn! deprecation notice, preserving the v0.4.5 adopter behavior. North-star axes 1 (reliability — the dual-bundle footgun is gone), 4 (trust — tier metadata is part of the rulepack source of truth), and 5 (one bundle, one mental model).
KijiDistilbertSafetyNet backend (Tier 2.5, #202)
PR #202 (commit 0cd9ccc) added a second observer at the Pass-3 SafetyNet chokepoint. The new --safety-net-backend kiji-distilbert flag selects a DistilBERT NER subprocess that runs alongside (or instead of) the existing OpenAI Privacy Filter; the default remains openai-filter for source compatibility. The pinned-artifact contract is identical to the existing OpenAI filter — model dir must carry SHA256SUMS, labels.json, model.onnx, and tokenizer.json with 0o700 directory + 0o600 file permissions on Unix; missing artifacts (including a missing SHA256SUMS) fail closed with a typed CliError::SafetyNetArtifactMissing { backend, path } exit 2 before the subprocess spawns. scripts/fetch-kiji-safetynet-model.sh mirrors the existing NER fetcher. North-star axis 1 (defense in depth — second NER opinion at the chokepoint, observer-only, never mutates the manifest).
Known limitations
- This is a pre-release. Cargo's caret operator excludes pre-releases, so adopters depending on the rc need an exact pin:
gaze-pii = "0.8.0-rc.1"(not"0.8"). The same applies togaze-assembly,gaze-recognizers, and the rest of the workspace. - Tier 2 entities (Aadhaar / NIR / Steuer-ID / BSN / CPF / CNPJ / NHS) are not yet shipped. They land in rc.2 with checksum-backed validators.
- Tier 3 regex locales (SSN / NINO / PAN) land in rc.3.
- The Kiji-DistilBERT artifact bundle ships with a placeholder Hugging Face commit SHA in
scripts/fetch-kiji-safetynet-model.shpending first sign-off; pin the upstream model hash before relying on it in production. --rulepack-bundled core-extendedstill works in v0.8 but emits a deprecation warning; plan the cutover before v0.10.0.
Adopter notes
- No policy edits required. Existing
policy.tomlfiles load unchanged. The newsafety_tierfield is internal to the bundled rulepack source. RedactionEntryJSON shape is additive. New fields use#[serde(skip_serializing_if = "Option::is_none")]and emit nothing when None, so existing JSON consumers see no shape change.- SQLite audit DBs migrate forward in place. Pre-migration redaction rows pick up
recognizer_id = "legacy_unversioned"andrecognizer_version_id = NULL. The migration is idempotent. - Adopters who depended on the v0.4.5 PR #58 no-policy surprise activation for
phone.national.*orpostal.*recognizers should pass--locale=de-DEor--locale=en-USexplicitly, or supply a policy with the recognizer enabled. - Adopters wiring the Kiji safety net must run
scripts/fetch-kiji-safetynet-model.shto populate the artifact directory; the SHA256SUMS check is hard-fail by design.
Download
- CLI binaries:
cargo install --version 0.8.0-rc.1 gaze-cli - Library:
[dependencies] gaze-pii = "0.8.0-rc.1" gaze-assembly = "0.8.0-rc.1"
- Source:
git checkout v0.8.0-rc.1
CHANGELOG
Full entry: CHANGELOG.md [0.8.0-rc.1].