Skip to content

feat: add /v1/privacy/classify endpoint#584

Merged
Evrard-Nil merged 4 commits into
mainfrom
feat/privacy-classify-endpoint
May 12, 2026
Merged

feat: add /v1/privacy/classify endpoint#584
Evrard-Nil merged 4 commits into
mainfrom
feat/privacy-classify-endpoint

Conversation

@Evrard-Nil
Copy link
Copy Markdown
Collaborator

Summary

  • Adds POST /v1/privacy/classify as a raw passthrough to a backend token-classification model (e.g. openai/privacy-filter)
  • Follows the existing embeddings_raw pattern: only the model field is deserialized for routing; the request body and response bytes are forwarded unchanged
  • Wires the new privacy_classify_raw trait method through InferenceProvider (vLLM impl + external/mock stubs), the inference provider pool, and CompletionServiceTrait with concurrent-slot guarding
  • Adds InferenceType::PrivacyClassify for usage tracking, billed as input_tokens × input_cost_per_token (summed from data[*].usage.input_tokens in the provider response) — same semantics as embedding/rerank
  • Registers the route on the text-inference router (in front of usage/auth/rate-limit middleware) and adds the path to the OpenAPI spec

Unblocks nearai/infra#86. The /v1/redact endpoint (nearai/infra#87) is a separate concern and will land in a follow-up PR.

The privacy-filter model itself is already deployed at port 8007 on gpu04 and registered at privacy-filter.completions.near.ai (see cvm-conf/small-models.yaml). The model still needs to be added to the cloud-api models DB via POST /v1/admin/models before this endpoint can be exercised end-to-end against the live backend.

Test plan

  • cargo check --workspace --all-targets
  • cargo clippy --workspace --all-targets -- -D warnings
  • cargo fmt --all -- --check
  • cargo test --lib --bins -p inference_providers -p services -p api (224 + 60 unit tests pass; test_openapi_spec_generation covers the new path)
  • After merge: admin-upsert openai/privacy-filter into cloud-api DB with input_modalities=["text"], output_modalities=["text"], provider_type=vllm, inference_url=https://privacy-filter.completions.near.ai, then curl -X POST $CLOUD_API/v1/privacy/classify -H 'Authorization: Bearer sk-...' -d '{"model":"openai/privacy-filter","input":"my SSN is 123-45-6789","threshold":0.5}'
  • E2E test against staging

Adds a raw passthrough endpoint that forwards requests to a backend
token-classification model (e.g. openai/privacy-filter) exposing
POST /v1/privacy/classify. Follows the embeddings_raw pattern: the
cloud API only deserializes the model field for routing, then forwards
the request body and proxies the response bytes back unchanged.

Billing follows rerank/embedding semantics: input_tokens summed from
data[*].usage.input_tokens, priced via input_cost_per_token. Adds a
new InferenceType::PrivacyClassify variant for usage tracking.

Unblocks nearai/infra#86. The /v1/redact endpoint (nearai/infra#87)
will be added separately.
Copilot AI review requested due to automatic review settings May 11, 2026 16:20
@Evrard-Nil Evrard-Nil temporarily deployed to Cloud API test env May 11, 2026 16:20 — with GitHub Actions Inactive
@claude
Copy link
Copy Markdown

claude Bot commented May 11, 2026

Review — PR #584: /v1/privacy/classify passthrough

This PR cleanly follows the existing embeddings_raw pattern across all 5 layers (route → service → pool → provider trait → vLLM impl) and reuses the same anomaly/usage-tracking shape. I didn't find any blocking issues.

✅ Verified safe

  • Rolling deployment: inference_type is VARCHAR(50), no CHECK constraint — "privacy_classify" fits. InferenceType::from_str falls back to ChatCompletion for unknown variants (usage_repository_impl.rs:54), so usage logs written by new nodes are readable by old nodes.
  • No DB migration required — column is permissive.
  • Token billing math — summing data[*].usage.input_tokens matches the per-item response shape documented in the PR body, and usage/mod.rs (Embedding | Rerank | PrivacyClassify arm) bills input_tokens × input_cost_per_token with no cache pricing, consistent with Embedding.
  • Middleware ordering — route is registered in the text_inference_routes group with auth/rate-limit/usage middleware applied, same as embeddings.
  • Concurrent slot guardtry_acquire_concurrent_slot + ConcurrentSlotGuard enforces per-org/model concurrency limits.

Minor observations (non-blocking, pre-existing patterns)

These are all carried over from embeddings and don't need fixing in this PR, but worth noting for a future cleanup:

  1. Upstream 401/403 → 500 mapping is dead code in practiceinference_provider_pool::privacy_classify always wraps the last error into PrivacyClassifyError::RequestFailed(...) (inference_provider_pool/mod.rs:1834), so the 401|403 → "model unavailable" arm in try_privacy_classify (completions/mod.rs:1608) can never fire. The HttpError variant is lost at the pool boundary. Same bug as try_embeddings.

  2. Silent billing skip on response-parse failure — if the provider returns an unexpected JSON shape (e.g. data not an array), serde_json::from_slice::<UsageExtract>(...).ok().map(...).unwrap_or(0) silently yields 0 tokens, hitting the token_count == 0 warn path but still recording $0 usage. Consistent with embeddings; just worth keeping in mind when the privacy-filter model evolves its response shape.

  3. Provider-fallback retries on 4xxpool::privacy_classify retries every error against the next provider, including client-caused 400s. Pre-existing pattern across all *_raw pool methods. Wastes a slot but isn't dangerous.

  4. extra.clone() per provider attemptHashMap<String, Value> is cloned each iteration of the fallback loop. Negligible (encryption headers are small), same as embeddings.

Nit

  • The _extra parameter in ExternalBackend::privacy_classify_raw default impl (external/backend.rs:389) is unused — fine since the default just errors out, but worth confirming external backends that do implement this in future will receive extra correctly (the wrapper in external/mod.rs:421 does forward it ✓).

Summary

No critical issues. The PR is a faithful extension of an existing, well-tested pattern. The mock returns a structurally-valid response so unit tests against the new path will exercise the token-summing code.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds a new /v1/privacy/classify endpoint for PII span detection, integrating it into the API, service layers, and provider pool. It includes updates to OpenAPI documentation, usage tracking for billing, and implementations for vLLM and mock providers. Feedback suggests addressing a potential integer overflow during token summation and consolidating retry logic to ensure consistency with other inference endpoints.

Comment thread crates/api/src/routes/completions.rs Outdated
Comment thread crates/services/src/inference_provider_pool/mod.rs
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new privacy-classification passthrough endpoint to the API (POST /v1/privacy/classify) and threads it through the inference provider abstraction, provider pool, concurrency limiting, OpenAPI registration, and usage billing (billed by input tokens, similar to embeddings/rerank).

Changes:

  • Introduces InferenceType::PrivacyClassify and bills it using input tokens × input_cost_per_token.
  • Adds a new raw passthrough method (privacy_classify_raw) across InferenceProvider implementations (vLLM/external/mock) and exposes it via InferenceProviderPool + CompletionServiceTrait with concurrent-slot guarding.
  • Adds the Axum route + OpenAPI schema/tag for /v1/privacy/classify, including token extraction from data[*].usage.input_tokens for usage recording.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
crates/services/src/usage/ports.rs Adds PrivacyClassify inference type string mappings.
crates/services/src/usage/mod.rs Bills PrivacyClassify like embeddings/rerank (input-token based).
crates/services/src/inference_provider_pool/mod.rs Adds pool-level privacy_classify passthrough with provider fallback.
crates/services/src/completions/ports.rs Extends completion service trait with try_privacy_classify.
crates/services/src/completions/mod.rs Implements try_privacy_classify with concurrency slot guarding + error mapping.
crates/inference_providers/src/vllm/mod.rs Implements vLLM privacy_classify_raw passthrough call.
crates/inference_providers/src/models.rs Introduces PrivacyClassifyError.
crates/inference_providers/src/mock.rs Adds mock implementation returning a minimal privacy-filter-like JSON shape.
crates/inference_providers/src/lib.rs Exposes PrivacyClassifyError and adds privacy_classify_raw to the InferenceProvider trait.
crates/inference_providers/src/external/mod.rs Wires passthrough method through ExternalProvider.
crates/inference_providers/src/external/backend.rs Adds default “not supported” implementation for external backends.
crates/api/src/routes/completions.rs Adds /v1/privacy/classify handler with routing-by-model + usage token extraction/recording.
crates/api/src/openapi.rs Registers the new OpenAPI path and adds the “Privacy” tag.
crates/api/src/lib.rs Registers the new Axum route under completion routes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread crates/api/src/routes/completions.rs Outdated
Comment thread crates/services/src/usage/mod.rs Outdated
Comment thread crates/api/src/routes/completions.rs Outdated
Address bot review feedback on #584:

- token sum: fold into i64 with saturating_add and clamp to i32,
  filtering negative values. Avoids release-build wrap when a malformed
  or malicious provider response has many data entries or oversized
  per-entry input_tokens (caught only after the MAX_REASONABLE_TOKENS
  cap, which checks the already-wrapped value).
- usage/mod.rs: extend the "rerank/embedding" comment on the input-token
  billing arm to also mention privacy_classify.
@Evrard-Nil Evrard-Nil temporarily deployed to Cloud API test env May 11, 2026 17:47 — with GitHub Actions Inactive
Six tests against MockProvider exercise the full route path:
  - basic single-input → 200 with valid response shape
  - array input → 200
  - unknown model → 404
  - missing API key → 401
  - missing model field → 400
  - costs deducted (10 mock tokens × 1_000_000 = 10_000_000 units)

Also adds openai/privacy-filter to init_inference_providers_with_mocks
so the MockProvider answers privacy_classify_raw for it.
@Evrard-Nil Evrard-Nil temporarily deployed to Cloud API test env May 11, 2026 19:09 — with GitHub Actions Inactive
lloydmak99

This comment was marked as duplicate.

@lloydmak99 lloydmak99 dismissed their stale review May 11, 2026 19:17

Re-posting as a comment-only review; not blocking on changes.

Copy link
Copy Markdown
Contributor

@lloydmak99 lloydmak99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice PR — shape mirrors embeddings_raw closely and the token-accounting math (saturating i64 sum, clamp to i32, negative-value filter, MAX_REASONABLE_TOKENS cap + Datadog anomaly metric) is genuinely the cleanest version of this pattern in the repo. A few items below before this lands.

Probably blocking

  • Body size limit inheritance: /privacy/classify is registered before the DefaultBodyLimit::max(AUDIO_TRANSCRIPTION_MAX_BODY_SIZE) layer in crates/api/src/lib.rs:982, so it inherits the audio-transcription cap (tens of MB). The privacy-filter model has a 512-token context per the test setup — that limit is wildly oversized for a text-classification endpoint. Suggest a per-route DefaultBodyLimit::max(...) with something more proportionate (a few hundred KB) so abuse / accidental flooding is less attractive.
  • Verify the upstream endpoint path actually exists: crates/inference_providers/src/vllm/mod.rs:1463 calls {base_url}/v1/privacy/classify. Standard vLLM/SGLang don't expose this path — this assumes the privacy-filter model server (port 8007 on gpu04) has a custom route or a custom server framework. Worth confirming against the small-models compose / model image before merging; if it's actually /v1/classify or /predict upstream, this 404s silently against the live backend and the E2E box on the test plan won't catch it because the mock answers anything.

Worth addressing

  • _body_hash extracted then discarded (crates/api/src/routes/completions.rs:109). Other passthrough routes consume this for body-hash propagation. If intentional, a one-line // privacy classify does not need body-hash propagation comment would prevent confusion; if not, this is silently dropping data downstream consumers may expect.
  • Pool's "no providers" path returns RequestFailed → 502 (crates/services/src/inference_provider_pool/mod.rs:1801). The route handler pre-resolves the model and returns 404, but if a model disappears between resolution and pool lookup the user sees a 502 instead of 404. Minor — consider a distinct error variant or status mapping for consistency with the pre-check.
  • 401/403 from upstream remapped to 500 "The model is currently unavailable" (crates/services/src/completions/mod.rs:1610). Matches existing patterns, but for a brand-new endpoint where upstream auth is still firming up, keep an eye on the tracing::warn! in the pool when debugging — the 500 the user sees won't tell you it's actually an auth problem.
  • Test coverage gaps: no test exercises the MAX_REASONABLE_TOKENS cap firing, the zero-tokens warning, or the 429/503 mappings out of try_privacy_classify. Given the token math is the most novel part of the diff, an explicit test that the cap clamps and emits METRIC_PROVIDER_TOKEN_ANOMALIES would be valuable.

Minor / nits

  • Model name openai/privacy-filter is misleading — not from OpenAI. Purely a naming choice for the admin upsert, but worth raising with whoever's deciding the public model ID before it ships.
  • OpenAPI doc types declare input: serde_json::Value and data: serde_json::Value — generates a generic object schema. Acceptable for a passthrough but unhelpful for consumers. Consider oneOf for input: string | string[] and a typed data entry.
  • Mock response hardcodes model: "mock-privacy-filter" (crates/inference_providers/src/mock.rs:1101) regardless of the requested model. test_privacy_classify_basic only asserts body.get("model").is_some() so it doesn't matter today — flagging as a future foot-gun for any test that asserts model-id round-trip.
  • ./svc.sh-style redundancy doesn't apply here, but the redundant 502 message in try_privacy_classify's 5xx arm ("server_error" → "Privacy classify request failed. Please try again later.") could be unified with the catch-all branch.
  • Two tokio::time::sleep(Duration::from_millis(200)) calls in the tests — fine, matches the rest of the suite, but worth double-checking they're necessary for this endpoint (some of those sleeps were originally tied to the 15-min provider refresh which is now 5min, and may not be related to this code path at all).

Approve once the body-size override and upstream-path verification are sorted. The token-accounting block here is solid enough that it could probably be a shared helper for the other passthrough routes in a follow-up.

Addresses review feedback on #584:

- Per-route DefaultBodyLimit::max(256 KB) on /v1/privacy/classify so it
  doesn't inherit the 25 MB audio-transcription cap from the shared
  text_inference_routes layer. Privacy filter is text-in/text-out with
  a small (e.g. 512-token) context; 25 MB is wildly oversized.
- Mock now echoes the requested model id instead of hardcoding
  "mock-privacy-filter", so any future test asserting model round-trip
  won't hit a foot-gun.
- New e2e test test_privacy_classify_body_size_limit confirms the
  per-route cap kicks in (413 on 300 KB body).
@Evrard-Nil Evrard-Nil temporarily deployed to Cloud API test env May 11, 2026 19:30 — with GitHub Actions Inactive
@Evrard-Nil
Copy link
Copy Markdown
Collaborator Author

Thanks @lloydmak99 — really thorough review. Pushed b446baa addressing the blocking items and a couple of the smaller ones. Walk-through:

Blocking items

Body size limit ✅ Fixed in b446baa. Added a per-route DefaultBodyLimit::max(256 KB) on /privacy/classify that overrides the 25 MB router-level audio cap. New test test_privacy_classify_body_size_limit confirms a 300 KB payload returns 413.

Upstream endpoint path verification ✅ Already verified before posting the test commit, but I should have flagged the evidence in the PR description. Two confirmations:

  1. Live curl against privacy-filter.completions.near.ai:
    curl -X POST .../v1/privacy/classify -H 'Authorization: Bearer m5N...' \
      -d '{\"input\":\"My SSN is 123-45-6789 and email is alice@example.com\",\"threshold\":0.5}'
    {\"data\":[{\"index\":0,\"spans\":[{\"category\":\"account_number\",\"end\":20,\"score\":0.99,...},{\"category\":\"private_email\",...}],\"usage\":{\"input_tokens\":17}}],\"id\":\"pt-...\",\"model\":\"openai/privacy-filter\"}
    
  2. Server source in cvm-compose-files/small-models.yaml:171 defines @app.post(\"/v1/privacy/classify\") on the FastAPI app. The container exposes exactly two routes: GET /v1/models and POST /v1/privacy/classify. No vLLM behind it — it's a small HF transformers + FastAPI server (the model is a token-classifier, not a generative LLM).

Worth-addressing items

Mock hardcoded model id ✅ Fixed in b446baa — mock now echoes body.model so round-trip assertions work.

_body_hash discarded — Intentional, matches the existing pattern in rerank (line 1814) and embeddings (2114). Body-hash propagation is used by the chat/image/audio handlers for TEE response signing, which isn't applicable to the stateless classification passthrough. Happy to add a clarifying comment to all three handlers in a follow-up if that's preferred, but didn't want to introduce a one-off inconsistency just on this one.

Pool 404→502 race — Pre-existing across all *_raw pool methods (embeddings, rerank, score, and now privacy_classify). The handler does pre-resolve via models_service.get_model_by_name, so the race window is small. Worth fixing as a single shared change across all passthrough pool methods, separately from this PR.

401/403→500 upstream remap — Same pre-existing pattern as try_embeddings (completions/mod.rs:1568). The claude-review bot flagged this as dead code (pool wraps HttpError as RequestFailed before service sees it), so the branch never fires in practice. Will be addressed once the pool-layer error mapping is fixed.

Test for MAX_REASONABLE_TOKENS cap — Skipped here because the current MockProvider returns a fixed response shape, so exercising the cap firing would need either a configurable mock or a wiremock-based integration test. Reasonable as a follow-up.

Nits

Model name openai/privacy-filter — That's the upstream Hugging Face model ID (huggingface.co/openai/privacy-filter). Changing the public ID would diverge from the HF namespace, so leaving as-is.

OpenAPI serde_json::Value types — Same as embeddings (EmbeddingsRequestDoc.input). The runtime is a true passthrough, so the schema is intentionally lossy — typing it precisely would make the doc lie about validation that the handler doesn't actually perform.

502 message redundancy / sleep(200ms) — Both pre-existing patterns inherited from the embeddings-style scaffolding. Out of scope here.

Test suite went from 6 → 7 tests, all pass serially. Let me know if the body-size override looks right and I'll squash if needed.

@Evrard-Nil Evrard-Nil merged commit 534b816 into main May 12, 2026
3 checks passed
@PierreLeGuen PierreLeGuen deleted the feat/privacy-classify-endpoint branch May 12, 2026 12:28
Evrard-Nil added a commit that referenced this pull request May 13, 2026
Critical / blocking fixes:

- redact_one is now fail-closed on malformed input. Previously a span
  whose offsets fell inside a multi-byte UTF-8 sequence or were
  out-of-range silently passed the original text through — leaking PII
  upstream when redaction was explicitly requested. Now returns
  AutoRedactError::Internal, propagated to a 500 in the handler.
  (gemini security-critical, Copilot)
- Streaming un-redact state is now keyed by choice index (HashMap<i64,
  StreamUnredact>) rather than a single shared instance, with separate
  maps for content / reasoning_content / reasoning. For n>1 completions
  the provider may interleave chunks across choice indices, which would
  have cross-contaminated the sliding 16-byte tail. (gemini high)

High-value fixes:

- End-of-stream flush: any bytes still held in a tail buffer when the
  upstream stream ends (e.g. mid-placeholder) are now emitted as a
  synthetic SSE chunk before [DONE], not silently dropped. (Copilot)
- The per-chunk `tracing::debug!("Completion stream event: ...")` line
  is suppressed when auto_redact_enabled. After un-redact the chunk
  holds the user's original PII; routing it to logs would defeat the
  privacy guarantee. (Copilot)

Other fixes:

- detect.rs: extend rather than overwrite when the detector returns
  multiple `data` entries for the same index. (gemini medium)
- Skip non-streaming response re-serialize when the redaction map is
  empty (request opted in but no PII detected). Preserves the
  raw_bytes/signing path for clean inputs. (Copilot)
- Doc comment in mod.rs no longer references a non-existent
  `docs/auto-redact.md`. (Copilot nit)
- MAX_PLACEHOLDER_LEN comment example matches what's actually minted
  (placeholder_prefix("account_number") -> "account", not
  "account_number"). (Copilot nit)
- Span.text field removed (it was only carried through from the
  detector response and never read; dead-code warning).
- MockProvider's privacy_classify_raw keeps usage.input_tokens=10 for
  backward compat with the privacy_classify e2e test from #584; only
  the spans field is computed from the input.

Tests:

- New unit tests in apply.rs for fail-closed-on-non-char-boundary and
  fail-closed-on-out-of-range-span.
- New e2e test
  auto_redact_skips_response_munging_when_no_pii_detected covers the
  empty-map short-circuit.
- All 260 unit + 8 auto_redact e2e + 7 privacy_classify e2e pass.
Evrard-Nil added a commit that referenced this pull request May 13, 2026
* feat: x-auto-redact for chat completions

Adds an opt-in header (`x-auto-redact: on`) and body field
(`auto_redact: true`) on /v1/chat/completions that:

  1. Detects PII in prompt messages by calling the privacy-filter model
     via the inference provider pool.
  2. Mints stable per-request placeholders (<email1>, <phone2>, …) and
     rewrites the messages so the provider only ever sees the redacted
     form. Provider scope is intentional: works for vLLM and external
     (Anthropic/OpenAI/Gemini) alike.
  3. Strips the auto_redact body field before forwarding so providers
     with strict JSON schemas don't 422.
  4. Un-redacts the response: walks message.content / reasoning fields
     for non-streaming, wraps SSE chunks in a sliding-window unredacter
     (16-byte tail, holds incomplete <…> tokens across chunks) for
     streaming.

Fails closed: if the PII detector is unavailable the request is
rejected with 503 `auto_redact_unavailable` rather than degrading
silently to send raw PII to the provider.

New module: services::auto_redact
  - placeholders.rs: bidirectional placeholder ↔ original map with
    monotonic per-category ordinals, dedup of repeated PII, and
    collision-safe minting when the user's own text contains a
    <categoryN>-shaped literal.
  - detect.rs: pool.privacy_classify invocation + response parse.
  - apply.rs: walks CompletionMessage content (string + content-parts
    arrays), redacts spans, writes back.
  - stream_unredact.rs: sliding 16-byte tail buffer + regex
    `<[a-z_]+\d+>` for streaming replacement that never splits a
    placeholder across an emitted chunk.

Scope decisions:
  - /v1/completions left out: handler currently returns 501. The
    maybe_redact helper is generic over ServiceCompletionRequest so the
    wire-in is a 5-line copy when that handler is enabled.
  - /v1/responses out of scope for v1 per design; stored-conversation
    semantics need their own decision (PR description).
  - No category opt-out filter (all-or-nothing).

Tests: 34 unit + 7 e2e covering header/body activation parity,
streaming chunk splits, fail-closed on missing detector, body-field
stripping, multi-PII (email/phone/SSN) round-trip, and PII passthrough
when off.

Builds on /v1/privacy/classify (#584). The MockProvider's
privacy_classify_raw now does shape-based PII detection (email, SSN,
phone) so e2e tests exercise the full redact→provider→unredact loop
without a live privacy-filter model.

* fix: address review feedback on auto-redact (#585)

Critical / blocking fixes:

- redact_one is now fail-closed on malformed input. Previously a span
  whose offsets fell inside a multi-byte UTF-8 sequence or were
  out-of-range silently passed the original text through — leaking PII
  upstream when redaction was explicitly requested. Now returns
  AutoRedactError::Internal, propagated to a 500 in the handler.
  (gemini security-critical, Copilot)
- Streaming un-redact state is now keyed by choice index (HashMap<i64,
  StreamUnredact>) rather than a single shared instance, with separate
  maps for content / reasoning_content / reasoning. For n>1 completions
  the provider may interleave chunks across choice indices, which would
  have cross-contaminated the sliding 16-byte tail. (gemini high)

High-value fixes:

- End-of-stream flush: any bytes still held in a tail buffer when the
  upstream stream ends (e.g. mid-placeholder) are now emitted as a
  synthetic SSE chunk before [DONE], not silently dropped. (Copilot)
- The per-chunk `tracing::debug!("Completion stream event: ...")` line
  is suppressed when auto_redact_enabled. After un-redact the chunk
  holds the user's original PII; routing it to logs would defeat the
  privacy guarantee. (Copilot)

Other fixes:

- detect.rs: extend rather than overwrite when the detector returns
  multiple `data` entries for the same index. (gemini medium)
- Skip non-streaming response re-serialize when the redaction map is
  empty (request opted in but no PII detected). Preserves the
  raw_bytes/signing path for clean inputs. (Copilot)
- Doc comment in mod.rs no longer references a non-existent
  `docs/auto-redact.md`. (Copilot nit)
- MAX_PLACEHOLDER_LEN comment example matches what's actually minted
  (placeholder_prefix("account_number") -> "account", not
  "account_number"). (Copilot nit)
- Span.text field removed (it was only carried through from the
  detector response and never read; dead-code warning).
- MockProvider's privacy_classify_raw keeps usage.input_tokens=10 for
  backward compat with the privacy_classify e2e test from #584; only
  the spans field is computed from the input.

Tests:

- New unit tests in apply.rs for fail-closed-on-non-char-boundary and
  fail-closed-on-out-of-range-span.
- New e2e test
  auto_redact_skips_response_munging_when_no_pii_detected covers the
  empty-map short-circuit.
- All 260 unit + 8 auto_redact e2e + 7 privacy_classify e2e pass.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants