feat: enhance repair by asteier2026 · Pull Request #137 · NVIDIA-NeMo/Anonymizer

asteier2026 · 2026-04-27T15:27:37Z

Changes include:

Add evidence to privacy qa reanswer
Enhance privacy qa reanswer prompt
Enhance repair prompt

greptile-apps · 2026-04-27T15:30:08Z

Greptile Summary

This PR enhances the privacy repair pipeline by adding an evidence field to privacy QA re-answers and substantially rewriting both the reanswer and repair prompts to be more adversarially rigorous.

Evidence field: PrivacyAnswerItemSchema gains an evidence: list[str] field (default []) that the reanswer LLM is asked to populate with verbatim quotes; these quotes are then surfaced in the repair prompt's <privacy_issues> block to give the rewriter concrete anchors for what to fix.
Repair prompt refactor: Removes the <protection_decisions> and <replacement_map> sections (and all supporting code: _format_protection_block, _replacement_map_is_empty, COL_SENSITIVITY_DISPOSITION/COL_REPLACEMENT_MAP_FOR_PROMPT dependencies); replaces them with <adversarial_goal>, <inference_rules>, <critical_warnings>, and <success_criteria> sections focused on inference suppression.
Reanswer prompt expansion: Adds detailed rules for exhaustive evidence search, normalized/synonymous value matching, and a "best-supported guess" heuristic to reduce false negatives.

Confidence Score: 5/5

Safe to merge — the changes are prompt engineering and a schema field addition; no data loss or broken contracts introduced.

All production code paths are exercised by existing tests, the schema change is backward-compatible (default_factory=list), and the removed dead code was cleanly excised with matching test deletions. No logic errors were found in the active code paths.

No files require special attention beyond the minor schema and test gaps noted in the inline comments.

Important Files Changed

Filename	Overview
src/anonymizer/engine/rewrite/evaluate.py	Adds `evidence` field to the privacy reanswer skeleton and expands the prompt rules with detailed inference-detection guidance; fallback default also updated with `evidence: []`.
src/anonymizer/engine/rewrite/repair.py	Removes dead `_format_protection_block`, `_replacement_map_is_empty`, and all references to `COL_SENSITIVITY_DISPOSITION`/`COL_REPLACEMENT_MAP_FOR_PROMPT`; adds evidence rendering in `_leaked_items_text`; substantially rewrites the repair prompt with adversarial-goal, inference-rules, and success-criteria sections.
src/anonymizer/engine/schemas/rewrite.py	Adds `evidence: list[str]` field to `PrivacyAnswerItemSchema`; no per-item length constraint unlike the sibling `reason` field, which could allow unbounded strings through validation.
src/anonymizer/engine/rewrite/parsers.py	Adds a module-level `logger` and the `logging` import; no logic changes.
tests/engine/test_repair.py	Removes tests for deleted functions (`_format_protection_block`, numpy-array replacement-map path); updates remaining tests to drop now-unused columns; evidence formatting branch in `_leaked_items_text` has no test coverage.

Sequence Diagram

sequenceDiagram
    participant Row as Input Row
    participant RE as _render_privacy_reanswer_prompt
    participant LLM1 as Reanswer LLM
    participant PA as PrivacyAnswerItemSchema
    participant LIT as _leaked_items_text
    participant RP as _render_repair_prompt
    participant LLM2 as Repair LLM

    Row->>RE: COL_PRIVACY_QA + COL_REWRITTEN_TEXT
    RE->>RE: Build skeleton with evidence field
    RE->>LLM1: Prompt with answer_template
    LLM1-->>PA: answer, confidence, reason, evidence quotes
    PA->>LIT: privacy_answers with evidence
    LIT->>LIT: Format leaked items and append Evidence quotes
    LIT-->>RP: formatted leaked_items_text
    Row->>RP: COL_TEXT, COL_REWRITTEN_TEXT, COL_LEAKAGE_MASS, etc.
    RP->>RP: Render prompt with adversarial_goal and inference_rules
    RP->>LLM2: Repair prompt without replacement_map or protection_decisions
    LLM2-->>Row: COL_REWRITTEN_TEXT_NEXT

_{Reviews (3): Last reviewed commit: "fix: update tests for repair" | Re-trigger Greptile}

lipikaramaswamy

Btw the failing test jobs seem expected from the current diff, since this PR removes repair prompt sections/helpers that the existing repair tests still assert on. If we do truly want to remove those sections, then the tests will need an update too. But let's discuss

lipikaramaswamy

LGTM 🚢

Focusing repair on residual inference issues instead of carrying forward all first-pass rewrite guidance makes sense, and the tighter privacy-evaluation approach seems well justified 👍

asteier2026 added 2 commits April 24, 2026 07:36

feature: enhance repair

d329742

feature: repair enhancements

6f8dab0

asteier2026 requested a review from a team as a code owner April 27, 2026 15:27

greptile-apps Bot reviewed Apr 27, 2026

View reviewed changes

Comment thread src/anonymizer/engine/rewrite/parsers.py Outdated

asteier2026 requested a review from lipikaramaswamy April 27, 2026 15:31

fix: address review feedback on repair

00cafd4

lipikaramaswamy reviewed May 4, 2026

View reviewed changes

Comment thread src/anonymizer/engine/rewrite/repair.py

lipikaramaswamy reviewed May 4, 2026

View reviewed changes

Comment thread src/anonymizer/engine/rewrite/evaluate.py

lipikaramaswamy reviewed May 4, 2026

View reviewed changes

fix: update tests for repair

5caf588

lipikaramaswamy approved these changes May 5, 2026

View reviewed changes

asteier2026 merged commit 26b231e into main May 6, 2026
11 checks passed

asteier2026 deleted the asteier2026/feature/repair-improvements branch May 6, 2026 15:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: enhance repair#137

feat: enhance repair#137
asteier2026 merged 4 commits into
mainfrom
asteier2026/feature/repair-improvements

asteier2026 commented Apr 27, 2026

Uh oh!

greptile-apps Bot commented Apr 27, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lipikaramaswamy left a comment

Uh oh!

lipikaramaswamy left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

asteier2026 commented Apr 27, 2026

Uh oh!

greptile-apps Bot commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lipikaramaswamy left a comment

Choose a reason for hiding this comment

Uh oh!

lipikaramaswamy left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

greptile-apps Bot commented Apr 27, 2026 •

edited

Loading