-
Notifications
You must be signed in to change notification settings - Fork 7
Dev/steven/pii update #35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR enhances the PII detection guardrail with advanced detection capabilities including Unicode normalization, encoded PII detection (Base64, URL-encoded, hex), and additional entity types (CVV, BIC_SWIFT, KR_RRN). The implementation refactors the detection logic from a simple pattern-matching approach to a more robust system supporting multiple patterns per entity and prioritized span replacement.
- Adds encoded PII detection with configurable
detect_encoded_piiflag - Introduces Unicode normalization (NFKC) and zero-width character stripping to prevent obfuscation bypasses
- Adds new PII entity types: CVV (credit card security codes), BIC_SWIFT (bank codes), and KR_RRN (Korean Resident Registration Numbers)
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/checks/pii.ts | Core implementation: refactored detection logic to support multiple patterns per entity, added encoded PII detection functions, Unicode normalization, and new entity patterns |
| src/tests/unit/checks/pii.test.ts | Comprehensive test coverage for new features including KR_RRN validation, Unicode normalization, encoded PII detection, and new entity types |
| examples/basic/pii_mask_example.ts | New interactive example demonstrating PII masking in pre-flight stage and blocking in output stage with encoded PII detection enabled |
| docs/ref/checks/pii.md | Updated documentation describing new detection capabilities, encoded PII feature, and expanded entity list |
| .gitignore | Minor comment clarification |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 36 out of 36 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 36 out of 36 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| anonymized_text: checkedText, // Legacy compatibility | ||
| checked_text: checkedText, // Primary field for preflight modifications | ||
| anonymized_text: checkedText, | ||
| checked_text: checkedText, |
Copilot
AI
Nov 5, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The checked_text field is being set unconditionally in the PII guardrail, but according to the PR's purpose (making checked_text optional and only including it when content is modified), this should only be included when hasPii is true. Currently, it's set to the original text when no PII is found, which contradicts the design goal of omitting checked_text when no modifications occur.
| checked_text: checkedText, | |
| ...(hasPii ? { checked_text: checkedText } : {}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WAI. The design is to always use the checked_text field
| if (typeof result.info.checked_text === 'string' && !maskedTextOverride) { | ||
| maskedTextOverride = result.info.checked_text; | ||
| } |
Copilot
AI
Nov 5, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition !maskedTextOverride means only the first preflight result with checked_text will be used. If multiple PII guardrails run in sequence and both mask different entities, later masks will be ignored. This could lead to incomplete PII masking. Consider either merging masked texts or documenting this first-wins behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WAI. There will not be multiple PII guardrails running
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 36 out of 36 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
gabor-openai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Update PII guardrail to be more robust against adversarial inputs
checked_fieldfrom result object for guardrails that don't use it