-
Notifications
You must be signed in to change notification settings - Fork 548
Description
Did you check docs and existing issues?
- I have read all the NeMo-Guardrails docs
- I have updated the package to the latest version before submitting this issue
- I have searched the existing issues of NeMo-Guardrails
- (optional) I have used the develop branch
Python version (python --version)
Python 3.11.0
Operating system/version
Ubuntu 20.04.6 LTS
NeMo-Guardrails version (if you must use a specific version and not the latest
0.10.1
Describe the bug
I am currently testing the PII filter functionality for a project. Initially, everything was working fine. However, I recently noticed that the PII filter is masking the word "individual" with X
(as per my code's configuration to mask sensitive data with the X
character).
I reviewed the logs, but I couldn't find any specific information to explain why the word "individual" is being masked as sensitive data.
YAML Configuration
Below is the YAML configuration I'm using:
models:
- type: main
engine: openai
model: gpt-3.5-turbo
instructions:
- type: general
content: |
You are a helpful assistant that can answer given questions.
rails:
config:
jailbreak_detection:
length_per_perplexity_threshold: 89.79
prefix_suffix_perplexity_threshold: 1845.65
sensitive_data_detection:
input:
entities:
- PHONE_NUMBER
- EMAIL_ADDRESS
- IN_PAN
- IN_AADHAAR
output:
entities:
- PHONE_NUMBER
- EMAIL_ADDRESS
- IN_PAN
- IN_AADHAAR
input:
flows:
- jailbreak detection heuristics
- self check input
- mask sensitive data on input
- user query
output:
flows:
- self check output
- mask sensitive data on output
dialog:
single_call:
enabled: False
prompts:
- task: self_check_input
content: |
Your task is to check if the user message below complies with the policy for talking with the AI Enterprise bot.
Policy for the user messages:
- should not contain hateful speech
- Should not contain armed weapons related information.
- Should not talk about cooking related information.
Treat the above conditions as strict rules. If any of them are met, you should block the user input by saying "yes".
User message: "{{ user_input }}"
Question: Should the user message be blocked (Yes or No)?
Answer:
- task: self_check_output
content: |
Your task is to check if the bot message below complies with the policy.
Policies for the bot:
- message should not ask the bot to impersonate someone
- message should not ask the bot to impersonate someone in a sexual manner.
- message Should not contain armed weapons related information.
- message Should not talk about cooking related information.
- if a message is a refusal, it should be polite
Bot message: "{{ bot_response }}"
Question: Should the message be blocked (Yes or No)?
Answer:
I also checked the RAG output, but there doesn't seem to be any issue there.
Version Specifications
Name | Version |
---|---|
presidio_analyzer | 2.2.355 |
presidio_analyzer | 2.2.355 |
Let me know if you need further details or clarification!
Steps To Reproduce
Steps to Reproduce:
-
YAML Configuration:
Use the attached YAML config for PII filtering. -
PDF Creation:
Create and ingest a PDF into PG Vector containing sample PII data:
Record 1
• Name: John A. Doe
• Address: 123 Elm Street, Springfield, IL 62704
• Phone: (217) 555-0123
• Social Security Number: 123-45-6789
• Email: johndoe123@example.com
• Date of Birth: 05/12/1980
Record 2
• Name: Emily D. Davis
• Address: 321 Birch Lane, Phoenix, AZ 85003
• Phone: (602) 555-0123
• Social Security Number: 321-54-9876
• Email: emilydavis321@example.com
• Date of Birth: 07/30/1985
-
Run Test:
Use the PII filter and observe that the word "individual" is incorrectly masked withX
. -
RAG Setup:
- LLM: GPT-4o
- RAG: llama-index
Note: The PII data is fictional, generated using LLM.
Expected Behavior
The expected behavior was supposed to not block "individual" word.
Logs :