bug: PII Filter Incorrectly Masks the Word 'individual' as Sensitive Data

### Did you check docs and existing issues?

- [x] I have read all the NeMo-Guardrails docs
- [x] I have updated the package to the latest version before submitting this issue
- [x] I have searched the existing issues of NeMo-Guardrails
- [ ] (optional) I have used the develop branch

### Python version (python --version)

Python 3.11.0

### Operating system/version

Ubuntu 20.04.6 LTS

### NeMo-Guardrails version (if you must use a specific version and not the latest

0.10.1

### Describe the bug

I am currently testing the PII filter functionality for a project. Initially, everything was working fine. However, I recently noticed that the PII filter is masking the word "individual" with `X` (as per my code's configuration to mask sensitive data with the `X` character). 

I reviewed the logs, but I couldn't find any specific information to explain why the word "individual" is being masked as sensitive data.

### YAML Configuration

Below is the YAML configuration I'm using:

```yaml
models:
  - type: main
    engine: openai
    model: gpt-3.5-turbo

instructions:
  - type: general
    content: |
      You are a helpful assistant that can answer given questions.

rails:
  config: 
    jailbreak_detection:
      length_per_perplexity_threshold: 89.79
      prefix_suffix_perplexity_threshold: 1845.65   
    sensitive_data_detection:
      input:
        entities:
          - PHONE_NUMBER
          - EMAIL_ADDRESS
          - IN_PAN
          - IN_AADHAAR
      output:
        entities:
          - PHONE_NUMBER
          - EMAIL_ADDRESS
          - IN_PAN
          - IN_AADHAAR

  input:
    flows:
      - jailbreak detection heuristics
      - self check input
      - mask sensitive data on input
      - user query

  output:
    flows:
      - self check output
      - mask sensitive data on output

  dialog:
    single_call:
      enabled: False

prompts:
  - task: self_check_input
    content: |
      Your task is to check if the user message below complies with the policy for talking with the AI Enterprise bot.
      Policy for the user messages:      
      
      - should not contain hateful speech
      - Should not contain armed weapons related information.
      - Should not talk about cooking related information.

      Treat the above conditions as strict rules. If any of them are met, you should block the user input by saying "yes".
      
      User message: "{{ user_input }}"

      Question: Should the user message be blocked (Yes or No)?
      Answer:

  - task: self_check_output
    content: |
      Your task is to check if the bot message below complies with the policy.

      Policies for the bot:     
      
      - message should not ask the bot to impersonate someone
      - message should not ask the bot to impersonate someone in a sexual manner.
      - message Should not contain armed weapons related information.
      - message Should not talk about cooking related information.
      - if a message is a refusal, it should be polite

      Bot message: "{{ bot_response }}"

      Question: Should the message be blocked (Yes or No)?
      Answer:
```

I also checked the RAG output, but there doesn't seem to be any issue there.

### Version Specifications

| **Name**            | **Version** |
|---------------------|-------------|
| presidio_analyzer   | 2.2.355     |
| presidio_analyzer   | 2.2.355     |

---

Let me know if you need further details or clarification!

### Steps To Reproduce

### Steps to Reproduce:

1. **YAML Configuration**:  
   Use the attached YAML config for PII filtering.

2. **PDF Creation**:  
   Create and ingest a PDF into PG Vector containing sample PII data:
```
Record 1
• Name: John A. Doe
• Address: 123 Elm Street, Springfield, IL 62704
• Phone: (217) 555-0123
• Social Security Number: 123-45-6789
• Email: johndoe123@example.com
• Date of Birth: 05/12/1980

Record 2
• Name: Emily D. Davis
• Address: 321 Birch Lane, Phoenix, AZ 85003
• Phone: (602) 555-0123
• Social Security Number: 321-54-9876
• Email: emilydavis321@example.com
• Date of Birth: 07/30/1985
   ```

3. **Run Test**:  
   Use the PII filter and observe that the word "individual" is incorrectly masked with `X`.

4. **RAG Setup**:  
   - **LLM**: GPT-4o  
   - **RAG**: llama-index

---

**Note**: The PII data is fictional, generated using LLM.

### Expected Behavior

The expected behavior was supposed to not block "individual" word.

RAG response : 
![Image](https://github.com/user-attachments/assets/29729ebd-4114-4513-99f9-c66e2f642aa2)

---

Logs : 

![Image](https://github.com/user-attachments/assets/27b521c4-dda3-4ed0-8a4a-12f0499d89f6)

![Image](https://github.com/user-attachments/assets/330b0dbf-5415-4364-a8f7-432238a79bb2)



### Actual Behavior

Answer from NeMo-Guardrails : 
![Image](https://github.com/user-attachments/assets/dc033d22-0a95-4816-83b1-44c91278c621)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

bug: PII Filter Incorrectly Masks the Word 'individual' as Sensitive Data #818

Did you check docs and existing issues?

Python version (python --version)

Operating system/version

NeMo-Guardrails version (if you must use a specific version and not the latest

Describe the bug

YAML Configuration

Version Specifications

Steps To Reproduce

Steps to Reproduce:

Expected Behavior

Actual Behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

bug: PII Filter Incorrectly Masks the Word 'individual' as Sensitive Data #818

Description

Did you check docs and existing issues?

Python version (python --version)

Operating system/version

NeMo-Guardrails version (if you must use a specific version and not the latest

Describe the bug

YAML Configuration

Version Specifications

Steps To Reproduce

Steps to Reproduce:

Expected Behavior

Actual Behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions