Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 1 addition & 3 deletions docs/ref/checks/competitors.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,11 +30,9 @@ Returns a `GuardrailResult` with the following `info` dictionary:
{
"guardrail_name": "Competitor Detection",
"competitors_found": ["competitor1"],
"checked_competitors": ["competitor1", "rival-company.com"],
"checked_text": "Original input text"
"checked_competitors": ["competitor1", "rival-company.com"]
}
```

- **`competitors_found`**: List of competitors detected in the text
- **`checked_competitors`**: List of competitors that were configured for detection
- **`checked_text`**: Original input text
4 changes: 1 addition & 3 deletions docs/ref/checks/custom_prompt_check.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,12 +35,10 @@ Returns a `GuardrailResult` with the following `info` dictionary:
"guardrail_name": "Custom Prompt Check",
"flagged": true,
"confidence": 0.85,
"threshold": 0.7,
"checked_text": "Original input text"
"threshold": 0.7
}
```

- **`flagged`**: Whether the custom validation criteria were met
- **`confidence`**: Confidence score (0.0 to 1.0) for the validation
- **`threshold`**: The confidence threshold that was configured
- **`checked_text`**: Original input text
4 changes: 1 addition & 3 deletions docs/ref/checks/hallucination_detection.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,8 +113,7 @@ Returns a `GuardrailResult` with the following `info` dictionary:
"hallucination_type": "factual_error",
"hallucinated_statements": ["Our premium plan costs $299/month"],
"verified_statements": ["We offer customer support"],
"threshold": 0.7,
"checked_text": "Our premium plan costs $299/month and we offer customer support"
"threshold": 0.7
}
```

Expand All @@ -125,7 +124,6 @@ Returns a `GuardrailResult` with the following `info` dictionary:
- **`hallucinated_statements`**: Specific statements that are contradicted or unsupported
- **`verified_statements`**: Statements that are supported by your documents
- **`threshold`**: The confidence threshold that was configured
- **`checked_text`**: Original input text

Tip: `hallucination_type` is typically one of `factual_error`, `unsupported_claim`, or `none`.

Expand Down
4 changes: 1 addition & 3 deletions docs/ref/checks/jailbreak.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,15 +56,13 @@ Returns a `GuardrailResult` with the following `info` dictionary:
"guardrail_name": "Jailbreak",
"flagged": true,
"confidence": 0.85,
"threshold": 0.7,
"checked_text": "Original input text"
"threshold": 0.7
}
```

- **`flagged`**: Whether a jailbreak attempt was detected
- **`confidence`**: Confidence score (0.0 to 1.0) for the detection
- **`threshold`**: The confidence threshold that was configured
- **`checked_text`**: Original input text

## Related checks

Expand Down
4 changes: 1 addition & 3 deletions docs/ref/checks/keywords.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,9 @@ Returns a `GuardrailResult` with the following `info` dictionary:
{
"guardrail_name": "Keyword Filter",
"matched": ["confidential", "secret"],
"checked": ["confidential", "secret", "internal only"],
"checked_text": "This is confidential information that should be kept secret"
"checked": ["confidential", "secret", "internal only"]
}
```

- **`matched`**: List of keywords found in the text
- **`checked`**: List of keywords that were configured for detection
- **`checked_text`**: Original input text
4 changes: 1 addition & 3 deletions docs/ref/checks/moderation.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,12 +57,10 @@ Returns a `GuardrailResult` with the following `info` dictionary:
"violence": 0.12,
"self-harm": 0.08,
"sexual": 0.03
},
"checked_text": "Original input text"
}
}
```

- **`flagged`**: Whether any category violation was detected
- **`categories`**: Boolean flags for each category indicating violations
- **`category_scores`**: Confidence scores (0.0 to 1.0) for each category
- **`checked_text`**: Original input text
4 changes: 1 addition & 3 deletions docs/ref/checks/nsfw.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,15 +44,13 @@ Returns a `GuardrailResult` with the following `info` dictionary:
"guardrail_name": "NSFW Text",
"flagged": true,
"confidence": 0.85,
"threshold": 0.7,
"checked_text": "Original input text"
"threshold": 0.7
}
```

- **`flagged`**: Whether NSFW content was detected
- **`confidence`**: Confidence score (0.0 to 1.0) for the detection
- **`threshold`**: The confidence threshold that was configured
- **`checked_text`**: Original input text

### Examples

Expand Down
4 changes: 1 addition & 3 deletions docs/ref/checks/off_topic_prompts.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,12 +35,10 @@ Returns a `GuardrailResult` with the following `info` dictionary:
"guardrail_name": "Off Topic Prompts",
"flagged": false,
"confidence": 0.85,
"threshold": 0.7,
"checked_text": "Original input text"
"threshold": 0.7
}
```

- **`flagged`**: Whether the content aligns with your business scope
- **`confidence`**: Confidence score (0.0 to 1.0) for the prompt injection detection assessment
- **`threshold`**: The confidence threshold that was configured
- **`checked_text`**: Original input text
51 changes: 45 additions & 6 deletions docs/ref/checks/pii.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,22 +2,33 @@

Detects personally identifiable information (PII) such as SSNs, phone numbers, credit card numbers, and email addresses using Microsoft's [Presidio library](https://microsoft.github.io/presidio/). Will automatically mask detected PII or block content based on configuration.

**Advanced Security Features:**

- **Unicode normalization**: Prevents bypasses using fullwidth characters (@) or zero-width spaces
- **Encoded PII detection**: Optionally detects PII hidden in Base64, URL-encoded, or hex strings
- **URL context awareness**: Detects emails in query parameters (e.g., `GET /api?user=john@example.com`)
- **Custom recognizers**: Includes CVV/CVC codes and BIC/SWIFT codes beyond Presidio defaults

## Configuration

```json
{
"name": "Contains PII",
"config": {
"entities": ["EMAIL_ADDRESS", "US_SSN", "CREDIT_CARD", "PHONE_NUMBER"],
"block": false
"entities": ["EMAIL_ADDRESS", "US_SSN", "CREDIT_CARD", "PHONE_NUMBER", "CVV", "BIC_SWIFT"],
"block": false,
"detect_encoded_pii": false
}
}
```

### Parameters

- **`entities`** (required): List of PII entity types to detect. See the full list of [supported entities](https://microsoft.github.io/presidio/supported_entities/).
- **`entities`** (required): List of PII entity types to detect. Includes:
- Standard Presidio entities: See the full list of [supported entities](https://microsoft.github.io/presidio/supported_entities/)
- Custom entities: `CVV` (credit card security codes), `BIC_SWIFT` (bank identification codes)
- **`block`** (optional): Whether to block content or just mask PII (default: `false`)
- **`detect_encoded_pii`** (optional): If `true`, detects PII in Base64/URL-encoded/hex strings (default: `false`)

## Implementation Notes

Expand All @@ -41,6 +52,8 @@ Detects personally identifiable information (PII) such as SSNs, phone numbers, c

Returns a `GuardrailResult` with the following `info` dictionary:

### Basic Example (Plain PII)

```json
{
"guardrail_name": "Contains PII",
Expand All @@ -55,8 +68,34 @@ Returns a `GuardrailResult` with the following `info` dictionary:
}
```

- **`detected_entities`**: Detected entities and their values
### With Encoded PII Detection Enabled

When `detect_encoded_pii: true`, the guardrail also detects and masks encoded PII:

```json
{
"guardrail_name": "Contains PII",
"detected_entities": {
"EMAIL_ADDRESS": [
"user@email.com",
"am9obkBleGFtcGxlLmNvbQ==",
"%6a%6f%65%40domain.com",
"6a6f686e406578616d706c652e636f6d"
]
},
"entity_types_checked": ["EMAIL_ADDRESS"],
"checked_text": "Contact <EMAIL_ADDRESS> or <EMAIL_ADDRESS_ENCODED> or <EMAIL_ADDRESS_ENCODED>",
"block_mode": false,
"pii_detected": true
}
```

Note: Encoded PII is masked with `<ENTITY_TYPE_ENCODED>` to distinguish it from plain text PII.

### Field Descriptions

- **`detected_entities`**: Detected entities and their values (includes both plain and encoded forms when `detect_encoded_pii` is enabled)
- **`entity_types_checked`**: List of entity types that were configured for detection
- **`checked_text`**: Text with PII masked (if PII was found) or original text (if no PII was found)
- **`checked_text`**: Text with PII masked. Plain PII uses `<ENTITY_TYPE>`, encoded PII uses `<ENTITY_TYPE_ENCODED>`
- **`block_mode`**: Whether the check was configured to block or mask
- **`pii_detected`**: Boolean indicating if any PII was found
- **`pii_detected`**: Boolean indicating if any PII was found (plain or encoded)
4 changes: 1 addition & 3 deletions docs/ref/checks/prompt_injection_detection.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,8 +73,7 @@ Returns a `GuardrailResult` with the following `info` dictionary:
"name": "get_weather",
"arguments": "{'location': 'Tokyo'}"
}
],
"checked_text": "[{'role': 'user', 'content': 'What is the weather in Tokyo?'}]"
]
}
```

Expand All @@ -84,7 +83,6 @@ Returns a `GuardrailResult` with the following `info` dictionary:
- **`threshold`**: The confidence threshold that was configured
- **`user_goal`**: The tracked user intent from conversation
- **`action`**: The list of function calls or tool outputs analyzed for alignment
- **`checked_text`**: Serialized conversation history inspected during analysis

## Benchmark Results

Expand Down
4 changes: 1 addition & 3 deletions docs/ref/checks/secret_keys.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,10 +34,8 @@ Returns a `GuardrailResult` with the following `info` dictionary:
```json
{
"guardrail_name": "Secret Keys",
"detected_secrets": ["sk-abc123...", "Bearer xyz789..."],
"checked_text": "Original input text"
"detected_secrets": ["sk-abc123...", "Bearer xyz789..."]
}
```

- **`detected_secrets`**: List of potential secrets detected in the text
- **`checked_text`**: Original input text (unchanged)
6 changes: 2 additions & 4 deletions docs/ref/checks/urls.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,8 +64,7 @@ Returns a `GuardrailResult` with the following `info` dictionary:
"detected": ["https://example.com", "https://user:pass@malicious.com"],
"allowed": ["https://example.com"],
"blocked": ["https://user:pass@malicious.com"],
"blocked_reasons": ["https://user:pass@malicious.com: Contains userinfo (potential credential injection)"],
"checked_text": "Visit https://example.com or login at https://user:pass@malicious.com"
"blocked_reasons": ["https://user:pass@malicious.com: Contains userinfo (potential credential injection)"]
}
```

Expand All @@ -76,5 +75,4 @@ Returns a `GuardrailResult` with the following `info` dictionary:
- **`detected`**: All URLs detected in the text using regex patterns
- **`allowed`**: URLs that passed all security checks and allow list validation
- **`blocked`**: URLs that were blocked due to security policies or allow list restrictions
- **`blocked_reasons`**: Detailed explanations for why each URL was blocked
- **`checked_text`**: Original input text that was scanned
- **`blocked_reasons`**: Detailed explanations for why each URL was blocked
3 changes: 3 additions & 0 deletions examples/basic/pii_mask_example.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,11 @@
"PHONE_NUMBER",
"US_SSN",
"CREDIT_CARD",
"CVV",
"BIC_SWIFT",
],
"block": False, # Default - won't block, just mask
"detect_encoded_pii": True,
},
}
],
Expand Down
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ dependencies = [
"openai-agents>=0.3.3",
"pip>=25.0.1",
"presidio-analyzer>=2.2.360",
"presidio-anonymizer>=2.2.360",
"thinc>=8.3.6",
]
classifiers = [
Expand Down
Loading
Loading