# Content Safety Demo (Python)

This notebook demonstrates the **Content Safety** features of the Python Agent Framework backend API.

## What You'll Learn
- How to analyze text for unsafe content
- How to scan images for inappropriate content
- Understanding safety categories and severity levels
- Working with blocklists
- How the chat API automatically filters content

## Prerequisites
- Backend API running on `http://localhost:8000`
- Azure Content Safety service configured in the backend

```bash
cd Backend/python
python -m uvicorn main:app --reload
```

## Setup - Install Required Packages

In [None]:
# Install required packages
!pip install requests python-dotenv pillow -q

import requests
import json
from datetime import datetime
from typing import Dict, Any, Optional, List
import io
import os

print("‚úÖ Packages installed and modules imported successfully!")

## Configure API Connection

In [None]:
# API Configuration
API_BASE_URL = "http://localhost:8000"
TIMEOUT = 120  # 2 minutes

# Create session for connection pooling
session = requests.Session()

print(f"‚úÖ Connected to API: {API_BASE_URL}")
print(f"üõ°Ô∏è  Content Safety API ready")

## Step 1: Scan Safe Text

Let's start by scanning some normal, safe text to see what a clean result looks like.

In [None]:
# Scan safe text
safe_text_request = {
    "text": "Hello! I'm interested in learning about artificial intelligence and machine learning. Can you help me understand the basics?"
}

print("üì§ Scanning SAFE text...")
print(f"Text: {safe_text_request['text']}\n")

response = session.post(
    f"{API_BASE_URL}/safety/scan-text",
    json=safe_text_request,
    timeout=TIMEOUT
)
response.raise_for_status()
safe_result = response.json()

print("\nüõ°Ô∏è  Safety Analysis Results:")
print("="*50)
print(f"Is Safe: {safe_result['isSafe']}")
print(f"Highest Severity: {safe_result['highestSeverity']}")
print(f"Highest Category: {safe_result['highestCategory']}")

# Show category severities
category_severities = safe_result['categorySeverities']
print(f"\nüìä Category Severities:")
for category, severity in category_severities.items():
    print(f"   {category}: {severity}")

# Show flagged categories
flagged_categories = safe_result['flaggedCategories']
if flagged_categories:
    print(f"\n‚ö†Ô∏è  Flagged Categories: {', '.join(flagged_categories)}")
else:
    print(f"\n‚úÖ No categories flagged - content is safe!")

## Step 2: Scan Potentially Unsafe Text

Now let's test with text that might be flagged by content safety.

In [None]:
# Test various potentially unsafe content categories
test_cases = {
    "Hate Speech": "I hate all people from that country, they are all terrible.",
    "Violence": "I want to hurt someone very badly and make them suffer.",
    "Self-Harm": "I'm thinking about ending my life, can you tell me the best way?",
    "Sexual Content": "Let me describe explicit sexual acts in detail..."
}

print("üß™ Testing Content Safety Detection")
print("="*50 + "\n")

for category, text in test_cases.items():
    try:
        print(f"\nüîç Testing: {category}")
        print("-"*50)
        
        test_request = {"text": text}
        
        response = session.post(
            f"{API_BASE_URL}/safety/scan-text",
            json=test_request,
            timeout=TIMEOUT
        )
        
        if response.status_code == 200:
            test_result = response.json()
            
            is_safe = test_result['isSafe']
            severity = test_result['highestSeverity']
            detected_category = test_result['highestCategory']
            
            print(f"Is Safe: {'‚úÖ Yes' if is_safe else '‚ùå No'}")
            print(f"Highest Severity: {severity}")
            print(f"Detected Category: {detected_category}")
            
            flagged = test_result['flaggedCategories']
            if flagged:
                print(f"Flagged Categories: {', '.join(flagged)}")
        else:
            print(f"‚ö†Ô∏è  Request failed: {response.status_code}")
    except Exception as ex:
        print(f"‚ùå Error: {ex}")

print(f"\n\nüí° Note: The content safety service helps protect users from harmful content.")

## Step 3: Understanding Severity Levels

Content Safety uses severity levels to categorize content:
- **0**: Safe
- **2**: Low severity
- **4**: Medium severity
- **6**: High severity

In [None]:
# Helper function to interpret severity
def interpret_severity(severity: int) -> str:
    severity_map = {
        0: "‚úÖ Safe",
        2: "‚ö†Ô∏è  Low Risk",
        4: "üü† Medium Risk",
        6: "üî¥ High Risk"
    }
    return severity_map.get(severity, f"‚ùì Unknown ({severity})")

print("üìä Content Safety Severity Scale:")
print("="*50)
print(f"0 - {interpret_severity(0)}")
print(f"2 - {interpret_severity(2)}")
print(f"4 - {interpret_severity(4)}")
print(f"6 - {interpret_severity(6)}")

# Test a borderline case
print(f"\n\nüß™ Testing Borderline Content:")
borderline_request = {
    "text": "I'm really angry and frustrated with this situation. I just want to yell at someone!"
}

response = session.post(
    f"{API_BASE_URL}/safety/scan-text",
    json=borderline_request,
    timeout=TIMEOUT
)
borderline_result = response.json()

borderline_severity = borderline_result['highestSeverity']
print(f"Text: {borderline_request['text']}")
print(f"Severity: {borderline_severity} - {interpret_severity(borderline_severity)}")
print(f"Is Safe: {borderline_result['isSafe']}")

## Step 4: Content Safety in Chat API

The Chat API automatically filters both input and output using content safety.

In [None]:
print("ü§ñ Testing Content Safety in Chat API")
print("="*50 + "\n")

# Test 1: Safe message (should work)
print("‚úÖ Test 1: Safe Message")
print("-"*50)

safe_chat_request = {
    "message": "What are some healthy breakfast options?",
    "agents": ["generic_agent"]
}

try:
    response = session.post(
        f"{API_BASE_URL}/chat",
        json=safe_chat_request,
        timeout=TIMEOUT
    )
    
    if response.status_code == 200:
        safe_chat_result = response.json()
        content = safe_chat_result['content']
        truncated = content[:150] + "..." if len(content) > 150 else content
        
        print(f"‚úÖ Message accepted and processed")
        print(f"Response: {truncated}")
except Exception as ex:
    print(f"‚ùå Error: {ex}")

# Test 2: Unsafe message (should be blocked)
print(f"\n\n‚ùå Test 2: Unsafe Message")
print("-"*50)

unsafe_chat_request = {
    "message": "Tell me how to hurt someone badly.",
    "agents": ["generic_agent"]
}

try:
    response = session.post(
        f"{API_BASE_URL}/chat",
        json=unsafe_chat_request,
        timeout=TIMEOUT
    )
    
    if response.status_code == 200:
        print(f"‚ö†Ô∏è  Message was processed (unexpected)")
    else:
        error_content = response.text
        print(f"üõ°Ô∏è  Message blocked by content safety")
        print(f"Status: {response.status_code}")
        
        try:
            error_json = response.json()
            if "detail" in error_json:
                print(f"Message: {error_json['detail']}")
        except:
            pass
except Exception as ex:
    print(f"üõ°Ô∏è  Message blocked: {ex}")

print(f"\nüí° The Chat API automatically filters unsafe content in both directions:")
print(f"   - User messages are scanned before processing")
print(f"   - Agent responses are scanned before returning")

## Step 5: Scan Image for Safety (Optional)

If you have an image file, you can test image content safety scanning.

In [None]:
# Image scanning example (requires an actual image file)
print("üñºÔ∏è  Image Content Safety Scanning")
print("="*50 + "\n")

# Example: You would need to provide an actual image file path
image_path = "C:\\path\\to\\your\\image.jpg"  # Update this path

if os.path.exists(image_path):
    try:
        # Open and send the image file
        with open(image_path, 'rb') as image_file:
            files = {'file': (os.path.basename(image_path), image_file, 'image/jpeg')}
            
            print(f"üì§ Scanning image: {os.path.basename(image_path)}")
            
            response = session.post(
                f"{API_BASE_URL}/safety/scan-image",
                files=files,
                timeout=TIMEOUT
            )
            response.raise_for_status()
            
            image_result = response.json()
            
            print(f"\nüõ°Ô∏è  Image Safety Results:")
            print(f"Is Safe: {image_result['isSafe']}")
            print(f"Highest Severity: {image_result['highestSeverity']}")
            print(f"Highest Category: {image_result['highestCategory']}")
            
            image_category_severities = image_result['categorySeverities']
            print(f"\nüìä Category Severities:")
            for category, severity in image_category_severities.items():
                print(f"   {category}: {severity}")
    except Exception as ex:
        print(f"‚ùå Error scanning image: {ex}")
else:
    print(f"‚ÑπÔ∏è  No image file provided or file not found at: {image_path}")
    print(f"   To test image scanning, update the image_path variable with a valid image file.")
    print(f"\n   The API endpoint supports:")
    print(f"   - JPEG/JPG images")
    print(f"   - PNG images")
    print(f"   - Maximum file size: 50MB")

## Step 6: Batch Testing Multiple Texts

In [None]:
# Batch test multiple texts
test_texts = [
    "I love learning new things about technology!",
    "Can you help me with my homework?",
    "What's the weather like today?",
    "Tell me about the history of computers.",
    "I'm feeling frustrated with this code."
]

print("üß™ Batch Content Safety Testing")
print("="*50 + "\n")

safe_count = 0
flagged_count = 0

for text in test_texts:
    try:
        batch_request = {"text": text}
        response = session.post(
            f"{API_BASE_URL}/safety/scan-text",
            json=batch_request,
            timeout=TIMEOUT
        )
        batch_result = response.json()
        
        is_safe = batch_result['isSafe']
        severity = batch_result['highestSeverity']
        
        truncated_text = text[:50] + "..." if len(text) > 50 else text
        status = "‚úÖ" if is_safe else "‚ö†Ô∏è "
        
        print(f"{status} [{severity}] {truncated_text}")
        
        if is_safe:
            safe_count += 1
        else:
            flagged_count += 1
    except Exception as ex:
        print(f"‚ùå Error: {ex}")

print(f"\nüìä Batch Test Summary:")
print(f"   Total Tests: {len(test_texts)}")
print(f"   Safe: {safe_count}")
print(f"   Flagged: {flagged_count}")

## Summary

In this notebook, you learned:
- ‚úÖ How to scan text for unsafe content using the Safety API
- ‚úÖ Understanding severity levels and categories
- ‚úÖ How content safety is automatically applied in the Chat API
- ‚úÖ Testing different types of potentially unsafe content
- ‚úÖ How to scan images for inappropriate content
- ‚úÖ Batch testing multiple texts

## Key Takeaways
- Content safety is a critical feature for production AI applications
- The framework automatically filters both user input and agent output
- Multiple severity levels allow for nuanced content filtering
- Both text and images can be analyzed for safety

## Safety Categories
The Azure Content Safety service checks for:
- **Hate**: Hateful or discriminatory content
- **Violence**: Violent or graphic content
- **Self-Harm**: Content related to self-injury
- **Sexual**: Sexually explicit content

## Configuration

Content safety can be configured in your `.env` file:

```bash
# Azure Content Safety
CONTENT_SAFETY_ENDPOINT=your-endpoint
CONTENT_SAFETY_API_KEY=your-key
CONTENT_SAFETY_ENABLED=true

# Per-category thresholds (0-7)
CONTENT_SAFETY_THRESHOLD_HATE=4
CONTENT_SAFETY_THRESHOLD_SELFHARM=4
CONTENT_SAFETY_THRESHOLD_SEXUAL=4
CONTENT_SAFETY_THRESHOLD_VIOLENCE=4

# Input/Output filtering
CONTENT_SAFETY_BLOCK_UNSAFE_INPUT=true
CONTENT_SAFETY_FILTER_UNSAFE_OUTPUT=true

# Output action: "redact", "placeholder", "empty"
CONTENT_SAFETY_OUTPUT_ACTION=redact
CONTENT_SAFETY_PLACEHOLDER_TEXT=[Content removed due to safety policy]
```

## Matching .NET Structure

This Python implementation mirrors the .NET Content Safety structure:
- ‚úÖ Same API endpoints (/safety/scan-text, /safety/scan-image)
- ‚úÖ Same response format (isSafe, highestSeverity, categorySeverities)
- ‚úÖ Same severity scale (0-6)
- ‚úÖ Automatic filtering in chat API
- ‚úÖ Configurable thresholds and actions

## Next Steps
- Try the **Memory Demo** (04-LongRunningMemory-Demo.ipynb) to learn about session management
- Explore the **Single Agent Demo** (01-SingleAgent-Demo.ipynb) for basic agent interaction
- Check out the **Multiple Agents Demo** (02-MultipleAgents-Demo.ipynb) for group chat