# Browser-Use PII Masking with AgentCore Tutorial

This tutorial demonstrates how **browser-use** integrates with **Amazon Bedrock AgentCore Browser Tool** to detect, mask, and securely handle Personally Identifiable Information (PII) during web automation tasks.

## What You'll Learn

1. **PII Detection**: How browser-use identifies different types of PII in web forms
2. **PII Masking**: Techniques for masking PII within AgentCore's isolated browser sessions
3. **Credential Security**: Secure handling of login credentials within AgentCore sessions
4. **Compliance Validation**: Ensuring HIPAA, PCI-DSS, and GDPR compliance
5. **Session Isolation**: Leveraging AgentCore's micro-VM isolation for sensitive data

## Prerequisites

- Completed Tutorial 1: Browser-Use AgentCore Secure Connection
- Python 3.12+
- AWS credentials configured for AgentCore and Bedrock
- Required packages: `browser-use`, `bedrock-agentcore`, `langchain-aws`
- AWS Bedrock model access (Claude models)

## Architecture Overview

```
Browser-Use Agent → PII Detection → AgentCore Micro-VM → Masked Data Processing
                         ↓                    ↓                    ↓
                  Pattern Matching    Session Isolation    Compliance Validation
                         ↓                    ↓                    ↓
                  Credential Security  Live View Monitoring  Audit Trail
```

## 1. Environment Setup and Imports

Import all required dependencies for PII detection and masking with browser-use and AgentCore.

In [None]:
# Core imports for browser-use and AgentCore integration
import asyncio
import logging
import os
import json
from datetime import datetime
from typing import Dict, Optional, Any, List

# AgentCore Browser Client - Real implementation
from bedrock_agentcore.tools.browser_client import BrowserClient

# Browser-Use framework - Real implementation
from browser_use import Agent
from browser_use.browser.session import BrowserSession

# Our PII masking and credential handling utilities
from tools.browseruse_pii_masking import (
    BrowserUsePIIMasking,
    BrowserUsePIIValidator,
    analyze_browser_page_pii,
    mask_browser_form_data,
    validate_browser_pii_handling
)
from tools.browseruse_credential_handling import (
    BrowserUseCredentialHandler,
    CredentialType,
    CredentialSecurityLevel,
    CredentialScope
)
from tools.browseruse_sensitive_data_handler import (
    BrowserUseSensitiveDataHandler,
    PIIType,
    ComplianceFramework,
    DataClassification
)
from tools.browseruse_agentcore_session_manager import (
    BrowserUseAgentCoreSessionManager,
    SessionConfig
)

# Configure logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

print("✅ All imports successful - Ready for PII masking and credential security!")

## 2. AWS Bedrock LLM Configuration

Configure your LLM model using **AWS Bedrock only** - no third-party APIs.

In [None]:
# AWS Bedrock Only - No Third-Party APIs
from langchain_aws import ChatBedrock

# Use Claude 3.5 Sonnet via Bedrock (most capable for PII detection)
try:
    llm_model = ChatBedrock(
        model_id="anthropic.claude-3-5-sonnet-20241022-v2:0",
        region_name="us-east-1"
    )
    print("✅ Using AWS Bedrock Claude 3.5 Sonnet for PII detection")
except Exception as e:
    # Fallback to Claude 3 Sonnet
    try:
        llm_model = ChatBedrock(
            model_id="anthropic.claude-3-sonnet-20240229-v1:0",
            region_name="us-east-1"
        )
        print("✅ Using AWS Bedrock Claude 3 Sonnet for PII detection")
    except Exception as e2:
        # Fallback to Claude 3 Haiku
        llm_model = ChatBedrock(
            model_id="anthropic.claude-3-haiku-20240307-v1:0",
            region_name="us-east-1"
        )
        print("✅ Using AWS Bedrock Claude 3 Haiku for PII detection")

print(f"✅ LLM Model configured via AWS Bedrock")
print(f"🔒 Using secure AWS infrastructure - no third-party API keys needed")

## 3. PII Detection Configuration

Configure PII detection with compliance frameworks for different types of sensitive data.

In [None]:
# Configure PII detection with multiple compliance frameworks
compliance_frameworks = [
    ComplianceFramework.HIPAA,    # Healthcare data
    ComplianceFramework.PCI_DSS,  # Payment card data
    ComplianceFramework.GDPR      # General data protection
]

# Initialize PII masking with comprehensive detection
pii_masking = BrowserUsePIIMasking(
    compliance_frameworks=compliance_frameworks,
    enable_screenshot_analysis=True,
    enable_dom_analysis=True
)

# Initialize PII validator
pii_validator = BrowserUsePIIValidator(pii_masking)

print("✅ PII Detection configured with compliance frameworks:")
for framework in compliance_frameworks:
    print(f"   - {framework.value.upper()}")
print("🔍 Ready to detect: SSN, Credit Cards, Email, Phone, Medical Records, and more")

## 4. AgentCore Session Configuration for PII Handling

Configure AgentCore Browser Tool with enhanced security for sensitive data processing.

In [None]:
# Configure AgentCore session for maximum security with PII handling
agentcore_config = SessionConfig(
    region='us-east-1',
    session_timeout=900,  # 15 minutes for complex PII processing
    enable_live_view=True,  # Monitor PII handling in real-time
    enable_session_replay=True,  # Full audit trail for compliance
    isolation_level="micro-vm",  # Maximum isolation for sensitive data
    compliance_mode="enterprise",  # Enterprise-grade security
    enable_pii_detection=True,  # Enable built-in PII detection
    enable_data_masking=True,   # Enable automatic data masking
    audit_level="comprehensive"  # Comprehensive audit logging
)

# Initialize session manager
session_manager = BrowserUseAgentCoreSessionManager(agentcore_config)

print("✅ AgentCore configured for secure PII handling:")
print(f"   - Micro-VM isolation: {agentcore_config.isolation_level}")
print(f"   - Live view monitoring: {agentcore_config.enable_live_view}")
print(f"   - Session replay: {agentcore_config.enable_session_replay}")
print(f"   - Compliance mode: {agentcore_config.compliance_mode}")
print("🔒 Maximum security enabled for sensitive data processing")

## 5. Demonstration: PII Detection in Different Form Types

Let's demonstrate how browser-use detects different types of PII in various web forms.

In [None]:
# Sample form data representing different scenarios
sample_forms = {
    "healthcare_form": {
        "form_id": "patient-intake",
        "form_action": "/submit-patient-info",
        "form_method": "POST",
        "fields": [
            {
                "id": "patient-ssn",
                "name": "social_security_number",
                "type": "text",
                "value": "123-45-6789",
                "xpath": "//input[@name='social_security_number']"
            },
            {
                "id": "patient-dob",
                "name": "date_of_birth",
                "type": "date",
                "value": "1985-03-15",
                "xpath": "//input[@name='date_of_birth']"
            },
            {
                "id": "patient-email",
                "name": "email_address",
                "type": "email",
                "value": "john.doe@email.com",
                "xpath": "//input[@name='email_address']"
            },
            {
                "id": "medical-record",
                "name": "medical_record_number",
                "type": "text",
                "value": "MRN-789456123",
                "xpath": "//input[@name='medical_record_number']"
            }
        ]
    },
    "financial_form": {
        "form_id": "payment-form",
        "form_action": "/process-payment",
        "form_method": "POST",
        "fields": [
            {
                "id": "credit-card",
                "name": "card_number",
                "type": "text",
                "value": "4532-1234-5678-9012",
                "xpath": "//input[@name='card_number']"
            },
            {
                "id": "cvv",
                "name": "security_code",
                "type": "text",
                "value": "123",
                "xpath": "//input[@name='security_code']"
            },
            {
                "id": "bank-account",
                "name": "account_number",
                "type": "text",
                "value": "123456789012",
                "xpath": "//input[@name='account_number']"
            }
        ]
    }
}

print("📋 Sample forms prepared for PII detection demonstration:")
print(f"   - Healthcare form with {len(sample_forms['healthcare_form']['fields'])} fields")
print(f"   - Financial form with {len(sample_forms['financial_form']['fields'])} fields")
print("🔍 Ready to analyze for PII patterns")

## 6. PII Detection Analysis

Analyze the sample forms to detect different types of PII and understand the detection patterns.

In [None]:
async def analyze_healthcare_form_pii():
    """Analyze healthcare form for PII detection."""
    print("🏥 Analyzing Healthcare Form for PII...")
    print("=" * 50)
    
    # Create page content structure for analysis
    healthcare_page_content = {
        "forms": [sample_forms["healthcare_form"]],
        "text_content": "Patient intake form for medical records and healthcare information"
    }
    
    # Analyze for PII
    analysis_results = await pii_masking.analyze_page_for_pii(healthcare_page_content)
    
    # Display results
    combined_results = analysis_results.get('combined_results', {})
    dom_analysis = analysis_results.get('dom_analysis', {})
    
    print(f"📊 Analysis Results:")
    print(f"   - Total PII items detected: {combined_results.get('total_pii_count', 0)}")
    print(f"   - Highest classification: {combined_results.get('highest_classification', 'N/A')}")
    print(f"   - Requires secure handling: {combined_results.get('requires_secure_handling', False)}")
    
    # Show detailed form analysis
    for form_analysis in dom_analysis.get('forms_analysis', []):
        print(f"\n📝 Form: {form_analysis.form_id}")
        print(f"   - PII elements found: {len(form_analysis.elements_with_pii)}")
        print(f"   - Classification level: {form_analysis.highest_classification}")
        
        for element in form_analysis.elements_with_pii:
            pii_types = [d.pii_type.value for d in element.pii_detections]
            print(f"   - Field '{element.element_name}': {', '.join(pii_types)}")
            print(f"     Original: {element.element_value}")
            print(f"     Masked: {element.masked_value}")
        
        if form_analysis.compliance_violations:
            print(f"   ⚠️  Compliance violations: {len(form_analysis.compliance_violations)}")
            for violation in form_analysis.compliance_violations:
                print(f"      - {violation['framework']}: {violation['pii_type']} (confidence: {violation['confidence']:.2f})")
    
    return analysis_results

# Run the healthcare form analysis
healthcare_analysis = await analyze_healthcare_form_pii()

In [None]:
async def analyze_financial_form_pii():
    """Analyze financial form for PII detection."""
    print("\n💳 Analyzing Financial Form for PII...")
    print("=" * 50)
    
    # Create page content structure for analysis
    financial_page_content = {
        "forms": [sample_forms["financial_form"]],
        "text_content": "Payment processing form for credit card and banking information"
    }
    
    # Analyze for PII
    analysis_results = await pii_masking.analyze_page_for_pii(financial_page_content)
    
    # Display results
    combined_results = analysis_results.get('combined_results', {})
    dom_analysis = analysis_results.get('dom_analysis', {})
    
    print(f"📊 Analysis Results:")
    print(f"   - Total PII items detected: {combined_results.get('total_pii_count', 0)}")
    print(f"   - Highest classification: {combined_results.get('highest_classification', 'N/A')}")
    print(f"   - Requires secure handling: {combined_results.get('requires_secure_handling', False)}")
    
    # Show detailed form analysis
    for form_analysis in dom_analysis.get('forms_analysis', []):
        print(f"\n💰 Form: {form_analysis.form_id}")
        print(f"   - PII elements found: {len(form_analysis.elements_with_pii)}")
        print(f"   - Classification level: {form_analysis.highest_classification}")
        
        for element in form_analysis.elements_with_pii:
            pii_types = [d.pii_type.value for d in element.pii_detections]
            print(f"   - Field '{element.element_name}': {', '.join(pii_types)}")
            print(f"     Original: {element.element_value}")
            print(f"     Masked: {element.masked_value}")
        
        if form_analysis.compliance_violations:
            print(f"   ⚠️  Compliance violations: {len(form_analysis.compliance_violations)}")
            for violation in form_analysis.compliance_violations:
                print(f"      - {violation['framework']}: {violation['pii_type']} (confidence: {violation['confidence']:.2f})")
        
        print(f"\n💡 Masking Recommendations:")
        for recommendation in form_analysis.masking_recommendations:
            print(f"   - {recommendation}")
    
    return analysis_results

# Run the financial form analysis
financial_analysis = await analyze_financial_form_pii()

## 7. Credential Security Demonstration

Demonstrate secure credential handling within AgentCore's isolated browser sessions.

In [None]:
# Initialize credential handler for secure credential management
from tools.browseruse_credential_handling import (
    BrowserUseCredentialHandler,
    CredentialType,
    CredentialSecurityLevel,
    CredentialScope,
    CredentialPolicy
)

# Configure credential policies for different types
credential_policies = {
    CredentialType.PASSWORD: CredentialPolicy(
        credential_type=CredentialType.PASSWORD,
        security_level=CredentialSecurityLevel.HIGH,
        scope=CredentialScope.SESSION,
        max_age_minutes=30,
        require_encryption=True,
        require_audit=True,
        allow_caching=False,
        isolation_required=True
    ),
    CredentialType.API_KEY: CredentialPolicy(
        credential_type=CredentialType.API_KEY,
        security_level=CredentialSecurityLevel.CRITICAL,
        scope=CredentialScope.SESSION,
        max_age_minutes=15,
        require_encryption=True,
        require_audit=True,
        allow_caching=False,
        require_mfa=True,
        isolation_required=True
    )
}

# Initialize credential handler
credential_handler = BrowserUseCredentialHandler(
    session_manager=session_manager,
    credential_policies=credential_policies
)

print("🔐 Credential handler initialized with secure policies:")
for cred_type, policy in credential_policies.items():
    print(f"   - {cred_type.value}: {policy.security_level.value} security, {policy.scope.value} scope")
print("🛡️  All credentials will be isolated within AgentCore micro-VM")

In [None]:
async def demonstrate_secure_credential_handling():
    """Demonstrate secure credential input and handling."""
    print("🔑 Demonstrating Secure Credential Handling...")
    print("=" * 50)
    
    # Simulate secure credential storage (in real scenario, these would be securely input)
    demo_credentials = {
        "login_password": {
            "type": CredentialType.PASSWORD,
            "value": "SecurePassword123!",
            "description": "User login password"
        },
        "api_key": {
            "type": CredentialType.API_KEY,
            "value": "sk-1234567890abcdef",
            "description": "Service API key"
        }
    }
    
    # Store credentials securely
    stored_credentials = {}
    
    for cred_name, cred_info in demo_credentials.items():
        print(f"\n🔒 Storing credential: {cred_name}")
        
        # Store credential with security measures
        credential_id = await credential_handler.store_credential(
            credential_type=cred_info["type"],
            credential_value=cred_info["value"],
            description=cred_info["description"],
            tags=["demo", "tutorial"]
        )
        
        stored_credentials[cred_name] = credential_id
        
        print(f"   ✅ Stored with ID: {credential_id}")
        print(f"   🔐 Security level: {credential_policies[cred_info['type']].security_level.value}")
        print(f"   🏠 Scope: {credential_policies[cred_info['type']].scope.value}")
        print(f"   ⏰ Max age: {credential_policies[cred_info['type']].max_age_minutes} minutes")
    
    return stored_credentials

# Run credential storage demonstration
stored_creds = await demonstrate_secure_credential_handling()

## 8. Browser-Use Agent with PII Masking Integration

Create a browser-use Agent that integrates with AgentCore and automatically handles PII masking.

In [None]:
async def create_pii_aware_browser_agent():
    """Create a browser-use Agent with PII masking capabilities."""
    print("🤖 Creating PII-Aware Browser-Use Agent...")
    print("=" * 50)
    
    # Create AgentCore browser session
    print("🔧 Setting up AgentCore browser session...")
    agentcore_session = await session_manager.create_secure_session(
        session_id="pii-masking-demo",
        enable_pii_detection=True,
        enable_credential_isolation=True
    )
    
    # Get WebSocket connection details
    ws_url, headers = session_manager.get_connection_details(agentcore_session.session_id)
    
    print(f"✅ AgentCore session created: {agentcore_session.session_id}")
    print(f"🔗 WebSocket URL: {ws_url[:50]}...")
    print(f"🔒 Micro-VM isolation: ENABLED")
    print(f"🔍 PII detection: ENABLED")
    
    # Create browser-use session with AgentCore
    browser_session = BrowserSession(
        cdp_url=ws_url,
        cdp_headers=headers
    )
    
    # Create browser-use Agent with PII awareness
    pii_aware_agent = Agent(
        task="Handle sensitive information securely with automatic PII detection and masking",
        llm=llm_model,
        browser_session=browser_session,
        # Add PII masking callbacks
        pre_action_callback=pii_masking.execute_pre_action_callbacks,
        post_action_callback=pii_masking.execute_post_action_callbacks
    )
    
    print("🤖 Browser-Use Agent created with PII masking integration")
    print(f"🧠 LLM Model: {llm_model.model_id}")
    print(f"🔒 Session isolation: AgentCore Micro-VM")
    
    # Get live view URL for monitoring
    live_view_url = session_manager.get_live_view_url(agentcore_session.session_id)
    print(f"👁️  Live View URL: {live_view_url}")
    
    return pii_aware_agent, agentcore_session

# Create the PII-aware agent
agent, session = await create_pii_aware_browser_agent()

## 9. PII Validation and Compliance Testing

Validate that PII is properly detected, masked, and handled according to compliance requirements.

In [None]:
async def validate_pii_compliance():
    """Validate PII handling compliance across different scenarios."""
    print("✅ Running PII Compliance Validation...")
    print("=" * 50)
    
    validation_results = []
    
    # Test 1: Healthcare form validation
    print("\n🏥 Test 1: Healthcare Form Compliance")
    healthcare_page_content = {
        "forms": [sample_forms["healthcare_form"]],
        "text_content": "Patient intake form for medical records"
    }
    
    healthcare_validation = await pii_validator.validate_page_pii_handling(
        healthcare_page_content, expected_masking=True
    )
    
    validation_results.append({
        "test_name": "Healthcare Form",
        "passed": healthcare_validation['validation_passed'],
        "issues": len(healthcare_validation['issues_found']),
        "pii_count": healthcare_validation['pii_analysis']['combined_results']['total_pii_count']
    })
    
    print(f"   Result: {'✅ PASSED' if healthcare_validation['validation_passed'] else '❌ FAILED'}")
    print(f"   PII detected: {healthcare_validation['pii_analysis']['combined_results']['total_pii_count']}")
    print(f"   Issues found: {len(healthcare_validation['issues_found'])}")
    
    # Test 2: Financial form validation
    print("\n💳 Test 2: Financial Form Compliance")
    financial_page_content = {
        "forms": [sample_forms["financial_form"]],
        "text_content": "Payment processing form"
    }
    
    financial_validation = await pii_validator.validate_page_pii_handling(
        financial_page_content, expected_masking=True
    )
    
    validation_results.append({
        "test_name": "Financial Form",
        "passed": financial_validation['validation_passed'],
        "issues": len(financial_validation['issues_found']),
        "pii_count": financial_validation['pii_analysis']['combined_results']['total_pii_count']
    })
    
    print(f"   Result: {'✅ PASSED' if financial_validation['validation_passed'] else '❌ FAILED'}")
    print(f"   PII detected: {financial_validation['pii_analysis']['combined_results']['total_pii_count']}")
    print(f"   Issues found: {len(financial_validation['issues_found'])}")
    
    # Summary
    print("\n📊 Validation Summary:")
    passed_tests = sum(1 for result in validation_results if result['passed'])
    total_tests = len(validation_results)
    total_pii = sum(result['pii_count'] for result in validation_results)
    total_issues = sum(result['issues'] for result in validation_results)
    
    print(f"   Tests passed: {passed_tests}/{total_tests}")
    print(f"   Total PII detected: {total_pii}")
    print(f"   Total issues found: {total_issues}")
    
    if passed_tests == total_tests:
        print("\n🎉 All compliance tests PASSED!")
        print("✅ PII detection and masking working correctly")
        print("✅ AgentCore isolation providing secure handling")
    else:
        print("\n⚠️  Some compliance tests FAILED")
        print("❌ Review PII masking configuration")
        print("❌ Check compliance framework settings")
    
    return validation_results

# Run compliance validation
compliance_results = await validate_pii_compliance()

## 10. Session Cleanup and Security

Demonstrate proper session cleanup and credential protection patterns.

In [None]:
async def demonstrate_secure_cleanup():
    """Demonstrate secure session cleanup and credential protection."""
    print("🧹 Demonstrating Secure Session Cleanup...")
    print("=" * 50)
    
    # Clean up stored credentials
    print("\n🔐 Cleaning up stored credentials...")
    for cred_name, cred_id in stored_creds.items():
        print(f"   Removing credential: {cred_name} ({cred_id})")
        await credential_handler.delete_credential(cred_id)
        print(f"   ✅ Credential {cred_name} securely deleted")
    
    # Clean up AgentCore session
    print("\n🖥️  Cleaning up AgentCore session...")
    session_id = session.session_id
    print(f"   Terminating session: {session_id}")
    
    # Get session summary before cleanup
    session_summary = await session_manager.get_session_summary(session_id)
    print(f"   Session duration: {session_summary.get('duration', 'N/A')}")
    print(f"   Operations performed: {session_summary.get('operations_count', 0)}")
    print(f"   PII detected: {session_summary.get('pii_detected', False)}")
    
    # Terminate session with secure cleanup
    await session_manager.cleanup_session(
        session_id,
        secure_wipe=True,
        preserve_audit_trail=True
    )
    
    print(f"   ✅ Session {session_id} securely terminated")
    print(f"   🔒 Micro-VM destroyed and wiped")
    print(f"   📋 Audit trail preserved for compliance")
    
    # Clear PII validation history
    print("\n🧹 Clearing PII validation history...")
    validation_history_count = len(pii_validator.get_validation_history())
    pii_validator.clear_validation_history()
    print(f"   ✅ Cleared {validation_history_count} validation records")
    
    print("\n🎉 Secure cleanup completed successfully!")
    print("✅ All sensitive data securely removed")
    print("✅ Session isolation boundaries maintained")
    print("✅ Compliance audit trail preserved")

# Run secure cleanup
await demonstrate_secure_cleanup()

## Summary and Next Steps

🎉 **Congratulations!** You've successfully completed the Browser-Use PII Masking with AgentCore tutorial.

### What You've Learned

1. **PII Detection**: How browser-use identifies different types of PII (SSN, credit cards, emails, etc.)
2. **PII Masking**: Techniques for masking sensitive data within AgentCore's secure environment
3. **Credential Security**: Secure handling of login credentials with isolation and encryption
4. **Compliance Validation**: Ensuring HIPAA, PCI-DSS, and GDPR compliance requirements
5. **Session Isolation**: Leveraging AgentCore's micro-VM isolation for maximum security

### Key Security Features Demonstrated

- ✅ **Micro-VM Isolation**: Complete isolation of sensitive operations
- ✅ **Real-time PII Detection**: Automatic identification of sensitive data patterns
- ✅ **Intelligent Masking**: Context-aware masking based on data classification
- ✅ **Credential Protection**: Secure storage and handling of authentication data
- ✅ **Compliance Validation**: Automated compliance checking for multiple frameworks
- ✅ **Audit Trail**: Comprehensive logging for regulatory requirements
- ✅ **Secure Cleanup**: Proper session termination and data wiping

### Next Steps

1. **Tutorial 3**: Advanced compliance and audit trails with browser-use
2. **Tutorial 4**: Production deployment patterns with AgentCore
3. **Real-world Examples**: Healthcare, financial, and legal use cases

### Production Considerations

- Configure appropriate compliance frameworks for your use case
- Implement proper credential management policies
- Set up monitoring and alerting for PII detection events
- Establish audit trail retention policies
- Test session isolation boundaries regularly

🔒 **Remember**: AgentCore's enterprise-grade security features provide the foundation for secure sensitive data handling, while browser-use provides the intelligent automation capabilities.