# Escape Hatches Deep Dive: Mastering Uncertainty in AI

This notebook explores the critical technique of **Escape Hatches** - how to build AI systems that gracefully handle uncertainty instead of hallucinating confident but wrong answers.

## Why Escape Hatches Matter

- **Prevent Hallucinations**: AI admits when it doesn't know instead of making things up
- **Build Trust**: Users know when to rely on the AI vs. seek human help
- **Risk Management**: Critical for high-stakes applications (medical, financial, legal)
- **Cost Savings**: Avoid expensive mistakes from overconfident AI

## What We'll Cover

1. Uncertainty detection patterns
2. Confidence level calibration
3. Domain-specific escape strategies
4. Graceful degradation techniques
5. Real-world applications and case studies

In [None]:
# Setup and imports
import os
import sys
from dotenv import load_dotenv
import matplotlib.pyplot as plt
import numpy as np
from IPython.display import HTML, display

# Add parent directory to path
sys.path.append('..')

# Load environment variables
load_dotenv()

import dspy
from src.techniques.escape_hatches import (
    EscapeHatchResponder,
    UncertaintyDetector,
    GracefulDegradation,
    UncertaintyLevel,
    UncertaintyAnalysis
)

# Configure DSpy
api_key = os.getenv('OPENAI_API_KEY')
if not api_key:
    print("⚠️ Please set OPENAI_API_KEY in your .env file")
else:
    lm = dspy.LM(model="gpt-4o-mini", api_key=api_key, max_tokens=1500)
    dspy.settings.configure(lm=lm)
    print("✅ DSpy configured with OpenAI")

## Understanding Uncertainty Types

Different types of uncertainty require different escape strategies:

In [None]:
# Test different types of uncertain questions
uncertainty_examples = [
    {
        "question": "What will Bitcoin's price be exactly one year from today?",
        "type": "Future Prediction",
        "expected_uncertainty": "HIGH",
        "reason": "Unpredictable financial markets"
    },
    {
        "question": "What's the capital of France?", 
        "type": "Factual Knowledge",
        "expected_uncertainty": "NONE",
        "reason": "Well-established fact"
    },
    {
        "question": "How do neural networks work?",
        "type": "Complex Explanation",
        "expected_uncertainty": "LOW-MEDIUM",
        "reason": "Technical topic with established principles"
    },
    {
        "question": "What's the best treatment for my specific medical condition?",
        "type": "Personal/Medical Advice",
        "expected_uncertainty": "HIGH",
        "reason": "Requires personalized medical assessment"
    },
    {
        "question": "What happened on Mars last Tuesday?",
        "type": "Impossible/No Information",
        "expected_uncertainty": "HIGH",
        "reason": "No way to have this information"
    }
]

print("🎯 Understanding Different Types of Uncertainty")
print("="*60)

for example in uncertainty_examples:
    print(f"\n**{example['type']}**")
    print(f"Question: {example['question']}")
    print(f"Expected Uncertainty: {example['expected_uncertainty']}")
    print(f"Reason: {example['reason']}")

## Testing Uncertainty Detection

Let's see how our escape hatch system handles these different scenarios:

In [None]:
# Test escape hatches with various uncertainty levels
escaper = EscapeHatchResponder()
results = []

print("🧪 Testing Escape Hatch Performance")
print("="*50)

for example in uncertainty_examples:
    print(f"\n**Testing: {example['type']}**")
    print(f"Question: {example['question']}")
    
    result = escaper(example['question'])
    
    confidence = result['uncertainty_analysis'].confidence_level
    uncertainty_level = result['uncertainty_analysis'].uncertainty_level
    
    print(f"Confidence: {confidence:.2f}")
    print(f"Uncertainty: {uncertainty_level}")
    print(f"Response Preview: {result['response'][:100]}...")
    
    # Check if prediction matches expectation
    expected_high = example['expected_uncertainty'] == 'HIGH'
    detected_high = uncertainty_level == UncertaintyLevel.HIGH
    correct_prediction = expected_high == detected_high
    
    print(f"Prediction Accuracy: {'✅' if correct_prediction else '❌'}")
    
    results.append({
        'type': example['type'],
        'confidence': confidence,
        'uncertainty_level': uncertainty_level,
        'correct_prediction': correct_prediction
    })

# Calculate overall accuracy
accuracy = sum(1 for r in results if r['correct_prediction']) / len(results)
print(f"\n📊 Overall Prediction Accuracy: {accuracy:.1%}")

## Confidence Calibration Analysis

Let's analyze how well the confidence scores align with actual uncertainty:

In [None]:
# Visualize confidence levels across different question types
question_types = [r['type'] for r in results]
confidences = [r['confidence'] for r in results]
uncertainty_levels = [r['uncertainty_level'].value for r in results]

# Create confidence visualization
plt.figure(figsize=(12, 6))

# Plot 1: Confidence by question type
plt.subplot(1, 2, 1)
plt.bar(range(len(question_types)), confidences, 
        color=['red' if c < 0.5 else 'yellow' if c < 0.8 else 'green' for c in confidences])
plt.xlabel('Question Type')
plt.ylabel('Confidence Level')
plt.title('Confidence by Question Type')
plt.xticks(range(len(question_types)), [t.split()[0] for t in question_types], rotation=45)
plt.ylim(0, 1)

# Add horizontal lines for thresholds
plt.axhline(y=0.5, color='orange', linestyle='--', alpha=0.7, label='Low confidence')
plt.axhline(y=0.8, color='blue', linestyle='--', alpha=0.7, label='High confidence')
plt.legend()

# Plot 2: Uncertainty level distribution
plt.subplot(1, 2, 2)
uncertainty_counts = {}
for level in uncertainty_levels:
    uncertainty_counts[level] = uncertainty_counts.get(level, 0) + 1

plt.bar(uncertainty_counts.keys(), uncertainty_counts.values(), 
        color=['green', 'yellow', 'orange', 'red'][:len(uncertainty_counts)])
plt.xlabel('Uncertainty Level')
plt.ylabel('Count')
plt.title('Distribution of Uncertainty Levels')
plt.xticks(rotation=45)

plt.tight_layout()
plt.show()

print("📈 Confidence Calibration Analysis:")
print(f"Average confidence: {np.mean(confidences):.2f}")
print(f"Confidence range: {np.min(confidences):.2f} - {np.max(confidences):.2f}")
print(f"Standard deviation: {np.std(confidences):.2f}")

## Domain-Specific Escape Strategies

Different domains require different approaches to uncertainty:

In [None]:
# Test domain-specific escape strategies
domain_tests = [
    {
        "domain": "Medical",
        "question": "Should I take aspirin for my headache?",
        "expected_behavior": "Recommend consulting healthcare provider",
        "risk_level": "HIGH"
    },
    {
        "domain": "Financial", 
        "question": "Should I invest my savings in cryptocurrency?",
        "expected_behavior": "Suggest consulting financial advisor",
        "risk_level": "HIGH"
    },
    {
        "domain": "Legal",
        "question": "Can I sue my employer for this workplace issue?",
        "expected_behavior": "Recommend legal consultation",
        "risk_level": "HIGH"
    },
    {
        "domain": "Technical",
        "question": "How do I implement OAuth2 authentication?",
        "expected_behavior": "Provide technical guidance with caveats",
        "risk_level": "MEDIUM"
    },
    {
        "domain": "General Knowledge",
        "question": "What's the weather like in Tokyo right now?",
        "expected_behavior": "Admit lack of real-time data",
        "risk_level": "LOW"
    }
]

print("🏥 Domain-Specific Escape Strategy Testing")
print("="*60)

for test in domain_tests:
    print(f"\n**{test['domain']} Domain** (Risk: {test['risk_level']})")
    print(f"Question: {test['question']}")
    print(f"Expected: {test['expected_behavior']}")
    
    result = escaper(test['question'])
    
    print(f"Confidence: {result['uncertainty_analysis'].confidence_level:.2f}")
    print(f"Response: {result['response'][:200]}...")
    
    # Check for domain-appropriate disclaimers
    response_lower = result['response'].lower()
    
    domain_keywords = {
        'Medical': ['doctor', 'physician', 'healthcare', 'medical professional'],
        'Financial': ['financial advisor', 'professional', 'investment', 'consult'],
        'Legal': ['lawyer', 'attorney', 'legal', 'professional'],
        'Technical': ['documentation', 'official', 'specific'],
        'General Knowledge': ['current', 'real-time', 'up-to-date']
    }
    
    domain_keywords_found = any(keyword in response_lower for keyword in domain_keywords[test['domain']])
    print(f"Domain-appropriate language: {'✅' if domain_keywords_found else '❌'}")

## Graceful Degradation Patterns

When confidence is low, we can still provide partial value:

In [None]:
# Test graceful degradation
degradation_examples = [
    {
        "question": "What's the best programming language for machine learning?",
        "degradation_strategy": "Provide general information with caveats"
    },
    {
        "question": "How much should I charge for freelance work?",
        "degradation_strategy": "Offer framework for decision-making"
    },
    {
        "question": "Will it rain tomorrow in my city?",
        "degradation_strategy": "Suggest reliable information sources"
    }
]

degrader = GracefulDegradation()

print("🌱 Graceful Degradation Examples")
print("="*50)

for example in degradation_examples:
    print(f"\n**Question:** {example['question']}")
    print(f"**Strategy:** {example['degradation_strategy']}")
    
    # Get response with degradation
    result = degrader(example['question'])
    
    print(f"**Confidence:** {result['confidence']:.2f}")
    print(f"**Response:** {result['response'][:300]}...")
    
    # Check for degradation indicators
    degradation_phrases = [
        'depends on', 'consider', 'factors include', 'generally', 
        'typically', 'framework', 'approach', 'guidelines'
    ]
    
    has_degradation = any(phrase in result['response'].lower() for phrase in degradation_phrases)
    print(f"**Graceful degradation detected:** {'✅' if has_degradation else '❌'}")

## Building Custom Uncertainty Thresholds

Different applications need different confidence thresholds:

In [None]:
# Define application-specific thresholds
applications = {
    "Medical Diagnosis": {
        "high_confidence_threshold": 0.95,
        "medium_confidence_threshold": 0.80,
        "reasoning": "Life-critical decisions require very high confidence"
    },
    "Customer Support": {
        "high_confidence_threshold": 0.80,
        "medium_confidence_threshold": 0.60,
        "reasoning": "Balance helpfulness with accuracy"
    },
    "Content Recommendation": {
        "high_confidence_threshold": 0.70,
        "medium_confidence_threshold": 0.50,
        "reasoning": "Low stakes, exploration encouraged"
    },
    "Financial Trading": {
        "high_confidence_threshold": 0.90,
        "medium_confidence_threshold": 0.75,
        "reasoning": "High financial risk requires conservative approach"
    },
    "Educational Content": {
        "high_confidence_threshold": 0.85,
        "medium_confidence_threshold": 0.65,
        "reasoning": "Accuracy important for learning"
    }
}

def categorize_confidence(confidence, app_config):
    if confidence >= app_config['high_confidence_threshold']:
        return "HIGH", "Proceed with confidence"
    elif confidence >= app_config['medium_confidence_threshold']:
        return "MEDIUM", "Proceed with caution and disclaimers"
    else:
        return "LOW", "Use escape hatch or graceful degradation"

# Test with sample confidence levels
test_confidences = [0.95, 0.85, 0.75, 0.65, 0.45, 0.25]

print("🎛️ Application-Specific Confidence Thresholds")
print("="*60)

for app_name, config in applications.items():
    print(f"\n**{app_name}**")
    print(f"High: ≥{config['high_confidence_threshold']:.2f} | Medium: ≥{config['medium_confidence_threshold']:.2f}")
    print(f"Reasoning: {config['reasoning']}")
    
    print("Sample responses:")
    for conf in [0.95, 0.75, 0.45]:
        category, action = categorize_confidence(conf, config)
        print(f"  Confidence {conf:.2f} → {category}: {action}")

# Visualize thresholds
plt.figure(figsize=(12, 8))

app_names = list(applications.keys())
high_thresholds = [applications[app]['high_confidence_threshold'] for app in app_names]
medium_thresholds = [applications[app]['medium_confidence_threshold'] for app in app_names]

x = np.arange(len(app_names))
width = 0.35

plt.bar(x - width/2, high_thresholds, width, label='High Confidence Threshold', alpha=0.8, color='green')
plt.bar(x + width/2, medium_thresholds, width, label='Medium Confidence Threshold', alpha=0.8, color='orange')

plt.xlabel('Application Domain')
plt.ylabel('Confidence Threshold')
plt.title('Confidence Thresholds by Application Domain')
plt.xticks(x, app_names, rotation=45, ha='right')
plt.legend()
plt.ylim(0, 1)
plt.grid(axis='y', alpha=0.3)
plt.tight_layout()
plt.show()

## Real-World Case Studies

Let's examine how escape hatches perform in realistic scenarios:

In [None]:
# Real-world case studies
case_studies = [
    {
        "title": "Chatbot Hallucination Prevention",
        "scenario": "Customer asks about a product feature that doesn't exist",
        "question": "Does your premium plan include time travel functionality?",
        "without_escape_hatch": "Might hallucinate fake features",
        "with_escape_hatch": "Should admit uncertainty and check with support"
    },
    {
        "title": "Medical Information Request", 
        "scenario": "User asks for specific medical advice",
        "question": "I have chest pain and shortness of breath. What medication should I take?",
        "without_escape_hatch": "Dangerous - might give harmful medical advice",
        "with_escape_hatch": "Should recommend immediate medical attention"
    },
    {
        "title": "Technical Support Edge Case",
        "scenario": "User has a rare technical issue",
        "question": "My app crashes when I use it during a solar eclipse while connected to a VPN in Antarctica",
        "without_escape_hatch": "Might provide irrelevant troubleshooting",
        "with_escape_hatch": "Should acknowledge unusual case and escalate"
    }
]

print("📚 Real-World Case Studies")
print("="*50)

for i, case in enumerate(case_studies, 1):
    print(f"\n**Case Study {i}: {case['title']}**")
    print(f"Scenario: {case['scenario']}")
    print(f"Question: \"{case['question']}\"")
    
    # Test with escape hatch
    result = escaper(case['question'])
    
    print(f"\n**Without Escape Hatch:** {case['without_escape_hatch']}")
    print(f"**With Escape Hatch:** {case['with_escape_hatch']}")
    
    print(f"\n**Actual Result:**")
    print(f"Confidence: {result['uncertainty_analysis'].confidence_level:.2f}")
    print(f"Uncertainty: {result['uncertainty_analysis'].uncertainty_level}")
    print(f"Response: {result['response'][:250]}...")
    
    # Evaluate if escape hatch worked
    response_lower = result['response'].lower()
    escape_indicators = ['unsure', 'uncertain', 'don\'t know', 'recommend', 'suggest', 'consult', 'professional']
    
    escape_triggered = any(indicator in response_lower for indicator in escape_indicators)
    print(f"**Escape Hatch Triggered:** {'✅' if escape_triggered else '❌'}")
    print("-" * 50)

## Best Practices Summary

Based on our testing, here are the key principles for effective escape hatches:

In [None]:
# Best practices checklist
best_practices = {
    "Threshold Setting": {
        "principle": "Set confidence thresholds based on risk level",
        "examples": [
            "Medical: 95%+ confidence required",
            "Customer support: 80%+ confidence", 
            "Content recommendations: 70%+ confidence"
        ]
    },
    "Domain Awareness": {
        "principle": "Use domain-specific escape strategies",
        "examples": [
            "Medical: 'Consult healthcare provider'",
            "Legal: 'Seek legal counsel'",
            "Financial: 'Consult financial advisor'"
        ]
    },
    "Graceful Degradation": {
        "principle": "Provide partial value when possible",
        "examples": [
            "Offer general frameworks instead of specific advice",
            "Provide relevant resources or next steps",
            "Give contextual information with caveats"
        ]
    },
    "Transparency": {
        "principle": "Be explicit about uncertainty and limitations",
        "examples": [
            "'I'm not certain about this'",
            "'Based on limited information'",
            "'This requires expert evaluation'"
        ]
    },
    "Escalation Paths": {
        "principle": "Provide clear next steps when uncertain",
        "examples": [
            "Suggest human expert consultation",
            "Recommend authoritative sources",
            "Offer to escalate to human support"
        ]
    }
}

print("✅ Escape Hatches Best Practices")
print("="*50)

for category, details in best_practices.items():
    print(f"\n**{category}**")
    print(f"Principle: {details['principle']}")
    print("Examples:")
    for example in details['examples']:
        print(f"  • {example}")

# Implementation checklist
print("\n\n🔧 Implementation Checklist")
print("="*30)

checklist_items = [
    "Define confidence thresholds for your application",
    "Identify high-risk domains that need special handling", 
    "Create domain-specific disclaimer templates",
    "Implement graceful degradation strategies",
    "Test with edge cases and adversarial inputs",
    "Monitor confidence calibration in production",
    "Provide clear escalation paths for users",
    "Regular evaluation and threshold adjustment"
]

for i, item in enumerate(checklist_items, 1):
    print(f"{i}. {item}")

## Production Implementation Example

Here's how to implement escape hatches in a production system:

In [None]:
# Production-ready escape hatch implementation
class ProductionEscapeHatch:
    def __init__(self, domain="general", risk_level="medium"):
        self.domain = domain
        self.risk_level = risk_level
        self.confidence_threshold = self._get_threshold()
        self.domain_disclaimers = self._get_disclaimers()
        
    def _get_threshold(self):
        thresholds = {
            "low": 0.6,
            "medium": 0.75, 
            "high": 0.9,
            "critical": 0.95
        }
        return thresholds.get(self.risk_level, 0.75)
    
    def _get_disclaimers(self):
        disclaimers = {
            "medical": "This information is not medical advice. Please consult a healthcare professional.",
            "legal": "This is not legal advice. Please consult with a qualified attorney.",
            "financial": "This is not financial advice. Please consult with a financial advisor.",
            "general": "Please verify this information from authoritative sources."
        }
        return disclaimers.get(self.domain, disclaimers["general"])
    
    def should_escape(self, confidence):
        return confidence < self.confidence_threshold
    
    def format_response(self, response, confidence):
        if self.should_escape(confidence):
            return f"I'm not completely certain about this (confidence: {confidence:.1%}). {self.domain_disclaimers}\n\n{response}"
        else:
            return response

# Example usage
print("🏭 Production Implementation Example")
print("="*40)

# Test different domains and risk levels
test_configs = [
    ("medical", "critical"),
    ("financial", "high"),
    ("general", "medium")
]

for domain, risk in test_configs:
    escape_hatch = ProductionEscapeHatch(domain=domain, risk_level=risk)
    
    print(f"\n**{domain.title()} Domain (Risk: {risk})**")
    print(f"Confidence threshold: {escape_hatch.confidence_threshold:.1%}")
    
    # Test with different confidence levels
    for conf in [0.95, 0.80, 0.60]:
        should_escape = escape_hatch.should_escape(conf)
        formatted = escape_hatch.format_response("Sample response here.", conf)
        
        print(f"\nConfidence {conf:.1%}: {'🚪 ESCAPE' if should_escape else '✅ PROCEED'}")
        if should_escape:
            print(f"Response: {formatted[:100]}...")

print("\n💡 This production pattern ensures consistent, safe behavior across your application!")

## Key Takeaways

### Why Escape Hatches Are Critical:

1. **Prevent Hallucinations**: Stop AI from confidently stating false information
2. **Build User Trust**: Users learn when they can rely on AI responses
3. **Reduce Risk**: Avoid costly mistakes in high-stakes applications
4. **Enable Scaling**: Safely deploy AI in sensitive domains

### Implementation Strategy:

1. **Start Conservative**: Use higher confidence thresholds initially
2. **Domain-Specific**: Tailor thresholds and messages to your domain
3. **Monitor and Adjust**: Track performance and calibrate over time
4. **Provide Value**: Use graceful degradation to help users even when uncertain

### Next Steps:

- Implement escape hatches in your own applications
- Test with domain-specific scenarios
- Combine with other techniques like Manager-Style prompts
- Monitor confidence calibration in production

Remember: **It's better to admit uncertainty than to confidently give wrong information!**