# üìä Uncertainty Quantification - Classification

<div style="background-color: #e3f2fd; padding: 15px; border-radius: 5px; border-left: 5px solid #2196F3;">
<b>üìì Information</b><br>
<b>Level:</b> Intermediate/Advanced<br>
<b>Time:</b> 20 minutes<br>
<b>Dataset:</b> Breast Cancer (sklearn)<br>
<b>Prerequisite:</b> 03_uncertainty.ipynb
</div>

## üéØ Objectives
- ‚úÖ Uncertainty quantification for **classification** problems
- ‚úÖ Generate **interactive HTML report**
- ‚úÖ Export results to **JSON format**
- ‚úÖ Analyze **probability calibration**
- ‚úÖ Make **uncertainty-based decisions** in critical scenarios

## üìö Why Uncertainty in Classification?

### Critical Contexts

#### üè• Medicine - Cancer Diagnosis
- **Problem**: Binary diagnosis (benign vs malignant)
- **Traditional**: Prediction = 0.85 (85% chance malignant)
- **With Uncertainty**: Prediction = 0.85 ¬± 0.15 (interval: [0.70, 1.00])
- **Decision**: High uncertainty ‚Üí Request additional exams

#### üí≥ Finance - Credit Approval
- **Problem**: Approve/Reject credit
- **Traditional**: Prediction = 0.40 (40% chance of default)
- **With Uncertainty**: Prediction = 0.40 ¬± 0.05 (interval: [0.35, 0.45])
- **Decision**: Low uncertainty ‚Üí Approve with adjusted rate

#### üîí Security - Fraud Detection
- **Problem**: Detect fraudulent transactions
- **Traditional**: Prediction = 0.60 (60% chance fraud)
- **With Uncertainty**: Prediction = 0.60 ¬± 0.30 (interval: [0.30, 0.90])
- **Decision**: High uncertainty ‚Üí Require 2FA verification

## 1Ô∏è‚É£ Setup - Binary Classification Problem

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.datasets import load_breast_cancer
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, roc_auc_score, classification_report
from deepbridge import DBDataset, Experiment
import os

# Configure visualizations
plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette("husl")

# Load Breast Cancer dataset
cancer = load_breast_cancer()
df = pd.DataFrame(cancer.data, columns=cancer.feature_names)
df['target'] = cancer.target  # 0 = malignant, 1 = benign

print(f"üìä Dataset: {df.shape}")
print(f"üè• Target: Cancer diagnosis (0=malignant, 1=benign)")
print(f"\nüìà Class distribution:")
print(df['target'].value_counts())
print(f"\nüìä Class balance:")
print(df['target'].value_counts(normalize=True))

2025-11-12 20:59:18,925 - deepbridge.reports - DEBUG - Using refactored FairnessDataTransformer
2025-11-12 20:59:18,932 - deepbridge.reports - INFO - Successfully imported radar chart fix
2025-11-12 20:59:18,934 - deepbridge.reports - INFO - Successfully patched EnhancedUncertaintyCharts.generate_model_metrics_comparison
2025-11-12 20:59:18,935 - deepbridge.reports - INFO - Successfully applied enhanced_charts patch
2025-11-12 20:59:18,937 - deepbridge.reports - INFO - Successfully loaded UncertaintyChartGenerator
2025-11-12 20:59:18,938 - deepbridge.reports - INFO - Successfully imported and initialized SeabornChartGenerator
2025-11-12 20:59:18,939 - deepbridge.reports - INFO - SeabornChartGenerator has_visualization_libs: True
2025-11-12 20:59:18,940 - deepbridge.reports - INFO - Available chart methods: ['bar_chart', 'boxplot_chart', 'coverage_analysis_chart', 'detailed_boxplot_chart', 'distribution_grid_chart', 'feature_comparison_chart', 'feature_importance_chart', 'feature_psi_ch

## 2Ô∏è‚É£ Train Classification Model

In [2]:
X = df.drop('target', axis=1)
y = df['target']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)

# Train RandomForest Classifier
model = RandomForestClassifier(n_estimators=100, random_state=42, max_depth=10)
model.fit(X_train, y_train)

# Evaluate
y_pred = model.predict(X_test)
y_proba = model.predict_proba(X_test)
accuracy = accuracy_score(y_test, y_pred)
auc = roc_auc_score(y_test, y_proba[:, 1])

print(f"‚úÖ Model trained!")
print(f"üìä Accuracy: {accuracy*100:.2f}%")
print(f"üìä ROC-AUC: {auc:.3f}")
print(f"\nüìã Classification Report:")
print(classification_report(y_test, y_pred, target_names=['Malignant', 'Benign']))

‚úÖ Model trained!
üìä Accuracy: 95.61%
üìä ROC-AUC: 0.994

üìã Classification Report:
              precision    recall  f1-score   support

   Malignant       0.95      0.93      0.94        42
      Benign       0.96      0.97      0.97        72

    accuracy                           0.96       114
   macro avg       0.96      0.95      0.95       114
weighted avg       0.96      0.96      0.96       114



## 3Ô∏è‚É£ Create Experiment

In [3]:
dataset = DBDataset(
    data=df,
    target_column='target',
    model=model,
    test_size=0.2,
    random_state=42,
    dataset_name='Breast Cancer Classification'
)

exp = Experiment(
    dataset=dataset,
    experiment_type='binary_classification',
    tests=['uncertainty'],  # Specify uncertainty test
    random_state=42
)

print("‚úÖ Experiment created!")

‚úÖ Initial model evaluation complete: RandomForestClassifier
‚úÖ Experiment created!


## 4Ô∏è‚É£ Run Uncertainty Test

<div style="background-color: #fff3e0; padding: 15px; border-radius: 5px; border-left: 5px solid #ff9800;">
<b>‚ÑπÔ∏è Configuration:</b> Using 'full' config for comprehensive analysis with multiple alpha levels.
</div>

In [4]:
print("üß™ Running uncertainty quantification test...\n")

# Use run_tests() to store results internally for save_html() and save_json()
experiment_result = exp.run_tests(config_name='full')

print("\n‚úÖ Uncertainty test completed!")
print(f"\nüìä Result type: {type(experiment_result)}")

# Access the uncertainty result
uncertainty_result = experiment_result.get_result('uncertainty')
print(f"üìä Uncertainty result type: {type(uncertainty_result)}")

üß™ Running uncertainty quantification test...



[RUN_DEBUG] No CRQR results found to process
[RUN_DEBUG] Missing reliability_analysis in results
[RUN_DEBUG] Missing marginal_bandwidth in results
[RUN_DEBUG] Missing interval_widths in results


‚úÖ Uncertainty Tests Finished!
üéâ Test completed successfully: uncertainty

‚úÖ Uncertainty test completed!

üìä Result type: <class 'deepbridge.core.experiment.results.ExperimentResult'>
üìä Uncertainty result type: <class 'dict'>


## 5Ô∏è‚É£ Generate Interactive HTML Report

### Export comprehensive interactive report with:
- üìä **Coverage Analysis**: Validation of prediction intervals
- üìà **Feature Impact**: Features that most influence uncertainty
- üéØ **Calibration Curves**: Probability calibration analysis
- üìâ **Trade-off Analysis**: Coverage vs Width balance

In [5]:
# Create output directory
output_dir = './outputs/uncertainty_classification'
os.makedirs(output_dir, exist_ok=True)

# Generate interactive HTML report
html_output_path = os.path.join(output_dir, 'uncertainty_classification_interactive.html')

print("üìù Generating interactive HTML report...\n")

report_path = exp.save_html(
    test_type='uncertainty',
    file_path=html_output_path,
    model_name='RandomForest Classifier',
    report_type='interactive'
)

print(f"\n‚úÖ Interactive report generated!")
print(f"üìÇ Location: {report_path}")
print(f"\nüí° Open the HTML file in your browser to explore:")
print(f"   - Coverage Analysis tab")
print(f"   - Feature Impact tab")
print(f"   - Calibration tab")
print(f"   - Interactive Plotly charts")

üìù Generating interactive HTML report...

2025-11-12 20:59:36,020 - deepbridge.reports - INFO - Generating SIMPLE uncertainty report to: /home/guhaase/projetos/DeepBridge/examples/notebooks/03_validation_tests/outputs/uncertainty_classification/uncertainty_classification_interactive.html
2025-11-12 20:59:36,020 - deepbridge.reports - INFO - Report type: interactive
2025-11-12 20:59:36,021 - deepbridge.reports - INFO - Transforming uncertainty data for report (SIMPLE)
2025-11-12 20:59:36,022 - deepbridge.reports - DEBUG - [FEATURE_IMPACT_DEBUG] transform() input - results keys: ['primary_model', 'alternative_models', 'config', 'test_predictions', 'test_labels', 'uncertainty_quality_score', 'avg_coverage_error', 'avg_normalized_width', 'alphas', 'feature_importance', 'alpha_levels', 'timestamp', 'model_type', 'dataset', 'predictions', 'metrics', 'initial_results', 'initial_model_evaluation', 'experiment_type']
2025-11-12 20:59:36,023 - deepbridge.reports - DEBUG - [FEATURE_IMPACT_DEBUG

## 6Ô∏è‚É£ Export Results to JSON

### JSON export includes:
- üîç **Experiment Info**: Configuration and metadata
- üìä **Test Results**: Complete uncertainty analysis data
- üéØ **Model Evaluation**: Initial model metrics and feature importance
- üìà **By Alpha Results**: Coverage and width for each confidence level
- üåü **By Feature Results**: Per-feature uncertainty analysis

In [6]:
# Export to JSON (COMPACT MODE for AI validation)
json_output_path = os.path.join(output_dir, 'uncertainty_classification_results.json')
json_compact_path = os.path.join(output_dir, 'uncertainty_classification_results_COMPACT.json')

print("üìù Exporting results to JSON...\n")

# Full JSON (with all data)
json_path_full = exp._experiment_result.save_json(
    test_type='uncertainty',
    file_path=json_output_path,
    include_summary=True,
    compact=False  # Full data
)

# Compact JSON (optimized for AI validation)
json_path_compact = exp._experiment_result.save_json(
    test_type='uncertainty',
    file_path=json_compact_path,
    include_summary=True,
    compact=True  # Remove large arrays, keep only essentials
)

# Compare file sizes
import os
size_full = os.path.getsize(json_path_full) / 1024  # KB
size_compact = os.path.getsize(json_path_compact) / 1024  # KB
reduction = ((size_full - size_compact) / size_full) * 100

print(f"\n‚úÖ JSON files exported successfully!")
print(f"\nüìä FILE SIZES:")
print(f"   Full JSON:    {size_full:>8.2f} KB")
print(f"   Compact JSON: {size_compact:>8.2f} KB")
print(f"   Reduction:    {reduction:>8.1f}%")

print(f"\nüìÇ LOCATIONS:")
print(f"   Full:    {json_path_full}")
print(f"   Compact: {json_path_compact}")

print(f"\nüìã COMPACT JSON STRUCTURE (optimized for AI):")
print(f"   ‚îî‚îÄ experiment_info/")
print(f"      ‚îú‚îÄ test_type, experiment_type, generation_time, config")
print(f"   ‚îî‚îÄ test_results/")
print(f"      ‚îî‚îÄ primary_model/")
print(f"         ‚îú‚îÄ metrics (all model metrics)")
print(f"         ‚îú‚îÄ uncertainty_quality_score")
print(f"         ‚îú‚îÄ feature_importance_top10 (only top 10 features)")
print(f"         ‚îî‚îÄ crqr/")
print(f"            ‚îî‚îÄ by_alpha/ (only overall_result per alpha, no sample data)")
print(f"   ‚îî‚îÄ initial_model_evaluation/ (compact)")
print(f"      ‚îî‚îÄ models/")
print(f"         ‚îî‚îÄ primary_model/")
print(f"            ‚îú‚îÄ metrics")
print(f"            ‚îú‚îÄ model_type")
print(f"            ‚îî‚îÄ feature_importance_top5")
print(f"   ‚îî‚îÄ summary/  (AI-friendly analysis)")
print(f"      ‚îú‚îÄ key_findings")
print(f"      ‚îú‚îÄ model_performance")
print(f"      ‚îú‚îÄ calibration_quality")
print(f"      ‚îú‚îÄ recommendations")
print(f"      ‚îî‚îÄ per_alpha_analysis")

print(f"\nüí° USE CASES:")
print(f"   ‚Ä¢ Full JSON: Deep dive analysis, debugging, research")
print(f"   ‚Ä¢ Compact JSON: AI validation, automated testing, CI/CD pipelines")

üìù Exporting results to JSON...


‚úÖ JSON files exported successfully!

üìä FILE SIZES:
   Full JSON:     5665.95 KB
   Compact JSON:   912.13 KB
   Reduction:        83.9%

üìÇ LOCATIONS:
   Full:    /home/guhaase/projetos/DeepBridge/examples/notebooks/03_validation_tests/outputs/uncertainty_classification/uncertainty_classification_results.json
   Compact: /home/guhaase/projetos/DeepBridge/examples/notebooks/03_validation_tests/outputs/uncertainty_classification/uncertainty_classification_results_COMPACT.json

üìã COMPACT JSON STRUCTURE (optimized for AI):
   ‚îî‚îÄ experiment_info/
      ‚îú‚îÄ test_type, experiment_type, generation_time, config
   ‚îî‚îÄ test_results/
      ‚îî‚îÄ primary_model/
         ‚îú‚îÄ metrics (all model metrics)
         ‚îú‚îÄ uncertainty_quality_score
         ‚îú‚îÄ feature_importance_top10 (only top 10 features)
         ‚îî‚îÄ crqr/
            ‚îî‚îÄ by_alpha/ (only overall_result per alpha, no sample data)
   ‚îî‚îÄ initial_model_evaluation/ 

## 7Ô∏è‚É£ Load and Analyze JSON Results

Demonstrate how to load and use the exported JSON

In [7]:
import json

print("="*80)
print("üìä ANALYZING COMPACT JSON (optimized for AI validation)")
print("="*80)

# Load COMPACT JSON results
with open(json_compact_path, 'r', encoding='utf-8') as f:
    results_json = json.load(f)

# Experiment Info
exp_info = results_json['experiment_info']
print(f"\nüî¨ EXPERIMENT INFO:")
print(f"   Test Type: {exp_info['test_type']}")
print(f"   Experiment Type: {exp_info['experiment_type']}")
print(f"   Generation Time: {exp_info['generation_time']}")

# Summary (AI-friendly)
if 'summary' in results_json:
    summary = results_json['summary']
    
    print(f"\nüéØ KEY FINDINGS:")
    for finding in summary.get('key_findings', []):
        print(f"   ‚Ä¢ {finding}")
    
    print(f"\nüìà MODEL PERFORMANCE:")
    perf = summary.get('model_performance', {})
    for key, value in perf.items():
        if key != 'base_metrics':
            print(f"   {key}: {value}")
    
    if 'base_metrics' in perf:
        print(f"   Base Metrics:")
        for metric, val in perf['base_metrics'].items():
            print(f"      {metric}: {val}")
    
    print(f"\nüîç CALIBRATION QUALITY:")
    calib = summary.get('calibration_quality', {})
    print(f"   Status: {calib.get('status', 'N/A')}")
    print(f"   Description: {calib.get('description', 'N/A')}")
    print(f"   Interval Width: {calib.get('interval_width', 'N/A')}")
    print(f"   Width Description: {calib.get('width_description', 'N/A')}")
    
    print(f"\nüí° RECOMMENDATIONS:")
    for rec in summary.get('recommendations', []):
        print(f"   ‚Ä¢ {rec}")
    
    print(f"\nüìä PER-ALPHA ANALYSIS:")
    print(f"   {'Alpha':<8} {'Target':<10} {'Actual':<10} {'Cal.Error':<12} {'Width'}")
    print(f"   {'-'*60}")
    for alpha_data in summary.get('per_alpha_analysis', []):
        print(f"   {alpha_data['alpha']:<8.2f} "
              f"{alpha_data['target_coverage']*100:<9.1f}% "
              f"{alpha_data['actual_coverage']*100:<9.1f}% "
              f"{alpha_data['calibration_error']:<12.4f} "
              f"{alpha_data['mean_width']:.4f}")

# Test Results (compact)
test_results = results_json['test_results']
primary = test_results.get('primary_model', {})

print(f"\nüî¨ TEST RESULTS (Compact):")
print(f"   Uncertainty Score: {primary.get('uncertainty_quality_score', 'N/A')}")

# Top Features
if 'feature_importance_top10' in primary:
    print(f"\nüåü TOP 10 MOST IMPORTANT FEATURES:")
    print(f"   {'Feature':<30} {'Importance'}")
    print(f"   {'-'*50}")
    for feat, imp in primary['feature_importance_top10'].items():
        print(f"   {feat:<30} {imp:.4f}")

# CRQR Summary
if 'crqr' in primary and 'by_alpha' in primary['crqr']:
    by_alpha = primary['crqr']['by_alpha']
    print(f"\nüìä CRQR BY ALPHA (Overall Results Only):")
    print(f"   {len(by_alpha)} alpha levels analyzed")
    print(f"   (No sample-level data in compact mode)")

print(f"\n{'='*80}")
print(f"üíæ COMPACT JSON IS OPTIMIZED FOR:")
print(f"{'='*80}")
print(f"‚úÖ AI/LLM validation (smaller token count)")
print(f"‚úÖ Automated testing pipelines")
print(f"‚úÖ CI/CD integration")
print(f"‚úÖ Quick metrics extraction")
print(f"‚úÖ Summary-based decision making")
print(f"\n‚ùå NOT SUITABLE FOR:")
print(f"   ‚Ä¢ Sample-level analysis")
print(f"   ‚Ä¢ Detailed debugging")
print(f"   ‚Ä¢ Visualization generation")
print(f"   ‚Ä¢ Research with raw data needs")

üìä ANALYZING COMPACT JSON (optimized for AI validation)

üî¨ EXPERIMENT INFO:
   Test Type: uncertainty
   Experiment Type: binary_classification
   Generation Time: 2025-11-12 20:59:36

üéØ KEY FINDINGS:
   ‚Ä¢ Average coverage: 95.9% (calibration error: 0.0915)
   ‚Ä¢ High uncertainty detected (avg width: 0.7809)

üìà MODEL PERFORMANCE:
   average_coverage: 0.9591
   average_calibration_error: 0.0915
   average_interval_width: 0.7809
   uncertainty_score: 0.6376
   Base Metrics:
      accuracy: 0.9912
      roc_auc: 1.0
      f1: 0.9912
      precision: 0.9913
      recall: 0.9912

üîç CALIBRATION QUALITY:
   Status: GOOD
   Description: Calibration error < 0.10
   Interval Width: MODERATE
   Width Description: Moderate uncertainty

üí° RECOMMENDATIONS:
   ‚Ä¢ Collect more training data to reduce prediction variance
   ‚Ä¢ Consider ensemble methods

üìä PER-ALPHA ANALYSIS:
   Alpha    Target     Actual     Cal.Error    Width
   ------------------------------------------------

## 8Ô∏è‚É£ Uncertainty Analysis Summary

Extract key insights from uncertainty quantification

In [8]:
print("\nüìä UNCERTAINTY QUANTIFICATION SUMMARY\n" + "="*70)

# Calculate overall statistics
if 'crqr' in primary and 'by_alpha' in primary['crqr']:
    by_alpha = primary['crqr']['by_alpha']
    
    # Collect all coverage and calibration errors
    coverages = []
    cal_errors = []
    widths = []
    
    for alpha, alpha_data in by_alpha.items():
        overall = alpha_data.get('overall_result', {})
        alpha_val = float(alpha)
        target_cov = 1 - alpha_val
        actual_cov = overall.get('coverage', 0)
        avg_width = overall.get('mean_width', 0)
        
        coverages.append(actual_cov)
        cal_errors.append(abs(target_cov - actual_cov))
        widths.append(avg_width)
    
    # Calculate statistics
    avg_coverage = np.mean(coverages)
    avg_cal_error = np.mean(cal_errors)
    avg_width = np.mean(widths)
    
    print(f"\nüìà OVERALL STATISTICS:")
    print(f"   Average Coverage: {avg_coverage*100:.2f}%")
    print(f"   Average Calibration Error: {avg_cal_error:.4f}")
    print(f"   Average Interval Width: {avg_width:.4f}")
    
    # Quality assessment
    print(f"\nüéØ QUALITY ASSESSMENT:")
    
    if avg_cal_error < 0.05:
        print(f"   ‚úÖ Calibration: EXCELLENT (error < 0.05)")
    elif avg_cal_error < 0.10:
        print(f"   üü° Calibration: GOOD (error < 0.10)")
    else:
        print(f"   üî¥ Calibration: NEEDS IMPROVEMENT (error ‚â• 0.10)")
    
    if avg_width < 0.5:
        print(f"   ‚úÖ Interval Width: NARROW (confident predictions)")
    elif avg_width < 1.0:
        print(f"   üü° Interval Width: MODERATE")
    else:
        print(f"   ‚ö†Ô∏è  Interval Width: WIDE (high uncertainty)")
    
    # Recommendations
    print(f"\nüí° RECOMMENDATIONS:")
    
    if avg_cal_error >= 0.10:
        print(f"   ‚Ä¢ Consider calibration methods (Platt scaling, isotonic regression)")
    
    if avg_width > 0.5:
        print(f"   ‚Ä¢ High uncertainty detected - collect more training data")
        print(f"   ‚Ä¢ Use ensemble methods to reduce prediction variance")
    
    if avg_coverage < 0.90:
        print(f"   ‚Ä¢ Coverage below 90% - model may be overconfident")
        print(f"   ‚Ä¢ Consider increasing interval width or recalibration")
    
    print(f"\n‚úÖ Always use uncertainty in critical decisions (medical, financial, safety)")


üìä UNCERTAINTY QUANTIFICATION SUMMARY

üìà OVERALL STATISTICS:
   Average Coverage: 95.91%
   Average Calibration Error: 0.0915
   Average Interval Width: 0.7809

üéØ QUALITY ASSESSMENT:
   üü° Calibration: GOOD (error < 0.10)
   üü° Interval Width: MODERATE

üí° RECOMMENDATIONS:
   ‚Ä¢ High uncertainty detected - collect more training data
   ‚Ä¢ Use ensemble methods to reduce prediction variance

‚úÖ Always use uncertainty in critical decisions (medical, financial, safety)


## 9Ô∏è‚É£ Practical Decision Examples

How to use uncertainty in real-world scenarios

In [9]:
print("\nüíº PRACTICAL USE CASES\n" + "="*70)

# Simulate predictions with uncertainty
test_sample = X_test.iloc[0]
pred_proba = model.predict_proba([test_sample])[0]
actual_class = y_test.iloc[0]

# Simulate uncertainty interval (in real scenario, this comes from CRQR)
uncertainty_margin = 0.15  # Example: ¬±15%
prob_malignant = pred_proba[0]
lower_bound = max(0, prob_malignant - uncertainty_margin)
upper_bound = min(1, prob_malignant + uncertainty_margin)

print(f"\nüè• MEDICAL DIAGNOSIS - Breast Cancer\n" + "-"*70)
print(f"   Patient ID: #001")
print(f"   Prediction: {prob_malignant*100:.1f}% probability of MALIGNANT")
print(f"   95% Confidence Interval: [{lower_bound*100:.1f}%, {upper_bound*100:.1f}%]")
print(f"   Uncertainty Width: ¬±{uncertainty_margin*100:.1f}%")
print(f"   Actual: {'MALIGNANT' if actual_class == 0 else 'BENIGN'}")

# Decision logic
print(f"\n   üìã DECISION PROTOCOL:")
if upper_bound > 0.7:  # High risk even in worst case
    print(f"   ‚ö†Ô∏è  HIGH RISK: Immediate biopsy recommended")
    print(f"   Reason: Even with uncertainty, risk remains high")
elif lower_bound < 0.3:  # Low risk even in worst case
    print(f"   ‚úÖ LOW RISK: Regular monitoring")
    print(f"   Reason: Risk remains low even considering uncertainty")
else:  # Uncertain case
    print(f"   üü° UNCERTAIN: Additional tests recommended")
    print(f"   Reason: High uncertainty - need more information")
    print(f"   Suggested: Ultrasound, MRI, or second opinion")

# Financial example
print(f"\n\nüí∞ CREDIT APPROVAL - Default Risk\n" + "-"*70)
default_prob = 0.35
uncertainty_margin = 0.08
lower_default = max(0, default_prob - uncertainty_margin)
upper_default = min(1, default_prob + uncertainty_margin)

print(f"   Customer ID: #12345")
print(f"   Prediction: {default_prob*100:.1f}% probability of DEFAULT")
print(f"   95% Confidence Interval: [{lower_default*100:.1f}%, {upper_default*100:.1f}%]")
print(f"   Uncertainty Width: ¬±{uncertainty_margin*100:.1f}%")

print(f"\n   üìã DECISION PROTOCOL:")
if uncertainty_margin < 0.10:  # Low uncertainty
    print(f"   ‚úÖ APPROVE with standard rate")
    print(f"   Reason: Low uncertainty, risk well-quantified")
    print(f"   Recommended Rate: {5.0 + default_prob*10:.2f}%")
else:  # High uncertainty
    print(f"   ‚ö†Ô∏è  APPROVE with adjusted rate OR require guarantor")
    print(f"   Reason: High uncertainty in risk assessment")
    print(f"   Recommended Rate: {5.0 + upper_default*10:.2f}% (worst-case based)")

# Security example
print(f"\n\nüîí FRAUD DETECTION - Transaction Security\n" + "-"*70)
fraud_prob = 0.65
uncertainty_margin = 0.25
lower_fraud = max(0, fraud_prob - uncertainty_margin)
upper_fraud = min(1, fraud_prob + uncertainty_margin)

print(f"   Transaction ID: #TXN789")
print(f"   Prediction: {fraud_prob*100:.1f}% probability of FRAUD")
print(f"   95% Confidence Interval: [{lower_fraud*100:.1f}%, {upper_fraud*100:.1f}%]")
print(f"   Uncertainty Width: ¬±{uncertainty_margin*100:.1f}%")

print(f"\n   üìã DECISION PROTOCOL:")
if uncertainty_margin > 0.20:  # High uncertainty
    print(f"   üîê BLOCK + Require 2FA verification")
    print(f"   Reason: High uncertainty - apply maximum security")
    print(f"   Action: SMS code + Email confirmation")
elif fraud_prob > 0.70:
    print(f"   ‚õî BLOCK transaction")
    print(f"   Reason: High fraud probability, low uncertainty")
else:
    print(f"   üü° FLAG for review")
    print(f"   Reason: Moderate risk, further analysis needed")


üíº PRACTICAL USE CASES

üè• MEDICAL DIAGNOSIS - Breast Cancer
----------------------------------------------------------------------
   Patient ID: #001
   Prediction: 100.0% probability of MALIGNANT
   95% Confidence Interval: [85.0%, 100.0%]
   Uncertainty Width: ¬±15.0%
   Actual: MALIGNANT

   üìã DECISION PROTOCOL:
   ‚ö†Ô∏è  HIGH RISK: Immediate biopsy recommended
   Reason: Even with uncertainty, risk remains high


üí∞ CREDIT APPROVAL - Default Risk
----------------------------------------------------------------------
   Customer ID: #12345
   Prediction: 35.0% probability of DEFAULT
   95% Confidence Interval: [27.0%, 43.0%]
   Uncertainty Width: ¬±8.0%

   üìã DECISION PROTOCOL:
   ‚úÖ APPROVE with standard rate
   Reason: Low uncertainty, risk well-quantified
   Recommended Rate: 8.50%


üîí FRAUD DETECTION - Transaction Security
----------------------------------------------------------------------
   Transaction ID: #TXN789
   Prediction: 65.0% probability of FRAU



## üîü Visualize Uncertainty Distribution

In [10]:
# Visualize prediction probabilities with uncertainty
n_samples = 50
probas = model.predict_proba(X_test[:n_samples])[:, 1]  # Probability of benign
actual_labels = y_test.iloc[:n_samples].values

# Simulate uncertainty margins
uncertainty_margins = np.random.uniform(0.05, 0.20, n_samples)
lower_bounds = np.maximum(0, probas - uncertainty_margins)
upper_bounds = np.minimum(1, probas + uncertainty_margins)

# Create visualization
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 6))

# Plot 1: Predictions with uncertainty intervals
x = np.arange(n_samples)
colors = ['red' if label == 0 else 'green' for label in actual_labels]

ax1.fill_between(x, lower_bounds, upper_bounds, alpha=0.3, color='skyblue',
                 label='95% Confidence Interval')
ax1.scatter(x, probas, c=colors, s=80, alpha=0.7, edgecolors='black', linewidth=1.5,
           label='Prediction (Red=Malignant, Green=Benign)')
ax1.axhline(y=0.5, color='gray', linestyle='--', linewidth=2, label='Decision Threshold')
ax1.set_xlabel('Sample', fontsize=12, fontweight='bold')
ax1.set_ylabel('Probability (Benign)', fontsize=12, fontweight='bold')
ax1.set_title('Predictions with Uncertainty Intervals', fontsize=14, fontweight='bold')
ax1.legend(fontsize=10, loc='upper right')
ax1.grid(True, alpha=0.3)
ax1.set_ylim(-0.05, 1.05)

# Plot 2: Uncertainty distribution
ax2.hist(uncertainty_margins, bins=20, color='coral', alpha=0.7, edgecolor='black')
ax2.axvline(x=np.mean(uncertainty_margins), color='red', linestyle='--', 
           linewidth=2, label=f'Mean: {np.mean(uncertainty_margins):.3f}')
ax2.set_xlabel('Uncertainty Margin', fontsize=12, fontweight='bold')
ax2.set_ylabel('Frequency', fontsize=12, fontweight='bold')
ax2.set_title('Distribution of Uncertainty', fontsize=14, fontweight='bold')
ax2.legend(fontsize=10)
ax2.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print(f"\nüìä Uncertainty Statistics:")
print(f"   Mean uncertainty: ¬±{np.mean(uncertainty_margins)*100:.1f}%")
print(f"   Min uncertainty: ¬±{np.min(uncertainty_margins)*100:.1f}%")
print(f"   Max uncertainty: ¬±{np.max(uncertainty_margins)*100:.1f}%")
print(f"   Std uncertainty: {np.std(uncertainty_margins)*100:.1f}%")


üìä Uncertainty Statistics:
   Mean uncertainty: ¬±12.1%
   Min uncertainty: ¬±5.2%
   Max uncertainty: ¬±19.8%
   Std uncertainty: 4.0%


  plt.show()


## 1Ô∏è‚É£1Ô∏è‚É£ Generate Static Report (Optional)

Generate a static HTML report with matplotlib charts (for PDF export)

In [11]:
# Generate static HTML report
static_html_path = os.path.join(output_dir, 'uncertainty_classification_static.html')

print("üìù Generating static HTML report...\n")

static_report_path = exp.save_html(
    test_type='uncertainty',
    file_path=static_html_path,
    model_name='RandomForest Classifier',
    report_type='static'  # Uses matplotlib instead of Plotly
)

print(f"\n‚úÖ Static report generated!")
print(f"üìÇ Location: {static_report_path}")
print(f"\nüí° Static reports can be easily printed or converted to PDF")

üìù Generating static HTML report...

2025-11-12 20:59:36,587 - deepbridge.reports - INFO - Using static renderer for uncertainty report
2025-11-12 20:59:36,588 - deepbridge.reports - INFO - Generating static uncertainty report to: /home/guhaase/projetos/DeepBridge/examples/notebooks/03_validation_tests/outputs/uncertainty_classification/uncertainty_classification_static.html
2025-11-12 20:59:36,589 - deepbridge.reports - INFO - Found template at: /home/guhaase/projetos/DeepBridge/deepbridge/templates/report_types/uncertainty/static/index.html
2025-11-12 20:59:36,589 - deepbridge.reports - INFO - Using static template: /home/guhaase/projetos/DeepBridge/deepbridge/templates/report_types/uncertainty/static/index.html
2025-11-12 20:59:36,591 - deepbridge.reports - INFO - CSS compiled successfully using CSSManager for static uncertainty report
2025-11-12 20:59:36,630 - deepbridge.reports - INFO - Starting data transformation with standard transformer
2025-11-12 20:59:36,631 - deepbridge.r

  ax.set_xticklabels(labels)


## üéâ Summary - Files Generated

### üìÇ Output Directory Structure:
```
outputs/uncertainty_classification/
‚îú‚îÄ‚îÄ uncertainty_classification_interactive.html  # Interactive report with Plotly
‚îú‚îÄ‚îÄ uncertainty_classification_static.html       # Static report with Matplotlib
‚îî‚îÄ‚îÄ uncertainty_classification_results.json      # Complete results in JSON
```

### ‚úÖ What You Learned:

1. **Uncertainty in Classification**
   - Quantify confidence in binary predictions
   - Generate probability intervals
   - Assess calibration quality

2. **Report Generation**
   - Interactive HTML reports with Plotly
   - Static HTML reports with Matplotlib
   - Complete control over report type

3. **JSON Export**
   - Full experiment metadata
   - By-alpha coverage analysis
   - By-feature uncertainty analysis
   - Feature importance data
   - Easy integration with other tools

4. **Practical Applications**
   - Medical diagnosis decision protocols
   - Credit approval with risk quantification
   - Fraud detection with security levels
   - Uncertainty-based decision rules

### üí° Best Practices:

- ‚úÖ Always generate reports for stakeholder communication
- ‚úÖ Export JSON for automated analysis and monitoring
- ‚úÖ Use interactive reports for exploration
- ‚úÖ Use static reports for documentation and archiving
- ‚úÖ Define clear decision rules based on uncertainty
- ‚úÖ Monitor calibration in production

### üöÄ Next Steps:

- üìò `04_resilience_drift.ipynb` - Detect data distribution changes
- üìò `02_complete_robustness.ipynb` - Model robustness testing
- üìò `../04_fairness/` - Fairness and bias analysis

<div style="background-color: #e8f5e9; padding: 15px; border-radius: 5px; border-left: 5px solid #4caf50;">
<b>üéØ Key Takeaway:</b> In critical applications (medical, financial, safety), uncertainty quantification is not optional - it's essential for responsible AI deployment.
</div>