# Getting Started with Calibre

This notebook provides a quick introduction to probability calibration using the Calibre library.

**What you'll learn:**
1. Basic calibration workflow from start to finish
2. How to choose the right calibration method for your data
3. How to evaluate calibration quality
4. Common patterns and best practices

**When to use this notebook:** Start here if you're new to calibration or the Calibre library.

In [None]:
# Import required libraries
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier

# Import calibre components
from calibre import IsotonicCalibrator, mean_calibration_error, brier_score
from calibre import calibration_curve

# Set random seed for reproducibility
np.random.seed(42)
plt.style.use('default')

## 1. Create Sample Data

Let's generate some sample data and train a model that produces poorly calibrated predictions:

In [None]:
# Generate synthetic dataset
n_samples = 1000
X = np.random.randn(n_samples, 5)
y = (X[:, 0] + 0.5 * X[:, 1] - 0.3 * X[:, 2] + np.random.randn(n_samples) * 0.1 > 0).astype(int)

# Split into train/test
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train a model that tends to be poorly calibrated
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

# Get uncalibrated predictions
y_proba_uncal = model.predict_proba(X_test)[:, 1]

print(f"Dataset: {len(X_train)} training, {len(X_test)} test samples")
print(f"Class distribution: {np.mean(y_test):.1%} positive class")

## 2. Basic Calibration Workflow

The standard calibration workflow has three steps:
1. **Fit** the calibrator on training predictions
2. **Transform** test predictions 
3. **Evaluate** calibration quality

In [None]:
# Step 1: Get training predictions for calibration
y_proba_train = model.predict_proba(X_train)[:, 1]

# Step 2: Fit calibrator
calibrator = IsotonicCalibrator(enable_diagnostics=True)
calibrator.fit(y_proba_train, y_train)

# Step 3: Apply calibration to test data
y_proba_cal = calibrator.transform(y_proba_uncal)

print("‚úÖ Calibration complete!")
print(f"Uncalibrated range: [{y_proba_uncal.min():.3f}, {y_proba_uncal.max():.3f}]")
print(f"Calibrated range: [{y_proba_cal.min():.3f}, {y_proba_cal.max():.3f}]")

## 3. Evaluate Calibration Quality

Let's measure how much calibration improved our predictions:

In [None]:
# Calculate calibration metrics
mce_before = mean_calibration_error(y_test, y_proba_uncal)
mce_after = mean_calibration_error(y_test, y_proba_cal)

brier_before = brier_score(y_test, y_proba_uncal)
brier_after = brier_score(y_test, y_proba_cal)

print("üìä Calibration Improvement:")
print(f"Mean Calibration Error: {mce_before:.3f} ‚Üí {mce_after:.3f} ({(mce_after/mce_before-1)*100:+.1f}%)")
print(f"Brier Score: {brier_before:.3f} ‚Üí {brier_after:.3f} ({(brier_after/brier_before-1)*100:+.1f}%)")

# Check diagnostics
if calibrator.has_diagnostics():
    print(f"\nüîç Diagnostics: {calibrator.diagnostic_summary()}")

## 4. Visualize the Results

The best way to understand calibration is to visualize the calibration curve:

In [None]:
# Create calibration curves
bin_means_uncal, bin_edges_uncal = calibration_curve(y_test, y_proba_uncal, n_bins=10)
bin_means_cal, bin_edges_cal = calibration_curve(y_test, y_proba_cal, n_bins=10)

# Plot comparison
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))

# Before calibration
ax1.plot([0, 1], [0, 1], 'k--', alpha=0.5, label='Perfect calibration')
ax1.plot(bin_edges_uncal, bin_means_uncal, 'o-', color='red', label='Uncalibrated')
ax1.set_xlabel('Mean Predicted Probability')
ax1.set_ylabel('Fraction of Positives')
ax1.set_title('Before Calibration')
ax1.legend()
ax1.grid(True, alpha=0.3)

# After calibration
ax2.plot([0, 1], [0, 1], 'k--', alpha=0.5, label='Perfect calibration')
ax2.plot(bin_edges_cal, bin_means_cal, 'o-', color='blue', label='Calibrated')
ax2.set_xlabel('Mean Predicted Probability')
ax2.set_ylabel('Fraction of Positives')
ax2.set_title('After Calibration')
ax2.legend()
ax2.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print("üìà A well-calibrated model should have points close to the diagonal line.")
print("üìà The closer to the diagonal, the better the calibration!")

## 5. Try Different Calibration Methods

Calibre provides several calibration methods. Let's compare a few:

In [None]:
from calibre import NearlyIsotonicCalibrator, SplineCalibrator

# Test different calibrators
calibrators = {
    'Isotonic': IsotonicCalibrator(),
    'Nearly Isotonic': NearlyIsotonicCalibrator(),
    'Spline': SplineCalibrator(n_splines=5)
}

results = {'Uncalibrated': (y_proba_uncal, mce_before)}

# Fit and evaluate each calibrator
for name, cal in calibrators.items():
    cal.fit(y_proba_train, y_train)
    y_cal = cal.transform(y_proba_uncal)
    mce = mean_calibration_error(y_test, y_cal)
    results[name] = (y_cal, mce)

# Print comparison
print("üèÜ Method Comparison (Mean Calibration Error):")
for name, (_, mce) in results.items():
    print(f"{name:15}: {mce:.4f}")

# Find best method
best_method = min(results.items(), key=lambda x: x[1][1])[0]
print(f"\nü•á Best method: {best_method}")

## Key Takeaways

üéØ **Quick Start Pattern:**
```python
from calibre import IsotonicCalibrator, mean_calibration_error

# Fit calibrator on training predictions
calibrator = IsotonicCalibrator()
calibrator.fit(train_probabilities, train_labels)

# Apply to test predictions
calibrated_probabilities = calibrator.transform(test_probabilities)

# Evaluate improvement
improvement = mean_calibration_error(labels, calibrated_probabilities)
```

üìã **Best Practices:**
- Always use separate data for calibration (like cross-validation)
- Enable diagnostics to understand calibration behavior
- Visualize calibration curves to verify improvement
- Try multiple methods and pick the best for your data

‚û°Ô∏è **Next Steps:**
- **Validation & Evaluation**: See detailed calibration analysis
- **Diagnostics & Troubleshooting**: Learn when calibration fails
- **Performance Comparison**: Systematic method comparison