# Support Vector Machine for Credit Classification

This notebook demonstrates Support Vector Machine (SVM) classification for credit default prediction using the UCI German Credit dataset.

## Model Overview

**Support Vector Machine (SVM)** finds the optimal hyperplane that maximises the margin between classes. With kernel functions, it can learn non-linear decision boundaries.

### Pros
- Effective in high-dimensional spaces
- Flexible kernel functions (linear, RBF, polynomial)
- Strong theoretical foundation (maximum margin)
- Works well with clear margin of separation
- Memory efficient (only stores support vectors)

### Cons
- Slow on large datasets (O(n²) to O(n³) complexity)
- Requires feature scaling
- Kernel and hyperparameter choice significantly affects performance
- Less interpretable than linear models
- Poor probability estimates without calibration

### When to Use
- When you have a clear margin of separation
- In high-dimensional spaces (text classification, genomics)
- When sample size is moderate (< 10,000)

## Setup

In [None]:
import sys
sys.path.insert(0, '../src')

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from creditclass.preprocessing import prepare_data
from creditclass.training import get_model, train_model, save_model, tune_hyperparameters
from creditclass.evaluation import (
    evaluate_model,
    compute_shap_values,
    get_learning_curve_data,
)
from creditclass.plots import (
    set_plot_style,
    plot_confusion_matrix,
    plot_roc_curve,
    plot_precision_recall,
    plot_learning_curve,
    plot_calibration,
    plot_shap_summary,
)

set_plot_style()
RANDOM_STATE = 42

## Load Data

In [None]:
data = prepare_data(
    target_type='default',
    encoding_method='onehot',
    test_size=0.2,
    random_state=RANDOM_STATE,
    scale=True,  # SVM requires scaling
)

X_train = data['X_train_scaled']
X_test = data['X_test_scaled']
y_train = data['y_train']
y_test = data['y_test']
feature_names = data['feature_names']

print(f"Training set: {X_train.shape[0]} samples, {X_train.shape[1]} features")
print(f"Test set: {X_test.shape[0]} samples")

## Training

In [None]:
model = get_model('svm')
model = train_model(model, X_train, y_train)

print("Model trained successfully!")
print(f"Kernel: {model.kernel}")
print(f"C: {model.C}")
print(f"Number of support vectors: {sum(model.n_support_)}")

## Evaluation

In [None]:
metrics = evaluate_model(model, X_test, y_test)

print("Performance Metrics:")
print("-" * 30)
for name, value in metrics.items():
    if value is not None:
        print(f"{name.capitalize():12} {value:.4f}")

In [None]:
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

plot_confusion_matrix(
    model, X_test, y_test,
    class_names=['Good Credit', 'Bad Credit'],
    ax=axes[0],
    title='SVM - Confusion Matrix'
)

plot_roc_curve(model, X_test, y_test, ax=axes[1], label='SVM')

plt.tight_layout()
plt.show()

In [None]:
fig, ax = plt.subplots(figsize=(7, 6))
plot_precision_recall(model, X_test, y_test, ax=ax, label='SVM')
plt.tight_layout()
plt.show()

## Interpretability

Note: SVM with RBF kernel doesn't have direct feature importance. We use SHAP's KernelExplainer for model-agnostic explanations.

In [None]:
# SHAP values (using KernelExplainer - may be slow)
print("Computing SHAP values (this may take a moment)...")
shap_data = compute_shap_values(model, X_test, feature_names=feature_names, max_samples=50)

fig, ax = plt.subplots(figsize=(10, 8))
plot_shap_summary(shap_data, plot_type='bar', max_display=15)
plt.title('SVM - SHAP Feature Importance')
plt.tight_layout()
plt.show()

## Hyperparameter Tuning

In [None]:
tuning_results = tune_hyperparameters(
    'svm',
    X_train, y_train,
    method='grid',
    cv=5,
    scoring='f1'
)

print("Best Parameters:")
print(tuning_results['best_params'])
print(f"\nBest CV F1 Score: {tuning_results['best_score']:.4f}")

In [None]:
tuned_model = tuning_results['best_model']
tuned_metrics = evaluate_model(tuned_model, X_test, y_test)

print("\nTuned Model Performance:")
print("-" * 30)
for name, value in tuned_metrics.items():
    if value is not None:
        print(f"{name.capitalize():12} {value:.4f}")

## Learning Curve

In [None]:
lc_model = get_model('svm')
lc_data = get_learning_curve_data(lc_model, X_train, y_train, cv=5, scoring='f1')

fig, ax = plt.subplots(figsize=(8, 6))
plot_learning_curve(lc_data, ax=ax, title='SVM - Learning Curve')
plt.tight_layout()
plt.show()

## Calibration

In [None]:
fig, ax = plt.subplots(figsize=(7, 6))
plot_calibration(model, X_test, y_test, ax=ax, label='SVM')
plt.tight_layout()
plt.show()

## Save Model

In [None]:
save_path = save_model(model, 'svm')
print(f"Model saved to: {save_path}")

## Summary

### Key Takeaways

1. **Performance**: SVM performs well on this dataset with proper scaling
2. **Support Vectors**: The model uses a subset of training points for predictions
3. **Kernel Choice**: RBF kernel captures non-linear patterns
4. **Calibration**: May need probability calibration for reliable estimates

### Recommendations

- Always scale features before training SVM
- Try different kernels (linear for interpretability, RBF for flexibility)
- Use CalibratedClassifierCV for better probability estimates
- Consider linear SVM for large datasets (much faster)