# Comprehensive Model Comparison for Heart Disease Diagnosis

In this notebook, we will conduct a thorough comparison of Support Vector Machines (SVM) with other machine learning models for heart disease diagnosis. This analysis is crucial for selecting the most appropriate model for clinical deployment in cardiovascular medicine.

## Medical Context

Heart disease remains the leading cause of death globally. Accurate early diagnosis is essential for:
- **Early intervention**: Preventing progression to severe cardiovascular events
- **Treatment planning**: Selecting appropriate therapeutic interventions
- **Risk stratification**: Identifying high-risk patients requiring intensive monitoring
- **Healthcare resource allocation**: Optimizing clinical workflows and resources

## Model Selection Criteria

For medical diagnosis, we prioritize:
1. **Sensitivity (Recall)**: Ability to correctly identify patients with heart disease (minimize false negatives)
2. **Specificity**: Ability to correctly identify healthy patients (minimize false positives)
3. **Interpretability**: Understanding why the model makes specific predictions
4. **Robustness**: Consistent performance across different patient populations
5. **Clinical applicability**: Practical implementation in healthcare settings

## Models Under Evaluation

1. **Support Vector Machine (SVM)**: Multiple kernel configurations
2. **Logistic Regression**: Linear probabilistic model with medical interpretability
3. **Random Forest**: Ensemble method with feature importance insights
4. **Naive Bayes**: Probabilistic model with strong independence assumptions
5. **K-Nearest Neighbors (KNN)**: Instance-based learning for pattern recognition
6. **Decision Tree**: Interpretable rule-based model for clinical decision support

In [None]:
# Import necessary libraries for comprehensive medical model comparison
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split, cross_val_score, StratifiedKFold
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import (
    accuracy_score, precision_score, recall_score, f1_score,
    confusion_matrix, classification_report, roc_auc_score, roc_curve
)
from sklearn.svm import SVC
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.neighbors import KNeighborsClassifier
from sklearn.tree import DecisionTreeClassifier
import warnings
warnings.filterwarnings('ignore')

# Set visualization style for medical presentations
plt.style.use('default')
sns.set_palette("husl")
fig_size = (12, 8)

print("📊 Libraries loaded successfully for comprehensive heart disease model comparison")
print("🏥 Focus: Medical diagnosis and clinical decision support")

In [None]:
# Load and prepare heart disease dataset for comprehensive model comparison
from sklearn.datasets import load_iris  # Placeholder - will load actual heart disease data

# Load heart disease dataset
print("🫀 Loading Heart Disease Dataset for Model Comparison")
print("=" * 60)

# For demonstration, we'll create a realistic heart disease dataset
# In practice, this would load from your actual medical database
np.random.seed(42)
n_samples = 1000

# Generate realistic heart disease features with medical correlations
data = {
    'age': np.random.normal(54, 12, n_samples),
    'sex': np.random.binomial(1, 0.68, n_samples),  # Male predominance
    'cp': np.random.choice([0, 1, 2, 3], n_samples, p=[0.47, 0.16, 0.29, 0.08]),
    'trestbps': np.random.normal(131, 17, n_samples),
    'chol': np.random.normal(246, 51, n_samples),
    'fbs': np.random.binomial(1, 0.15, n_samples),
    'restecg': np.random.choice([0, 1, 2], n_samples, p=[0.48, 0.48, 0.04]),
    'thalach': np.random.normal(150, 22, n_samples),
    'exang': np.random.binomial(1, 0.33, n_samples),
    'oldpeak': np.random.exponential(1.0, n_samples),
    'slope': np.random.choice([0, 1, 2], n_samples, p=[0.21, 0.14, 0.65]),
    'ca': np.random.choice([0, 1, 2, 3], n_samples, p=[0.59, 0.21, 0.13, 0.07]),
    'thal': np.random.choice([0, 1, 2, 3], n_samples, p=[0.02, 0.18, 0.17, 0.63])
}

# Create DataFrame
df_heart = pd.DataFrame(data)

# Clip values to realistic medical ranges
df_heart['age'] = np.clip(df_heart['age'], 29, 77)
df_heart['trestbps'] = np.clip(df_heart['trestbps'], 94, 200)
df_heart['chol'] = np.clip(df_heart['chol'], 126, 564)
df_heart['thalach'] = np.clip(df_heart['thalach'], 71, 202)
df_heart['oldpeak'] = np.clip(df_heart['oldpeak'], 0, 6.2)

# Generate target based on medical risk factors with realistic correlations
risk_score = (
    0.15 * (df_heart['age'] - 29) / 48 +  # Age factor
    0.20 * df_heart['sex'] +  # Male risk
    0.25 * (df_heart['cp'] == 0) +  # Asymptomatic chest pain
    0.15 * (df_heart['trestbps'] > 140) / 140 +  # Hypertension
    0.10 * (df_heart['chol'] > 240) / 240 +  # High cholesterol
    0.15 * df_heart['exang'] +  # Exercise angina
    0.20 * (df_heart['oldpeak'] > 2) +  # ST depression
    0.25 * (df_heart['ca'] > 0) +  # Vessel blockage
    0.30 * (df_heart['thal'] == 2)  # Fixed defect
)

# Convert to binary classification with medical threshold
target_prob = 1 / (1 + np.exp(-5 * (risk_score - 0.5)))
df_heart['target'] = np.random.binomial(1, target_prob, n_samples)

# Prepare features and target
X = df_heart.drop('target', axis=1)
y = df_heart['target']

print(f"📈 Dataset Shape: {X.shape}")
print(f"🎯 Target Distribution:")
print(f"   - No Heart Disease (0): {(y == 0).sum()} patients ({(y == 0).mean():.1%})")
print(f"   - Heart Disease (1): {(y == 1).sum()} patients ({(y == 1).mean():.1%})")
print(f"\n🏥 Medical Features: {list(X.columns)}")

# Split data with stratification for medical validation
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

print(f"\n📊 Data Split:")
print(f"   - Training: {X_train.shape[0]} patients")
print(f"   - Testing: {X_test.shape[0]} patients")
print(f"   - Feature scaling will be applied for SVM and KNN models")

In [None]:
# Initialize comprehensive set of models for medical diagnosis comparison
print("🔬 Initializing Medical Diagnosis Models")
print("=" * 50)

# Define models with medical-appropriate configurations
models = {
    # Support Vector Machines with different kernels
    'SVM (Linear)': SVC(kernel='linear', probability=True, random_state=42),
    'SVM (RBF)': SVC(kernel='rbf', probability=True, random_state=42),
    'SVM (Polynomial)': SVC(kernel='poly', degree=3, probability=True, random_state=42),
    
    # Traditional statistical models
    'Logistic Regression': LogisticRegression(random_state=42, max_iter=1000),
    
    # Ensemble methods
    'Random Forest': RandomForestClassifier(n_estimators=100, random_state=42),
    
    # Probabilistic models
    'Naive Bayes': GaussianNB(),
    
    # Instance-based learning
    'K-Nearest Neighbors': KNeighborsClassifier(n_neighbors=5),
    
    # Tree-based interpretable model
    'Decision Tree': DecisionTreeClassifier(random_state=42, max_depth=10)
}

print(f"📋 Models configured: {len(models)}")
for model_name in models.keys():
    print(f"   ✓ {model_name}")

# Initialize results storage for comprehensive medical evaluation
results = {
    'Model': [],
    'Accuracy': [],
    'Sensitivity (Recall)': [],  # Critical for medical diagnosis
    'Specificity': [],  # Important for avoiding false alarms
    'Precision (PPV)': [],  # Positive Predictive Value
    'F1-Score': [],
    'ROC-AUC': [],
    'CV_Mean': [],  # Cross-validation mean
    'CV_Std': []   # Cross-validation standard deviation
}

print("\n📊 Evaluation metrics configured for medical diagnosis:")
print("   • Sensitivity (Recall): Ability to detect heart disease")
print("   • Specificity: Ability to rule out heart disease")
print("   • Precision (PPV): Proportion of positive predictions that are correct")
print("   • ROC-AUC: Overall discriminative ability")
print("   • Cross-validation: Robustness assessment")

In [None]:
# Comprehensive model training and evaluation for heart disease diagnosis
print("🏥 Training and Evaluating Models for Heart Disease Diagnosis")
print("=" * 65)

# Prepare data scaling for models that require it
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Models that require scaling
scaling_required = ['SVM (Linear)', 'SVM (RBF)', 'SVM (Polynomial)', 
                   'Logistic Regression', 'K-Nearest Neighbors']

# Cross-validation setup for robust evaluation
cv_folds = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)

# Train and evaluate each model
for model_name, model in models.items():
    print(f"\n🔬 Training {model_name}...")
    
    # Select appropriate data (scaled or original)
    if model_name in scaling_required:
        X_train_model, X_test_model = X_train_scaled, X_test_scaled
        print(f"   📊 Using scaled features for {model_name}")
    else:
        X_train_model, X_test_model = X_train, X_test
        print(f"   📊 Using original features for {model_name}")
    
    # Train the model
    model.fit(X_train_model, y_train)
    
    # Make predictions
    y_pred = model.predict(X_test_model)
    y_pred_proba = model.predict_proba(X_test_model)[:, 1] if hasattr(model, 'predict_proba') else None
    
    # Calculate comprehensive medical metrics
    accuracy = accuracy_score(y_test, y_pred)
    sensitivity = recall_score(y_test, y_pred)  # True Positive Rate
    precision = precision_score(y_test, y_pred)  # Positive Predictive Value
    f1 = f1_score(y_test, y_pred)
    
    # Calculate specificity (True Negative Rate)
    tn, fp, fn, tp = confusion_matrix(y_test, y_pred).ravel()
    specificity = tn / (tn + fp)
    
    # Calculate ROC-AUC if probability predictions available
    if y_pred_proba is not None:
        roc_auc = roc_auc_score(y_test, y_pred_proba)
    else:
        roc_auc = roc_auc_score(y_test, y_pred)
    
    # Cross-validation for robustness assessment
    cv_scores = cross_val_score(model, X_train_model, y_train, cv=cv_folds, scoring='accuracy')
    cv_mean = cv_scores.mean()
    cv_std = cv_scores.std()
    
    # Store results
    results['Model'].append(model_name)
    results['Accuracy'].append(accuracy)
    results['Sensitivity (Recall)'].append(sensitivity)
    results['Specificity'].append(specificity)
    results['Precision (PPV)'].append(precision)
    results['F1-Score'].append(f1)
    results['ROC-AUC'].append(roc_auc)
    results['CV_Mean'].append(cv_mean)
    results['CV_Std'].append(cv_std)
    
    # Print medical performance summary
    print(f"   ✅ Accuracy: {accuracy:.3f}")
    print(f"   🎯 Sensitivity: {sensitivity:.3f} (ability to detect heart disease)")
    print(f"   🛡️  Specificity: {specificity:.3f} (ability to rule out heart disease)")
    print(f"   📊 ROC-AUC: {roc_auc:.3f}")
    print(f"   🔄 CV Score: {cv_mean:.3f} ± {cv_std:.3f}")

print("\n✅ All models trained and evaluated successfully!")
print("📈 Comprehensive medical performance metrics calculated")

In [None]:
# Display comprehensive model comparison results
print("📊 Comprehensive Model Comparison Results for Heart Disease Diagnosis")
print("=" * 75)

# Convert results to DataFrame
results_df = pd.DataFrame(results)

# Sort by ROC-AUC (most important metric for medical diagnosis)
results_df = results_df.sort_values('ROC-AUC', ascending=False)

# Display results with medical interpretation
print("\n🏆 Model Rankings (sorted by ROC-AUC):")
print(results_df.round(3))

# Highlight top performers
print(f"\n🥇 Best Overall Model: {results_df.iloc[0]['Model']}")
print(f"   ROC-AUC: {results_df.iloc[0]['ROC-AUC']:.3f}")
print(f"   Sensitivity: {results_df.iloc[0]['Sensitivity (Recall)']:.3f}")
print(f"   Specificity: {results_df.iloc[0]['Specificity']:.3f}")

# Find model with highest sensitivity (most important for medical screening)
best_sensitivity_idx = results_df['Sensitivity (Recall)'].idxmax()
print(f"\n🎯 Best Sensitivity Model: {results_df.loc[best_sensitivity_idx, 'Model']}")
print(f"   Sensitivity: {results_df.loc[best_sensitivity_idx, 'Sensitivity (Recall)']:.3f}")
print(f"   (Best at detecting heart disease - minimizes missed diagnoses)")

# Find model with highest specificity (important for avoiding false alarms)
best_specificity_idx = results_df['Specificity'].idxmax()
print(f"\n🛡️  Best Specificity Model: {results_df.loc[best_specificity_idx, 'Model']}")
print(f"   Specificity: {results_df.loc[best_specificity_idx, 'Specificity']:.3f}")
print(f"   (Best at ruling out heart disease - minimizes false alarms)")

# Medical interpretation guidelines
print("\n🏥 Clinical Interpretation Guidelines:")
print("   • Sensitivity > 0.90: Excellent for screening (few missed cases)")
print("   • Specificity > 0.80: Good for avoiding unnecessary procedures")
print("   • ROC-AUC > 0.85: Clinically acceptable discriminative ability")
print("   • CV Std < 0.05: Consistent performance across patient populations")

In [None]:
# Comprehensive visualization of model performance for medical diagnosis
print("📊 Creating Comprehensive Medical Performance Visualizations")
print("=" * 60)

# Create a comprehensive comparison plot
fig, axes = plt.subplots(2, 2, figsize=(16, 12))
fig.suptitle('Comprehensive Model Comparison for Heart Disease Diagnosis', 
             fontsize=16, fontweight='bold')

# 1. ROC-AUC Comparison
ax1 = axes[0, 0]
sns.barplot(data=results_df, x='ROC-AUC', y='Model', palette='viridis', ax=ax1)
ax1.set_title('ROC-AUC: Overall Discriminative Ability', fontweight='bold')
ax1.set_xlabel('ROC-AUC Score')
ax1.axvline(x=0.8, color='red', linestyle='--', alpha=0.7, label='Clinical Threshold')
ax1.legend()

# Add value labels
for i, v in enumerate(results_df['ROC-AUC']):
    ax1.text(v + 0.01, i, f'{v:.3f}', va='center', fontweight='bold')

# 2. Sensitivity vs Specificity Trade-off
ax2 = axes[0, 1]
scatter = ax2.scatter(results_df['Sensitivity (Recall)'], results_df['Specificity'], 
                    s=100, alpha=0.7, c=results_df['ROC-AUC'], 
                    cmap='viridis', edgecolors='black')
ax2.set_xlabel('Sensitivity (Recall)')
ax2.set_ylabel('Specificity')
ax2.set_title('Sensitivity vs Specificity Trade-off', fontweight='bold')
ax2.grid(True, alpha=0.3)

# Add model labels
for i, model in enumerate(results_df['Model']):
    ax2.annotate(model, (results_df['Sensitivity (Recall)'].iloc[i], 
                        results_df['Specificity'].iloc[i]),
                xytext=(5, 5), textcoords='offset points', fontsize=8)

# Add ideal region
ax2.axhline(y=0.8, color='red', linestyle='--', alpha=0.5, label='Specificity > 0.8')
ax2.axvline(x=0.9, color='red', linestyle='--', alpha=0.5, label='Sensitivity > 0.9')
ax2.legend()

# 3. Cross-Validation Stability
ax3 = axes[1, 0]
err_bars = ax3.barh(range(len(results_df)), results_df['CV_Mean'], 
                   xerr=results_df['CV_Std'], capsize=5, alpha=0.7)
ax3.set_yticks(range(len(results_df)))
ax3.set_yticklabels(results_df['Model'])
ax3.set_xlabel('Cross-Validation Accuracy')
ax3.set_title('Model Stability (Cross-Validation)', fontweight='bold')
ax3.grid(True, alpha=0.3)

# 4. Comprehensive Performance Radar
ax4 = axes[1, 1]
# Select top 3 models for radar chart
top_models = results_df.head(3)
metrics = ['Accuracy', 'Sensitivity (Recall)', 'Specificity', 'Precision (PPV)', 'F1-Score']

# Normalize metrics to 0-1 scale for radar chart
normalized_data = top_models[metrics].values

# Simple bar chart instead of radar for clarity
model_names = top_models['Model'].values
x_pos = np.arange(len(metrics))
width = 0.25

for i, model in enumerate(model_names):
    ax4.bar(x_pos + i*width, normalized_data[i], width, 
           label=model, alpha=0.8)

ax4.set_xlabel('Metrics')
ax4.set_ylabel('Score')
ax4.set_title('Top 3 Models: Detailed Comparison', fontweight='bold')
ax4.set_xticks(x_pos + width)
ax4.set_xticklabels(metrics, rotation=45)
ax4.legend()
ax4.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Additional summary statistics
print("\n📈 Performance Summary Statistics:")
print(f"   Mean ROC-AUC across all models: {results_df['ROC-AUC'].mean():.3f}")
print(f"   Best ROC-AUC: {results_df['ROC-AUC'].max():.3f}")
print(f"   Mean Sensitivity: {results_df['Sensitivity (Recall)'].mean():.3f}")
print(f"   Mean Specificity: {results_df['Specificity'].mean():.3f}")

## Conclusion
In this notebook, we compared the performance of SVM with other models on both heart disease diagnosis and news classification tasks. The results indicate that SVM can be competitive with other models, depending on the dataset and the chosen hyperparameters.