# AI Disease Prediction - Model Comparison Analysis

This notebook demonstrates the comparison of multiple classification algorithms for disease prediction using blood test data.

## Project Overview
- **Objective**: Compare classification models for disease prediction
- **Focus**: Optimize recall score to minimize false negatives
- **Data**: Blood test parameters
- **Models**: Random Forest, XGBoost, SVM, Logistic Regression, etc.

In [None]:
# Import required libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import warnings

# Import project modules
import sys
sys.path.append('../src')

from data_preprocessing import load_and_preprocess_data
from model_comparison import compare_classification_models
from evaluation import evaluate_multiple_models
from visualization import create_comprehensive_visualization_report

warnings.filterwarnings('ignore')

# Set up plotting
plt.style.use('default')
sns.set_palette("husl")
%matplotlib inline

## 1. Data Loading and Preprocessing

In [None]:
# Generate sample data if it doesn't exist
import os
if not os.path.exists('../data/sample_blood_data.csv'):
    print("Generating sample dataset...")
    from generate_sample_data import save_sample_datasets
    save_sample_datasets(output_dir='../data')

# Load and preprocess data
print("Loading and preprocessing data...")
X_train, X_test, y_train, y_test, feature_names, scaler = load_and_preprocess_data(
    '../data/sample_blood_data.csv',
    target_column='Disease',
    test_size=0.2,
    apply_balancing=True,
    random_state=42
)

print(f"Training set shape: {X_train.shape}")
print(f"Test set shape: {X_test.shape}")
print(f"Number of features: {len(feature_names)}")

## 2. Model Comparison

In [None]:
# Compare classification models
print("Comparing classification models...")
comparison_results = compare_classification_models(
    X_train, X_test, y_train, y_test,
    cv_folds=5,
    random_state=42
)

In [None]:
# Display cross-validation results
print("Cross-Validation Results:")
print("=" * 50)
cv_results = comparison_results['cv_results']
display(cv_results)

In [None]:
# Display test results
print("Test Set Results:")
print("=" * 30)
test_results = comparison_results['test_results']
display(test_results.sort_values('Recall', ascending=False))

## 3. Best Model Analysis

In [None]:
# Display best model information
best_model_name = comparison_results['best_model_name']
best_model = comparison_results['best_model']
best_recall = comparison_results['best_recall']

print(f"Best Model by Recall Score:")
print(f"Model: {best_model_name}")
print(f"Recall Score: {best_recall:.4f}")

# Detailed evaluation
from evaluation import create_evaluation_report

evaluation_report = create_evaluation_report(
    best_model, X_test, y_test,
    model_name=best_model_name,
    feature_names=feature_names
)

## 4. Feature Importance Analysis

In [None]:
# Feature importance for the best model
feature_importance = evaluation_report['feature_importance']

if feature_importance is not None:
    print("Top 10 Most Important Features:")
    print("=" * 35)
    display(feature_importance)
    
    # Plot feature importance
    plt.figure(figsize=(10, 6))
    plt.barh(range(len(feature_importance)), feature_importance['importance'])
    plt.yticks(range(len(feature_importance)), feature_importance['feature'])
    plt.xlabel('Importance Score')
    plt.title(f'Feature Importance - {best_model_name}')
    plt.grid(True, alpha=0.3)
    plt.tight_layout()
    plt.show()
else:
    print(f"Feature importance not available for {best_model_name}")

## 5. Comprehensive Visualization

In [None]:
# Create comprehensive visualization report
models_dict = comparison_results['trained_models']

# Get feature importance for visualizable models
feature_importance_dict = {}
for model_name, model in models_dict.items():
    try:
        if hasattr(model, 'feature_importances_') or hasattr(model, 'coef_'):
            from evaluation import ModelEvaluator
            evaluator = ModelEvaluator(model, model_name)
            importance = evaluator.get_feature_importance(feature_names, top_n=10)
            feature_importance_dict[model_name] = importance
    except:
        pass

# Create visualization report
figures = create_comprehensive_visualization_report(
    models_dict, test_results, X_test, y_test,
    feature_importance_dict, save_directory=None
)

## 6. Model Performance Summary

In [None]:
# Summary table with key metrics
summary_metrics = ['Accuracy', 'Precision', 'Recall', 'F1-Score', 'ROC-AUC']
summary_df = test_results[summary_metrics].round(4)

print("Model Performance Summary (Test Set):")
print("=" * 45)
display(summary_df)

# Highlight best performers
print("\nTop 3 Models by Recall Score:")
print("-" * 35)
top_3_recall = summary_df.sort_values('Recall', ascending=False).head(3)
for idx, (model, row) in enumerate(top_3_recall.iterrows(), 1):
    print(f"{idx}. {model}: Recall = {row['Recall']:.4f}")

## 7. Medical Interpretation

In [None]:
# Medical interpretation of results
print("MEDICAL INTERPRETATION:")
print("=" * 25)

best_metrics = evaluation_report['evaluation_metrics']

print(f"Best Model: {best_model_name}")
print(f"Recall (Sensitivity): {best_metrics['recall']:.4f}")
print(f"Precision (PPV): {best_metrics['precision']:.4f}")
print(f"Specificity: {best_metrics['specificity']:.4f}")
print(f"NPV: {best_metrics['npv']:.4f}")
print()

# Clinical significance
recall = best_metrics['recall']
precision = best_metrics['precision']

print("Clinical Significance:")
print("-" * 20)

if recall >= 0.9:
    print("✓ EXCELLENT RECALL: Model catches 90%+ of disease cases")
    print("  → Very good for screening and early detection")
elif recall >= 0.8:
    print("✓ GOOD RECALL: Model catches 80%+ of disease cases")
    print("  → Acceptable for most diagnostic applications")
else:
    print("⚠ MODERATE RECALL: Model misses some disease cases")
    print("  → May need improvement for clinical use")

if precision >= 0.8:
    print("✓ HIGH PRECISION: Few false alarms")
    print("  → Reduces unnecessary worry and follow-up tests")
elif precision >= 0.6:
    print("○ MODERATE PRECISION: Some false alarms")
    print("  → Acceptable but may cause some unnecessary concern")
else:
    print("⚠ LOW PRECISION: Many false alarms")
    print("  → May cause excessive worry and unnecessary tests")

# Cost-benefit analysis
cost_benefit = evaluation_report['cost_benefit_analysis']
if cost_benefit:
    print(f"\nCost-Benefit Analysis:")
    print(f"Net Benefit per Case: ${cost_benefit['net_benefit_per_case']:.2f}")
    if cost_benefit['net_benefit'] > 0:
        print("✓ Model provides positive economic value")
    else:
        print("⚠ Model may not be cost-effective")

## 8. Recommendations for Team Project

### Key Findings

1. **Best Performing Model**: Based on recall optimization
2. **Recall Score Achievement**: Target of >90% recall for medical screening
3. **Feature Importance**: Most predictive blood parameters identified
4. **Model Comparison**: Comprehensive evaluation across multiple algorithms

### Next Steps for Team

**Person 1 (Data Preprocessing)**: 
- Explore additional feature engineering techniques
- Investigate different imputation strategies
- Analyze feature correlations and interactions

**Person 2 (Model Implementation)**:
- Implement ensemble methods
- Try additional algorithms (Neural Networks, etc.)
- Explore model stacking approaches

**Person 3 (Hyperparameter Tuning)**:
- Perform extensive hyperparameter optimization
- Use advanced optimization techniques (Optuna, Bayesian)
- Focus on recall-optimized tuning

**Person 4 (Evaluation & Documentation)**:
- Create detailed performance analysis
- Develop clinical interpretation guidelines
- Prepare presentation materials

### Project Deliverables

- [ ] Complete model comparison report
- [ ] Best model with >90% recall
- [ ] Feature importance analysis
- [ ] Clinical interpretation guide
- [ ] Interactive visualization dashboard
- [ ] Team presentation

### Success Metrics

- **Primary**: Recall Score > 0.90
- **Secondary**: Balanced precision and specificity
- **Tertiary**: Model interpretability and clinical relevance