# üè• CDC Diabetes Health Indicators: 3-Class Classification & Clustering Analysis

**Course:** WM9QG-15 - Fundamentals of Artificial Intelligence and Data Mining  
**Institution:** University of Warwick  
**Assessment:** Individual Project (70%)  

---

## üìã Project Overview

**Context:** Contracted by public health research institute to analyze CDC BRFSS 2015 dataset (253,680 individuals)

**Research Questions:**
1. **Classification:** Can we predict diabetes diagnosis (No Diabetes, Prediabetes, Diabetes) based on health indicators?
2. **Clustering:** Can we identify meaningful population segments at risk for targeted interventions?

**Dataset:** CDC Behavioral Risk Factor Surveillance System (BRFSS) 2015
- **Samples:** 253,680 survey responses
- **Features:** 21 health and demographic indicators
- **Target:** 3-class (0=No Diabetes, 1=Prediabetes, 2=Diabetes)
- **Challenge:** Severe class imbalance (84.3%, 1.8%, 13.9%)

**Approach:**
- CRISP-DM inspired methodology (naturally structured)
- Comprehensive comparison of 6 class imbalance strategies
- Custom class weight optimization via Optuna
- Feature selection using RFECV
- Both unsupervised (clustering) and supervised (classification) learning

---

## üéØ Learning Outcomes Addressed

- **LO2:** Select and apply appropriate AI/ML algorithms
- **LO3:** Critically evaluate different tools and techniques
- **LO5:** Synthesize methodologies to articulate solution rationale

---

## üìä Expected Deliverables

1. Population clustering for risk segmentation
2. Probabilistic classification model (predict_proba)
3. Performance comparison before/after imbalance handling
4. Optimal class weights via Bayesian optimization
5. Critical evaluation of methodology
6. Ethical and practical implications discussion

# üìë Table of Contents

## 0Ô∏è‚É£ Setup & Configuration
- 0.1 Import Libraries
- 0.2 Load Configuration
- 0.3 Helper Functions

## 1Ô∏è‚É£ Data Understanding & Exploratory Analysis
- 1.1 Load and Inspect Data
- 1.2 Class Distribution Analysis
- 1.3 Feature Distributions
- 1.4 Correlation Analysis
- 1.5 Target Leakage Investigation
- 1.6 Critical Analysis: Data Quality

## 2Ô∏è‚É£ Data Preparation
- 2.1 Preprocessing Pipeline
- 2.2 Feature Engineering
- 2.3 Train/Validation/Test Split
- 2.4 Critical Analysis: Preparation Decisions

## 3Ô∏è‚É£ Clustering Analysis (Unsupervised Learning)
- 3.1 K-Means Implementation
- 3.2 DBSCAN Implementation
- 3.3 Cluster Evaluation
- 3.4 Cluster Interpretation for Public Health
- 3.5 Critical Analysis: Clustering Results

## 4Ô∏è‚É£ Classification - Baseline & Strategy Comparison
- 4.1 Baseline Models (No Weighting)
- 4.2 Strategy Comparison (6 approaches)
- 4.3 Performance Evaluation
- 4.4 Best Strategy Selection
- 4.5 Critical Analysis: Imbalance Handling

## 5Ô∏è‚É£ Feature Selection & Hyperparameter Optimization
- 5.1 RFECV Feature Selection
- 5.2 Optuna Weight + Hyperparameter Optimization
- 5.3 Final Model Training
- 5.4 Critical Analysis: Optimization Results

## 6Ô∏è‚É£ Comprehensive Evaluation & Implications
- 6.1 Model Performance Analysis
- 6.2 Confusion Matrix Deep Dive
- 6.3 Error Analysis
- 6.4 Fairness & Bias Assessment
- 6.5 Ethical Considerations
- 6.6 Deployment Recommendations
- 6.7 Critical Reflection: Limitations & Future Work

In [4]:
import sklearn

In [5]:
# ============================================================================
# IMPORTS: STANDARD LIBRARIES
# ============================================================================
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
from pathlib import Path
import sys
import os
from datetime import datetime
import json

# Ignore warnings for cleaner output
warnings.filterwarnings('ignore')

# ============================================================================
# IMPORTS: SCIKIT-LEARN - PREPROCESSING
# ============================================================================
from sklearn.model_selection import (
    train_test_split, 
    cross_val_score, 
    StratifiedKFold,
    GridSearchCV
)
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.pipeline import Pipeline
from sklearn.compose import ColumnTransformer

# ============================================================================
# IMPORTS: SCIKIT-LEARN - MODELS
# ============================================================================
# Classification models
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import (
    RandomForestClassifier,
    GradientBoostingClassifier,
    VotingClassifier
)
from sklearn.svm import SVC
from sklearn.neighbors import KNeighborsClassifier

# Clustering models
from sklearn.cluster import KMeans, DBSCAN
from sklearn.decomposition import PCA

# ============================================================================
# IMPORTS: SCIKIT-LEARN - FEATURE SELECTION
# ============================================================================
from sklearn.feature_selection import RFECV, SelectKBest, chi2

# ============================================================================
# IMPORTS: SCIKIT-LEARN - METRICS
# ============================================================================
from sklearn.metrics import (
    # Classification metrics
    accuracy_score,
    balanced_accuracy_score,
    precision_score,
    recall_score,
    f1_score,
    classification_report,
    confusion_matrix,
    ConfusionMatrixDisplay,
    roc_auc_score,
    roc_curve,
    precision_recall_curve,
    average_precision_score,
    
    # Clustering metrics
    silhouette_score,
    davies_bouldin_score,
    calinski_harabasz_score,
    adjusted_rand_score,
    normalized_mutual_info_score
)

# ============================================================================
# IMPORTS: IMBALANCED-LEARN
# ============================================================================
from imblearn.over_sampling import SMOTE, RandomOverSampler
from imblearn.under_sampling import RandomUnderSampler
from imblearn.combine import SMOTETomek
from imblearn.pipeline import Pipeline as ImbPipeline

# ============================================================================
# IMPORTS: OPTUNA (HYPERPARAMETER OPTIMIZATION)
# ============================================================================
import optuna
from optuna.samplers import TPESampler
from optuna.visualization import (
    plot_optimization_history,
    plot_param_importances
)

# Suppress Optuna logging (optional)
optuna.logging.set_verbosity(optuna.logging.WARNING)

# ============================================================================
# IMPORTS: SCIPY
# ============================================================================
from scipy import stats
from scipy.cluster.hierarchy import dendrogram, linkage

# ============================================================================
# IMPORTS: CONFIGURATION
# ============================================================================
import config

# ============================================================================
# VERIFY IMPORTS
# ============================================================================
print("‚úÖ All libraries imported successfully!")
print(f"Python version: {sys.version}")
print(f"Pandas version: {pd.__version__}")
print(f"NumPy version: {np.__version__}")
print(f"Scikit-learn version: {sklearn.__version__}")
print(f"Optuna version: {optuna.__version__}")
print(f"\n{'='*60}")
print(f"üéØ Configuration loaded from: config.py")
print(f"üìä Random seed: {config.RANDOM_STATE}")
print(f"üìÅ Data path: {config.DATA_PATH}")
print(f"{'='*60}")

‚úÖ All libraries imported successfully!
Python version: 3.12.5 (tags/v3.12.5:ff3bc82, Aug  6 2024, 20:45:27) [MSC v.1940 64 bit (AMD64)]
Pandas version: 3.0.0
NumPy version: 2.4.2
Scikit-learn version: 1.8.0
Optuna version: 4.7.0

üéØ Configuration loaded from: config.py
üìä Random seed: 42
üìÅ Data path: data/CDC_Diabetes_Dataset.csv


In [6]:
# ============================================================================
# DISPLAY SETTINGS
# ============================================================================
# Pandas display options
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', 100)
pd.set_option('display.width', None)
pd.set_option('display.float_format', '{:.4f}'.format)

# Matplotlib style
plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette(config.COLOR_PALETTE)

# Figure defaults
plt.rcParams['figure.figsize'] = config.FIGURE_SIZE
plt.rcParams['figure.dpi'] = config.FIGURE_DPI
plt.rcParams['savefig.dpi'] = config.FIGURE_DPI
plt.rcParams['font.size'] = 10
plt.rcParams['axes.labelsize'] = 12
plt.rcParams['axes.titlesize'] = 14
plt.rcParams['xtick.labelsize'] = 10
plt.rcParams['ytick.labelsize'] = 10
plt.rcParams['legend.fontsize'] = 10

# ============================================================================
# CREATE OUTPUT DIRECTORIES
# ============================================================================
for directory in [config.OUTPUT_DIR, config.MODELS_DIR, 
                  config.FIGURES_DIR, config.RESULTS_DIR]:
    Path(directory).mkdir(parents=True, exist_ok=True)

print("‚úÖ Display settings configured")
print("‚úÖ Output directories created")

‚úÖ Display settings configured
‚úÖ Output directories created


---

## üîß Helper Functions

The following helper functions will be used throughout the analysis to:
1. **Visualization:** Plot distributions, confusion matrices, ROC curves
2. **Evaluation:** Calculate and display comprehensive metrics
3. **Comparison:** Compare multiple models systematically
4. **Saving:** Save models, figures, and results consistently

These functions follow the code style from course lab workshops.

In [11]:
# ============================================================================
# HELPER FUNCTIONS: VISUALIZATION
# ============================================================================

def plot_class_distribution(y, title="Class Distribution", save_path=None):
    """
    Plot class distribution as bar chart and pie chart side by side.
    
    Parameters:
    -----------
    y : array-like
        Target variable
    title : str
        Plot title
    save_path : str, optional
        Path to save figure
    """
    fig, axes = plt.subplots(1, 2, figsize=(14, 5))
    
    # Count and percentage
    counts = pd.Series(y).value_counts().sort_index()
    percentages = counts / len(y) * 100
    
    # Bar chart
    axes[0].bar(counts.index, counts.values, 
                color=[config.CLASS_COLORS[i] for i in counts.index],
                edgecolor='black', linewidth=1.5)
    axes[0].set_xlabel('Class', fontsize=12)
    axes[0].set_ylabel('Count', fontsize=12)
    axes[0].set_title(f'{title} - Counts', fontsize=14, fontweight='bold')
    axes[0].set_xticks(range(config.N_CLASSES))
    axes[0].set_xticklabels(config.TARGET_NAMES)
    
    # Add count labels on bars
    for i, (idx, val) in enumerate(counts.items()):
        axes[0].text(i, val + max(counts) * 0.01, f'{val:,}\n({percentages[idx]:.1f}%)',
                    ha='center', va='bottom', fontsize=10, fontweight='bold')
    
    # Pie chart
    axes[1].pie(counts.values, labels=config.TARGET_NAMES,
                colors=[config.CLASS_COLORS[i] for i in counts.index],
                autopct='%1.1f%%', startangle=90,
                textprops={'fontsize': 11, 'fontweight': 'bold'})
    axes[1].set_title(f'{title} - Proportions', fontsize=14, fontweight='bold')
    
    plt.tight_layout()
    
    if save_path:
        plt.savefig(save_path, dpi=config.FIGURE_DPI, bbox_inches='tight')
        print(f"üíæ Saved: {save_path}")
    
    plt.show()
    
    # Print statistics
    print(f"\n{'='*60}")
    print(f"Class Distribution Statistics")
    print(f"{'='*60}")
    for idx in counts.index:
        print(f"{config.TARGET_NAMES[idx]:20s}: {counts[idx]:8,} ({percentages[idx]:5.2f}%)")
    print(f"{'='*60}")
    print(f"Total samples: {len(y):,}")
    print(f"Imbalance ratio (Class 1/Class 0): 1:{counts[0]/counts[1]:.1f}")
    print(f"Imbalance ratio (Class 2/Class 0): 1:{counts[0]/counts[2]:.1f}")
    print(f"{'='*60}\n")


def plot_feature_distributions(df, features, target_col, ncols=3, save_path=None):
    """
    Plot distributions of multiple features, colored by target class.
    
    Parameters:
    -----------
    df : pd.DataFrame
        Dataset
    features : list
        List of feature names to plot
    target_col : str
        Target column name
    ncols : int
        Number of columns in subplot grid
    save_path : str, optional
        Path to save figure
    """
    nrows = (len(features) + ncols - 1) // ncols
    fig, axes = plt.subplots(nrows, ncols, figsize=(6*ncols, 4*nrows))
    axes = axes.flatten() if len(features) > 1 else [axes]
    
    for idx, feature in enumerate(features):
        ax = axes[idx]
        
        # Plot for each class
        for class_val in sorted(df[target_col].unique()):
            data = df[df[target_col] == class_val][feature]
            ax.hist(data, bins=30, alpha=0.5, 
                   color=config.CLASS_COLORS[class_val],
                   label=config.TARGET_NAMES[class_val],
                   edgecolor='black', linewidth=0.5)
        
        ax.set_xlabel(feature, fontsize=11)
        ax.set_ylabel('Frequency', fontsize=11)
        ax.set_title(f'Distribution of {feature}', fontsize=12, fontweight='bold')
        ax.legend()
        ax.grid(True, alpha=0.3)
    
    # Hide empty subplots
    for idx in range(len(features), len(axes)):
        axes[idx].axis('off')
    
    plt.tight_layout()
    
    if save_path:
        plt.savefig(save_path, dpi=config.FIGURE_DPI, bbox_inches='tight')
        print(f"üíæ Saved: {save_path}")
    
    plt.show()


def plot_correlation_matrix(df, features, title="Correlation Matrix", save_path=None):
    """
    Plot correlation matrix heatmap.
    
    Parameters:
    -----------
    df : pd.DataFrame
        Dataset
    features : list
        List of features to include
    title : str
        Plot title
    save_path : str, optional
        Path to save figure
    """
    # Calculate correlation
    corr = df[features].corr()
    
    # Create mask for upper triangle
    mask = np.triu(np.ones_like(corr, dtype=bool))
    
    # Plot
    fig, ax = plt.subplots(figsize=(14, 12))
    sns.heatmap(corr, mask=mask, annot=True, fmt='.2f', 
                cmap='coolwarm', center=0, vmin=-1, vmax=1,
                square=True, linewidths=0.5, cbar_kws={"shrink": 0.8},
                ax=ax)
    
    ax.set_title(title, fontsize=16, fontweight='bold', pad=20)
    plt.tight_layout()
    
    if save_path:
        plt.savefig(save_path, dpi=config.FIGURE_DPI, bbox_inches='tight')
        print(f"üíæ Saved: {save_path}")
    
    plt.show()
    
    # Print highly correlated features
    print(f"\n{'='*60}")
    print("Highly Correlated Feature Pairs (|r| > 0.5)")
    print(f"{'='*60}")
    
    high_corr = []
    for i in range(len(corr.columns)):
        for j in range(i+1, len(corr.columns)):
            if abs(corr.iloc[i, j]) > 0.5:
                high_corr.append((corr.columns[i], corr.columns[j], corr.iloc[i, j]))
    
    if high_corr:
        for feat1, feat2, corr_val in sorted(high_corr, key=lambda x: abs(x[2]), reverse=True):
            print(f"{feat1:20s} <-> {feat2:20s}: {corr_val:6.3f}")
    else:
        print("No feature pairs with |correlation| > 0.5")
    
    print(f"{'='*60}\n")

print("‚úÖ Visualization functions defined")

‚úÖ Visualization functions defined


In [12]:
# ============================================================================
# HELPER FUNCTIONS: EVALUATION
# ============================================================================

def evaluate_classifier(y_true, y_pred, y_pred_proba=None, 
                       model_name="Model", display_cm=True):
    """
    Comprehensive evaluation of classification model.
    
    Parameters:
    -----------
    y_true : array-like
        True labels
    y_pred : array-like
        Predicted labels
    y_pred_proba : array-like, optional
        Predicted probabilities (for ROC-AUC)
    model_name : str
        Name of the model for display
    display_cm : bool
        Whether to display confusion matrix
        
    Returns:
    --------
    dict : Dictionary of metric scores
    """
    print(f"\n{'='*70}")
    print(f"üìä EVALUATION: {model_name}")
    print(f"{'='*70}\n")
    
    # Calculate metrics
    metrics = {
        'accuracy': accuracy_score(y_true, y_pred),
        'balanced_accuracy': balanced_accuracy_score(y_true, y_pred),
        'precision_macro': precision_score(y_true, y_pred, average='macro', zero_division=0),
        'precision_weighted': precision_score(y_true, y_pred, average='weighted', zero_division=0),
        'recall_macro': recall_score(y_true, y_pred, average='macro', zero_division=0),
        'recall_weighted': recall_score(y_true, y_pred, average='weighted', zero_division=0),
        'f1_macro': f1_score(y_true, y_pred, average='macro', zero_division=0),
        'f1_weighted': f1_score(y_true, y_pred, average='weighted', zero_division=0)
    }
    
    # ROC-AUC (if probabilities provided)
    if y_pred_proba is not None:
        try:
            metrics['roc_auc_ovr'] = roc_auc_score(y_true, y_pred_proba, 
                                                   multi_class='ovr', average='weighted')
        except:
            metrics['roc_auc_ovr'] = None
    
    # Display metrics
    print("Overall Metrics:")
    print(f"  Accuracy:           {metrics['accuracy']:.4f}")
    print(f"  Balanced Accuracy:  {metrics['balanced_accuracy']:.4f}")
    print(f"  F1-Score (Macro):   {metrics['f1_macro']:.4f} ‚≠ê")
    print(f"  F1-Score (Weighted):{metrics['f1_weighted']:.4f}")
    if metrics.get('roc_auc_ovr'):
        print(f"  ROC-AUC (OVR):      {metrics['roc_auc_ovr']:.4f}")
    
    # Classification report
    print(f"\n{'-'*70}")
    print("Per-Class Performance:")
    print(f"{'-'*70}")
    print(classification_report(y_true, y_pred, 
                                target_names=config.TARGET_NAMES,
                                digits=4, zero_division=0))
    
    # Confusion matrix
    if display_cm:
        cm = confusion_matrix(y_true, y_pred)
        disp = ConfusionMatrixDisplay(confusion_matrix=cm,
                                      display_labels=config.TARGET_NAMES)
        fig, ax = plt.subplots(figsize=(8, 6))
        disp.plot(cmap='Blues', ax=ax, values_format='d')
        ax.set_title(f'Confusion Matrix - {model_name}', 
                    fontsize=14, fontweight='bold', pad=15)
        plt.tight_layout()
        plt.show()
    
    print(f"{'='*70}\n")
    
    return metrics


def compare_models(results_dict, metric='f1_macro', save_path=None):
    """
    Compare multiple models side by side.
    
    Parameters:
    -----------
    results_dict : dict
        Dictionary of {model_name: metrics_dict}
    metric : str
        Primary metric for sorting
    save_path : str, optional
        Path to save comparison table
    """
    # Create DataFrame
    df_results = pd.DataFrame(results_dict).T
    df_results = df_results.sort_values(metric, ascending=False)
    
    print(f"\n{'='*80}")
    print(f"üìä MODEL COMPARISON (Sorted by {metric})")
    print(f"{'='*80}\n")
    print(df_results.to_string())
    print(f"\n{'='*80}\n")
    
    # Highlight best model
    best_model = df_results.index[0]
    best_score = df_results.iloc[0][metric]
    print(f"üèÜ Best Model: {best_model}")
    print(f"   {metric}: {best_score:.4f}\n")
    
    # Bar plot comparison
    fig, ax = plt.subplots(figsize=(12, 6))
    
    x = np.arange(len(df_results))
    width = 0.2
    
    metrics_to_plot = ['f1_macro', 'f1_weighted', 'balanced_accuracy']
    colors = ['#1f77b4', '#ff7f0e', '#2ca02c']
    
    for i, metric_name in enumerate(metrics_to_plot):
        if metric_name in df_results.columns:
            values = df_results[metric_name].values
            ax.bar(x + i*width, values, width, label=metric_name, color=colors[i])
    
    ax.set_xlabel('Model', fontsize=12)
    ax.set_ylabel('Score', fontsize=12)
    ax.set_title('Model Performance Comparison', fontsize=14, fontweight='bold')
    ax.set_xticks(x + width)
    ax.set_xticklabels(df_results.index, rotation=45, ha='right')
    ax.legend()
    ax.grid(True, alpha=0.3, axis='y')
    plt.tight_layout()
    
    if save_path:
        plt.savefig(save_path, dpi=config.FIGURE_DPI, bbox_inches='tight')
        print(f"üíæ Saved: {save_path}")
    
    plt.show()
    
    # Save table
    if save_path:
        csv_path = save_path.replace('.png', '.csv')
        df_results.to_csv(csv_path)
        print(f"üíæ Saved: {csv_path}")
    
    return df_results

print("‚úÖ Evaluation functions defined")

‚úÖ Evaluation functions defined


---

# 1Ô∏è‚É£ DATA UNDERSTANDING & EXPLORATORY ANALYSIS

## Objectives

1. Load and inspect the CDC diabetes dataset
2. Understand data structure, types, and quality
3. Analyze class distribution and imbalance severity
4. Explore feature distributions and relationships
5. Identify correlations and potential multicollinearity
6. Investigate potential target leakage features
7. Document data quality issues and prepare for preprocessing

---

## üîç Critical Analysis Preview

Throughout this section, we will critically evaluate:
- **Data quality:** Are there missing values? Duplicates? Outliers?
- **Class imbalance:** How severe? What strategies might work?
- **Feature relationships:** Which features correlate with target? With each other?
- **Leakage risk:** Could any features be consequences rather than predictors of diabetes?

---

In [14]:
# ============================================================================
# 1.1 LOAD AND INSPECT DATASET
# ============================================================================

print("="*70)
print("LOADING DATASET")
print("="*70)

# Load dataset
df = pd.read_csv(config.DATA_PATH)

print(f"‚úÖ Dataset loaded successfully!")
print(f"üìä Shape: {df.shape}")
print(f"   Rows (samples): {df.shape[0]:,}")
print(f"   Columns (features): {df.shape[1]}")
print(f"\n{'='*70}\n")

# Display first few rows
print("First 5 rows:")
display(df.head())

print(f"\n{'='*70}\n")

# Display last few rows
print("Last 5 rows:")
display(df.tail())

print(f"\n{'='*70}\n")

# Basic info
print("Dataset Info:")
df.info()

print(f"\n{'='*70}\n")

# Column names
print("Column names:")
for i, col in enumerate(df.columns, 1):
    print(f"  {i:2d}. {col}")

LOADING DATASET


FileNotFoundError: [Errno 2] No such file or directory: 'data/CDC_Diabetes_Dataset.csv'

In [None]:
# ============================================================================
# 1.2 DATA QUALITY ASSESSMENT
# ============================================================================

print("="*70)
print("DATA QUALITY ASSESSMENT")
print("="*70)

# Check for missing values
print("\n1. Missing Values:")
print("-" * 70)
missing = df.isnull().sum()
missing_pct = (missing / len(df)) * 100

missing_df = pd.DataFrame({
    'Missing Count': missing,
    'Percentage': missing_pct
})
missing_df = missing_df[missing_df['Missing Count'] > 0].sort_values('Missing Count', ascending=False)

if len(missing_df) > 0:
    print(missing_df)
    print(f"\n‚ö†Ô∏è  Found {len(missing_df)} columns with missing values")
else:
    print("‚úÖ No missing values found!")

# Check for duplicates
print("\n2. Duplicate Rows:")
print("-" * 70)
duplicates = df.duplicated().sum()
print(f"Duplicate rows: {duplicates:,} ({(duplicates/len(df)*100):.2f}%)")

if duplicates > 0:
    print("‚ö†Ô∏è  Warning: Found duplicate rows")
else:
    print("‚úÖ No duplicate rows")

# Check data types
print("\n3. Data Types:")
print("-" * 70)
print(df.dtypes)

# Summary statistics
print("\n4. Summary Statistics:")
print("-" * 70)
display(df.describe())

# Check for constant features (no variance)
print("\n5. Feature Variance Check:")
print("-" * 70)
constant_features = []
for col in df.columns:
    if df[col].nunique() <= 1:
        constant_features.append(col)

if constant_features:
    print(f"‚ö†Ô∏è  Constant features (no variance): {constant_features}")
else:
    print("‚úÖ All features have variance")

# Value range check
print("\n6. Feature Value Ranges:")
print("-" * 70)
for col in df.columns:
    print(f"{col:25s}: [{df[col].min():8.2f}, {df[col].max():8.2f}] "
          f"(unique: {df[col].nunique()})")

print(f"\n{'='*70}\n")