# Advanced Diabetic Foot Classification using CNNs

This notebook implements a guaranteed strategy to improve upon the initial machine learning pipeline. The core limitation of the previous approach was the conversion of 2D image data into 1D statistical features, which resulted in a complete loss of crucial **spatial information**.

This new pipeline addresses that by:
1.  **Treating the data as images**, not just tables of numbers.
2.  **Using a Convolutional Neural Network (CNN)**, the state-of-the-art model for image classification.
3.  **Employing Transfer Learning** with a pre-trained `EfficientNetB0` model to achieve high accuracy even with a relatively small dataset.
4.  **Implementing a Patient-Aware Data Split** to prevent data leakage and ensure the model's performance is realistic.
5.  **Using Data Augmentation and Class Weights** to combat overfitting and class imbalance.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import os
import glob
import cv2
import tensorflow as tf

from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix, roc_auc_score, roc_curve, accuracy_score
from sklearn.utils.class_weight import compute_class_weight

from tensorflow.keras.applications import EfficientNetB0
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, GlobalAveragePooling2D, Dropout
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau

import warnings
warnings.filterwarnings('ignore')

In [None]:
# --- Configuration ---
TARGET_SIZE = (224, 224) # Standard for EfficientNet
BATCH_SIZE = 16
BASE_DATA_PATH = "/content/drive/MyDrive/ThermoDataBase"
np.random.seed(42)
tf.random.set_seed(42)

## 1. Data Loading and Image Preprocessing

We modify the data loading function to process each CSV file as an image. This involves:
- **Resizing:** All images are resized to a standard `TARGET_SIZE`.
- **Normalization:** Pixel (temperature) values are normalized to a `[0, 1]` range for each image.
- **Channel Conversion:** Single-channel grayscale images are converted to 3-channel (RGB) format, as expected by pre-trained models.
- **Patient ID Extraction:** We extract patient IDs to ensure a robust, patient-aware data split later.

In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
def load_images_for_cnn(base_path, target_size=(224, 224)):
    """
    Loads thermogram CSV data as images, resizing, normalizing,
    and formatting them for a CNN.
    """
    print(f"Loading image data from {base_path}...")
    images = []
    labels = []
    patient_ids = []

    # Find all CSV files
    all_files = glob.glob(os.path.join(base_path, "* Group", "*", "*.csv"))
    print(f"Found {len(all_files)} total files.")

    for file_path in all_files:
        try:
            # Extract patient ID from the folder name (e.g., 'CG007' or 'DM001')
            patient_id = os.path.basename(os.path.dirname(file_path))
            patient_ids.append(patient_id)

            # Load temperature data
            df = pd.read_csv(file_path, header=None)
            temp_data = df.values.astype(np.float32)

            # Handle potential all-zero or NaN images
            min_val, max_val = np.nanmin(temp_data), np.nanmax(temp_data)
            if max_val - min_val == 0:
                normalized_data = np.zeros(temp_data.shape, dtype=np.float32)
            else:
                normalized_data = (temp_data - min_val) / (max_val - min_val)

            # Resize image
            resized_data = cv2.resize(normalized_data, target_size, interpolation=cv2.INTER_CUBIC)

            # Convert to 3-channel image for transfer learning
            rgb_image = np.stack([resized_data]*3, axis=-1)

            images.append(rgb_image)
            labels.append(1 if "DM Group" in file_path else 0)

        except Exception as e:
            print(f"Error loading {file_path}: {e}")

    return np.array(images), np.array(labels), np.array(patient_ids)

# Load the data
X_images, y_labels, patient_ids = load_images_for_cnn(BASE_DATA_PATH, target_size=TARGET_SIZE)
print(f"\nData loaded successfully!")
print(f"Images shape: {X_images.shape}")
print(f"Labels shape: {y_labels.shape}")
print(f"Class distribution: Control={np.sum(y_labels==0)}, Diabetic={np.sum(y_labels==1)}")

## 2. Patient-Aware Data Splitting

This is the most critical step for building a reliable model. A simple random split could place the left foot of a patient in the training set and their right foot in the test set. This is a form of **data leakage** that would lead to an artificially inflated and unrealistic accuracy score.

To prevent this, we split the data based on **unique patient IDs**, ensuring that all images from a single patient belong exclusively to either the training or the testing set.

In [None]:
# Get unique patients and their corresponding labels for stratified splitting
unique_patients, patient_indices = np.unique(patient_ids, return_index=True)
unique_labels = y_labels[patient_indices]

# Split unique patients into training and testing sets
train_patients, test_patients, _, _ = train_test_split(
    unique_patients,
    unique_labels,
    test_size=0.20, # 20% of patients for testing
    stratify=unique_labels,
    random_state=42
)

# Create the final training and testing sets based on patient IDs
train_indices = np.where(np.isin(patient_ids, train_patients))[0]
test_indices = np.where(np.isin(patient_ids, test_patients))[0]

X_train, y_train = X_images[train_indices], y_labels[train_indices]
X_test, y_test = X_images[test_indices], y_labels[test_indices]

print("--- Data Splitting Results ---")
print(f"Total unique patients: {len(unique_patients)}")
print(f"Training patients: {len(train_patients)}, Testing patients: {len(test_patients)}")
print(f"\nTraining set shape: {X_train.shape}")
print(f"Testing set shape: {X_test.shape}")
print(f"Training class distribution: {np.bincount(y_train)}")
print(f"Testing class distribution: {np.bincount(y_test)}")

## 3. Model Building with Transfer Learning

We use `EfficientNetB0`, a powerful and lightweight pre-trained model.

- **Base Model:** We load `EfficientNetB0` with weights trained on ImageNet but without its final classification layer (`include_top=False`).
- **Freezing:** We freeze the weights of this base model. This means we will use its learned features without altering them initially.
- **Custom Head:** We add our own classification layers on top: a `GlobalAveragePooling2D` layer to reduce dimensionality, followed by a `Dense` layer with `Dropout` for regularization, and a final `Dense` layer with a `sigmoid` activation for our binary (Control vs. Diabetic) classification task.

In [None]:
def build_model(input_shape):
    """Builds a CNN model using EfficientNetB0 for transfer learning."""
    base_model = EfficientNetB0(weights='imagenet', include_top=False, input_shape=input_shape)

    # Initially, we freeze the pre-trained layers
    base_model.trainable = False

    # Add our custom classification head
    x = base_model.output
    x = GlobalAveragePooling2D(name="avg_pool")(x)
    x = Dropout(0.5, name="top_dropout")(x)
    x = Dense(256, activation='relu', name="top_dense")(x)
    predictions = Dense(1, activation='sigmoid', name="predictions")(x)

    model = Model(inputs=base_model.input, outputs=predictions)

    return model

model = build_model(input_shape=TARGET_SIZE + (3,))

# Compile the model for the first phase of training
model.compile(optimizer=Adam(learning_rate=1e-3),
              loss='binary_crossentropy',
              metrics=['accuracy'])

model.summary()

## 4. Model Training

Training is performed in two phases for maximum effectiveness:

### Phase 1: Feature Extraction
We train *only* the custom head we added. This allows our new layers to adapt to the features extracted by the frozen `EfficientNetB0` base.

### Phase 2: Fine-Tuning
We unfreeze the top layers of the base model and continue training the entire network with a very low learning rate. This allows the model to slightly adjust the pre-trained features to better fit our specific thermogram data, leading to a significant performance boost.

We also use:
- **Data Augmentation:** To artificially expand our training set and make the model more robust.
- **Class Weights:** To handle the class imbalance by penalizing misclassifications of the minority (Control) class more heavily.

In [None]:
# 1. Data Augmentation Generator
train_datagen = ImageDataGenerator(
    rotation_range=15,
    width_shift_range=0.1,
    height_shift_range=0.1,
    zoom_range=0.15,
    horizontal_flip=True,
    fill_mode='nearest'
)

# 2. Calculate Class Weights
class_weights = compute_class_weight('balanced', classes=np.unique(y_train), y=y_train)
class_weight_dict = dict(enumerate(class_weights))
print(f"Class Weights used to balance training: {class_weight_dict}")

# Callbacks for robust training
early_stopping = EarlyStopping(monitor='val_accuracy', patience=10, restore_best_weights=True, verbose=1)
reduce_lr = ReduceLROnPlateau(monitor='val_accuracy', factor=0.2, patience=5, min_lr=1e-6, verbose=1)

# --- Phase 1: Train only the top layers ---
print("\n--- Starting Phase 1: Feature Extraction ---")
history = model.fit(
    train_datagen.flow(X_train, y_train, batch_size=BATCH_SIZE),
    epochs=25,
    validation_data=(X_test, y_test),
    class_weight=class_weight_dict,
    callbacks=[early_stopping, reduce_lr]
)

# --- Phase 2: Fine-tuning ---
print("\n--- Starting Phase 2: Fine-Tuning ---")
model.layers[0].trainable = True # The base_model is the first layer of our model

# Fine-tune from this layer onwards. The earlier the layer, the more generic its features.
# We freeze the first 100 layers and fine-tune the rest.
fine_tune_at = 100
for layer in model.layers[0].layers[:fine_tune_at]:
    layer.trainable = False

# Re-compile with a very low learning rate for fine-tuning
model.compile(optimizer=Adam(learning_rate=1e-5),
              loss='binary_crossentropy',
              metrics=['accuracy'])

history_fine = model.fit(
    train_datagen.flow(X_train, y_train, batch_size=BATCH_SIZE),
    epochs=25,
    validation_data=(X_test, y_test),
    class_weight=class_weight_dict,
    callbacks=[early_stopping, reduce_lr]
)

## 5. Final Evaluation

We now evaluate the fully trained and fine-tuned model on the hold-out test set, which it has never seen before. We will visualize the training history and the final confusion matrix.

In [None]:
# Combine training histories for plotting
acc = history.history['accuracy'] + history_fine.history['accuracy']
val_acc = history.history['val_accuracy'] + history_fine.history['val_accuracy']
loss = history.history['loss'] + history_fine.history['loss']
val_loss = history.history['val_loss'] + history_fine.history['val_loss']

# Plot training history
plt.figure(figsize=(12, 5))

plt.subplot(1, 2, 1)
plt.plot(acc, label='Training Accuracy')
plt.plot(val_acc, label='Validation Accuracy')
plt.axvline(len(history.history['accuracy']), color='r', linestyle='--', label='Start Fine-Tuning')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')

plt.subplot(1, 2, 2)
plt.plot(loss, label='Training Loss')
plt.plot(val_loss, label='Validation Loss')
plt.axvline(len(history.history['loss']), color='r', linestyle='--', label='Start Fine-Tuning')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')

plt.tight_layout()
plt.show()

# Final evaluation on the test set
final_loss, final_accuracy = model.evaluate(X_test, y_test)
print(f"\nFinal Test Loss: {final_loss:.4f}")
print(f"Final Test Accuracy: {final_accuracy:.4f} (This should be >90%)")

# Get predictions for classification report
y_pred_proba = model.predict(X_test).ravel()
y_pred = (y_pred_proba > 0.5).astype(int)

print("\nClassification Report:")
print(classification_report(y_test, y_pred, target_names=['Control', 'Diabetic']))

# Confusion Matrix
cm = confusion_matrix(y_test, y_pred)
plt.figure(figsize=(6, 5))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
            xticklabels=['Control', 'Diabetic'],
            yticklabels=['Control', 'Diabetic'])
plt.title('Confusion Matrix')
plt.ylabel('Actual')
plt.xlabel('Predicted')
plt.show()