# Chest X-Ray Pneumonia Detection using Convolutional Neural Networks

## 6COSC020W Coursework - Part C Implementation

This notebook implements a CNN-based binary classifier for detecting pneumonia from chest X-ray images. The model is trained on the Kaggle Chest X-Ray Images (Pneumonia) dataset containing 5,863 images from pediatric patients.

**Author:** Student Name  
**Date:** January 2026  
**Domain:** Healthcare Diagnostics - Medical Imaging

---
## 1. Environment Setup and Library Imports

Import all necessary libraries for data processing, model building, training, and evaluation.

In [None]:
# Core libraries
import os
import numpy as np
import pandas as pd
import warnings
warnings.filterwarnings('ignore')

# Visualization
import matplotlib.pyplot as plt
import seaborn as sns
plt.style.use('seaborn-v0_8-whitegrid')

# Image processing
from PIL import Image

# TensorFlow and Keras
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, models, optimizers
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint, ReduceLROnPlateau

# Evaluation metrics
from sklearn.metrics import confusion_matrix, classification_report, accuracy_score
from sklearn.metrics import precision_score, recall_score, f1_score, roc_curve, auc

# Set random seeds for reproducibility
np.random.seed(42)
tf.random.set_seed(42)

print(f"TensorFlow Version: {tf.__version__}")
print(f"Keras Version: {keras.__version__}")
print(f"GPU Available: {tf.config.list_physical_devices('GPU')}")

---
## 2. Dataset Configuration

Define paths to the dataset directories and configure image parameters. The dataset should be downloaded from Kaggle and extracted to the specified location.

**Dataset Source:** [Chest X-Ray Images (Pneumonia) - Kaggle](https://www.kaggle.com/datasets/paultimothymooney/chest-xray-pneumonia)

In [None]:
# Configuration constants
IMAGE_SIZE = (150, 150)  # Target image dimensions
BATCH_SIZE = 32          # Training batch size
EPOCHS = 15              # Maximum training epochs
LEARNING_RATE = 0.0001   # Initial learning rate

# Dataset paths - UPDATE THESE PATHS TO YOUR LOCAL DATASET LOCATION
# Download from: https://www.kaggle.com/datasets/paultimothymooney/chest-xray-pneumonia
BASE_DIR = './chest_xray'  # Root directory of the dataset
TRAIN_DIR = os.path.join(BASE_DIR, 'train')
VAL_DIR = os.path.join(BASE_DIR, 'val')
TEST_DIR = os.path.join(BASE_DIR, 'test')

# Class labels
CLASSES = ['NORMAL', 'PNEUMONIA']

print(f"Configuration:")
print(f"  Image Size: {IMAGE_SIZE}")
print(f"  Batch Size: {BATCH_SIZE}")
print(f"  Max Epochs: {EPOCHS}")
print(f"  Learning Rate: {LEARNING_RATE}")

---
## 3. Data Exploration and Analysis

Explore the dataset structure, examine class distribution, and visualize sample images to understand the data characteristics.

In [None]:
def count_images_in_directory(directory):
    """
    Count the number of images in each class subdirectory.
    
    Args:
        directory: Path to the parent directory containing class folders
    
    Returns:
        Dictionary with class names as keys and image counts as values
    """
    counts = {}
    for class_name in CLASSES:
        class_path = os.path.join(directory, class_name)
        if os.path.exists(class_path):
            counts[class_name] = len([f for f in os.listdir(class_path) 
                                      if f.lower().endswith(('.png', '.jpg', '.jpeg'))])
        else:
            counts[class_name] = 0
    return counts

# Count images in each split
print("Dataset Distribution:")
print("=" * 50)

for split_name, split_dir in [('Training', TRAIN_DIR), ('Validation', VAL_DIR), ('Test', TEST_DIR)]:
    counts = count_images_in_directory(split_dir)
    total = sum(counts.values())
    print(f"\n{split_name}:")
    for class_name, count in counts.items():
        percentage = (count / total * 100) if total > 0 else 0
        print(f"  {class_name}: {count:,} images ({percentage:.1f}%)")
    print(f"  Total: {total:,} images")

In [None]:
def visualize_sample_images(directory, n_samples=4):
    """
    Display sample images from each class for visual inspection.
    
    Args:
        directory: Path to the parent directory containing class folders
        n_samples: Number of samples to display per class
    """
    fig, axes = plt.subplots(2, n_samples, figsize=(16, 8))
    fig.suptitle('Sample Chest X-Ray Images', fontsize=16, fontweight='bold')
    
    for row_idx, class_name in enumerate(CLASSES):
        class_path = os.path.join(directory, class_name)
        if not os.path.exists(class_path):
            continue
            
        image_files = [f for f in os.listdir(class_path) 
                       if f.lower().endswith(('.png', '.jpg', '.jpeg'))]
        sample_files = np.random.choice(image_files, min(n_samples, len(image_files)), replace=False)
        
        for col_idx, img_file in enumerate(sample_files):
            img_path = os.path.join(class_path, img_file)
            img = Image.open(img_path)
            
            ax = axes[row_idx, col_idx]
            ax.imshow(img, cmap='gray')
            ax.set_title(f'{class_name}', fontsize=12, fontweight='bold',
                        color='green' if class_name == 'NORMAL' else 'red')
            ax.axis('off')
    
    plt.tight_layout()
    plt.show()

# Visualize sample training images
visualize_sample_images(TRAIN_DIR, n_samples=4)

In [None]:
# Visualize class distribution as bar chart
train_counts = count_images_in_directory(TRAIN_DIR)
test_counts = count_images_in_directory(TEST_DIR)

fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Training set distribution
colors = ['#2ecc71', '#e74c3c']
axes[0].bar(train_counts.keys(), train_counts.values(), color=colors, edgecolor='black', linewidth=1.5)
axes[0].set_title('Training Set Distribution', fontsize=14, fontweight='bold')
axes[0].set_xlabel('Class', fontsize=12)
axes[0].set_ylabel('Number of Images', fontsize=12)
for i, (k, v) in enumerate(train_counts.items()):
    axes[0].text(i, v + 50, str(v), ha='center', fontsize=12, fontweight='bold')

# Test set distribution
axes[1].bar(test_counts.keys(), test_counts.values(), color=colors, edgecolor='black', linewidth=1.5)
axes[1].set_title('Test Set Distribution', fontsize=14, fontweight='bold')
axes[1].set_xlabel('Class', fontsize=12)
axes[1].set_ylabel('Number of Images', fontsize=12)
for i, (k, v) in enumerate(test_counts.items()):
    axes[1].text(i, v + 10, str(v), ha='center', fontsize=12, fontweight='bold')

plt.tight_layout()
plt.show()

# Note on class imbalance
imbalance_ratio = train_counts['PNEUMONIA'] / train_counts['NORMAL']
print(f"\nClass Imbalance Ratio (PNEUMONIA:NORMAL): {imbalance_ratio:.2f}:1")
print("Note: The dataset is imbalanced with more PNEUMONIA cases. Consider class weights during training.")

---
## 4. Data Pre-processing and Augmentation

Configure data generators for loading, preprocessing, and augmenting images during training. Data augmentation helps improve model generalization by artificially expanding the training set.

In [None]:
# Data augmentation for training set
# Augmentation helps reduce overfitting and improves generalization
train_datagen = ImageDataGenerator(
    rescale=1./255,           # Normalize pixel values to [0, 1]
    rotation_range=20,         # Random rotation up to 20 degrees
    width_shift_range=0.1,     # Horizontal shift up to 10%
    height_shift_range=0.1,    # Vertical shift up to 10%
    shear_range=0.1,           # Shear transformation
    zoom_range=0.1,            # Random zoom up to 10%
    horizontal_flip=True,      # Random horizontal flip
    fill_mode='nearest',       # Fill strategy for new pixels
    validation_split=0.2       # Use 20% of training data for validation
)

# No augmentation for validation and test sets - only normalization
test_datagen = ImageDataGenerator(rescale=1./255)

print("Data generators configured with the following augmentations:")
print("  - Rotation: ±20 degrees")
print("  - Width/Height Shift: ±10%")
print("  - Shear: 10%")
print("  - Zoom: ±10%")
print("  - Horizontal Flip: Enabled")
print("  - Validation Split: 20% from training data")

In [None]:
# Create data generators for training, validation, and testing
# Using flow_from_directory to load images directly from folders

# Training generator (with augmentation)
train_generator = train_datagen.flow_from_directory(
    TRAIN_DIR,
    target_size=IMAGE_SIZE,
    batch_size=BATCH_SIZE,
    class_mode='binary',       # Binary classification
    subset='training',         # Use training subset
    shuffle=True,
    seed=42
)

# Validation generator (from training data, no augmentation behavior applied)
validation_generator = train_datagen.flow_from_directory(
    TRAIN_DIR,
    target_size=IMAGE_SIZE,
    batch_size=BATCH_SIZE,
    class_mode='binary',
    subset='validation',       # Use validation subset
    shuffle=False,
    seed=42
)

# Test generator (no augmentation)
test_generator = test_datagen.flow_from_directory(
    TEST_DIR,
    target_size=IMAGE_SIZE,
    batch_size=BATCH_SIZE,
    class_mode='binary',
    shuffle=False              # Keep order for evaluation
)

# Print class indices mapping
print(f"\nClass Indices: {train_generator.class_indices}")
print(f"Training Samples: {train_generator.samples}")
print(f"Validation Samples: {validation_generator.samples}")
print(f"Test Samples: {test_generator.samples}")

In [None]:
# Visualize augmented images to verify augmentation is working
def visualize_augmentation(generator, n_samples=6):
    """
    Display augmented versions of images from the generator.
    
    Args:
        generator: Image data generator
        n_samples: Number of augmented samples to display
    """
    fig, axes = plt.subplots(2, n_samples, figsize=(18, 6))
    fig.suptitle('Data Augmentation Examples', fontsize=16, fontweight='bold')
    
    # Get a batch of augmented images
    batch_x, batch_y = next(generator)
    
    for i in range(min(n_samples * 2, len(batch_x))):
        row = i // n_samples
        col = i % n_samples
        ax = axes[row, col]
        ax.imshow(batch_x[i])
        label = 'PNEUMONIA' if batch_y[i] == 1 else 'NORMAL'
        ax.set_title(label, fontsize=10, color='red' if batch_y[i] == 1 else 'green')
        ax.axis('off')
    
    plt.tight_layout()
    plt.show()

# Visualize augmented training images
visualize_augmentation(train_generator, n_samples=6)

---
## 5. CNN Model Architecture

Define the Convolutional Neural Network architecture for pneumonia detection. The model uses three convolutional blocks with batch normalization, followed by dense layers for classification.

In [None]:
def build_cnn_model(input_shape=(150, 150, 3)):
    """
    Build a CNN model for binary image classification.
    
    Architecture:
    - 3 Convolutional blocks (Conv2D + BatchNorm + MaxPool)
    - Flatten layer
    - Dense layer with dropout for regularization
    - Output layer with sigmoid activation
    
    Args:
        input_shape: Shape of input images (height, width, channels)
    
    Returns:
        Compiled Keras model
    """
    model = models.Sequential(name='pneumonia_cnn')
    
    # Input layer
    model.add(layers.Input(shape=input_shape))
    
    # Convolutional Block 1
    model.add(layers.Conv2D(32, (3, 3), activation='relu', name='conv2d_1'))
    model.add(layers.BatchNormalization(name='batch_norm_1'))
    model.add(layers.MaxPooling2D((2, 2), name='max_pool_1'))
    
    # Convolutional Block 2
    model.add(layers.Conv2D(64, (3, 3), activation='relu', name='conv2d_2'))
    model.add(layers.BatchNormalization(name='batch_norm_2'))
    model.add(layers.MaxPooling2D((2, 2), name='max_pool_2'))
    
    # Convolutional Block 3
    model.add(layers.Conv2D(128, (3, 3), activation='relu', name='conv2d_3'))
    model.add(layers.BatchNormalization(name='batch_norm_3'))
    model.add(layers.MaxPooling2D((2, 2), name='max_pool_3'))
    
    # Flatten and Dense Layers
    model.add(layers.Flatten(name='flatten'))
    model.add(layers.Dense(256, activation='relu', name='dense_1'))
    model.add(layers.Dropout(0.5, name='dropout'))  # Regularization
    
    # Output Layer
    model.add(layers.Dense(1, activation='sigmoid', name='output'))
    
    return model

# Build the model
model = build_cnn_model(input_shape=(*IMAGE_SIZE, 3))

# Display model summary
print("Model Architecture:")
print("=" * 70)
model.summary()

In [None]:
# Compile the model
model.compile(
    optimizer=optimizers.Adam(learning_rate=LEARNING_RATE),
    loss='binary_crossentropy',
    metrics=['accuracy']
)

print("Model compiled with:")
print(f"  Optimizer: Adam (lr={LEARNING_RATE})")
print(f"  Loss: Binary Cross-Entropy")
print(f"  Metrics: Accuracy")

In [None]:
# Visualize model architecture
try:
    from tensorflow.keras.utils import plot_model
    plot_model(model, to_file='model_architecture.png', show_shapes=True, 
               show_layer_names=True, dpi=100)
    print("Model architecture diagram saved to 'model_architecture.png'")
except Exception as e:
    print(f"Could not generate model diagram: {e}")

---
## 6. Model Training

Train the CNN model with callbacks for early stopping and learning rate reduction. Class weights are applied to handle the imbalanced dataset.

In [None]:
# Calculate class weights to handle imbalanced data
train_counts = count_images_in_directory(TRAIN_DIR)
total_samples = sum(train_counts.values())
n_classes = len(CLASSES)

class_weights = {
    0: total_samples / (n_classes * train_counts['NORMAL']),   # Weight for NORMAL
    1: total_samples / (n_classes * train_counts['PNEUMONIA']) # Weight for PNEUMONIA
}

print("Class Weights for Imbalanced Data:")
print(f"  NORMAL (0): {class_weights[0]:.4f}")
print(f"  PNEUMONIA (1): {class_weights[1]:.4f}")

In [None]:
# Define callbacks
callbacks = [
    # Early stopping to prevent overfitting
    EarlyStopping(
        monitor='val_loss',
        patience=5,
        restore_best_weights=True,
        verbose=1
    ),
    # Reduce learning rate when validation loss plateaus
    ReduceLROnPlateau(
        monitor='val_loss',
        factor=0.5,
        patience=3,
        min_lr=1e-7,
        verbose=1
    ),
    # Save the best model
    ModelCheckpoint(
        'best_pneumonia_model.keras',
        monitor='val_accuracy',
        save_best_only=True,
        verbose=1
    )
]

print("Training callbacks configured:")
print("  - EarlyStopping (patience=5)")
print("  - ReduceLROnPlateau (factor=0.5, patience=3)")
print("  - ModelCheckpoint (save best model)")

In [None]:
# Train the model
print("\nStarting model training...")
print("=" * 70)

history = model.fit(
    train_generator,
    epochs=EPOCHS,
    validation_data=validation_generator,
    class_weight=class_weights,
    callbacks=callbacks,
    verbose=1
)

print("\nTraining completed!")

---
## 7. Training Visualization

Plot training and validation accuracy/loss curves to analyze model performance during training.

In [None]:
def plot_training_history(history):
    """
    Plot training and validation accuracy/loss curves.
    
    Args:
        history: Keras training history object
    """
    fig, axes = plt.subplots(1, 2, figsize=(16, 6))
    
    # Accuracy plot
    axes[0].plot(history.history['accuracy'], label='Training Accuracy', 
                 linewidth=2, marker='o', markersize=5)
    axes[0].plot(history.history['val_accuracy'], label='Validation Accuracy', 
                 linewidth=2, marker='s', markersize=5)
    axes[0].set_title('Model Accuracy', fontsize=14, fontweight='bold')
    axes[0].set_xlabel('Epoch', fontsize=12)
    axes[0].set_ylabel('Accuracy', fontsize=12)
    axes[0].legend(loc='lower right', fontsize=11)
    axes[0].grid(True, alpha=0.3)
    axes[0].set_ylim([0.5, 1.0])
    
    # Loss plot
    axes[1].plot(history.history['loss'], label='Training Loss', 
                 linewidth=2, marker='o', markersize=5, color='#e74c3c')
    axes[1].plot(history.history['val_loss'], label='Validation Loss', 
                 linewidth=2, marker='s', markersize=5, color='#3498db')
    axes[1].set_title('Model Loss', fontsize=14, fontweight='bold')
    axes[1].set_xlabel('Epoch', fontsize=12)
    axes[1].set_ylabel('Loss', fontsize=12)
    axes[1].legend(loc='upper right', fontsize=11)
    axes[1].grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.savefig('training_curves.png', dpi=150, bbox_inches='tight')
    plt.show()
    print("Training curves saved to 'training_curves.png'")

# Plot training history
plot_training_history(history)

---
## 8. Model Evaluation

Evaluate the trained model on the test set and generate comprehensive performance metrics.

In [None]:
# Evaluate on test set
print("Evaluating model on test set...")
print("=" * 70)

test_loss, test_accuracy = model.evaluate(test_generator, verbose=1)

print(f"\nTest Results:")
print(f"  Loss: {test_loss:.4f}")
print(f"  Accuracy: {test_accuracy:.4f} ({test_accuracy*100:.2f}%)")

In [None]:
# Generate predictions for detailed analysis
print("Generating predictions on test set...")

# Reset generator to ensure correct ordering
test_generator.reset()

# Get predictions
predictions = model.predict(test_generator, verbose=1)
predicted_classes = (predictions > 0.5).astype(int).flatten()
true_classes = test_generator.classes

print(f"\nTotal test samples: {len(true_classes)}")
print(f"Predictions shape: {predictions.shape}")

In [None]:
# Calculate detailed metrics
accuracy = accuracy_score(true_classes, predicted_classes)
precision = precision_score(true_classes, predicted_classes)
recall = recall_score(true_classes, predicted_classes)
f1 = f1_score(true_classes, predicted_classes)

print("Detailed Classification Metrics:")
print("=" * 50)
print(f"Accuracy:  {accuracy:.4f} ({accuracy*100:.2f}%)")
print(f"Precision: {precision:.4f} ({precision*100:.2f}%)")
print(f"Recall:    {recall:.4f} ({recall*100:.2f}%)")
print(f"F1-Score:  {f1:.4f} ({f1*100:.2f}%)")

In [None]:
# Print full classification report
print("\nClassification Report:")
print("=" * 60)
print(classification_report(true_classes, predicted_classes, 
                           target_names=['NORMAL', 'PNEUMONIA']))

In [None]:
# Plot confusion matrix
def plot_confusion_matrix(y_true, y_pred, class_names):
    """
    Plot a detailed confusion matrix with annotations.
    
    Args:
        y_true: True labels
        y_pred: Predicted labels
        class_names: List of class names
    """
    cm = confusion_matrix(y_true, y_pred)
    
    fig, ax = plt.subplots(figsize=(10, 8))
    
    # Create heatmap
    sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', 
                xticklabels=class_names, yticklabels=class_names,
                annot_kws={'size': 20}, ax=ax)
    
    ax.set_title('Confusion Matrix', fontsize=16, fontweight='bold')
    ax.set_xlabel('Predicted Label', fontsize=14)
    ax.set_ylabel('True Label', fontsize=14)
    
    # Add percentage annotations
    total = cm.sum()
    for i in range(2):
        for j in range(2):
            percentage = cm[i, j] / total * 100
            ax.text(j + 0.5, i + 0.7, f'({percentage:.1f}%)', 
                   ha='center', va='center', fontsize=12, color='gray')
    
    plt.tight_layout()
    plt.savefig('confusion_matrix.png', dpi=150, bbox_inches='tight')
    plt.show()
    print("Confusion matrix saved to 'confusion_matrix.png'")
    
    # Print confusion matrix interpretation
    tn, fp, fn, tp = cm.ravel()
    print(f"\nConfusion Matrix Breakdown:")
    print(f"  True Negatives (NORMAL correctly identified): {tn}")
    print(f"  False Positives (NORMAL incorrectly labeled PNEUMONIA): {fp}")
    print(f"  False Negatives (PNEUMONIA incorrectly labeled NORMAL): {fn}")
    print(f"  True Positives (PNEUMONIA correctly identified): {tp}")

# Plot confusion matrix
plot_confusion_matrix(true_classes, predicted_classes, CLASSES)

In [None]:
# Plot ROC curve
def plot_roc_curve(y_true, y_pred_proba):
    """
    Plot the ROC curve and calculate AUC.
    
    Args:
        y_true: True labels
        y_pred_proba: Predicted probabilities
    """
    fpr, tpr, thresholds = roc_curve(y_true, y_pred_proba)
    roc_auc = auc(fpr, tpr)
    
    fig, ax = plt.subplots(figsize=(10, 8))
    
    ax.plot(fpr, tpr, color='#3498db', lw=3, 
            label=f'ROC Curve (AUC = {roc_auc:.4f})')
    ax.plot([0, 1], [0, 1], color='#95a5a6', lw=2, linestyle='--', 
            label='Random Classifier')
    
    ax.fill_between(fpr, tpr, alpha=0.3, color='#3498db')
    
    ax.set_xlim([0.0, 1.0])
    ax.set_ylim([0.0, 1.05])
    ax.set_xlabel('False Positive Rate', fontsize=14)
    ax.set_ylabel('True Positive Rate', fontsize=14)
    ax.set_title('Receiver Operating Characteristic (ROC) Curve', 
                 fontsize=16, fontweight='bold')
    ax.legend(loc='lower right', fontsize=12)
    ax.grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.savefig('roc_curve.png', dpi=150, bbox_inches='tight')
    plt.show()
    
    print(f"\nArea Under ROC Curve (AUC): {roc_auc:.4f}")
    print("ROC curve saved to 'roc_curve.png'")
    
    return roc_auc

# Plot ROC curve
auc_score = plot_roc_curve(true_classes, predictions.flatten())

---
## 9. Sample Predictions Visualization

Visualize sample predictions to qualitatively assess model performance on individual images.

In [None]:
def visualize_predictions(generator, model, n_samples=8):
    """
    Visualize model predictions on sample images.
    
    Args:
        generator: Test data generator
        model: Trained model
        n_samples: Number of samples to visualize
    """
    # Get a batch of test images
    generator.reset()
    batch_x, batch_y = next(generator)
    
    # Get predictions
    batch_predictions = model.predict(batch_x, verbose=0)
    
    # Plot
    n_cols = 4
    n_rows = (n_samples + n_cols - 1) // n_cols
    fig, axes = plt.subplots(n_rows, n_cols, figsize=(16, 4*n_rows))
    fig.suptitle('Sample Predictions', fontsize=16, fontweight='bold')
    
    for i, ax in enumerate(axes.flat):
        if i >= min(n_samples, len(batch_x)):
            ax.axis('off')
            continue
            
        ax.imshow(batch_x[i])
        
        true_label = 'PNEUMONIA' if batch_y[i] == 1 else 'NORMAL'
        pred_prob = batch_predictions[i][0]
        pred_label = 'PNEUMONIA' if pred_prob >= 0.5 else 'NORMAL'
        
        correct = true_label == pred_label
        color = 'green' if correct else 'red'
        
        title = f"True: {true_label}\nPred: {pred_label} ({pred_prob:.2%})"
        ax.set_title(title, fontsize=10, color=color, fontweight='bold')
        ax.axis('off')
    
    plt.tight_layout()
    plt.savefig('sample_predictions.png', dpi=150, bbox_inches='tight')
    plt.show()
    print("Sample predictions saved to 'sample_predictions.png'")

# Visualize predictions
visualize_predictions(test_generator, model, n_samples=8)

---
## 10. Model Summary and Conclusions

Summarize the model performance and key findings from this implementation.

In [None]:
# Final summary
print("=" * 70)
print("MODEL PERFORMANCE SUMMARY")
print("=" * 70)
print(f"\nDataset: Chest X-Ray Images (Pneumonia)")
print(f"Total Training Samples: {train_generator.samples}")
print(f"Total Validation Samples: {validation_generator.samples}")
print(f"Total Test Samples: {test_generator.samples}")
print(f"\nModel Architecture: Custom CNN (3 Conv blocks + 2 Dense layers)")
print(f"Total Parameters: {model.count_params():,}")
print(f"\nTraining Configuration:")
print(f"  - Image Size: {IMAGE_SIZE}")
print(f"  - Batch Size: {BATCH_SIZE}")
print(f"  - Learning Rate: {LEARNING_RATE}")
print(f"  - Epochs Trained: {len(history.history['accuracy'])}")
print(f"\nTest Set Performance:")
print(f"  - Accuracy: {accuracy*100:.2f}%")
print(f"  - Precision: {precision*100:.2f}%")
print(f"  - Recall: {recall*100:.2f}%")
print(f"  - F1-Score: {f1*100:.2f}%")
print(f"  - AUC: {auc_score:.4f}")
print("\n" + "=" * 70)

In [None]:
# Save final model
model.save('pneumonia_detection_model.keras')
print("Model saved to 'pneumonia_detection_model.keras'")

# Save training history
history_df = pd.DataFrame(history.history)
history_df.to_csv('training_history.csv', index=False)
print("Training history saved to 'training_history.csv'")

---
## Key Observations

1. **Model Performance**: The CNN model achieves reasonable accuracy on the pneumonia detection task, demonstrating the feasibility of using deep learning for medical image classification.

2. **Class Imbalance**: The dataset contains significantly more PNEUMONIA samples than NORMAL, which was addressed using class weights during training.

3. **Data Augmentation**: Augmentation techniques helped improve model generalization by artificially expanding the training set diversity.

4. **Clinical Implications**: High recall for PNEUMONIA cases is particularly important in medical applications to minimize false negatives (missed diagnoses).

5. **Limitations**: The model is trained on pediatric chest X-rays only and may not generalize well to adult populations without additional training.