# Pneumonia Detection Using Deep Learning

## Statement of Purpose

* Pneumonia is an infection that inflames the air sacs in lungs. It is the leading cause of death for children under 5. In 2017, 2.56 million people died from pneumonia worldwide, of which almost a third were children younger than 5 years old.

* This project implements a deep learning approach using VGG19 architecture to detect pneumonia from chest X-ray images.

* The goal is to develop an accurate and reliable model that can assist medical professionals in diagnosing pneumonia cases.

## Background

* Pneumonia is one of the most common respiratory infections globally, with high mortality rates especially in children and the elderly
* Deep learning approaches have shown promise in medical image analysis, including chest X-ray interpretation
* Transfer learning using pre-trained convolutional neural networks (CNNs) can leverage knowledge from general image recognition tasks for medical imaging

## My Approach Overview

* Implement a transfer learning approach using VGG19 pre-trained on ImageNet
* Process and prepare X-ray images from the chest_xray dataset for binary classification
* Apply progressive fine-tuning strategy to improve model performance
* Evaluate the model's effectiveness on pneumonia detection task

## Pre-Processing Raw Data

* Loaded chest X-ray images from the dataset and resized them to 128x128 pixels
* Applied data augmentation to increase the diversity of the training dataset, using techniques such as:
  * Horizontal and vertical flipping
  * Rotation (up to 40 degrees)
  * Width and height shifts
  * Shear transformations
* Applied normalization by rescaling pixel values to [0,1]
* Visualized sample images to understand the differences between normal and pneumonia X-rays

## Transfer Learning with VGG19

* Used VGG19 pre-trained on ImageNet as the base model
* Kept the convolutional base layers for feature extraction
* Added custom classification layers for pneumonia detection
* Fine-tuned the model with medical image data to adapt it to the specific task

## VGG19 Architecture

* VGG19 is a deep CNN with 19 layers (16 convolutional layers, 3 fully connected layers)
* Pre-trained on ImageNet dataset (over 14 million images across 1,000 categories)
* Known for its uniform architecture with 3x3 convolutional filters
* Excellent feature extraction capabilities for image classification tasks

## Progressive Fine-Tuning Approach

This project implements a progressive fine-tuning strategy that gradually unfreezes deeper layers of the VGG19 model:

1. **Initial Training (model_01)**: Only train custom classification layers while keeping the entire VGG19 base frozen
2. **Partial Fine-Tuning (model_02)**: Unfreeze deeper convolutional layers (block5_conv3, block5_conv4) while keeping earlier layers frozen
3. **Full Fine-Tuning (model_03)**: Attempt to train the entire network for optimal performance

This staged approach prevents catastrophic forgetting and allows for more stable training.

## Model Training Details

The VGG19 model is trained using the following approach:

1. **Model Configuration**:
   * Custom classification head with 4608 and 1152 neurons in dense layers
   * Dropout (0.2) for regularization
   * Binary classification output with softmax activation

2. **Training Parameters**:
   * Stochastic Gradient Descent (SGD) optimizer with low learning rate (0.0001)
   * Categorical cross-entropy loss function
   * Callbacks for early stopping and learning rate reduction

3. **Model Checkpointing**:
   * Best weights saved based on validation loss
   * Weights saved to model_weights directory for future use

## Fine-tuning Strategy Implementation

The implementation follows these key steps:

1. **Initial Model (model_01)**:
   * Freeze all VGG19 convolutional layers
   * Train only the custom classification head
   * Save the model weights to "model_weights/vgg19_model_01.h5"

2. **Partial Fine-tuning (model_02)**:
   * Load weights from model_01
   * Selectively unfreeze deeper layers (block5_conv3, block5_conv4)
   * Continue training with a lower learning rate
   * Save the improved weights to "model_weights/vgg19_model_02.h5"

3. **Full Fine-tuning (model_03)**:
   * Load weights from model_01
   * Make all layers trainable
   * Train with more steps per epoch (100)
   * Save the final model as "model_weights/vgg_unfrozen.h5"

## Model Performance Results

The progressive fine-tuning of the VGG19 model produces the following results:

* **Initial Model (model_01)**: Training only custom classification layers
  * Demonstrates reasonable baseline performance
  * Captures basic features for pneumonia detection

* **Partially Fine-tuned Model (model_02)**: With block5_conv3 and block5_conv4 unfrozen
  * Shows improved feature extraction for medical imaging
  * Higher accuracy from domain-specific adaptations

* **Fully Fine-tuned Model (model_03)**: With all layers trainable
  * Highest capacity model with full parameter tuning
  * Potential for overfitting mitigated by callbacks and regularization

## Model Evaluation and Metrics

The performance of the best model can be evaluated using various metrics:

* **Accuracy**: Measures the overall correctness of predictions
* **Precision**: Indicates how many of the predicted pneumonia cases are actually pneumonia
* **Recall**: Shows how many actual pneumonia cases are correctly identified
* **F1 Score**: Harmonic mean of precision and recall, balancing both metrics
* **Loss Values**: Lower validation and test loss indicate better model generalization

## Comparison of Model Versions

Comparing the three stages of the VGG19 model fine-tuning:

* **Base Model (model_01)**: Good starting performance with frozen VGG19 layers
* **Partially Fine-tuned (model_02)**: Improved performance by unfreezing specific deep layers
* **Fully Fine-tuned (model_03)**: Most comprehensive training with all parameters adjustable

The progressive fine-tuning approach demonstrates how gradually adapting pre-trained weights to the specific task of pneumonia detection leads to improved performance.

## Comparison with Other Approaches

* Transfer learning with VGG19 provides competitive results compared to other approaches for pneumonia detection
* The progressive fine-tuning strategy allows for better adaptation to the medical imaging domain
* This approach achieves good performance with relatively modest computational requirements

## Progressive Learning Benefits

* The progressive fine-tuning approach offers several advantages:
  * Prevents catastrophic forgetting of pre-trained weights
  * Allows the model to gradually adapt to the medical imaging domain
  * Enables experimentation with different layer freezing strategies
* Future iterations could explore unfreezing different combinations of layers to find optimal performance

## Conclusions

* The transfer learning approach with VGG19 demonstrates effective pneumonia detection from chest X-rays

* Progressive fine-tuning strategy shows clear benefits over both completely frozen and fully trainable approaches

* The model architecture with custom dense layers (4608 and 1152 neurons) provides good classification performance

## Further Research

* Experiment with different VGG19 layer freezing/unfreezing combinations to optimize performance
* Explore other pre-trained architectures like ResNet, DenseNet, or EfficientNet for comparison
* Implement data preprocessing techniques specific to medical imaging
* Integrate with the existing Flask application for real-time pneumonia detection

## Integration with Flask Web Application

This project includes a Flask web application that allows users to:

* Upload chest X-ray images through a web interface
* Get real-time predictions on whether the X-ray shows pneumonia or normal lung tissue
* The application uses the trained VGG19 model to provide fast diagnostics
* This makes the deep learning model accessible to healthcare professionals without requiring technical expertise