<a href="https://colab.research.google.com/github/abhilb/DicomImageClassification/blob/main/Build_a_VGG16_Model_Milestone_2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# COVID Chest X-Ray Image Classification Using VGG16
### Objective
Train a VGG16 model in a real clinical setting using DICOM images (X-ray) with Keras. Use a custom data generator to input images in DICOM format to a Keras deep learning model. Build the VGG16 model from basic building blocks and train it with the X-ray data to achieve good (> 70%) validation accuracy.

## Step 1. Use a custom image data generator which takes in DICOM images.

* Load the CSV file you saved in Part 1 in a pandas DataFrame
* Split n the DataFrame into train, test DataFrame using the function train_test_split(test_size=0.2) from the sklearn.model_selection library
* Add train/ test/ validation data augmentation parameters in a dictionary form or use the Keras preprocessing function.
* Set training/ test/ validation parameters such as BATCH_SIZE, CLASS_MODE, COLOR_MODE, TARGET_SIZE, and EPOCHS.
* Create a data generator class for reading in DICOM images or use the class provided. With this custom datagenerator class create a train and validation generator.
* Build a VGG 16 model from scratch and train using X-ray images

## Step 2. Build the VGG16 model from scratch using correct layers and activations.
* Compile the model and check model summary.
* Using the model.fit_generator function of Keras, train the model using the train_generator and validation_generator you built.
* Plot the training loss. accuracy, and validation loss. and accuracy values vs. epochs.
* Load a set of 9 random images from the test_generator, run model.predict on them. and visualize the prediction scores along with the test images

In [1]:
from google.colab import drive
drive.mount('/content/gdrive')

Mounted at /content/gdrive


In [2]:
import tensorflow as tf
tf.__version__

'2.5.0'

In [15]:
import numpy as np
import pandas as pd
from pathlib import Path
from sklearn.model_selection import train_test_split

In [9]:
normal_data_dir = Path('./gdrive/MyDrive/Colab Notebooks/content/normal').absolute()
covid_data_dir = Path('./gdrive/MyDrive/Colab Notebooks/content/covid').absolute()
dataset_dir = Path('./gdrive/MyDrive/Colab Notebooks/content/')

In [8]:
normal_data = [x for x in normal_data_dir.rglob("*.dcm")]
covid_data = [x for x in covid_data_dir.rglob("*.dcm")]

### Load the CSV file you saved in Part 1 in a pandas DataFrame

In [14]:
dataset = pd.read_csv(dataset_dir / "dataset.csv", header=None, names=["Path", "Label"])
dataset.head()

Unnamed: 0,Path,Label
0,/content/gdrive/MyDrive/Colab Notebooks/conten...,COVID
1,/content/gdrive/MyDrive/Colab Notebooks/conten...,COVID
2,/content/gdrive/MyDrive/Colab Notebooks/conten...,COVID
3,/content/gdrive/MyDrive/Colab Notebooks/conten...,COVID
4,/content/gdrive/MyDrive/Colab Notebooks/conten...,COVID


### Split n the DataFrame into train, test DataFrame using the function train_test_split(test_size=0.2) from the sklearn.model_selection library

In [22]:
train, test = train_test_split(dataset.to_numpy(), test_size=0.2, random_state=42)

In [25]:
print(f"Train: {train.shape}, Test: {test.shape}")

Train: (11046, 2), Test: (2762, 2)


### Add train/ test/ validation data augmentation parameters in a dictionary form or use the Keras preprocessing function.

### Set training/ test/ validation parameters such as BATCH_SIZE, CLASS_MODE, COLOR_MODE, TARGET_SIZE, and EPOCHS.

In [None]:
BATCH_SIZE = 32
CLASS_MODE = 
COLOR_MODE = 
TARGET_SIZE = 
EPOCHS = 100

### Create a data generator class for reading in DICOM images or use the class provided. With this custom datagenerator class create a train and validation generator.

### Build a VGG 16 model from scratch and train using X-ray images

In [28]:
from tensorflow.keras.layers import (
    ZeroPadding2D, 
    Convolution2D,
    MaxPooling2D,
    Flatten,
    Dense,
    Dropout
)
from tensorflow.keras.models import Sequential

In [35]:
def VGG_16():
    model = Sequential()
    model.add(ZeroPadding2D((1,1),input_shape=(1,224,224)))
    model.add(Convolution2D(64, 3, 3, activation='relu'))
    model.add(ZeroPadding2D((1,1)))
    model.add(Convolution2D(64, 3, 3, activation='relu'))
    model.add(MaxPooling2D((2,2), strides=(2,2)))

    model.add(ZeroPadding2D((1,1)))
    model.add(Convolution2D(128, 3, 3, activation='relu'))
    model.add(ZeroPadding2D((1,1)))
    model.add(Convolution2D(128, 3, 3, activation='relu'))
    model.add(MaxPooling2D((2,2), strides=(2,2)))

    model.add(ZeroPadding2D((1,1)))
    model.add(Convolution2D(256, 3, 3, activation='relu'))
    model.add(ZeroPadding2D((1,1)))
    model.add(Convolution2D(256, 3, 3, activation='relu'))
    model.add(ZeroPadding2D((1,1)))
    model.add(Convolution2D(256, 3, 3, activation='relu'))
    model.add(MaxPooling2D((2,2), strides=(2,2)))

    model.add(ZeroPadding2D((1,1)))
    model.add(Convolution2D(512, 3, 3, activation='relu'))
    model.add(ZeroPadding2D((1,1)))
    model.add(Convolution2D(512, 3, 3, activation='relu'))
    model.add(ZeroPadding2D((1,1)))
    model.add(Convolution2D(512, 3, 3, activation='relu'))
    model.add(MaxPooling2D((2,2), strides=(2,2)))

    model.add(ZeroPadding2D((1,1)))
    model.add(Convolution2D(512, 3, 3, activation='relu'))
    model.add(ZeroPadding2D((1,1)))
    model.add(Convolution2D(512, 3, 3, activation='relu'))
    model.add(ZeroPadding2D((1,1)))
    model.add(Convolution2D(512, 3, 3, activation='relu'))
    model.add(MaxPooling2D((2,2), strides=(2,2)))

    model.add(Flatten())
    model.add(Dense(4096, activation='relu'))
    model.add(Dropout(0.5))
    model.add(Dense(4096, activation='relu'))
    model.add(Dropout(0.5))
    model.add(Dense(1000, activation='softmax'))

    return model

In [36]:
model = VGG_16()

ValueError: ignored