# Image Classification (Chest X-ray-Pneumonia Detection) using CNN from Scratch, Transfer Learning and Fine-Tuning Techniques.

* **Model-1: Designing CNN Model from scratch:**In this case everything is trained from scratch. More specifically, the model is designed and trained according to our dataset. This is an efficient approach in the field of medical science.
* **Model-2: Designing CNN with TL Technique:**In this approach a pretrained model is used and modified according to the existing pretrained model to predict the class. 
* **Model-3: Designing CNN with FT Technique:** This is a most efficient approach since, this approach is almost same as our model-2 but, a small transformation leads to a major difference in terms of model’s performance.

So, these are the three different types of approach to the problem which are clearly 
explained in later sections.

# 1. Basic Imports 

In [None]:
import numpy as np             
import pandas as pd
import tensorflow as tf
import seaborn as sns
import matplotlib.pyplot as plt

# 2. Data Processing

The Dataset folder consists of three different folders train, test, and val. Parameters are assigned as train_path for training images, test_path for testing images and valid_path for validation images. 

In [None]:
train_path = '../input/pneumonia-xray-images/train'
test_path = '../input/pneumonia-xray-images/test'
valid_path = '../input/pneumonia-xray-images/val'

batch_size = 32

img_height = 224
img_width = 224

## 2.1. Image Augmentation


**Image Augmentation:**

Data Augmentation is the technique to increase size of a data artificially by performing some image augmentation techniques on the existing train data. Data Augmentation is the essential process to follow to face present AI challenges. This technique improves the training ability of the data by undergoing some image processing techniques so, that the model has a higher chance to predict the case and to increase the model accuracy. Generally, in the field of medical image recognition, it plays a vital role by undergoingsmall transformation in the existing data. The main reason to perform this task is to deal with small datasets, because sharing medical data is probably might cause privacy regulations.

In [None]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator
image_gen = ImageDataGenerator(
                                  rescale = 1./255,
                                  shear_range = 0.2,
                                  zoom_range = 0.2,
                                  horizontal_flip = True
                               )

test_data_gen = ImageDataGenerator(rescale = 1./255)

In [None]:
train = image_gen.flow_from_directory(
      train_path,
      target_size=(img_height, img_width),
      color_mode='rgb',
      class_mode='binary',
      batch_size=batch_size
      )

test = test_data_gen.flow_from_directory(
      test_path,
      target_size=(img_height, img_width),
      color_mode='rgb',
      shuffle=False, 
      class_mode='binary',
      batch_size=batch_size
      )

valid = test_data_gen.flow_from_directory(
      valid_path,
      target_size=(img_height, img_width),
      color_mode='rgb',
      class_mode='binary', 
      batch_size=batch_size
      )

In [None]:
plt.figure(figsize=(12, 12))

for i in range(0, 10):
    plt.subplot(2, 5, i+1)
    for X_batch, Y_batch in train:
        image = X_batch[0]        
        dic = {0:'NORMAL', 1:'PNEUMONIA'}
        plt.title(dic.get(Y_batch[0]))
        plt.axis('off')
        plt.imshow(np.squeeze(image),cmap='gray',interpolation='nearest')
        break
        
plt.tight_layout()
plt.show()

# 3. Model-1: Convolutional Neural Network Model from Scratch (CNN_model)

In [None]:
from tensorflow.keras import layers
from tensorflow import keras
from keras.models import Model
from tensorflow.keras.models import Sequential, load_model
from tensorflow.keras.layers import Dense,Conv2D,Flatten,MaxPooling2D
from tensorflow.keras.callbacks import EarlyStopping,ReduceLROnPlateau

CNN_Model=Sequential()

#Image Feature Extraction:(Block: 1 - Block: 3)
#Block-1
CNN_Model.add(layers.Conv2D(16, (3, 3), activation="relu", input_shape=(img_width, img_height, 3)))
CNN_Model.add(layers.MaxPooling2D(pool_size = (2, 2)))

#Block-2
CNN_Model.add(layers.Conv2D(32, (3, 3), activation="relu", input_shape=(img_width, img_height, 3)))
CNN_Model.add(layers.MaxPooling2D(pool_size = (2, 2)))
CNN_Model.add(layers.Conv2D(32, (3, 3), activation="relu", input_shape=(img_width, img_height, 3)))
CNN_Model.add(layers.MaxPooling2D(pool_size = (2, 2)))

#Block-3
CNN_Model.add(layers.Conv2D(64, (3, 3), activation="relu", input_shape=(img_width, img_height, 3)))
CNN_Model.add(layers.MaxPooling2D(pool_size = (2, 2)))
CNN_Model.add(layers.Conv2D(64, (3, 3), activation="relu", input_shape=(img_width, img_height, 3)))
CNN_Model.add(layers.MaxPooling2D(pool_size = (2, 2)))

#Final Layer(Classification/prediction)
CNN_Model.add(layers.Flatten())
CNN_Model.add(layers.Dense(activation = 'relu', units = 128))
CNN_Model.add(layers.Dense(activation = 'relu', units = 64))
CNN_Model.add(layers.Dense(activation = 'sigmoid', units = 1))
CNN_Model.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])

In [None]:
CNN_Model.summary()

##  3.2. Fitting the Model(CNN)

In [None]:
early = EarlyStopping(monitor='val_loss', mode='min', patience=3)
learning_rate_reduction = ReduceLROnPlateau(monitor='val_loss', patience = 2, verbose=1,factor=0.3, min_lr=0.000001)
callbacks_list = [ early, learning_rate_reduction]

In [None]:
from sklearn.utils.class_weight import compute_class_weight

weights = compute_class_weight(
                               'balanced', 
                               classes=np.unique(train.classes), 
                               y=train.classes
                               )
cw = dict(zip(np.unique(train.classes), weights))
print(cw)

In [None]:
CNN_Model.fit(train, epochs=50, validation_data=valid, class_weight=cw, callbacks=callbacks_list)

## 3.3. Evaluation(CNN)

In [None]:
pd.DataFrame(CNN_Model.history.history).plot()

In [None]:
test_accu_CNN = CNN_Model.evaluate(test)
print('The testing accuracy is :',test_accu_CNN[1]*100, '%')

In [None]:
preds = CNN_Model.predict(test,verbose=1)

In [None]:
predictions = preds.copy()
predictions[predictions <= 0.5] = 0
predictions[predictions > 0.5] = 1

In [None]:
from sklearn.metrics import classification_report,confusion_matrix

cm = pd.DataFrame(data=confusion_matrix(test.classes, predictions, labels=[0, 1]),index=["Actual Normal", "Actual Pneumonia"],
columns=["Predicted Normal", "Predicted Pneumonia"])
sns.heatmap(cm,annot=True,fmt="d")

In [None]:
print(classification_report(y_true=test.classes,y_pred=predictions,target_names =['NORMAL','PNEUMONIA']))

In [None]:
test.reset()
x=np.concatenate([test.next()[0] for i in range(test.__len__())])
y=np.concatenate([test.next()[1] for i in range(test.__len__())])
print(x.shape)
print(y.shape)

dic = {0:'NORMAL', 1:'PNEUMONIA'}
plt.figure(figsize=(14, 14))
for i in range(0+228, 9+228):
  plt.subplot(3, 3, (i-228)+1)
  if preds[i, 0] >= 0.5: 
      out = ('{:.2%} probability of being Pneumonia case'.format(preds[i][0]))
      
      
  else: 
      out = ('{:.2%} probability of being Normal case'.format(1-preds[i][0]))
  plt.title(out+"\n Actual case : "+ dic.get(y[i]))    
  plt.imshow(np.squeeze(x[i]))
  plt.axis('off')
plt.show()

#  4. Model-2: Transfer Learning (TL_Model)

In [None]:
base_model = tf.keras.applications.ResNet152V2(
    weights='imagenet',
    input_shape=(img_height, img_width, 3),
    include_top=False)
base_model.trainable = False

def get_pretrained():
    inputs = layers.Input(shape=(img_height, img_width, 3))
    x = base_model(inputs)
    
    # Head
    x = layers.GlobalAveragePooling2D()(x)
    x = layers.Dense(128, activation='relu')(x)
    x = layers.Dense(64, activation='relu')(x)
   
    
    #Final Layer (Output)
    output = layers.Dense(1, activation='sigmoid')(x)
    model = keras.Model(inputs=[inputs], outputs=output)
    
    return model                                        

In [None]:
keras.backend.clear_session()

TL_Model = get_pretrained()
TL_Model.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])

TL_Model.summary()

##  4.1. Fitting the Model(TL)

In [None]:
TL_Model.fit(train,epochs=50, validation_data=valid, class_weight=cw, callbacks=callbacks_list)

## 4.2. Evaluation(TL)

In [None]:
pd.DataFrame(TL_Model.history.history).plot()

In [None]:
test_accu_TL = TL_Model.evaluate(test)
print('The testing accuracy is :',test_accu_TL[1]*100, '%')

In [None]:
preds = TL_Model.predict(test,verbose=1)

In [None]:
cm = pd.DataFrame(data=confusion_matrix(test.classes, predictions, labels=[0, 1]),index=["Actual Normal", "Actual Pneumonia"],
columns=["Predicted Normal", "Predicted Pneumonia"])
sns.heatmap(cm,annot=True,fmt="d")

In [None]:
print(classification_report(y_true=test.classes,y_pred=predictions,target_names =['NORMAL','PNEUMONIA']))

In [None]:
test.reset()
x=np.concatenate([test.next()[0] for i in range(test.__len__())])
y=np.concatenate([test.next()[1] for i in range(test.__len__())])
print(x.shape)
print(y.shape)

dic = {0:'NORMAL', 1:'PNEUMONIA'}
plt.figure(figsize=(14, 14))
for i in range(0+228, 9+228):
  plt.subplot(3, 3, (i-228)+1)
  if preds[i, 0] >= 0.5: 
      out = ('{:.2%} probability of being Pneumonia case'.format(preds[i][0]))    
  else: 
      out = ('{:.2%} probability of being Normal case'.format(1-preds[i][0]))
  plt.title(out+"\n Actual case : "+ dic.get(y[i]))    
  plt.imshow(np.squeeze(x[i]))
  plt.axis('off')
plt.show()

#  5. Model-3: Fine Tuning (FT)

Fine Tuning Technique is the third type of approach in this project. FT is the most efficient and accurate technique because of its flexibility as mentioned in above section 3.4. In this model every aspect is similar to the TL model (Model-2), only change is that by unfreezinglast few layers of feature extraction part rest everything is similar. This small change will bring a great result in the model prediction because, of retraining the last few layers of feature learning layers as shown in Figure-28. The Pretrained model is also same as performed in the Model-2 which is Resnet152V2.

In [None]:
base_model.trainable = True

# Freeze all layers except for the
for layer in base_model.layers[:-15]:
    layer.trainable = False

In [None]:
TL_Model.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])

TL_Model.summary()

**Calculation of parameters for FT Technique:**

The Model-3 summary of layers will be same as the Model-2 because it is the same layers that are using now but, in this technique, the last 15 layers are Unfreeze. By this changethe number of trainable and non-trainable parameters in this model will change. All the parameters of Dense layers also do not change as clearly seen in the Figure-29. Total trainable parameters in this technique are 5,789,953 whereas in the Model-2 are 270,593. Analyse in depth:

* Trainable parameters = (Dense layer’s parameters) + (retrained parameters from pretrained model)
  = 270,593 + 5,519,360
  = 5,789,953
* Non-Trainable parameters = (ResNet152V2 parameters) - (retrained parameters from pretrained model) 
  = 58,331,648 - 5,519,360
  = 52,812,288
* Total parameters = (Trainable parameters) + (Non-Trainable Parameters) 
  = 5,789,953 + 52,812,288
  = 58,602,241


## 5.1. Fitting the Model(FT)

In [None]:
TL_Model.fit(train,epochs=50, validation_data=valid, class_weight=cw, callbacks=callbacks_list)

## 5.2. Evaluation(FT)

In [None]:
pd.DataFrame(TL_Model.history.history).plot()

In [None]:
test_accu_FT = TL_Model.evaluate(test)
print('The testing accuracy is :',test_accu_FT[1]*100, '%')

In [None]:
preds = TL_Model.predict(test,verbose=1)

In [None]:
cm = pd.DataFrame(data=confusion_matrix(test.classes, predictions, labels=[0, 1]),index=["Actual Normal", "Actual Pneumonia"],
columns=["Predicted Normal", "Predicted Pneumonia"])
sns.heatmap(cm,annot=True,fmt="d")

In [None]:
print(classification_report(y_true=test.classes,y_pred=predictions,target_names =['NORMAL','PNEUMONIA']))

In [None]:
test.reset()
x=np.concatenate([test.next()[0] for i in range(test.__len__())])
y=np.concatenate([test.next()[1] for i in range(test.__len__())])
print(x.shape)
print(y.shape)

dic = {0:'NORMAL', 1:'PNEUMONIA'}
plt.figure(figsize=(14, 14))
for i in range(0+228, 9+228):
  plt.subplot(3, 3, (i-228)+1)
  if preds[i, 0] >= 0.5: 
      out = ('{:.2%} probability of being Pneumonia case'.format(preds[i][0]))    
  else: 
      out = ('{:.2%} probability of being Normal case'.format(1-preds[i][0]))
  plt.title(out+"\n Actual case : "+ dic.get(y[i]))    
  plt.imshow(np.squeeze(x[i]))
  plt.axis('off')
plt.show()

#  6. Final Accuracy of Model-1 (CNN model from Scratch), Model-2 (TF) and Model-3 (FT)

In [None]:
print('1. The testing accuracy of Model-1 (CNN model from Scratch) is :',test_accu_CNN[1]*100, '%')
print('2. The testing accuracy of Model-2 (Transfer Learning) is :',test_accu_TL[1]*100, '%')
print('3. The testing accuracy of Model-3 (Fine Tuning) is :',test_accu_FT[1]*100, '%')