<div style="background-color:orange;">
    <h1><center>What is Pneumothorax?</center></h1>
</div>

* A pneumothorax can be defined as air in the pleural cavity. This occurs when there is a breach of the lung surface or chest wall which allows air to enter the pleural cavity and consequently cause the lung to collapse.

* Pneumothorax can be caused by a blunt chest injury, damage from underlying lung disease, or most horrifying—it may occur for no obvious reason at all. On some occasions, a collapsed lung can be a life-threatening event.

* Pneumothorax is usually diagnosed by a radiologist on a chest x-ray, and can sometimes be very difficult to confirm. An accurate AI algorithm to detect pneumothorax would be useful in a lot of clinical scenarios. AI could be used to triage chest radiographs for priority interpretation, or to provide a more confident diagnosis for non-radiologists.


<div style="background-color:orange;">
    <h1><center>Importing Libraries</center></h1>
</div>

In [None]:
import numpy as np 
import pandas as pd 
from pathlib import Path
import os.path
import matplotlib.pyplot as plt
import seaborn as sns
import os
import cv2

from keras.preprocessing.image import load_img
from keras.utils import to_categorical
from keras.models import Model
from keras.layers import BatchNormalization, Dense, GlobalAveragePooling2D,Lambda, Dropout, InputLayer, Input
from tensorflow import keras
from keras.applications import Xception
from keras.applications.xception import preprocess_input
from keras.callbacks import EarlyStopping
from keras.models import Sequential

<div style="background-color:orange;">
    <h1><center>Importing The Dataset</center></h1>
</div>

In [None]:
train_img_path = '../input/pneumothorax-binary-classification-task/small_train_data_set/small_train_data_set'
labels = pd.read_csv(r'../input/pneumothorax-binary-classification-task/train_data.csv')

Pneumothorax small dataset contains 2027 images medical images of lungs done by radiologist during chest x-ray of the patients.

In [None]:
labels.head()

In [None]:
#drop unnecessary columns
labels.drop(['Unnamed: 0','Unnamed: 0.1'],axis=1,inplace=True)

In [None]:
print(f'Number of pictures in the training dataset: {labels.shape[0]}\n')
print(f'Number of different labels: {len(labels.target.unique())}\n')
print(f'Labels: {labels.target.unique()}')

<div style="background-color:orange;">
    <h1><center>Data Visualization</center></h1>
</div>

In [None]:
plt.figure(figsize=(20,40))
i=1
for idx,s in labels.head(6).iterrows():
    img_path = os.path.join(train_img_path,s['file_name'])
    img=cv2.imread(img_path)
    img=cv2.cvtColor(img,cv2.COLOR_BGR2RGB)
    fig=plt.subplot(6,2,i)
    fig.imshow(img)
    fig.set_title(s['target'])
    i+=1

In [None]:
#Extracting different classes
classes = sorted(labels['target'].unique())
n_classes = len(classes)
print(f'number of class: {n_classes}')

In [None]:
classes_to_num = dict(zip(classes,range(n_classes)))

<div style="background-color:orange;">
    <h1><center>Converting Images to Array</center></h1>
</div>

In [None]:
#Function to load and convert images to array

def images_to_array(data_dir,df,image_size):
    image_names = df['file_name']
    image_labels = df['target']
    data_size = len(image_names)
    
    X = np.zeros([data_size,image_size[0],image_size[1],image_size[2]],dtype = np.uint8)
    y = np.zeros([data_size,1],dtype = np.uint8)
    
    for i in range(data_size):
        img_name = image_names[i]
        img_dir = os.path.join(data_dir,img_name)
        img_pixels = load_img(img_dir,target_size=image_size)
        X[i] = img_pixels
        y[i] = classes_to_num[image_labels[i]]
        
    y = to_categorical(y)
    ind = np.random.permutation(data_size)
    X = X[ind]
    y = y[ind]
    print('Ouptut Data Size: ', X.shape)
    print('Ouptut Label Size: ', y.shape)
    return X, y  


In [None]:
#Selecting image size according to pretrained models
img_size = (299,299,3)
X, y = images_to_array(train_img_path,labels,img_size)

<div style="background-color:orange;">
    <h1><center>Extracting features using Xception</center></h1>
</div>

In [None]:

def get_features(model_name, data_preprocessor,weight, input_size, data):
    #Prepare pipeline.
    input_layer = Input(input_size)
    preprocessor = Lambda(data_preprocessor)(input_layer)
    
    base_model = model_name(weights=weight,
                            include_top=False,
                            input_shape=input_size)(preprocessor)
    
    avg = GlobalAveragePooling2D()(base_model)
    feature_extractor = Model(inputs = input_layer, outputs = avg)
    
    #Extract feature.
    feature_maps = feature_extractor.predict(data, batch_size=128, verbose=1)
    print('Feature maps shape: ', feature_maps.shape)
    return feature_maps

In [None]:
#Extracting features using Xception
Xception_preprocessor = preprocess_input
Xception_features = get_features(Xception,
                                  Xception_preprocessor,
                                 '../input/keras-pretrained-models/Xception_NoTop_ImageNet.h5',
                                  img_size, X)

<div style="background-color:orange;">
    <h1><center>Model Building</center></h1>
</div>

In [None]:
#Callbacks
EarlyStop_callback = EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True)
my_callback=[EarlyStop_callback]

In [None]:
#Adding the final layers to the above base models where the actual classification is done in the dense layers
#Building Model
model = Sequential()
model.add(InputLayer(Xception_features.shape[1:]))
model.add(Dropout(0.3))
model.add(Dense(2,activation='sigmoid'))

model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['AUC'])
model.summary()

# Training the CNN on the Train features and evaluating it on the val data
history = model.fit(Xception_features,y,validation_split=0.20,callbacks=my_callback, epochs = 50, batch_size=128)

In [None]:
# summarize history for loss
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'validation'], loc='upper left')
plt.show()

In [None]:
# summarize history for AUC
plt.plot(history.history['auc'])
plt.plot(history.history['val_auc'])
plt.title('model AUC')
plt.ylabel('AUC')
plt.xlabel('epoch')
plt.legend(['train', 'validation'], loc='upper left')
plt.show()

<div class="alert alert-warning">
<h4>If you like this notebook, please upvote it! 
     Thank you! :)</h4>
</div>