<div style="color:white;
           display:fill;
           border-radius:25px;
           background-color:Purple;
           font-size:210%;
           font-family:Verdana;
           letter-spacing:0.5px">
<p style="padding: 10px;
          color:white;
          text-align:center;"
          >
       WELCOME TO MY NOTEBOOK
</p>
</div>

# About Dataset: Brain Tumors 256

In this notebook, we are going to classify 4 different types of Brain Tumors.The input data is in the form of images and our goal is to classify image into their corresponding target label.

![](https://www.pacificneuroscienceinstitute.org/wp-content/uploads/Brain-Tumor-IORT-banner.jpg)

Below is the description of 4 different kind of Brain Tumors.

1. **Glioma Tumor**: Glioma is a type of brain tumor that originates from the glial cells in the brain. It can be cancerous and typically requires medical intervention, as it can cause a range of neurological symptoms and pose serious health risks.

2. **Meningioma Tumor**: Meningioma is another type of brain tumor, but it develops from the meninges, the protective membranes around the brain. Meningiomas are generally non-cancerous, and treatment options can vary depending on their size and location.

3. **Normal**: Normal refers to a healthy state without the presence of tumors or significant medical conditions. A normal brain has no abnormal growths or diseases and functions as expected.

4. **Pituitary Tumor**: A pituitary tumor is a growth that occurs in the pituitary gland, a small but critical gland in the brain. These tumors can impact hormone regulation, leading to various symptoms. 



Thank you for exploring my notebook. Please take a moment to upvote my notebook. Your support motivates me to keep improving and sharing valuable insights.😊

# Import all the libraries

In [None]:
# Import all the libraries
import os 
import numpy as np
import pandas as pd
import tensorflow as tf
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.metrics import accuracy_score,precision_score,recall_score,f1_score,ConfusionMatrixDisplay,classification_report


# Exploratory Data Analysis

In [None]:
# lets take the image file from each directory
glioma_tumor="/kaggle/input/brain-tumors-256x256/Data/glioma_tumor"
glioma_files = os.listdir(glioma_tumor)

meningioma_tumor="/kaggle/input/brain-tumors-256x256/Data/meningioma_tumor"
meningioma_files = os.listdir(meningioma_tumor)

normal="/kaggle/input/brain-tumors-256x256/Data/normal"
normal_files = os.listdir(normal)


pituitary_tumor="/kaggle/input/brain-tumors-256x256/Data/pituitary_tumor"
pituitary_files= os.listdir(pituitary_tumor)

# Visualize the number of images in each directory

In [None]:
# declaring data 
data = [len(glioma_files), len(meningioma_files), len(normal_files),len(pituitary_files)] 
keys = ['glioma_tumor', 'meningioma_tumor', 'normal', 'pituitary_tumor'] 

# define Seaborn color palette to use 
palette_color = sns.color_palette('bright') 

# plotting data on chart 
plt.pie(data, labels=keys, colors=palette_color,autopct='%.0f%%', shadow=True) 
plt.legend(loc='center right',  bbox_to_anchor=(1.6, 0.5))

# displaying chart 
plt.show() 


# Lets create the Training data 

In [None]:
# Lets create the Training data 
image_directory="/kaggle/input/brain-tumors-256x256/Data"
batch_size = 32
img_height = 224
img_width = 224
train_data = tf.keras.utils.image_dataset_from_directory(image_directory,
                                                         validation_split=0.2,
                                                         seed=123,
                                                         subset="training",
                                                         image_size=(img_height, img_width),
                                                         batch_size=batch_size)

# Print the name of four different classes 

In [None]:
# Print the name of four different classes 
class_names =train_data.class_names
print(class_names)

In [None]:
# Lets check how many batches are created in training data
total_images_train_data=2477
num_batches=total_images_train_data/32
num_batches

In [None]:
len(train_data)

> Here we can see that there are total 78 batches are created and each batch contain 32 images 

# Lets Check the shape and label of image in first batch of training data

In [None]:
# Lets check the shape and label of single image in first batch of training data
for image_batch, labels_batch in train_data.take(1):
    print(f"Shape of the image in one batch is: {image_batch.shape}")
    print(f"Total labels contained in one batch:{labels_batch.shape}")
    print(f"Visualise the single image in form of tensors:{image_batch[0]}")
    print(f"lable of image is:{class_names[labels_batch[0]]}")
    

So, here we can see that the batch size is 32 and shape of image is (224,224,3) and there are 32 corresponding labels.

# Create the Validation data 

In [None]:
# Lets create the Validation data 
val_data = tf.keras.utils.image_dataset_from_directory(image_directory,
                                                       validation_split=0.2,
                                                       subset="validation",
                                                       seed=123,
                                                       image_size=(img_height, img_width),
                                                       batch_size=batch_size)

# Lets Visualize the images

In [None]:
# See the images of tumor
plt.figure(figsize=(14, 14))
for images, labels in train_data.take(1):
    for i in range(20):
        ax = plt.subplot(5, 4, i + 1)
        plt.imshow(images[i].numpy().astype("uint8"))
        plt.title(class_names[labels[i]])
        plt.axis("off")

# Transfer Learning-- We use the pretrained model ResNet50 architecture to train the model

In [None]:
# Create base model with tf.keras.applications
base_model =tf.keras.applications.resnet.ResNet50(include_top=False)


# Freeze the base model (so the pre-learned patterns remain same)
base_model.trainable = False


# Create inputs into the base model
inputs = tf.keras.layers.Input(shape=(224, 224,3), name="input_layer")
x = base_model(inputs)


# Check data shape after passing it to base_model
print(f"Shape after base_model: {x.shape}")


# Average pool the outputs of the base model (aggregate all the most important information, reduce number of computations)
x = tf.keras.layers.GlobalAveragePooling2D(name="global_average_pooling_layer")(x)



x=tf.keras.layers.Flatten()(x)
x=tf.keras.layers.Dense(512, activation="relu")(x)
x=tf.keras.layers.Dropout(0.5)(x)

# Create the output activation layer
outputs=tf.keras.layers.Dense(4, activation="softmax",name="output_layer")(x)

# Combine the inputs with the outputs into a model
model= tf.keras.Model(inputs, outputs, name="model")



# Compile the model
model.compile(loss=tf.keras.losses.SparseCategoricalCrossentropy(), # different loss function for multiclass classifcation
              optimizer=tf.keras.optimizers.Adam(lr=0.001),
              metrics=["accuracy"]
             )


# Callbacks--> Stop training automatically once the model performance stop improving
learning_rate_reduction=tf.keras.callbacks.ReduceLROnPlateau(monitor="val_loss",patience=2,factor=0.5, min_lr=0.00001,verbose =1)
Early_Stopping= tf.keras.callbacks.EarlyStopping(monitor="val_loss",patience=5, restore_best_weights=True)

# Fit the model
history1= model.fit(train_data, epochs=20, validation_data= val_data,callbacks=[Early_Stopping,learning_rate_reduction], verbose=1)


# Model Summary

In [None]:
model.summary()

# Lets Evaluate the Model

In [None]:
loss, accuracy= model.evaluate(val_data)
print(f"loss is:{loss}")
print(f"accuacy is:{accuracy}")

# Lets see the History of Resnet50 Model

In [None]:
history_resnet50= pd.DataFrame(history1.history)
history_resnet50.head()

# Lets Plot the Loss and Accuracy Curves

In [None]:
plt.figure(figsize=(15,5))
plt.subplot(1,2,1)
plt.plot(history_resnet50['loss'],label='Train_Loss')
plt.plot(history_resnet50['val_loss'],label='Validation_Loss')
plt.title('Train_Loss and Validation_Loss',fontsize=15)
plt.legend()


plt.subplot(1,2,2)
plt.plot(history_resnet50['accuracy'],label='Train_Accuracy')
plt.plot(history_resnet50['val_accuracy'],label='Validation_Accuracy')
plt.title('Train_Accuracy and Validation_Accuracy',fontsize=15)
plt.legend()
plt.show()

# Lets make the Predictions

In [None]:
x_val=[]
y_val=[]
y_pred=[]

for images, labels in val_data:
    x_val.extend(images.numpy())
    y_val.extend(labels.numpy())
    

predictions=model.predict(np.array(x_val))
for i in predictions:
    y_pred.append(np.argmax(i))

# 

# Lets Visualize the Actual images and Predicted images

In [None]:
plt.figure(figsize=(32, 32))

for i in range(32):
    ax = plt.subplot(8, 4, i + 1)
    plt.imshow(x_val[i].astype("uint8"))
    actual_label = class_names[y_val[i]]
    predicted_label = class_names[y_pred[i]]
    
    # Check if the actual and predicted labels are the same
    if actual_label == predicted_label:
        label_color = 'green'
    else:
        label_color = 'red'
    
    plt.title(f'Actual: {actual_label} \n Predicted: {predicted_label}', color=label_color)
    plt.axis("off")

plt.show()


# Lets Visualize the Precision,Recall,F1_Score

In [None]:
f1_scores=[]
precision_scores=[]
recall_scores=[]

f1_scores.append(f1_score(y_pred, y_val, average="macro"))
precision_scores.append(precision_score(y_pred, y_val, average="macro"))
recall_scores.append(recall_score(y_pred, y_val, average="macro"))


# Print the Results
print(f"F1-Score:{f1_scores}")
print(f"Precision:{precision_scores}")
print(f"Recall:{recall_scores}")



print("Classification_Report")
print("-----------------------")
print(classification_report(y_val,y_pred))
print("Confusion_Matrix")
print("----------------------")
ConfusionMatrixDisplay.from_predictions(y_val, y_pred, display_labels=class_names, xticks_rotation="vertical")
plt.show()

In [None]:
from sklearn.metrics import confusion_matrix

# Get the confusion matrix as a DataFrame
dataframe= pd.DataFrame(confusion_matrix(y_val, y_pred), columns=class_names, index=class_names)

# Print the DataFrame
print(dataframe)

In [None]:
dataframe.to_csv("tumor.csv")

# Lets see the Incorrect Ratio for each class

In [None]:
stat_dataframe=pd.read_excel("/kaggle/input/tumor-result/tumors_stat.xlsx")
columns=stat_dataframe.columns.to_list()
columns[0]="Tumor_Name"
stat_dataframe.columns=columns
stat_dataframe

# Lets Visualize the Incorrect Ratio

In [None]:
plt.figure(figsize=(5,5))
sns.barplot(stat_dataframe, x="Tumor_Name", y="Incorrect Ratio")
plt.xticks(rotation=45)
plt.show()

1. Here we can see that the Glioma_tumor has highest incorrect ratio as compare to other tumor classes. 
2. According to National Institutes of Health, Glioblastoma can mimic meningioma on MRI with dural tail sign, CSF cleft sign, and broad dural contact. Moreover,cerebral angiography can reveal tumor feeders commonly associated with meningioma. These features can contribute to diagnostic confusion.
https://pubmed.ncbi.nlm.nih.gov/27565475/#:~:text=Conclusions%3A%20Glioblastoma%20can%20mimic%20meningioma,can%20contribute%20to%20diagnostic%20confusion.

>  That is why there is higher incorrect ratio of Glioma_Tumor and the model predict Glioma_tumor as meningioma_tumor, but there is no mistake in model predictions.

In [None]:
model_name=["ResNet50"]
result_df= pd.DataFrame({"F1_Score":f1_scores, "Precision_Score":precision_scores, "Recall_Score":recall_scores}, index=model_name)
result_df= result_df.T.sort_values(by="ResNet50", ascending=False)
result_df

In [None]:
result_df.plot(kind="bar",figsize=(5,5), color="red").legend(bbox_to_anchor=(1.5,1))

> 1. In this case, **Recall gets higher importance than the Precision**. Beacuse,if our model predicts the False Neagtive which means the person has not a Brain Tumor, but actually it has a Brain Tumor, then it will become more dangerous to the health of person.Hence,by minimizing the False Negative we can get higher Recall.
2. Recall is defined as Out of Total Prediceted Labels how many Did we predict correctly.
3. **Recall = True Positive/(True Positive + False Negative)**