Welcome,
This is my 2nd Neural Network Project,
here we train a network in order to detect whether a certain MRI scan of the brain consists a tumor or not.

Through this project, I've utilised three datasets from kaggle and combined them into their appropriate classes, ending up with around 5000 images to work with(which would be even greater when the image gets augmented)

The model ends up with a best validation accuracy of 98.57%,showing that even with a diverse set of datasets, be it validation or training, the model works well.



Author: Skanda Vyas

In [None]:
import os
import zipfile
import tensorflow as tf
import random
from shutil import copy2
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from keras import regularizers
import shutil

I uploaded the files to my dropbox, so we upload them from there

In [None]:
!wget -O Tumors.zip https://www.dropbox.com/s/ymd7eg2mucnm4hp/kjjj.zip?dl=0
!wget -O Tumors2.zip https://www.dropbox.com/s/vchsqodcr4oid49/archive%20%285%29.zip?dl=0
!wget -O Tumors3.zip https://www.dropbox.com/s/d327vnslutc6c6z/archive%20%286%29.zip?dl=0

The files are installed as .zip , so we use the zipfile package in order to extract all the files into a new folder.

In [None]:
zip_ref = "/content/Tumors.zip"
ext = zipfile.ZipFile(zip_ref,'r')
ext.extractall("Project")
ext.close()

zip_ref2 = "/content/Tumors2.zip"
ext = zipfile.ZipFile(zip_ref2,'r')
ext.extractall("Project")
ext.close()

zip_ref3 = "/content/Tumors3.zip"
ext = zipfile.ZipFile(zip_ref3,'r')
ext.extractall("Project")
ext.close()


shutil.rmtree("/content/Project/brain_tumor_dataset")


A method used to create directories in order to store the images present in the datasets.

In [None]:
def create_directories(dir):
   tr = os.path.join(dir,"training")
   os.makedirs(tr,exist_ok=True)
   va = os.path.join(dir,"validation")
   os.makedirs(va,exist_ok=True)
   b = ["yes","no"]
   for a in b:
    os.makedirs(os.path.join(tr,a),exist_ok=True)
    os.makedirs(os.path.join(va,a),exist_ok=True)


Another method which splits the data from a given directory, into two other directories in a ratio of 4:1.

The method is used to split a bunch of images, such that some of them are copied to the training directory, and the rest of them are copied to the validation directory.

In [None]:
def split_data(SOURCE_DIR,TRAINING_DIR,VALIDATION_DIR):
    k = os.listdir(SOURCE_DIR)
    a = random.sample(k,len(k))
    count = 0
    #the 0.8, shows that 80% of the data goes to the trianing directory
    limit = len(a) * 0.8
    for image in a:
      if(count<=limit):
        copy2(os.path.join(SOURCE_DIR,image),TRAINING_DIR)
        count+=1
      else:
        copy2(os.path.join(SOURCE_DIR,image),VALIDATION_DIR)


In [None]:
create_directories("/content/")
split_data("/content/Project/yes","/content/training/yes","/content/validation/yes")
split_data("/content/Project/no","/content/training/no","/content/validation/no")
split_data("/content/Project/Brain_Tumor_Detection/yes","/content/training/yes","/content/validation/yes")
split_data("/content/Project/Brain_Tumor_Detection/no","/content/training/no","/content/validation/no")
#

Here, I use ImageDataGenerator in order to augment the images in order to train the model more effectively.

We also use flow_from_directory to resize the images to a particular size and provide images for the model

In [None]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(rescale = 1./255,
                                   rotation_range = 10,
                                   height_shift_range = 0.1,
                                   width_shift_range = 0.1,
                                   zoom_range = 0.2,
                                   horizontal_flip = True,
                                   brightness_range = (0.8,1.2),
                                   fill_mode = 'nearest'
                                   )

train_gen =train_datagen.flow_from_directory("/content/training",target_size=(200,200),class_mode='binary',batch_size = 16)
validation_datagen = ImageDataGenerator(rescale=1./255)
validation_gen = validation_datagen.flow_from_directory("/content/validation",target_size=(200,200),class_mode='binary',batch_size = 16)

Found 3091 images belonging to 2 classes.
Found 1139 images belonging to 2 classes.


This is the model used.

There are 4 Convolutional Layers(with L2 regularizers), followed by Pooling layers. This is followed by a flatten layer, followed by 3 Dense layers.

As there are only two outputs i.e. yes or no, we use 1 neuron in the last layer with a sigmoid activation.

In [None]:
import tensorflow as tf
model = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(32,(3,3),activation='relu',input_shape =(200,200,3),kernel_regularizer=regularizers.l2(0.01)),
    tf.keras.layers.MaxPooling2D(2,2),
    tf.keras.layers.Conv2D(64,(3,3),activation='relu',kernel_regularizer=regularizers.l2(0.01)),
    tf.keras.layers.BatchNormalization(axis=-1),
    tf.keras.layers.MaxPooling2D(2,2),
    tf.keras.layers.Conv2D(64,(3,3),activation='relu',kernel_regularizer=regularizers.l2(0.01)),
    tf.keras.layers.BatchNormalization(axis=-1),
    tf.keras.layers.MaxPooling2D(2,2),
    tf.keras.layers.Conv2D(64,(3,3),activation='relu',kernel_regularizer=regularizers.l2(0.01)),
    tf.keras.layers.BatchNormalization(axis=-1),
    tf.keras.layers.MaxPooling2D(2,2),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(1024,activation='relu'),
    tf.keras.layers.Dense(512,activation='relu'),
    tf.keras.layers.Dense(1,activation = 'sigmoid') ])



Here we compile it with the adam optimizer, and use the binary crossentropy loss function.

We train the model for a 100 epochs, and in batch sizes of 16.

In [None]:
model.compile(loss = 'binary_crossentropy',optimizer = 'adam',metrics = ["accuracy"])

model.fit(train_gen, epochs = 100,validation_data=validation_gen)

Here, you can upload images of MRI scans of the brain, through which the model predicts whether the images show a presence of a tumor or not.

In [None]:
import numpy as np
from google.colab import files
from tensorflow.keras.utils import load_img, img_to_array

uploaded = files.upload()

for fn in uploaded.keys():

  # predicting images
  path = '/content/' + fn
  img = load_img(path, target_size=(300, 300))
  x = img_to_array(img)
  x /= 255
  x = np.expand_dims(x, axis=0)

  images = np.vstack([x])
  classes = model.predict(images, batch_size=10)
  print(classes[0])

  if classes[0]>0.5:
    print(fn + " does not have a tumor present")
  else:
    print(fn + " has a tumor present")