# Cerebri Tumor Deprehensio

a Radically Enhanced Artifical-Intelligence Based Solution for Detecting the Brain Tumor from X-Ray Images of Brain

<img src="https://www.drugs.com/health-guide/images/3c7b18a8-c266-455c-b552-dd660d9fac50.jpg" width=400>

## Overview of the Contents
### 1. Aim of the Project
### 2. What is Brain Tumor
### 3. Exploration of the Dataset
### 4. Global Configurations and Functions
### 5. Import The Necessary Libraries
### 6. Loading the Dataset
### 7. The Theory Behind Convolutional Neural Networks
### 8. Building the Model
### 9. Training the Model
### 10. Saving the Model
### 11. Plotting the Model
### 12. Model Evaluation
#### 12.1 Plotting Accuracy
#### 12.2 Plotting Loss
### 13. Conclusion



# Step 1. Aim of the Project

__The aim of this project is to train a neural network with thousands of Brain X-ray images of people with Brain Tumor and make computer to learn what Brain Tumor is and how to detect it. The high accuracy, speed and efficiency of the artificial intelligence program can be utilized in the diagnosis of Brain Tumor.__
<br><br>
__Because while it may take many years to train and be specialized a doctor, it may take only a few days or even hours to train and specialize an Artificial Intelligence model. Therefore, training and using a deep learning model in the diagnosis of the disease saves a great deal of time.__
<br><br>
__Also in recent studies, it has been determined that a trained artificial intelligence model can diagnose diseases accurately in a shorter time and at a higher rate than doctors. This shows that the accuracy of the machines is more reliable.__

<img src="https://miro.medium.com/max/1180/1*WCYlOskUZ3dXFbgbXC0ZHg@2x.png" width="450">

# Step 2. What is Brain Tumor?

**A brain tumor is a mass or growth of abnormal cells in your brain.**

Many different types of brain tumors exist. Some brain tumors are noncancerous (benign), and some brain tumors are cancerous (malignant). Brain tumors can begin in your brain (primary brain tumors), or cancer can begin in other parts of your body and spread to your brain as secondary (metastatic) brain tumors.

How quickly a brain tumor grows can vary greatly. The growth rate as well as the location of a brain tumor determines how it will affect the function of your nervous system.

Brain tumor treatment options depend on the type of brain tumor you have, as well as its size and location.

<img src="https://static.dw.com/image/36871896_101.jpg" width=600>

# Step 3. Exploration of the Dataset

<img src = "https://thumbs.dreamstime.com/b/film-ray-brain-tumor-my-mom-bangkok-thailand-film-ray-brain-tumor-my-mather-bangkok-thailand-146994631.jpg" width=700>


This dataset consists of Brain X-Ray Images of Healthy and Tumurous brains. There are 5000 X-Ray Images. The X-Ray Images are categorized inside folders as "Healthy" and "Brain Tumor". Some Images are RGB, some are mono-color. Additionally, images vary in dimensions sometimes.

# Step 4. Global Configurations and Functions

## Configurations

Configurations are not necessary to utilize, however, it is best practise to control the flow of the neural network from only one source.
That is why Cerebri Tumor Deprehensio has a global configuration system.
A lot of internal stuff about the neural network can be changed only from the configurations panel.

<img src="https://www.serverwatch.com/wp-content/uploads/2021/09/SW.ConfigurationManagement-scaled.jpg" alt="configuration-image" width="700"/>

In [None]:
"""
    THE CONFIGURATIONS PANEL
"""

# These are the configurations that affect the whole kernel. 
# These variables and configurations are used by a lot of functions. 

imageSize = 150 # this neural network will work with this image size
imageSizeAsTuple = (imageSize, imageSize)
absoulutePathDataset = "/kaggle/input/brian-tumor-dataset/Brain Tumor Data Set/Brain Tumor Data Set/"
# Had the training done in a local environemnt, this URL would change a little.

# Step 5. Importing the Necessary Libraries


## The Libraries and Tools Used

<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/2/2d/Tensorflow_logo.svg/1200px-Tensorflow_logo.svg.png" width="70">
<img src="https://keras.io/img/logo.png" width="140">
<img src="https://matplotlib.org/3.4.3/_static/logo2_compressed.svg" width="150">




In [None]:
# Tensorflow
import tensorflow as tf

# Keras
from tensorflow import keras
from keras.preprocessing.image import ImageDataGenerator
from keras.layers import Conv2D, MaxPool2D, LeakyReLU, BatchNormalization, Dropout, Dense, InputLayer, Flatten
from keras.losses import BinaryCrossentropy
from tensorflow.keras.optimizers import Adam
from keras.utils.vis_utils import plot_model
from keras import utils, callbacks
from keras.models import Sequential

# Mat Plot Library
import matplotlib.pyplot as plt

# Step 6. Loading the Dataset

![loading-dataset](https://us.123rf.com/450wm/transylvania/transylvania1811/transylvania181105279/112397982-network-database-flat-white-icons-on-round-color-backgrounds-17-background-color-variations-are-incl.jpg?ver=6)

The images are structured in a containing folder. There must be imported, in other words, loaded into the memory. Keras' ImageDataGenerator class provides a method called "flow from directory", namely, that allows researchers to load images from a specific directory __with their labels__. It's important.

In [None]:

# Determining the Train-Validation Split Ration
## Generally at least 15 percent is recommended for neural network to learn greatly.
trainValidationSplitRatio = 22 / 100
normalizationFactor = 1.0 / 255
zoomRange = ( 1-0.01, 1-0.01 )



# ImageDataGenerator generates a tf.data.Dataset from image files in a directory
generator = ImageDataGenerator(
    rescale= normalizationFactor,
    validation_split = trainValidationSplitRatio,
    dtype = tf.float32, 
    zoom_range = zoomRange )


# Rescaling value is provided as 1 / 255 in other words, each pixel is normalized by being divided to 255.

# Data type is chosen as Floating data type.
# Zoom Range is assigned, it means data is augmented by making random zooms in this range.

**Bear in mind that, by providing a split ratio to the generator, train-validation split is handled.****

**Also keep in mind that, NORMALIZATION IS DONE IN THIS STEP AS WELL.**

Another point, **Data Augmentation** to some degree is done in this step as well, by providing **zoom range** for example.

The ImageDataGenerator class has three methods flow(), flow_from_directory() and flow_from_dataframe() to read the images from a big numpy array and folders containing images.

The directory must be set to the path where your ‘n’ classes of folders are present. <br/>
The target_size is the size of your input images, every image will be resized to this size.<br/>
**color_mode**: if the image is either black and white or grayscale set “grayscale” or if the image has three color channels, set “rgb”.<br/>
**batch_size**: No. of images to be yielded from the generator per batch.<br/>
**class_mode**: Set “binary” if you have only two classes to predict, if not set to“categorical”, in case if you’re developing an Autoencoder system, both input and the output would probably be the same image, for this case set to “input”.<br/>
**shuffle**: Set True if you want to shuffle the order of the image that is being yielded, else set False.<br/>
**seed**: Random seed for applying random image augmentation and shuffling the order of the image.<br/>

In [None]:
batchCoefficientTraining = 128
batchCoefficientValidation = 4
# batch_size states number of images to be yielded from the generator per batch.

In [None]:
randomSeed = 1997

In [None]:
RGB_COLORS = "rgb"
# RGB Color Format found out to be better choice in this project. Because it has simply more data in it.

In [None]:
# Load the training data from the given location.

train = generator.flow_from_directory(absoulutePathDataset,
                               target_size = imageSizeAsTuple,
                               subset = "training", # generate the training data from these folders
                               batch_size = batchCoefficientTraining,
                               class_mode = "binary",
                               color_mode = RGB_COLORS,
                               shuffle = True,
                               seed = randomSeed)


# Here is the training data.

In [None]:
validation = generator.flow_from_directory(absoulutePathDataset,
                               target_size = imageSizeAsTuple,
                               subset = "validation",  # generate the validation data from these folders
                               batch_size = batchCoefficientValidation,
                               class_mode = "binary", # binary classification is done
                               color_mode = RGB_COLORS,
                               shuffle = True,
                               seed = randomSeed)
# Here is the validation data.

# Step 7. The Theory of Convolutional Neural Networks

Convolutional neural networks are used for image classification mainly.
Typically, convolutional neural networks are super similar to the artifical neural networks. 
The only difference is the fact that CNNs work with images mainly, that is why they preprocess images (in convolution layers) 
before sending processed pixels to the ANN.

<br>

## Architecture of Convolutional Neural Networks (CNN)

<br>

<img src="https://miro.medium.com/max/2510/1*vkQ0hXDaQv57sALXAJquxA.jpeg" width="950">

<br>

In the figure above, an overview of a CNN is seen. It is just a representation of CNNs, does not reflect the neural network we've developed here.

### What is an Image?

Inputs to CNNs are IMAGES. Images are numbers, in other words, array of pixels. <br>
In simple terms, IMAGES ARE MATRICES. <br>

Each pixel has an RGB value between 0-255. <br>
 
<img src="https://previews.123rf.com/images/papulov74/papulov741802/papulov74180200002/96082800-extremely-closed-shot-of-lcd-tv-rgb-pixel.jpg" width="700">

### Test and Train Split

<img src = "https://miro.medium.com/max/1400/1*-8_kogvwmL1H6ooN1A1tsQ.png" width="650">

__In not just Convolutional Neural networks but also Artificial Neural networks as well, a split method is observed. It is generally known as TEST-TRAIN SPLIT.__

__It refers to splitting the overall dataset into two parts, not equally, generally in a way that test part is not more than 15% of the entire dataset, in order to be able to TEST THE ACCURACY, EFFICIENT, LOSS OF THE NEURAL NETWORK.__

__Consider the scenerio where a split is not performed and the neural network is fed with all the dataset, in this case, it would be IMPOSSIBLE TO TEST THE NEURAL NETWORK.__

_To sum up, Test-Train split is performed, to split the data into two parts, and it avoids overfitting._


## Normalization

<img src="https://miro.medium.com/max/1083/1*onZIiGguLfbUYs3aTtmijg.jpeg" width="500">

In a typical neural network, trillions of mathematicals operations are done. 

From derivatives to multiplications. 

If we use values between 0-255, it would slow the neural network down. So it is best to rearrange the values between 0-1. 

This is called NORMALIZATION.


## Reshape

<img src="https://deeplearningtricks.files.wordpress.com/2021/11/proper-reshape-1.jpg?w=768" width="600">

The same logic as normalization. An image might have more than one million pixels. 

Consider example 1920 x 1080. It means 2 million pixels. 

The NN can’t be trained with this much of pixel values. 

Therefore, we lose some pixel information, we lose some details of the image by reshaping it.

## Label Encoding

<img src="https://www.mertmekatronik.com/uploads/images/2020/10/image_750x_5f8c7d06319f9.jpg" width="400">

Labels are strings. But computers work with numbers. Therefore we convert STRING LABELS TO NUMBERS.

Keras does this in a similar manner like BYTE-ARRAY.

For label 2, it uses [0,0,1,0,0,0,0,0]

## Patterns and Pattern Recognition

Objects in real life, consist of small parts, such as circles, points, lines, edges. 

Consider a square, for example, it consists of 4 different lines.

Consider an eye, for example, it consists of tons of circles and very small lines.

If, HOWEVER, we could ever figure out a way TO DISTINGUISH THOSE SMALL PARTS, WE WOULD BE ABLE TO DISTINGUISH BIGGER OBJECTS as well.


## Convolutions and Convolution Operation

Convolution Operation is basically FEATURE DETECTORS.

We apply convolution kernels(matrices) to the image. It can be a 3x3 matris for example.

By applying these kind of matrices, we are able to detect basic primitive structures.

In other words, THIS IS FEATURE DETECTING.

![CONVOLUTION-OPERATION](https://miro.medium.com/max/1400/1*ROh_38pysewuh6fVPQpxFQ.png)

### Feature detector detects features, like edges or convex shapes. 


## Applying Activation Function ReLU During Convolution Operation

Apply an activation function, better apply ReLU(Rectified Linear Units), to break up linearity.

![relu](https://miro.medium.com/max/1400/1*XxxiA0jJvPrHEJHD4z893g.png)

As seen negative values are gone after ReLU. Linearity is broken.

Non-linearity is DESIRED in AI. Because if linearity continues, all the operations would be affected by each other and there would be no mean to apply tons of operations.

By appling ReLU, linearity is broken. 


## Padding

After applying convolution operation, the size of the image IS REDUCED.

This is not something desired during convolution operation. 

Therefore, we apply PADDING to the image,

i.e. Size of the image is first INCREASED, and then by convolution operation it is reduced TO THE SAME SIZE AS BEFORE.

That is why called SAME PADDING.

By appling same padding, LOSS OF INFORMATION IS AVOIDED.

![padding](http://xrds.acm.org/blog/wp-content/uploads/2016/06/Figure_3.png)


## Pooling

We can’t keep all the information in the image till the artificial neural network. That would make a vector of hundreds of thousands of pixel information and that would make it impossible to train the ANN.

That is why we need to choose representatives. This means we divide the image into pools, and choose a value to represent that pool,
in the end, even though some pixel information is lost, this would make it real to train the ANN.

This is called DOWN-SAMPLING.

Techniques such as Max Pooling, Average Pooling etc. can be applied.

<img src="https://media.geeksforgeeks.org/wp-content/uploads/20190721025744/Screenshot-2019-07-21-at-2.57.13-AM.png"  width="700">


## Flattening

Artifical Neural Networks work with vectors, that is why we need to FLATTEN, 

in other words, convert the 2D, 3D matrix into 1D Vector. 

And then we can feed it to ANN.

![flattening](https://sds-platform-private.s3-us-east-2.amazonaws.com/uploads/73_blog_image_1.png)


## Full Connection Layer (Artifical Neural Network)

After processing the image with convolution operations and pooling operations, we have a vector that represent our image, 1D vector,

this will get into an Artifical Neural Network, to make predictions.


**Input Layer**: Input is 1D vector, that represents our image.

**Hidden Layers**: Similar to the nature of human brains, that is extremely complex and made up of convolutions and layers. 
The more hidden layers, the better the Neural Network.

**Output Layer**: The prediction, made by the Neural Network

![full-connection](https://sds-platform-private.s3-us-east-2.amazonaws.com/uploads/74_blog_image_1.png)


In this operation, all the neurons above are connected to each other. And all the Convolutional Neural Network is done.

## Applying Dropout

Dropout is a technique to break connections between some neurons that are randomly selected.

The aim is to make the Artifical Neural Network (ANN) more stable and to avoid memorizing the data. 

Mathematical counterpart is ignoring some neurons while making a forward-propagation.

![dropout](https://3.bp.blogspot.com/-W4llqbhI44U/VjUuHTUVFmI/AAAAAAAABxA/gyyuO9CA-tsGdrqpLXNMoqMoIAT45l2MACPcBGAYYCw/s1600/Droput.jpg)

# Step 8. Building the Model

In [None]:
# This is the actual brain of the this project, hence the most important part.

def buildTheModel(inputShape):
    # this function builds the convolutional neural network for that can later be trained accordingly
    
    ## -- CONFIGURATIONS PANEL OF THE NEURAL NETWORK --
    # Configurations for this convolutional neural network, if need be, configurations can be changed directly from this panel
    ACTIVATION =  "relu"
    PADDING = "same"
    SIGMOID = "sigmoid"
    LOSS = "binary_crossentropy"
    ACCURACY = "accuracy"
    
    model = Sequential() # Create a CNN model
    
    # -- CONVOLUTION LAYERS --  (Feature Detectors)
        
    ## Convolution Layer 1
    model.add(InputLayer( input_shape = inputShape ))
    model.add(Conv2D(filters = 32, # Number of filters
                     kernel_size = 3, # Dimension of Filter Matrix, specifying the height and width of the 2D convolution window
                     padding = PADDING, # Padding operation
                     activation = ACTIVATION)) # Break up linearlity after filtering
    
    model.add(Dropout(0.10))
    
    # Batch normalization applies a transformation that maintains the mean output close to 0 and the output standard deviation close to 1.
    model.add(BatchNormalization( axis = -1,
                                 momentum = 0.99,
                                 epsilon = 0.001,
                                 center = True,
                                 scale = True,
                                 beta_initializer = "zeros",
                                 gamma_initializer = "ones",
                                 moving_mean_initializer = "zeros",
                                 moving_variance_initializer = "ones",
                                 beta_regularizer = None,
                                 gamma_regularizer = None,
                                 beta_constraint = None,
                                 gamma_constraint = None))
    
    
    # Max pooling is the best pooling technique to apply. Get the maximum of the matrix.
    model.add( MaxPool2D( pool_size = (2, 2),
                         strides = None,
                         padding = 'valid'))
    
    
    ## Convolution Layer 2
    model.add(Conv2D(filters = 64, # Number of filters
                     kernel_size = 3, # Dimension of Filter Matrix
                     padding = PADDING, # Padding operation
                     activation = ACTIVATION)) # Break up linearlity after filtering  
    
    model.add(Dropout(0.15))
    
    model.add(BatchNormalization( axis = -1,
                                 momentum = 0.99,
                                 epsilon = 0.001,
                                 center = True,
                                 scale = True,
                                 beta_initializer = "zeros",
                                 gamma_initializer = "ones",
                                 moving_mean_initializer = "zeros",
                                 moving_variance_initializer = "ones",
                                 beta_regularizer = None,
                                 gamma_regularizer = None,
                                 beta_constraint = None,
                                 gamma_constraint = None))

    model.add( MaxPool2D( pool_size = (2, 2),
                         strides = None,
                         padding='valid'))
    
    
    ## Convolution Layer 3
    model.add(Conv2D(filters = 32, # Number of filters
                     kernel_size = 3, # Dimension of Filter Matrix
                     padding = PADDING, # Padding operation
                     activation = ACTIVATION)) # Break up linearlity after filtering     
    
    model.add( MaxPool2D( pool_size = (2, 2),
                         strides = None,
                         padding='valid'))
    
    
    model.add(Flatten())

    # THE HIDDEN LAYER
    model.add(Dense(units = 156, # 156 Neurons as input
                    activation = ACTIVATION)) # ReLU is for breaking linearity.
    model.add(BatchNormalization())
    model.add(Dropout(0.25)) # the last dropout
    
    model.add(Dense(64, activation = ACTIVATION))
    model.add(BatchNormalization())
    model.add(Dropout(0.35)) 
    
    model.add(Dense(units = 1 , activation = SIGMOID))
    # At the end of the neural network, sigmoid is chosen as an activation function 
    # becuase only two different kinds of output exist. Brain Tumor or Healthy.
    
    # Compilation of the Model
    model.compile(optimizer = Adam( learning_rate = 0.002, beta_1 = 0.9, beta_2 = 0.999, epsilon = 1e-07, amsgrad = False), 
                  loss = BinaryCrossentropy(), 
                  metrics = [ACCURACY])
    
    print("Model is successfully created.")
    
    return model

**Learning Rate Optimizer**

**While doing back propagation, learning reate is one of the most critical factors while learning, (i.e. taking derivative of the cost function), if learning rate is high, the neural network might not be able to learn, if it is too low, it might not learn at all.**

In this project, Adam Optimizer is chosen to be used. 
Adam optimization is a stochastic gradient descent method that is based on adaptive estimation of first-order and second-order moments.

<img src="https://machinelearningmastery.com/wp-content/uploads/2017/05/Comparison-of-Adam-to-Other-Optimization-Algorithms-Training-a-Multilayer-Perceptron.png">

In [None]:
# Compute the Input Shape
numberofColorChannels = 3
# number of color channels is 3 because this neural networks works with RGB colors
inputShape = (imageSize, imageSize, numberofColorChannels)


model = buildTheModel(inputShape) # Compile and Build the model accodingly

## Summary of the Model

In [None]:
model.summary()


# Step 9. Training the Model

All the steps until now are all about configurations about the neural network, now is the time for training the neural network.

__After this step, it is expected that the neural network succesfully learns the weights and biases that are highyl accurate.__

<img src="https://ml4a.github.io/images/figures/connection_tweak.png" width="500">

##### Some key concepts about the training process:

### Epochs
__the number of complete passes through the training dataset__

### Batch Size
__a number of samples processed before the model is updated__

<img src="https://miro.medium.com/max/1010/1*AOiD8LEDWrWy5l_f9qgweQ@2x.jpeg" width="450">

In [None]:
# fit() method on model of Keras Library actually trains the neural networks.

# Configurations About Training

BATCH_SIZE = 16
EPOCHS = 12 # the optimal value is 20 - 24.

## Early Stopping

An **Early Stopping Mechanism** better be set up. 
What Early Stopping Basically does is **stop training when a monitored metric has stopped improving.**

<img src="https://miro.medium.com/max/973/1*nhmPdWSGh3ziatQKOmVq0Q.png" width="700">

In [None]:
MONITOR = "val_loss" # be monitoring validation loss, when it is beyond the level not suitable, terminate
# The goal of a training is to minimize the loss.

MODE = "min" # minimize the monitoring value

PATIENCE_THRESHOULD = 4 
# patience: Number of epochs with no improvement after which training will be stopped.

earlyStopping = callbacks.EarlyStopping( monitor = MONITOR, 
                                        mode = MODE, 
                                        patience = PATIENCE_THRESHOULD, 
                                        restore_best_weights = True,
                                        baseline = None)

In [None]:
history = model.fit( train, 
                    verbose = 1, 
                    callbacks = [earlyStopping], 
                    epochs = EPOCHS, 
                    validation_data = (validation) )


# Step 10. Saving the Model

Neural networks are generally designed to be served as Software as a Service or Artifical Intelligence as a Service,
with this purpose, the __trained neural network, in other words weights and biases__ MUST BE SAVED SOMEWHERE.

That's where Saving the model practises come in.

__Models are saved as .h5 file.__

<br>

<img src = "https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcQaVyBUJ1_lBDzaCDK1oI-Zqr8CzeX6hNzyHQ&usqp=CAU" width="100">

In [None]:
def savetheModel(model):
    # this function will save the model locally
    model.save("brain-tumor-detection-model.h5")

    
savetheModel(model)

# Step 11. Plotting the Model

In this step, the built Convolutional Neural Network Model will be plotted by Keras' Plot Model utility. 

In [None]:
plot_model( model, 
           to_file = 'brain-tumor-detection-neural-net.png', 
           show_shapes = True, 
           show_layer_names = True,
           rankdir = "TB",
           expand_nested = False,
           dpi = 96,
           layer_range = None)

# Step 12. Model Evaluation

Model evaluation is the process of using different evaluation metrics to understand a machine learning model's performance, as well as its strengths and weaknesses. Model evaluation is important to assess the efficacy of a model during initial research phases, and it also plays a role in model monitoring.

In this project, **validation data** is used as test data. Therefore, we know the model's accuracy.

## Step 12.1 Plotting Accuracy


In [None]:
# Draw the Model's Accuracy Graph with respect to epoch 

plt.plot( history.history['accuracy'], label = 'accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.ylim( [0, 1] )
plt.legend(loc='lower right')

## Step 12.2 Plotting Loss

In [None]:
# Draw the Model's Loss Graph with respect to epoch 

plt.plot( history.history['loss'], label='loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.ylim( [0, 1] )
plt.legend( loc = 'lower right' ) 

As clearly seen in the graph, the loss is minimized.

# Step 13. Conclusion

__In conclusion, it is extremely essential to build up intelligent systems for a variety of purposes. In this case, a neural network for healthcare issues has been developed.__

<br>

__Most imporantly, Let's focus on why it is crucial to build those intelligence systems. Doctors are humans, and they are prone to errors. In other industries errors are at worst a loss of money, but in healthcare, a mistake might result in death. That is the main reason, we, humans, must develop these intelligent healthcare systems.__

<br>
This project is actually a PROOF-OF-CONCEPT. It means that it reflects parts in the future.

__Finally, let's not forget, deep learning is the solution.__