# Model building

There are several top notch deep learning frameworks that would suffice for this task; but in this case we'll be using __Tensorflow__ from __Google__.  

In [1]:
# Import necessary packages.
import tensorflow as tf
from tensorflow.keras import layers, models
tf.get_logger().setLevel("ERROR")

# Important constants.
IMAGE_SIZE = 256
BATCH_SIZE = 8
CHANNELS = 3
EPOCHS = 100

In [2]:
# Load the dataset with the Tensorflow data pipeline.
dataset = tf.keras.preprocessing.image_dataset_from_directory(
    r"C:\Users\ifunanyaScript\Everything\BrainTumour_DiagnosisApp\data\clean_dataset",
    shuffle = True,
    image_size = (IMAGE_SIZE, IMAGE_SIZE),
    batch_size = (BATCH_SIZE),
)

Found 3000 files belonging to 2 classes.


The dataset is loaded as batches specified by the "batch_size" parameter. In this case, 3000 images batched into sizes of 8. Thus, 375 batches and that would be the length of the dataset.

In [3]:
len(dataset)

375

When one wants to train a model with a particular dataset, the normal practice is to split said dataset into separate chunks; particularly a training chunk and a testing chunk. The purpose of splitting the data is for evaluation purposes after training the model.  
Most times, ML folks use __scikit-learn's__ ___train_test_split___. In this case, we'll use the ___take___ and ___skip___ attributes of the dataset object which allows us to grab a portion of the dataset by batches.  

The training chunk is always a lot larger that the validation chunk and testing chunk, so that the model is trained on as much data as possible. We'll set aside for training, validation and testing; 80%, 10%, 10% respectively.

In [4]:
train_size = 0.8
print(f"The training dataset should be {int(len(dataset)*train_size)} batches")

The training dataset should be 300 batches


In [5]:
# Takes first 300 batches of the dataset
train_ds = dataset.take(300)
len(train_ds)

300

In [6]:
# Remaining data after the training chunk.
remnant = dataset.skip(300)
len(remnant)

75

In [7]:
val_size = 0.1
print(f"The validation dataset should be {int(len(dataset)*val_size)} batches")

The validation dataset should be 37 batches


In [8]:
# Takes the first 37 batch of the remaining data.
val_ds = remnant.take(37)
len(val_ds)

37

In [9]:
# Takes all the data after the first 37 batches.
test_ds = remnant.skip(37)
len(test_ds)

38

All of these snippets can be wrapped in a simple function.

In [10]:
def chunking(dataset, train_split=0.8, validation_split=0.1, test_split=0.1, shuffle=True, buffer=1000):
    """
    The purpose of this function is to split a dataset into the necessary chunks and return
    said chunks accordingly.
    
    A dataset is passed as an argument and the partitions are made with the predefined split sizes.
    One can also alter the split sizes by changing the values while calling the function.
    """
    
    number_of_batches = len(dataset)
    
    if shuffle:
        dataset = dataset.shuffle(buffer, seed=321)
    
    train_size = int(train_split*number_of_batches)
    validation_size = int(validation_split*number_of_batches)
    
    train_ds = dataset.take(train_size)
    val_ds = dataset.skip(train_size).take(validation_size)
    test_ds = dataset.skip(train_size).skip(validation_size)
    
    return train_ds, val_ds, test_ds

In [11]:
train_ds, val_ds, test_ds = chunking(dataset)
print(f"""The dataset of 94 batches was chunked as follows: 
{len(train_ds)} batches for the Training dataset, 
{len(val_ds)} batches for the Validation dataset and, 
{len(test_ds)} batches for the Testing dataset.""")

The dataset of 94 batches was chunked as follows: 
300 batches for the Training dataset, 
37 batches for the Validation dataset and, 
38 batches for the Testing dataset.


In [12]:
train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size=tf.data.AUTOTUNE)
val_ds = val_ds.cache().shuffle(1000).prefetch(buffer_size=tf.data.AUTOTUNE)
test_ds = test_ds.cache().shuffle(1000).prefetch(buffer_size=tf.data.AUTOTUNE)

Caching is used to improve data retrieval performance by reducing the need to access underlying storage.  
In short, the dataset is cached in memory. This reduces training time because there will be no need open files and read images during each epoch.. The next epochs will reuse the data cached by the cache transformation.

In [13]:
# Layers for data_augmentation, resizing and rescaling.
data_augmentation = tf.keras.Sequential([
    layers.experimental.preprocessing.RandomContrast(0.5),
    layers.experimental.preprocessing.RandomFlip("horizontal_and_vertical"),
    layers.experimental.preprocessing.RandomRotation(0.3),
    layers.experimental.preprocessing.RandomZoom(0.4)
])

resize_and_rescale = tf.keras.Sequential([
    layers.experimental.preprocessing.Resizing(IMAGE_SIZE, IMAGE_SIZE),
    layers.experimental.preprocessing.Rescaling(1.0/255)
])

As always, feature engineering/extraction is an essential step to training a reliably accurate model. However, one might ask how do you extract features when training a CNN. Well, the convolution filters in CNN hovers through the entire image and extracts features from the image. This is actually a rather simple mathematical process if you consider it.   
We'll stack several convolutional layers together to grab enough features as possible. Also reducing dimensionality is very important so as to keep the trainable parameters(weights) concise and reduce training time, hence, we use Maxpooling.  
After the convolutional layers we'll add a fully connected dense layer containing an arbitrary number of neurons and a final output layer activated by a sigmoid function.

In [14]:
input_shape = (BATCH_SIZE, IMAGE_SIZE, IMAGE_SIZE, CHANNELS)

model = models.Sequential([
    resize_and_rescale,
    data_augmentation,
    layers.Conv2D(32, (3, 3), activation="relu", padding="same", input_shape=input_shape),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(32, (3, 3), activation="relu", padding="same"),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation="relu", padding="same"),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation="relu", ),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation="relu"),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation="relu"),
    layers.MaxPooling2D((2, 2)),
    layers.Flatten(),
    layers.Dense(64, activation="relu"),
    layers.Dropout(0.2),
    layers.Dense(1, activation="sigmoid")
])

# Initialises the model.
model.build(input_shape=input_shape)

In [15]:
model.summary()

Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 sequential_1 (Sequential)   (8, 256, 256, 3)          0         
                                                                 
 sequential (Sequential)     (None, 256, 256, 3)       0         
                                                                 
 conv2d (Conv2D)             (None, 256, 256, 32)      896       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 128, 128, 32)     0         
 )                                                               
                                                                 
 conv2d_1 (Conv2D)           (None, 128, 128, 32)      9248      
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 64, 64, 32)       0         
 2D)                                                  

Yowza!!!<br> 
We have to train over 155,900 weights.<br>
This will take a pretty long time using a CPU.<br>
Patience is a virtue.

As we know, backward propagation is one of two fundamental steps in training an __NN__. We'll use Adam optimizer for backward propagation.<br>
Since it is binary classification __i.e__, no tumour[0] or tumour[1], we'll use BinaryCrossentropy to calculate the loss of the model's prediction compared to the actual labels.<br>
Then we can track our model's performance with an accuracy metric.


__NB:__ Backward propagation is a method sort of differencial calculus where we calculate the gradient of the loss function with respect to the parameters(weights and biases). Then we update said weights and biases according to minimise the loss.

In [16]:
model.compile(
    optimizer="adam",
    loss = tf.keras.losses.BinaryCrossentropy(from_logits=False),
    metrics=["accuracy"]
)

We train the model by fitting it to the training dataset and we validate its performance with the validation dataset.

In [17]:
history = model.fit(
    train_ds,
    epochs=EPOCHS,
    batch_size=BATCH_SIZE,
    verbose=1,
    validation_data=val_ds
)

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100


Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78/100
Epoch 79/100
Epoch 80/100
Epoch 81/100
Epoch 82/100
Epoch 83/100
Epoch 84/100
Epoch 85/100
Epoch 86/100
Epoch 87/100
Epoch 88/100
Epoch 89/100
Epoch 90/100
Epoch 91/100
Epoch 92/100
Epoch 93/100
Epoch 94/100
Epoch 95/100
Epoch 96/100
Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/100


We'll carry out a final evaluation of the model to see its general performance.<br>
We'll do this using the test dataset; a dataset it has not seen pri.

In [24]:
scores = model.evaluate(test_ds)



In [26]:
model.save(r"C:\Users\ifunanyaScript\Everything\BrainTumour_DiagnosisApp\models\1")

