This notebook demonstrates the functionality of our 3D generator class on Grayscale CT scans.

**Note:** 

* In this notebook an oversimplified CNN architecture is used as well as low number of epochs. Therefore, the main purpose is to provide a guideline on how to use the generator for data augmentation during training.

## Import Libraries

In [1]:
import numpy as np
import pandas as pd   

#libraries for deep learning
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Conv3D, MaxPool3D, Flatten, Dropout

# Load Data Info

In [2]:
info = pd.read_csv('ct_info.csv')
info.head()

Unnamed: 0,patient_id,scan_id,label
0,0,3131,1
1,1,3143,1
2,10,3152,1
3,1065,3104,1
4,1066,3105,1


In [3]:
info.label.value_counts()

3    19
2    19
1    19
Name: label, dtype: int64

 **Labels should start by 0** instead of 1 so we need to make the appropriate changes.

In [4]:
info['label'].replace({1:0, 2:1, 3:2},inplace=True)
info.label.value_counts()

2    19
1    19
0    19
Name: label, dtype: int64

## Split IDs to train and test test 80/20

In [5]:
train = info['patient_id'].sample(frac = 0.8).to_list()
validation = list(set(train)^set(info['patient_id']))

#make the train/test IDs a list
train = [str(i) for i in train]
validation = [str(i) for i in validation]

#create dictionary containing the train/test ID list
dictionary = {
    'train': train,
    'validation': validation}

In [6]:
#create a dictionary of the labels where key and values are charcters.
labels = pd.Series(info.label.values, index = info.patient_id.astype(str)).to_dict()

## Import our Generator

In [7]:
from Image3DGenerator import DataGenerator

In [8]:
# Parameters
params = {'dim': (100,100,100),
          'batch_size': 5,
          'n_classes': 3,
          'n_channels': 1,
          'rotation': True,
          'normalisation': True,
          'min_bound': 0,
          'max_bound': 1,
          'gaussian_noise': True,
          'noise_mean': 0,
          'noise_std': 0.01,
          'shuffle': True,
          'rotate_std':15,
          'path':'./new_data',
          'display_ID':False}

In [9]:
# Generators
training_generator = DataGenerator(dictionary['train'], labels, **params)
validation_generator = DataGenerator(dictionary['validation'], labels, **params)

# Create a very simple CNN for demonstration

**Note:** The architecture of the following CNN is not recommended. This is just for demonstration purposes.

In [10]:
no_epochs = 2
size3d = ( 100, 100, 100, 1)

In [11]:
model = Sequential()
#usually for a CNN the first layer that it encounters is a Convolutional Layer.
model.add(Conv3D(filters = 8, kernel_size = (4,4,4), strides = (2,2,2), padding = 'valid',
                #I also need to specify the input shape that it should expect
                #That is the shape of a single input image
                input_shape = size3d,
                kernel_initializer='he_uniform',
                #Finally we need to choose what activation function we want to use
                activation = 'relu'))

#After a convolutional layer we should have a POOLING LAYER
                    #pool size usually half of kernel size
model.add(MaxPool3D(pool_size = (2,2,2)))

model.add(Conv3D(filters = 8, kernel_size = (4,4,4), strides = (2,2,2), padding = 'valid',
                kernel_initializer = 'he_uniform',
                activation = 'relu'))
model.add(MaxPool3D(pool_size = (2,2,2)))

model.add(Flatten())


model.add(Dense(50, kernel_initializer='he_uniform',activation = 'relu'))

# Dropouts help reduce overfitting by randomly turning neurons off during training.
# Here we say randomly turn off 50% of neurons.
model.add(Dropout(0.5))


#final output layer
model.add(Dense(3, activation='softmax'))

model.compile(loss='categorical_crossentropy',
              optimizer = 'adam',
              metrics=['accuracy'])

In [12]:
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv3d (Conv3D)              (None, 49, 49, 49, 8)     520       
_________________________________________________________________
max_pooling3d (MaxPooling3D) (None, 24, 24, 24, 8)     0         
_________________________________________________________________
conv3d_1 (Conv3D)            (None, 11, 11, 11, 8)     4104      
_________________________________________________________________
max_pooling3d_1 (MaxPooling3 (None, 5, 5, 5, 8)        0         
_________________________________________________________________
flatten (Flatten)            (None, 1000)              0         
_________________________________________________________________
dense (Dense)                (None, 50)                50050     
_________________________________________________________________
dropout (Dropout)            (None, 50)                0

In [13]:
model.fit(x = training_generator,
          epochs= no_epochs, 
          validation_data= validation_generator)

Epoch 1/2
Epoch 2/2


<tensorflow.python.keras.callbacks.History at 0x7fe011998a58>