# Binary Classification using Convolutional Neural Networks

An investigation into the effects that image augmentation has on the accuracy and loss of Convolutional Neural Networks. *This work was completed as part of dissertation project for Bachelor of Science (Honours) in Computer Science with specialism in Artificial Intelligence.*

---

## Import Libraries
Import all the necessary libraries for the project to run. Primarily using TensorFlow base and Keras to construct the architecture.
Also import supporting libraries like numpy for manipulating the arrays.

The `classifier_helpers` library is a collection of helper functions extracted to make the code easier to read.

In [None]:
from keras.callbacks import Callback, LearningRateScheduler
from keras.preprocessing.image import ImageDataGenerator
from keras.optimizers import Adam
from sklearn.model_selection import train_test_split
from keras.utils import to_categorical
from keras.models import Sequential
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.layers.core import Activation
from keras.layers.core import Flatten
from keras.layers.core import Dense
from keras import backend as K
import numpy as np
import random
import classifier_helpers as tools

## Configuration Variables for Experiments
Each of the variables controls a different area of the network structure. Collated here for easier control of changes between experiments.

`results_file_name` defines the name of the output files for results etc.

`dataset_path` points to the location of the dataset files (either images or arrays representing images).

`rotation_range` the maximum rotational range of an image in either positive or negative direction (max 180°).

`epochs` number of epochs to train the model for.

`initial_learning_rate` the learning rate for the network to start off with, changing this can affect how quickly the model converges on the solution.

`batch_size` number of samples to show the network before updating the network weights.

`decay_rate` the rate at which the learning rate should decay over time

`validation_dataset_size` percentage of the dataset to be used for testing, by default 75% goes to training and 25% goes to testing.

`random_seed` random number to be used for seed - for repeatability of dataset shuffles.

`image_depth` coloured images have three layers of depth.

`results_path` path to the results directory

`model_name` name of the model to be saved in the results file/model structure files

`plot_name` titles/names for the result plots

In [None]:
results_file_name = 'Batch-Size-2'
dataset_path = '../dataset/'
rotation_range = 0
epochs = 100
initial_learning_rate = 1e-5  # 1e-5
batch_size = 2
decay_rate = initial_learning_rate / epochs 
validation_dataset_size = 0.25
random_seed = 42
image_depth = 3

results_path = 'results/'
model_name = results_file_name + "-" + str(rotation_range)
plot_name = model_name


## Define Helper Functions
Additional functions used by the network, mainly controling decay rates for experiments with varying the decay of learning rates within the model.

In [None]:
def get_lr_metric(optimizer):
	def lr(y_true, y_pred):
		return optimizer.lr
	
	return lr

def stepDecay(epoch):
	dropEvery = 10
	initAlpha = 0.01
	factor = 0.25
	# Compute learning rate for current epoch
	exp = np.floor((1 + epoch) / dropEvery)
	alpha = initAlpha * (factor ** exp)
	
	return float(alpha)


## Build the Network Architecture
Compile the network architecture from individual layers. The function encapsulates the entire structure of the network which can be initialised as the model. It requires `width`, `height`, and `depth` values for the images it will be processing as well as the number of `classes` which it will be classifying. Binary classification requires that two classes are defined. (In this case benign and malignant samples).

In [None]:
def buildNetworkModel(width, height, depth, classes):
	model = Sequential()
	input_shape = (height, width, depth)
	
	# If 'channel first' is being used, update the input shape
	if K.image_data_format() == 'channel_first':
		input_shape = (depth, height, width)
	
	# First layer
	model.add(
		Conv2D(20, (5, 5), padding = "same", input_shape = input_shape))  # Learning 20 (5 x 5) convolution filters
	model.add(Activation("relu"))
	model.add(MaxPooling2D(pool_size = (2, 2), strides = (2, 2)))
	
	# Second layer
	model.add(Conv2D(50, (5, 5), padding = "same"))
	model.add(Activation("relu"))
	model.add(MaxPooling2D(pool_size = (2, 2), strides = (2, 2)))
	
	# Third layer - fully-connected layers
	model.add(Flatten())
	model.add(Dense(50))  # 500 nodes
	model.add(Activation("relu"))
	
	# Softmax classifier
	model.add(Dense(classes))  # number of nodes = number of classes
	model.add(Activation("softmax"))  # yields probability for each class
	
	# Return the model
	return model

## Load and Initialise the Dataset
Load the dataset, normally the program can load images into its memory, however that is time consuming so instead the images have been loaded and exported as arrays for quicker load times.
By using arrays, the dataset can be exported and loaded more efficiently within the Jupyter without the need to share the entire image library (~15GB).

The array containing the images and their respective labels are loaded. They are then combined (so that the labels correspond to the image) and the shuffled whilst the label and image remain related.

Randomised arrays are then split in accordance to the training-testing dataset split. By default it's set to 75% training and 25% testing. 

In [None]:
sorted_data = np.load('sorted_data_array.npy')
sorted_labels = np.load('sorted_labels_array.npy')
data = []
labels = []

combined = list(zip(sorted_data, sorted_labels))
random.shuffle(combined)
data[:], labels[:] = zip(*combined)

# Scale the raw pixel intensities to the range [0, 1]
data = np.array(data, dtype = "float") / 255.0
labels = np.array(labels)

test_set = int(validation_dataset_size * len(labels))
validation_dataset_labels = labels[-test_set:]

# Partition the data into training and testing splits
(train_x, test_x, train_y, test_y) = train_test_split(data, labels, test_size = test_set, random_state = random_seed)

# Convert the labels from integers to vectors
train_y = to_categorical(train_y, num_classes = 2)
test_y = to_categorical(test_y, num_classes = 2)

## Define Image Augmentation Generators
Image augmentation generators are defined here, they take an input image and apply the predefined augmentation method, in this case `rotation_range` is applied to any image effectively rotating it to a random degree within that range.

In [None]:
training_augmented_image_generator = ImageDataGenerator(rotation_range = rotation_range, fill_mode = "nearest")
testing_augmented_image_generator = ImageDataGenerator(rotation_range = rotation_range, fill_mode = "nearest")

## Compile the Network Model
Compile the network model using the predefined structure from the `buildNetworkModel` and apply the optimiser and learning rate metrics.
This is where we define the loss and accuracy metrics which are saved in the history dictionary.

In [None]:
print(tools.stamp() + "Compiling Network Model")

# Reducing the learning rate by half every 2 epochs
learning_rate_schedule = [LearningRateScheduler(stepDecay)]

# Build the model based on control variable parameters
model = buildNetworkModel(width = 64, height = 64, depth = image_depth, classes = 2)

# Set optimiser
optimiser = Adam(lr = initial_learning_rate)
lr_metric = get_lr_metric(optimiser)

# Compile the model using binary crossentropy, preset optimiser and selected metrics
model.compile(loss = "binary_crossentropy", optimizer = optimiser, metrics = ["accuracy", "mean_squared_error", lr_metric])
# Train the network
print(tools.stamp() + "Training Network Model")

## Save the Model
Completed model can be saved to the disk along with all statistics and graphs. 

In [None]:
# Save results of training in history dictionary for statistical analysis
history = model.fit_generator(
	training_augmented_image_generator.flow(train_x, train_y, batch_size = batch_size),
	validation_data = (test_x, test_y),
	steps_per_epoch = len(train_x) // batch_size,
	epochs = epochs,
	verbose = 1)

# Save all runtime statistics and plot graphs
tools.saveNetworkStats(history, epochs, initial_learning_rate, model_name, results_path)
tools.saveAccuracyGraph(history, plot_name, results_path)
tools.saveLossGraph(history, plot_name, results_path)
tools.saveLearningRateGraph(history, plot_name, results_path)
tools.saveModelToDisk(model, model_name, results_path)
tools.saveWeightsToDisk(model, model_name, results_path)