# Modifiable Convolutional Neural Network

This notebook can be used for testing and benchmarking different solutions for CNN's for different image datasets. The images can have variance in sizes and the parameters are easily controlled. Notice that by default, the images will be converted to grayscale.

## Set up

The structure of your dataset directory should look like this: (labels represents directories)

<br>
<p>parent directory</p>
            <p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|---> class 1</p>
            <p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|---> class 2</p>
            <p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|---> class ..</p>
            <p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|---> class n</p>

Child directories must have the name of the classes you want to predict.
<br>

Variables to assign:
- directory - String representation of path to the parent directory of the dataset.
- labels - A list of all the different classes you want to predict (same name as the child directories in the data set folder).
- image_width - The width, in pixels, all images will be scaled to.
- image_height - The height, in pixels, all images will be scaled to.
- perc_test - The percentage of the dataset that will be used for evaluationg your model.

Your python interpreter should have the following modules installed:
- NumPy
- tensorflow or tensorflow-gpu
- OpenCV


## How to use

Set up the parameters discussed above in the first code cell, then run it. This will create a uniform version of your dataset and it will be saved as a pickle file. If you want to make changes to your data set, these two cells must be executed again.

In the second code cell, you assign a desirable name of the model. To prevent appending models to the same file, each filename will have a timestamp. Then you can set up various parameters for your training as well as setting up layers as you like in your neural network.
This cell can then be run and the accuracy as well as loss will be printed out for each epoch and the final test. You can do further investigation of your model using TensorBoard (or other callbacks with slight refactoring of the script).

The final code cell is used for testing your model on a specific image. This cell needs to be given a path to your image.

### How open your model TensorBoard:
1. Open a terminal with your python environment.
2. cd to the folder this notebook is saved in.
3. type tensorboard --logdir=logs/
4. Open the link which the console window outputs in a browser.



# Prepare dataset

In [None]:
import cv2
import os
#import matplotlib.pyplot as plt
import numpy
import random
import pickle
import math

# Set up
directory = r'C:/Users/Mattias/Documents/parent directory' # Path to dataset directory here (parent of the image folders)
labels = ['Cat', 'Dog'] # List of the classes you want to predict
image_width = 100 # The width all images will be scaled to
image_height = 100 # The height all images will be scaled to
perc_test = 20 # The percentage of the dataset that will be used for evaluationg your model


training_data = []

def create_training_data():
    for label in labels:
        path = os.path.join(directory, label) # point correct label to correct folder
        class_number = labels.index(label)
        for img in os.listdir(path):
            try:
                img_array = cv2.imread(os.path.join(path, img), cv2.IMREAD_GRAYSCALE)
                img_array = cv2.resize(img_array, (image_width, image_height))
                training_data.append([img_array, class_number])
            except Exception as e:
                print(e)

create_training_data()
random.shuffle(training_data)

X = []
y = []

for feature, label in training_data:
    X.append(feature)
    y.append(label)

X = numpy.array(X).reshape(-1, image_width, image_height, 1)
training_x = X[: int(len(X) - perc_test * len(X) / 100.0)]; test_x = X[int(len(X) - perc_test * len(X) / 100.0) :]
training_y = y[: int(len(X) - perc_test * len(X) / 100.0)]; test_y = y[int(len(X) - perc_test * len(X) / 100.0) :]

pickle_out = open("training_x.pickle","wb")
pickle.dump(training_x, pickle_out)
pickle_out.close()

pickle_out = open("test_x.pickle","wb")
pickle.dump(test_x, pickle_out)
pickle_out.close()

pickle_out = open("training_y.pickle","wb")
pickle.dump(training_y, pickle_out)
pickle_out.close()

pickle_out = open("test_y.pickle","wb")
pickle.dump(test_y, pickle_out)
pickle_out.close()

# Build, evaluate and save model

In [None]:
import tensorflow
from tensorflow.keras import layers
from tensorflow.keras import models
from tensorflow.keras.callbacks import TensorBoard
import time

# Set up
model_name='test-{}'.format(time.time())
validation_split = 0.2
loss_function = 'sparse_categorical_crossentropy'
optimizer = 'adam'
epochs = 5
batch_size = 64

# gpu_options = tensorflow.GPUOptions(per_process_gpu_memory_fraction=0.333)
# sess = tensorflow.Session(config=tensorflow.ConfigProto(gpu_options=gpu_options))

model = models.Sequential()
model.add(layers.Conv2D(256, (3, 3), activation='relu', input_shape=(image_width, image_height, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(len(labels), activation='softmax'))

#model.summary()

tensorboard = TensorBoard(log_dir="logs/{}".format(model_name))

model.compile(loss=loss_function,
             optimizer=optimizer,
             metrics=['accuracy'])

# Load the pickle files:
pickle_in = open("training_x.pickle", "rb")
training_x = pickle.load(pickle_in)  # training images 
training_x = tensorflow.keras.utils.normalize(training_x, axis=1)
#training_x = training_x / 255.0

pickle_in = open("training_y.pickle", "rb")
training_y = pickle.load(pickle_in)  # training labels

pickle_in = open("test_x.pickle", "rb")
test_x = pickle.load(pickle_in)  # test images 
test_x = tensorflow.keras.utils.normalize(test_x, axis=1)
#test_x = test_x / 255.0

pickle_in = open("test_y.pickle", "rb")
test_y = pickle.load(pickle_in)  # test labels

model.fit(training_x, training_y, batch_size=batch_size, epochs=epochs, validation_split=validation_split, callbacks=[tensorboard])

test_loss, test_acc = model.evaluate(test_x, test_y)
print('Loss: {}, Accuracy: {}'.format(test_loss, test_acc))

model.save('{}.model'.format(model_name))

# Test model

In [None]:
# Set up
path = r'C:/Users/Mattias/Documents/image.jpg' # path to test image here

def prepare(path):
    img_array = cv2.imread(path, cv2.IMREAD_GRAYSCALE)
    img_array = cv2.resize(img_array, (image_width, image_height))
    return img_array.reshape(-1, image_width, image_height, 1)


model = tensorflow.keras.models.load_model(model_name + '.model')

predictions = model.predict([prepare(path)])
print(labels[numpy.argmax(predictions[0])])