# Multi-Class Keras Image Classifier
Welcome to an example model training notebook classifying different dog breeds. Hopefully this will be suffficently documented but be sure to further reasearch terms to get a better understanding and better optimise it for your use.

# Keywords
<b>Before we get started</b> it's really important to understand some essential key words. They come up time and time again in machine learning and are essential for understanding what's happening and how to improve your designs.
<ul>
    <li><b style="font-size:20px;">Neural net (model):</b> Arguably the most important term to know. This is the artificial element that processes data and makes decisions. It can be thought of in a similar way to a biological neural network which refers to how the brain processes inputs through neurons to create responses. In machine learning we design models which outline the structure for neural nets and then through training, optimises its structure and connections to make it good at whatevers its been designed to do.</li>
    <li><b style="font-size:20px;">Overfitting:</b> Regards to models that are fitted too tightly to their training data, and are poor at adapting to unseen data. This is an important phenomenon to avoid as although you'll get a high accuracy score in training, your model will actually adapt poorly when used in reality. To avoid this; 
        <ul>
            <li><b>Diverse data:</b> Make sure you dataset is large and diverse. If your images have unrelated features that occur across the class (like all huskys are photographed in the snow) your model may link those features to what it determines a class by, and then not respond correctly in other cases. Large datasets are essential.</li>
            <li><b>Dropout:</b> One of the key structures in CNN's play a vital role in its design and helps reduce overfitting by removing connections to make the model more flexible</li>
        </ul>
    </li>
</ul>

<h2>Imports</h2>
<ul>
    <li><b>Keras:</b> The main library of use in this project. It's a machine learning library useful for designing neural nets and in this implementation works on top of tensorflow (google's machine learning library)</li>
    <li><b>Scikit-learn (sklearn):</b> Supports some of the mathmatical operations needed and provides helpful tools</li>
    <li><b>NumPy:</b> Provides array and mathmatical strucutres to python</li>
    <li><b>pickle:</b> Used for serialising (encoding) objects to save to disk</li>
    <li><b>opencv (cv2):</b> Computer vision library used for image manipulation</li>
    <li><b>os:</b> Standard python library for checking file directories</li>
    <li><b>tqdm:</b> Library that prints out loop progress (optional if removed from used cells)</li>
</ul>  
You'll also note <b>neuralNetStructure</b>. This is a separate python file containing the neural net design used in our model. This can be modified to help improve your system, but make sure your understand the structures involved. Read through it when you've been through and understand this notebook

In [1]:
# import the necessary packages
from keras.preprocessing.image import ImageDataGenerator
from keras.optimizers import Adam
from keras.preprocessing.image import img_to_array
from sklearn.preprocessing import MultiLabelBinarizer
from sklearn.model_selection import train_test_split
import numpy as np
import pickle
import cv2
import os
import tqdm

from neuralNetStructure import NeuralNetStructure

ModuleNotFoundError: No module named 'keras'

<h2>Important variables</h2>
<ul>
    <li><b>Epochs:</b> See definition above</li>
    <li><b>Initial learning rate (init_lr):</b> The amount weights are stepped (updated) during training.</li>
    <li><b>Batch size (bs):</b> Number of sampls used per iteration</li>
    <li><b>Image dimensions (image_dims):</b> Refers to the structure each image will be converted to with the values mapping to width, height and depth. Depth refers to the colour of the image with 3 being rgb colour (red,green,blue = 3) but may also sometimes be 1 (grayscale) as this is not important to the model</li>

In [2]:
EPOCHS = 15
INIT_LR = 1e-3
BS = 32
IMAGE_DIMS = (96, 96, 3)

Define the folder containing your image folders in the <b>dataFolder</b> variable.<br>
<b>data</b> is the array which will hold the images converted to array format. <br>
<b>labels</b> is the array holding the corresponding class name (breed) for each of these images

In [3]:
dataFolder = "dataset"
data = []
labels = []

<h2>Converting images</h2>
<p>Computers don't understand images but they do understand numbers. Before a machine learning system can work with images it must convert them into their numeric format (values of each pixel). They also need to be standardaised into the same format to be fed into the model</p>
<ol>
    <li>The <b>outer for loop</b> iterates through each folder with the <b>inner loop</b> iterating over each image in the current folder</li>
    <li>Each image is firstly loaded into the image variable by <b>opencv (cv2)</b> which reads the image in the current image path (folder and image name)</li>
    <li>Then the image is <b>resized</b> into the width and height stated in <b>IMAGE_DIMS</b></li>
    <li>Then the image converted into an array (of pixel values) and added to the <b>data</b> list</li>
    <li>The class name is also appended to the <b>labels</b> list (name of the folder)</li>
</ol>

In [4]:
for folder in tqdm.tqdm(os.listdir(dataFolder)):
    for img in os.listdir(dataFolder+"/"+folder):
        # load the image, pre-process it, and store it in the data list
        imagePath = dataFolder+"/"+folder+"/"+img
        image = cv2.imread(imagePath)
        image = cv2.resize(image, (IMAGE_DIMS[1], IMAGE_DIMS[0]))
        image = img_to_array(image)
        data.append(image)

        # save folder name as label
        labels.append([folder])

100%|██████████| 3/3 [00:01<00:00,  1.87it/s]


<p>The data and labels lists are converted to numpy arrays as these are the preferred format for our libraries and are better optimsed for purpose. the data array is also converted to a <b>float</b> and divided by <b>255</b> to scale the pixel values to a range of 0 to 1</p>

In [5]:
data = np.array(data, dtype="float") / 255.0
labels = np.array(labels)
print("[INFO] data matrix: {} images ({:.2f}MB)".format(
    len(labels), data.nbytes / (1024 * 1000.0)))
print(labels[0:10])

[INFO] data matrix: 269 images (58.10MB)
[['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']
 ['Chihuahuas']

<p>At this stage our labels are correct but the machine learning model needs them to be optimised for purpose. This process called <b>One hot encoding</b> helps by converting category names into numeric (binary) representation to work with prediction. Importantly each class name still maps to their numeric representation so we can still link predicitons back to the original classes</p>

In [6]:
mlb = MultiLabelBinarizer()
labels = mlb.fit_transform(labels)
for (i, label) in enumerate(mlb.classes_):
    print("{}. {}".format(i + 1, label))

1. Beagles
2. Bulldogs
3. Chihuahuas


<h2>Data manipulation</h2>
<p>Although our labels and images have now been converted into numeric format and optimised for our model we still need to do more. Firstly we need to split into training and testing data (explained below)  and then we can make an image augmentor to further our image data</p>

<h3>Train & Test Data</h3>
<p>The idea of training and testing data is essential to machine learning. <b>Training data</b> is what the model analyses to manipulate its structure to fit around. Although this may seem to be the only thing we need to do in training, we also need <b>Testing data</b> to validate its performance. Testing data is explicitly left out of what the model fits around to assess the models performance on. This helps generalise the model and links to the concept of avoiding <b>overfitting</b> (see definitions)</p>
<p>In the above cell we use the <b>train_test_split</b> function to return our data (images) and labels into <b>4</b> new lists. The 'X' lists contains the image data, whilst the 'Y' lists contains the labels (order maintained). </p>
<p>The reason these are regarded to as X and Y is to do with how machine learning prediction is conceptualised, which is typically with a graph. As we know these refer to the two axis graph have, with the x value determining y's. This works the same here with the x data (image) driving the value of y (class label).</p>
<p>The other important argument to note is <b>test_size</b> which defines what percentage of the data should be used for testing during training. This value is typically in the range of 20% (80:20 split) to 33% (67:33 split). The tradeoff to this value is the more you put into testing the less data the model has to train on, but having a small test size will also increase overfitting. <b>random_state</b> simply defines a seed for random to decide how to randomly sort data and is optional</p>
<p>After this cell runs you're data has been divided between training and testing and is also still split into X (images) and Y (classes) hence 4 arrays</p>
<ul>
    <li>trainX: training images</li>
    <li>testX: testing images</li>
    <li>trainY: training classes</li>
    <li>testY: testing labels</li>
</ul>

In [7]:
(trainX, testX, trainY, testY) = train_test_split(data, labels, test_size=0.2, random_state=42)

<h3>Image augmentation</h3>
<p>As we've disussed diverse data is essential to a project, but that doesn't have to end at the dataset building stage. With an augmentor we can get more from our data by manipulating it to account for different angles, translations and rotations. This is another important tool for avoiding <b>overfitting</b> and generalising a model. For example if our dog images mostly showed the dog facing to the right, it may do poorly with an image of a dog facing left. Using the augmentor we can simulate that same image flipped to overcome this issue.

In [8]:
# construct the image generator for data augmentation
aug = ImageDataGenerator(rotation_range=25, width_shift_range=0.1,
                         height_shift_range=0.1, shear_range=0.2, zoom_range=0.2,
                         horizontal_flip=True, fill_mode="nearest")

<h2>Training</h2>
<p>Here we are. The stage where we finally put our processed data through the system. A lot of keywords are used here so look at the keywords at the top for help and reasearch more to improve your design.</p>

<p>Firstly we need to load our model. Here we are calling the <b>build</b> function in our <b>NeuralNetStructure</b> class (see that file for how it's designed). We pass it the image width, height and depth of the images as well as how many classes we're training for. We also declare the <b>activation function</b> to be <b>sigmoid</b>. This is a widely used function that will map any returned values to betweeen 1 and 0 (hence used for outputting probability). This function will return the model structure we will train with</p> 

In [9]:
model = NeuralNetStructure.build(
    width=IMAGE_DIMS[1], height=IMAGE_DIMS[0],
    depth=IMAGE_DIMS[2], classes=len(mlb.classes_),
    finalAct="sigmoid")

<p>We also need to declare an optimiser. This is an object which is used by the model to evaluate how it's doing and tweak values to improve. They use a loss function to establish what weights should be changed over time for the best results. I've gone with Adam which is a widely used opotimiser but please research for yourself to see what other optimisers may offer. Here we pass the <b>learning rate</b> (explained in code cell 2) and <b>decay</b> which is how much the learning rate reduces by each epoch (once again please research more about these if interested) </p>

In [10]:
opt = Adam(lr=INIT_LR, decay=INIT_LR / EPOCHS)

<p>The last step before training is to bring the components together. We compile the model with the optimiser. We also ... </p>

In [11]:
model.compile(loss="categorical_crossentropy", optimizer=opt, metrics=["accuracy"])

In [12]:
H = model.fit_generator(
    aug.flow(trainX, trainY, batch_size=BS),
    validation_data=(testX, testY),
    steps_per_epoch=len(trainX) // BS,
    epochs=EPOCHS, verbose=1)

Epoch 1/15
Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15
Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
Epoch 11/15
Epoch 12/15
Epoch 13/15
Epoch 14/15
Epoch 15/15


In [13]:
model.save("model.model")

In [14]:
f = open("label.pickle", "wb")
f.write(pickle.dumps(mlb))
f.close()