# Character classifier with Keras

## Description

We build a classifier to recognize English characters. We used the dataset called EnglishImg from http://www.ee.surrey.ac.uk/CVSSP/demos/chars74k/. This dataset contains digits and English carachters (lower and uppercase). We choose *Keras* to implement our classifier: when the depth of a neural network increases, it becomes difficult to take care of all parameters; differently from, e.g., TensorFlow or Theano, Keras provides an intuitive way to tune complex neural networks. We think it is the most appropriate library to build fast and efficient classifiers.

## Step 0 - Import libraries

First, we import the libraries used in the notebook.

In [None]:
import os
import keras
import string
import numpy as np
import matplotlib.pyplot as plt
from keras import models
from keras import layers
from keras import optimizers
from keras.applications import VGG16
from keras.preprocessing.image import ImageDataGenerator

## Step 1 - Variable initialization

First, we initialize the variables we are going to use in the notebook. After running the shell script format_dataset.sh, the dataset is moved to /img and split into three folders (/train, /valid and /test).

In [None]:
width = 224
height = 224
nb_epoch = 5
training_path = 'img/train'
validation_path = "img/valid"
test_path = "img/test"
training_batch = 10
validation_batch = 4
test_batch = 10
nb_class = len(next(os.walk(training_path))[1])
class_label = list(string.digits + string.ascii_uppercase + string.ascii_lowercase)

## Step 2 - Data loader

We build the dataset loader by using the *ImageDataGenerator* function of Keras. Preprocessing the images is important to train appropriately the network, and the parameters can be tuned (e.g., target size) to improve the classifier accuracy. What is more, the usage of the image mask can further improve performance.

In [None]:
datagen = ImageDataGenerator(rescale = 1./255,
                             rotation_range = 20,
                             width_shift_range = 0.2,
                             height_shift_range = 0.2,
                             horizontal_flip = True,
                             fill_mode='nearest')

training = datagen.flow_from_directory(training_path,
                                       target_size = (width, height),
                                       batch_size = training_batch,
                                       class_mode = 'categorical')
validation = datagen.flow_from_directory(validation_path,
                                         target_size = (width, height),
                                         batch_size = validation_batch,
                                         class_mode = 'categorical')

nb_classes = len(next(os.walk(training_path))[1])

## Step 3 - Load the VGG model

In our classifier we used the pre-trained network called *VGG16* which has been trained with images. As an alternative, we could use, e.g., *resnet*.

In [None]:
vgg_conv = VGG16(weights = 'imagenet', include_top = False, input_shape = (width, height, 3))

# Freeze the layers except the last 4 layers
for layer in vgg_conv.layers[:-4]:
    layer.trainable = False

## Step 4 - Build the model

We fine tune the VGG model by adding four layers to reduce overfitting (*layers.Dropout()*), and output probabilities for classes (*layers.Dense()* with *softmax* activation function).

In [None]:
# Create the model and add the VGG convolutional base model
model = models.Sequential()
model.add(vgg_conv)
 
# Add new layers
model.add(layers.Flatten())
model.add(layers.Dense(1024, activation = 'relu'))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(nb_classes, activation = 'softmax'))

## Step 5 - Compile the model

We compile the model with the optimizer *RMSprop*. We tested *Adam* with default parameters, but it provided worse results.

In [None]:
model.compile(loss = 'categorical_crossentropy',
              optimizer = optimizers.RMSprop(lr = 1e-4),
              metrics = ['acc'])

## Step 6 - Train the model

We train the model over five epochs and we validate to analyze overfitting. We store the resulting weights in a file. Alternatively, it is possible to restore previous computed weights through the command *model.load_weights('model/vgg16.h5')*.

In [None]:
history = model.fit_generator(training,
                              steps_per_epoch = len(training),
                              epochs = nb_epoch,
                              validation_data = validation,
                              validation_steps = len(validation),
                              verbose = 2)

model.save_weights('model/vgg16.h5')

## Step 7 - Plot the training graphs

We plot the accuracy and the loss per epoch for the training and the validation sets.

In [None]:
plt.figure(1, figsize = (10, 8))
   
# Plot accuracy
plt.subplot(211)  
plt.plot(history.history['acc'])  
plt.plot(history.history['val_acc'])  
plt.title('model accuracy')  
plt.ylabel('accuracy')  
plt.legend(['train', 'validation'], loc='upper right')  

# Plot loss  
plt.subplot(212)  
plt.plot(history.history['loss'])  
plt.plot(history.history['val_loss'])  
plt.title('model loss')  
plt.ylabel('loss')  
plt.xlabel('epoch')  
plt.legend(['train', 'validation'], loc='upper right')  
plt.show()  

## Step 8 - Make predictions on the test set

According to the model generated, we compute the probability that an image belongs to a certain class (*prediction_prob*). Conversely, the variable *prediction* describes which class has the highest probability per image. We also extract images (in *test_image*) and labels (in *test_label*) from the test set.

In [None]:
test = datagen.flow_from_directory(test_path,
                                   target_size = (width, height),
                                   batch_size = test_batch,
                                   shuffle = False,
                                   class_mode = 'categorical')

prediction_prob = model.predict_generator(test, steps = len(test))

# Choose the class with the highest probability
best_prediction = np.argmax(prediction_prob, axis = 1)
prediction = np.zeros((len(prediction_prob), nb_classes))

for i in range(len(best_prediction)):
    prediction[i, best_prediction[i]] = 1   

# Extract images and labels from the test set
test_image = []
test_label = np.zeros((len(prediction), nb_class))
test.reset()

i = 0
while i < len(test):
    x, y = test.next()

    for j in range(len(x)):
        type(x[j])
        test_images.append(x[j])
    
        for w in range(nb_class):
            test_label[i * 10 + j, w] = y[j][w]
    i += 1  

## Step 9 - Print accuracy

We print the accuracy of the model as the percentage of correct predictions divided by the total number of predictions. The prediction used is the processed 0/1 variable *prediction* described in Step 8. A possible improvement of this metric should also take into account the probabilities of the predictions.

In [None]:
error = np.max(prediction - test_label, axis=1)
accuracy = np.around((1 - int(np.sum(error)) / len(prediction)) * 100, 2)
print('Accuracy:', accuracy, '% -', int(np.sum(error)), 'errors out of', len(prediction), 'images.')

## Step 10 - Show errors

This cell gives the opportunity to visually inspect on which images the classifier is making errors. This is important to analyze the result of the classifier, and improve the model accordingly.<br>
**RECOMMENDED USE:** limit the number of images to show when the number of errors is large.

In [None]:
error_index = np.where(error == 1)[0]
print(error_index)
for i in range(len(error_index)):
    a = np.argmax(prediction_prob[error_index[i], :])
    b = np.max(prediction_prob[error_index[i], :])
    print('Predict', a, 'with probability ', np.around(b * 100, 2), '%')
    plt.imshow(test_images[error_index[i]])            # print the name of the class
    plt.show()

## Bonus step - Build the model

We show here an alternative model built from scratch. This could be used instead of Steps 3 and 4. However, it provides bad accuracy if not trained with a large dataset.

In [None]:
model = Sequential()

model.add(Conv2D(128, (3, 3), input_shape = (width, height, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(64, (3, 3), input_shape = (width, height, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(32, (3, 3), input_shape = (width, height, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(layers.Dense(nb_classes, activation = 'softmax'))