# Problem
The question we are trying to answear in this notebook is if deskewing images before applying HOG descriptor increases performance of Neural Network, SVM and RandomForest models.


The main task is to classify grayscale images of handwritten digits (28 pixels by 28 pixels), into their 10 
categories (0 to 9). The dataset we will use is the MNIST dataset, a classic dataset in the machine learning community, which has been 
around for almost as long as the field itself and has been very intensively studied. It's a set of 60,000 training images, plus 10,000 test 
images, assembled by the National Institute of Standards and Technology (the NIST in MNIST) in the 1980s. You can think of "solving" MNIST 
as the "Hello World" of deep learning -- it's what you do to verify that your algorithms are working as expected. As you become a machine 
learning practitioner, you will see MNIST come up over and over again, in scientific papers, blog posts, and so on.

In [1]:
import cv2
import numpy as np
import matplotlib.pyplot as plt

In [2]:
## [deskew]
SZ=28
affine_flags = cv2.WARP_INVERSE_MAP|cv2.INTER_LINEAR

def deskew(img):
    m = cv2.moments(img)
    if abs(m['mu02']) < 1e-2:
        return img.copy()
    skew = m['mu11']/m['mu02']
    M = np.float32([[1, skew, -0.5*SZ*skew], [0, 1, 0]])
    img = cv2.warpAffine(img,M,(SZ, SZ),flags=affine_flags)
    return img
## [deskew]


In [3]:
def showOpencvImage(image, isGray=False):
    fig = plt.figure(figsize=(6, 6))
    plt.imshow(image, cmap = 'gray')
    plt.show()

In [4]:
def openCVHOG(im):
    winSize = (20,20)
    blockSize = (10,10)
    blockStride = (5,5)
    cellSize = (10,10)
    nbins = 9
    derivAperture = 1
    winSigma = -1.
    histogramNormType = 0
    L2HysThreshold = 0.2
    gammaCorrection = 1
    nlevels = 64
    signedGradients = True

    hog = cv2.HOGDescriptor(winSize,blockSize,blockStride,cellSize,nbins,derivAperture,winSigma,histogramNormType,L2HysThreshold,gammaCorrection,nlevels, signedGradients)
    descriptor = np.ravel(hog.compute(im))
    
    return descriptor

# Data preprocessing

Raw data

In [5]:
from keras.datasets import mnist

(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


Vectorised raw data

In [6]:
train_raw = train_images.reshape(len(train_images), 28 * 28)
test_raw = test_images.reshape(len(test_images), 28 * 28)

Deskewed data

In [7]:
train_deskewed = np.float32([deskew(im) for im in train_raw])
test_deskewed = np.float32([deskew(im) for im in test_raw])
train_deskewed = np.asarray(train_deskewed).reshape(-1,28*28)
test_deskewed = np.asarray(test_deskewed).reshape(-1,28*28)

Hog descriptor data

In [8]:
hogdata_train = np.float32([openCVHOG(im) for im in train_images]).reshape(-1,81)
hogdata_test = np.float32([openCVHOG(im) for im in test_images]).reshape(-1,81)

hogdata_train_deskewed = np.float32([openCVHOG(deskew(im)) for im in train_images]).reshape(-1,81)
hogdata_test_deskewed = np.float32([openCVHOG(deskew(im)) for im in test_images]).reshape(-1,81)

Data for grid search and cross validation grid search

In [28]:
from collections import Counter
hogdata_train_deskewed_short = hogdata_train_deskewed[:600]
train_labels_short = train_labels[:600]
print(Counter(train_labels_short))
hogdata_train_short = hogdata_train[:600]

Counter({1: 79, 9: 65, 2: 64, 7: 62, 4: 59, 3: 59, 0: 58, 6: 54, 5: 51, 8: 49})


# Model SVM

Training models with HOG descriptors with and without deskewing images

In [9]:
from sklearn import svm

model_deskewed = svm.SVC(C=15.5,gamma=0.7)
model_non_deskewed = svm.SVC(C=15.5,gamma=0.7)

model_deskewed.fit(hogdata_train_deskewed, train_labels)
model_non_deskewed.fit(hogdata_train, train_labels)

pred_labels_deskewed = model_deskewed.predict(hogdata_test_deskewed)
pred_labels_non_deskewed = model_non_deskewed.predict(hogdata_test)

In [10]:
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score
print("Accuracy for deskewed images: {}\nAccuracy for non deskewed images: {}".\
      format(accuracy_score(test_labels, pred_labels_deskewed), accuracy_score(test_labels, pred_labels_non_deskewed)))

Accuracy for deskewed images: 0.9835
Accuracy for non deskewed images: 0.9756


In [11]:
cm_deskewed = confusion_matrix(test_labels, pred_labels_deskewed)
cm_non_deskewed = confusion_matrix(test_labels, pred_labels_non_deskewed)


print("Confusion matrix for deskewed images:\n{}\n\nConfusion matrix for non deskewed images:\n{}".\
      format(cm_deskewed, cm_non_deskewed))

Confusion matrix for deskewed images:
[[ 973    1    1    0    0    1    1    1    1    1]
 [   0 1127    1    0    0    0    2    4    1    0]
 [   1    1 1017    2    0    0    0   11    0    0]
 [   0    1    5  993    0    4    0    6    1    0]
 [   1    0    2    0  967    1    6    0    1    4]
 [   1    0    0    4    0  878    3    2    3    1]
 [   3    3    1    0    7    2  941    0    1    0]
 [   1    3    9    5    0    0    0 1007    2    1]
 [   0    0    1    6    6    1    0    0  949   11]
 [   2    0    1    1    6    5    1    4    6  983]]

Confusion matrix for non deskewed images:
[[ 962    2    3    0    2    2    5    1    0    3]
 [   0 1127    1    1    0    0    2    4    0    0]
 [   2    1 1003    6    0    0    0   18    2    0]
 [   0    1    7  987    1    4    0    6    4    0]
 [   2    1    2    0  960    0    3    1    4    9]
 [   1    3    1    8    0  867    4    2    4    2]
 [   8    2    3    0    3    2  939    0    1    0]
 [   0    3   19 


*   **PRECISION** = TP / (TP+FP)
*   **RECALL** = TP + (TP+FN)
*   **F1 score** = 2*PRECISION*RECALL/(PRECISION+RECALL)
*   **ACCURACY** = SUM_OF_DIAGNONAL ELEMENTS/SUM OF ALL ELEMENTS
*   **Macro_AVG OF PRECISION** = SUM OF PRECISIONS/NUMBER OF CLASSES
*   **Weighted AVG OF PRECISION** = SUM OVER CLASSES PRECISION(CLASS)*WEIGHT*   (CLASS),
**WEIGHT** = CLASS SUPPORT/ALL ELEMENTS
*   **MICRO AVG OF PRECISION** = SUM (TP(CLASS))/SUM(TP(CLASS)+FP(CLASS))
   



# Model RandomForest

Training models with HOG descriptors with and without deskewing images

In [12]:
from sklearn.ensemble import RandomForestClassifier

model_deskewed = RandomForestClassifier(max_depth=15, n_estimators=100, max_features=60)
model_non_deskewed = RandomForestClassifier(max_depth=15, n_estimators=100, max_features=60)

model_deskewed.fit(hogdata_train_deskewed, train_labels)
model_non_deskewed.fit(hogdata_train, train_labels)

pred_labels_deskewed = model_deskewed.predict(hogdata_test_deskewed)
pred_labels_non_deskewed = model_non_deskewed.predict(hogdata_test)

In [13]:
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score
print("Accuracy for deskewed images: {}\nAccuracy for non deskewed images: {}".\
      format(accuracy_score(test_labels, pred_labels_deskewed), accuracy_score(test_labels, pred_labels_non_deskewed)))

Accuracy for deskewed images: 0.9637
Accuracy for non deskewed images: 0.9482


In [14]:
cm_deskewed = confusion_matrix(test_labels, pred_labels_deskewed)
cm_non_deskewed = confusion_matrix(test_labels, pred_labels_non_deskewed)


print("Confusion matrix for deskewed images:\n{}\n\nConfusion matrix for non deskewed images:\n{}".\
      format(cm_deskewed, cm_non_deskewed))

Confusion matrix for deskewed images:
[[ 958    0    4    0    2    5    2    2    4    3]
 [   2 1123    2    2    0    0    3    2    1    0]
 [   0    1  991    4    2    3    2   15   10    4]
 [   1    2   13  979    0    3    0    7    4    1]
 [   2    0    3    0  949    1   11    1    4   11]
 [   4    0    1    4    1  868    1    2    9    2]
 [  10    1    0    0   10    6  925    0    5    1]
 [   3    1   27    9    0    1    0  977    2    8]
 [   2    0    4    7    9    9    3    3  909   28]
 [   5    0    3    0    3   11    6    3   20  958]]

Confusion matrix for non deskewed images:
[[ 946    1    2    0    6    8    7    4    1    5]
 [   1 1121    5    0    2    1    1    3    1    0]
 [   5    2  977    8    2    4    3   21    7    3]
 [   1    0   17  963    0    5    0   15    7    2]
 [   4    1    0    0  941    1   16    1    3   15]
 [   4    3    1    8    0  848    2    3   16    7]
 [  12    5    3    0    9    9  911    0    6    3]
 [   2    3   37 

# Model Neural Network

In [15]:
from keras import models
from keras import layers

#model for deskewed data
network_deskewed = models.Sequential()
network_deskewed.add(layers.Dense(512, activation='relu', input_shape=(81,)))
network_deskewed.add(layers.Dense(10, activation='softmax'))
#model for non deskewed data
network_non_deskewed = models.Sequential()
network_non_deskewed.add(layers.Dense(512, activation='relu', input_shape=(81,)))
network_non_deskewed.add(layers.Dense(10, activation='softmax'))

Instructions for updating:
Colocations handled automatically by placer.


In [16]:
network_deskewed.compile(optimizer='rmsprop',
                loss='categorical_crossentropy',
                metrics=['accuracy'])
network_non_deskewed.compile(optimizer='rmsprop',
                loss='categorical_crossentropy',
                metrics=['accuracy'])

Data vectorisation (HOG on deskewed images and HOG on non deskewed images)

In [17]:
nn_train_deskewed = np.array(hogdata_train_deskewed).reshape((60000, 81))
nn_train_deskewed = nn_train_deskewed.astype('float32') / 255

nn_test_deskewed = np.array(hogdata_test_deskewed).reshape((10000, 81))
nn_test_deskewed = nn_test_deskewed.astype('float32') / 255

nn_train_non_deskewed = np.array(hogdata_train).reshape((60000, 81))
nn_train_non_deskewed = nn_train_non_deskewed.astype('float32') / 255

nn_test_non_deskewed = np.array(hogdata_test).reshape((10000, 81))
nn_test_non_deskewed = nn_test_non_deskewed.astype('float32') / 255

In [18]:
from keras.utils import to_categorical

encoded_train_labels = to_categorical(train_labels)
encoded_test_labels = to_categorical(test_labels)

encoded_test_labels

array([[0., 0., 0., ..., 1., 0., 0.],
       [0., 0., 1., ..., 0., 0., 0.],
       [0., 1., 0., ..., 0., 0., 0.],
       ...,
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.]], dtype=float32)

Training our models

In [19]:
network_deskewed.fit(nn_train_deskewed, encoded_train_labels, epochs=12, batch_size=128)

Instructions for updating:
Use tf.cast instead.
Epoch 1/12
Epoch 2/12
Epoch 3/12
Epoch 4/12
Epoch 5/12
Epoch 6/12
Epoch 7/12
Epoch 8/12
Epoch 9/12
Epoch 10/12
Epoch 11/12
Epoch 12/12


<keras.callbacks.callbacks.History at 0x214957b0e80>

In [20]:
network_non_deskewed.fit(nn_train_non_deskewed, encoded_train_labels, epochs=12, batch_size=128)

Epoch 1/12
Epoch 2/12
Epoch 3/12
Epoch 4/12
Epoch 5/12
Epoch 6/12
Epoch 7/12
Epoch 8/12
Epoch 9/12
Epoch 10/12
Epoch 11/12
Epoch 12/12


<keras.callbacks.callbacks.History at 0x21495736c50>

In [21]:
pred_probabilities_deskewed = network_deskewed.predict(nn_test_deskewed)
pred_probabilities_non_deskewed = network_non_deskewed.predict(nn_test_non_deskewed)

pred_labels_deskewed = np.argmax(pred_probabilities_deskewed,-1)
pred_labels_non_deskewed = np.argmax(pred_probabilities_non_deskewed,-1)

In [22]:
print("Accuracy score for deskewed data: {}\nAccuracy score for non deskewed data: {}".\
     format(accuracy_score(test_labels, pred_labels_deskewed), accuracy_score(test_labels, pred_labels_non_deskewed)))

Accuracy score for deskewed data: 0.9286
Accuracy score for non deskewed data: 0.9062


In [23]:
cm_deskewed = confusion_matrix(test_labels, pred_labels_deskewed)
cm_non_deskewed = confusion_matrix(test_labels, pred_labels_non_deskewed)
print("Confusion matrix for deskewed data: \n{}\n\nConfusion matrix for non deskewed data: \n{}".\
      format(cm_deskewed, cm_non_deskewed))

Confusion matrix for deskewed data: 
[[ 926    1    6    0    1    1    5    2    0   38]
 [   0 1122    2    0    1    0    2    5    2    1]
 [   0    4  951   19    3    0    0   43    9    3]
 [   0    1   24  935    0    5    0   27   12    6]
 [   1    6    4    0  936    1   13    1    1   19]
 [   2    0    1   11    1  822    3    1   27   24]
 [   9    3    1    0   15    4  920    0    4    2]
 [   5    2   33   35    2    0    0  931    4   16]
 [   1    0    7   15   13    6    4    9  827   92]
 [   9    1    3    2    3    8    5   28   34  916]]

Confusion matrix for non deskewed data: 
[[ 919    2    6    0    4    6    9    2    1   31]
 [   0 1117    2    0    4    0    3    5    3    1]
 [   3    6  931   15    3    1    0   62    9    2]
 [   0    0   20  935    0   13    0   24   10    8]
 [   7    3    1    0  927    1   21    0    2   20]
 [   1    2    1   19    2  786    2    2   50   27]
 [  21    2    1    0   17    8  902    0    6    1]
 [   5    2   53   

In [24]:
cr_deskewed = classification_report(test_labels, pred_labels_deskewed)
cr_non_deskewed = classification_report(test_labels, pred_labels_non_deskewed)

print("Clasification report for deskewed data: \n{}\n\nClassification report for non deskewed data: \n{}".\
     format(cr_deskewed, cr_non_deskewed))


Clasification report for deskewed data: 
             precision    recall  f1-score   support

          0       0.97      0.94      0.96       980
          1       0.98      0.99      0.99      1135
          2       0.92      0.92      0.92      1032
          3       0.92      0.93      0.92      1010
          4       0.96      0.95      0.96       982
          5       0.97      0.92      0.95       892
          6       0.97      0.96      0.96       958
          7       0.89      0.91      0.90      1028
          8       0.90      0.85      0.87       974
          9       0.82      0.91      0.86      1009

avg / total       0.93      0.93      0.93     10000


Classification report for non deskewed data: 
             precision    recall  f1-score   support

          0       0.93      0.94      0.93       980
          1       0.98      0.98      0.98      1135
          2       0.90      0.90      0.90      1032
          3       0.90      0.93      0.91      1010
       

# Conclusion
Deskewing images before applying HOG descriptor increases performance for every of tested models.