# A quick comparison between a Linear Classifier and a Deep Neural Network

## MNIST classification

This code aims to briefly highlight what benefits a DNN can bring in terms of accuracy for an image classifier. This is done using the usual MNIST dataset on Keras.

In [7]:
import tensorflow as tf
mnist = tf.keras.datasets.mnist

First, let us load and reshape the data

In [8]:
(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

First, we will build a linear model for this. A linear model is nothing else than a neural network without any hidden layers. Hence, we can build it using the Keras Sequential tools.

We choose to keep 20% of the data for validation, and train it on 5 epochs.

In [9]:
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.fit(x_train, y_train, epochs=5, validation_split = 0.2)
model.evaluate(x_test, y_test)

Train on 48000 samples, validate on 12000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


[0.2709509532779455, 0.9246]

This linear model has an accuracy of 0.93 on the validation sample

Then, we move on to a Deep Neural Network, that has two hidden layers. We use the activation function relu, that usually works well, and a 20% Dropout to avoid overfitting.

We still use 20% of the sample for validation, and 5 epochs for the training.

In [10]:
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(512, activation=tf.nn.relu),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(512, activation=tf.nn.relu),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.fit(x_train, y_train, epochs=5, validation_split = 0.2)
model.evaluate(x_test, y_test)

Train on 48000 samples, validate on 12000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


[0.07859076420282073, 0.977]

This deep NN has an accuracy of 0.98 on the validation sample, which is similar to the accuracy of the training sample (0.98), which indicates there is very little overfitting.

## Conclusion:

Adding hidden layers to our linear model is a good way of increasing accuracy without overfitting. This example shows why it is interesting to use Deep Learning to improve classification in image analysis.