# Ungraded Lab: Improving Computer Vision Accuracy using Convolutions

## Shallow Neural Network

In the previous lessons, you saw how to do fashion recognition using a neural network containing three layers -- the input layer (in the shape of the data), the output layer (in the shape of the desired output) and only one hidden layer. You experimented with the impact of different sizes of hidden layer, number of training epochs etc on the final accuracy. For convenience, here's the entire code again. Run it and take a note of the test accuracy that is printed out at the end.

In [1]:
# Import all package I will need
%matplotlib inline
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import datasets
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns 

plt.style.use('ggplot')

In [2]:
# create dataset
(training_images, training_labels), (test_images, test_labels) = datasets.fashion_mnist.load_data()

print(f'shape of training_images: {training_images.shape}')
print(f'shape of training_labels: {training_labels.shape}')
print(f'shape of test_images: {test_images.shape}')
print(f'shape of test_labels: {test_labels.shape}')


shape of training_images: (60000, 28, 28)
shape of training_labels: (60000,)
shape of test_images: (10000, 28, 28)
shape of test_labels: (10000,)


In [3]:
# Create first models without conv2d
model_one = keras.Sequential([
    keras.layers.Flatten(input_shape = (training_images.shape[1], training_images.shape[2])),
    keras.layers.Dense(128, activation = tf.nn.relu),
    keras.layers.Dense(10, activation = tf.nn.softmax)
])
model_one.compile(
    optimizer = 'adam',
    loss = 'sparse_categorical_crossentropy',
    metrics = ['accuracy']
)

In [4]:
# train model_one
model_one.fit(training_images, training_labels, epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x1db88a08d60>

In [5]:
# Evaluate model
model_one.evaluate(test_images, test_labels)



[0.5891151428222656, 0.7928000092506409]

## Convolutional Neural Network

In the model above, your accuracy will probably be about 89% on training and 87% on validation. Not bad. But how do you make that even better? One way is to use something called convolutions. We're not going into the details of convolutions in this notebook (please see resources in the classroom), but the ultimate concept is that they narrow down the content of the image to focus on specific parts and this will likely improve the model accuracy.

If you've ever done image processing using a filter (like [this](https://en.wikipedia.org/wiki/Kernel_(image_processing))), then convolutions will look very familiar. In short, you take an array (usually 3x3 or 5x5) and scan it over the entire image. By changing the underlying pixels based on the formula within that matrix, you can do things like edge detection. So, for example, if you look at the above link, you'll see a 3x3 matrix that is defined for edge detection where the middle cell is 8, and all of its neighbors are -1. In this case, for each pixel, you would multiply its value by 8, then subtract the value of each neighbor. Do this for every pixel, and you'll end up with a new image that has the edges enhanced.

This is perfect for computer vision because it often highlights features that distinguish one item from another. Moreover, the amount of information needed is then much less because you'll just train on the highlighted features.


That's the concept of **Convolutional Neural Networks**. Add some layers to do convolution before you have the dense layers, and then the information going to the dense layers is more focused and possibly more accurate.

Run the code below. This is the same neural network as earlier, but this time with [Convolution](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Conv2D) and [MaxPooling](https://www.tensorflow.org/api_docs/python/tf/keras/layers/MaxPool2D) layers added first. It will take longer, but look at the impact on the accuracy.

In [6]:
# create model with Conv2d
model_two = keras.Sequential([
    # Conv2D 
    keras.layers.Conv2D(
        64, 
        (3,3), 
        activation = tf.nn.relu, 
        input_shape = (training_images.shape[1], training_images.shape[2], 1)
    ),
    keras.layers.MaxPooling2D(2, 2),
    keras.layers.Conv2D(
        64,
        (3,3),
        activation = tf.nn.relu
    ),
    keras.layers.MaxPooling2D(2,2),
    
    keras.layers.Flatten(input_shape = (training_images[1], training_images.shape[2])),
    keras.layers.Dense(128, activation = tf.nn.relu),
    keras.layers.Dense(10, activation = tf.nn.softmax)
])
model_two.compile(
    optimizer = 'adam',
    loss = 'sparse_categorical_crossentropy',
    metrics = ['accuracy']
)
model_one.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 flatten (Flatten)           (None, 784)               0         
                                                                 
 dense (Dense)               (None, 128)               100480    
                                                                 
 dense_1 (Dense)             (None, 10)                1290      
                                                                 
Total params: 101,770
Trainable params: 101,770
Non-trainable params: 0
_________________________________________________________________


In [7]:
# training the model_two
model_two.fit(training_images, training_labels, epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x1db8c213c70>

In [None]:
## evaluate model_two
model_two.evaluate(test_images, test_labels)