# CNN on MNIST-10 Dataset

### by Shailesh Patro

This assignment was carried out with the inbuilt Tensorflow tutorial found here:  https://www.tensorflow.org/tutorials/estimators/cnn 


In [0]:
import tensorflow as tf
import numpy as np
from tensorflow.examples.tutorials.mnist import input_data

In [2]:
# Import MNIST data
mnist = input_data.read_data_sets("MNIST-data/", one_hot=True)

Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.
Instructions for updating:
Please write your own downloading logic.
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting MNIST-data/train-images-idx3-ubyte.gz
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting MNIST-data/train-labels-idx1-ubyte.gz
Instructions for updating:
Please use tf.one_hot on tensors.
Extracting MNIST-data/t10k-images-idx3-ubyte.gz
Extracting MNIST-data/t10k-labels-idx1-ubyte.gz
Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.


Note to self: The one_hot option encodes the otherwise binary representation (3 = 0011 ) to a one_hot vector (3 = 1000).

Defining the architecture of my network:
* (Input) -> [batch_size, 28, 28, 1]  >> Apply 32 filter of [5x5]
* (Convolutional layer 1)  -> [batch_size, 28, 28, 32]
* (ReLU 1)  -> [?, 28, 28, 32]
* (Max pooling 1) -> [?, 14, 14, 32]
* (Convolutional layer 2)  -> [?, 14, 14, 64] 
* (ReLU 2)  -> [?, 14, 14, 64] 
* (Max pooling 2)  -> [?, 7, 7, 64] 
* [fully connected layer 3] -> [1x1024]
* [ReLU 3]  -> [1x1024]
* [Drop out]  -> [1x1024]
* [fully connected layer 4] -> [1x10]

In [0]:
# Initialize parameters for the model
width = 28 # width of image in pixels
height = 28  # height of the image in pixels
n_inputs = width*height # number of pixels in one image/ number of inputs
n_classes = 10 # There are 10 possible classifications for this problem

In [0]:
# Functions to format variables (kernel weights and biases) and to add to graph collections
def format_weights(shape, name):
  var = tf.get_variable(name = name, dtype = tf.float32, shape = shape, initializer = tf.contrib.layers.xavier_initializer_conv2d())
  tf.add_to_collection('model_vars', var)
  tf.add_to_collection('l2', tf.reduce_sum(tf.square(var)))
  return var

def format_biases(shape, name):
  var = tf.get_variable(name = name, dtype = tf.float32, shape = shape, initializer = tf.constant_initializer(0.0))
  tf.add_to_collection('model_vars', var)
  tf.add_to_collection('l2', tf.reduce_sum(tf.square(var)))
  return var

In [0]:
class CNNmodel():
  def __init__(self, sess, weights_dict, biases_dict, iterations, batch_size, learn_rate, reg_rate,  display_steps=100, n_inputs=n_inputs, n_classes=n_classes):
    self.sess = sess

    # previously defined parameters for the model
    self.n_inputs = n_inputs
    self.n_classes = n_classes

    # get weights and biases in a dictionary for the various layers
    self.weights_dict = weights_dict
    self.biases_dict= biases_dict

    # user-defined hyperparameters
    self.iterations = iterations
    self.batch_size = batch_size
    self.display_steps = display_steps
    self.learn_rate = learn_rate
    self.reg_rate = reg_rate # is also called lambda?

    # creae placeholders for inputs and outputs
    self.x = tf.placeholder(tf.float32, [None, self.n_inputs])
    self.y = tf.placeholder(tf.float32, [None, self.n_classes])
    self.dropout = tf.placeholder(tf.float32)

    self.cnn_model_fn()

### Convole
To create convoloution layer, I will use tf.nn.conv2d. The function takes in 4-D input and filter tensors, then computes a 2-D convolution.
As inputs the function takes the following:

*  x  in the shape - [batch_size, height, width, channels]
*  weights in the shape - [filter_height, filter_width, input_channels, output_channels] (at least for the first layer filer/kernel is 5*5 is of shape [5*5*1,32])
*  stride - how much the window shifts in each of the dimensions in the input tensor

What happens through this function?
*  Change the filter to a 2-D matrix with shape [5\*5\*1,32]
*  Then it extracts image patches from the input tensor to form a *virtual* tensor of shape `[batch, 28, 28, 5*5*1]`.
*  For each batch, right-multiplies the filter matrix and the image vector.

The result is a `Tensor` (a 2-D convolution) of size <tf.Tensor 'add_7:0' shape=(?, 28, 28, 32)
Here, the output of the first convolution layer is 32 [28x28] images. 32 is considered as volume/depth of the output image.
### Adjust for bias
The function defined below also utilized tf.nn.bias_add. Why is this useful? Well as the model learns it develops sources of bias. This tensorflow function can help correct that bias. It is very similar to tf.add.

### Apply activation function
All the outputs of the convolutional layer which is negative is replace by a 0 with the help of the ReLU activation function.

### Apply max pooling
Max Pooling is a method of non-linear down-sampling, by which the input image is partitioned into set of squares (or rectangles?) and then the maximum value in that region is obtained.


In [0]:
class CNNmodel(CNNmodel):
  def conv_layer(self, x, w, b, stride=1):
    x = tf.nn.conv2d(x, w, strides=[1, stride, stride, 1], padding='SAME')
    x = tf.nn.bias_add(x, b)
    x = tf.nn.relu(x)
    convlayer = tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
    return convlayer

### Creating the conv layers, fully connected layer, dropout and softmax layer

In [0]:
class CNNmodel(CNNmodel):
  def cnn_model_fn(self):
    
    # Convert the images to tensors: input image is 28*28. The channel is 1
    # referring to grayscale. the first number is batch size which can be any
    # size so its denoted by -1
    x_image = tf.reshape(self.x, [-1,28,28,1])
    
    # Convolutional layer 1
    self.yhat = self.conv_layer(x_image, self.weights_dict['w1'], self.biases_dict['b1'])
    # Convolutional layer 2
    self.yhat = self.conv_layer(self.yhat, self.weights_dict['w2'], self.biases_dict['b2'])
    
    # Defining a fully connected layer to use softmax and to create the probabilities in the end.
    
    # Flatten second conv layer
    self.yhat = tf.reshape(self.yhat, [-1, self.weights_dict['w3'].get_shape().as_list()[0]])
    # Applying weights and biases
    self.yhat = tf.add(tf.matmul(self.yhat, self.weights_dict['w3']), self.biases_dict['b3'])
    # Apply the ReLU activation function
    self.yhat = tf.nn.relu(self.yhat)
    
    # Create a dropout layer - an overfitting prevention tool
    self.yhat = tf.nn.dropout(self.yhat, self.dropout)
    
    # Create a softmax layer
    self.yhat = tf.add(tf.matmul(self.yhat, self.weights_dict['w4']), self.biases_dict['b4'])
    # Define the cost function
    self.costs = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=self.yhat, labels=self.y))
    # Define the loss function with L2 regularization
    self.l2 = tf.reduce_sum(tf.get_collection('l2'))
    self.loss = self.costs + self.reg_rate * self.l2
    
    # Count the number cases in a mini-batch that has been classified correctly
    self.correct_prediction = tf.equal(tf.argmax(self.yhat, 1), tf.argmax(self.y, 1))  
    # Get accuracy using average of correct cases
    self.accuracy = tf.reduce_mean(tf.cast(self.correct_prediction, tf.float32))

In [0]:
class CNNmodel(CNNmodel):
  def train(self):
    model_vars = tf.get_collection('model_vars')
    self.optim = (tf.train.AdamOptimizer(learning_rate=self.learn_rate).minimize(self.loss, var_list=model_vars))
    self.sess.run(tf.global_variables_initializer())

    for it in range(self.iterations):
      batch = mnist.train.next_batch(self.batch_size)
      self.sess.run([self.optim], feed_dict={self.x: batch[0], self.y: batch[1], self.dropout: 0.75})
      if it % self.display_steps == 0:
        loss = self.sess.run(self.loss, feed_dict={self.x: batch[0], self.y: batch[1], self.dropout: 1.0})
        train_accuracy = self.accuracy.eval(session=self.sess, feed_dict={self.x:batch[0], self.y: batch[1], self.dropout: 1.0})
        print("Step: %d, Training accuracy: %g, Loss: %f" % (it, float(train_accuracy), loss))
      self.optim.run(session=self.sess,feed_dict={self.x: batch[0], self.y: batch[1], self.dropout: 0.5})
    acc = self.sess.run(self.accuracy, feed_dict={self.x: mnist.validation.images[:5000], self.y: mnist.validation.labels[:5000], self.dropout: 1.0})
    print("Validation Accuracy: ", acc)

  def test_accuracy(self):
    acc = self.sess.run(self.accuracy, feed_dict={self.x: mnist.test.images, self.y: mnist.test.labels, self.dropout: 1.0})
    print("Test Accuracy: ", acc)

In [9]:
sess_1 = tf.Session()

weights_1 = {
    'w1': format_weights([5, 5, 1, 32], 'w_1-1'),
    'w2': format_weights([5, 5, 32, 64], 'w_1-2'),
    'w3': format_weights([7 * 7 * 64, 1024], 'w_1-3'),
    'w4': format_weights([1024, n_classes], 'w_1-4')
}

biases_1= {
    'b1': format_biases([32], 'b_1-1'),
    'b2': format_biases([64], 'b_1-2'),
    'b3': format_biases([1024], 'b_1-3'),
    'b4': format_biases([n_classes], 'b_1-4')
}

runs_1= 600
batchsize_1 = 20
learningrate_1 = 0.03
regularizationrate_1 = 0.03


model_1 = CNNmodel(sess_1, weights_1, biases_1, runs_1, batchsize_1, learningrate_1, regularizationrate_1)

model_1.train()
model_1.test_accuracy()

sess_1.close()

Step: 0, Training accuracy: 0.25, Loss: 45.495209
Step: 100, Training accuracy: 0.55, Loss: 4.286979
Step: 200, Training accuracy: 0.55, Loss: 6.666721
Step: 300, Training accuracy: 0.6, Loss: 5.033250
Step: 400, Training accuracy: 0.6, Loss: 3.635106
Step: 500, Training accuracy: 0.55, Loss: 5.257320
Validation Accuracy:  0.4788
Test Accuracy:  0.4792


In [10]:
sess_2 = tf.Session()

weights_2 = {
    'w1': format_weights([5, 5, 1, 32], 'w_2-1'),
    'w2': format_weights([5, 5, 32, 64], 'w_2-2'),
    'w3': format_weights([7 * 7 * 64, 1024], 'w_2-3'),
    'w4': format_weights([1024, n_classes], 'w_2-4')
}

biases_2= {
    'b1': format_biases([32], 'b_2-1'),
    'b2': format_biases([64], 'b_2-2'),
    'b3': format_biases([1024], 'b_2-3'),
    'b4': format_biases([n_classes], 'b_2-4')
}

runs_2= 1000
batchsize_2 = 20
learningrate_2 = 0.003
regularizationrate_2 = 0.003


model_2 = CNNmodel(sess_2, weights_2, biases_2, runs_2, batchsize_2, learningrate_2, regularizationrate_2)

model_2.train()
model_2.test_accuracy()

sess_2.close()

Step: 0, Training accuracy: 0.2, Loss: 9.865908
Step: 100, Training accuracy: 0.95, Loss: 0.729135
Step: 200, Training accuracy: 0.95, Loss: 0.616921
Step: 300, Training accuracy: 0.95, Loss: 0.600459
Step: 400, Training accuracy: 0.9, Loss: 0.816150
Step: 500, Training accuracy: 0.8, Loss: 0.951970
Step: 600, Training accuracy: 0.95, Loss: 0.665529
Step: 700, Training accuracy: 0.95, Loss: 0.437964
Step: 800, Training accuracy: 0.95, Loss: 0.937742
Step: 900, Training accuracy: 1, Loss: 0.396490
Validation Accuracy:  0.9574
Test Accuracy:  0.9576


In [11]:
sess_3 = tf.Session()

weights_3 = {
    'w1': format_weights([5, 5, 1, 32], 'w_3-1'),
    'w2': format_weights([5, 5, 32, 64], 'w_3-2'),
    'w3': format_weights([7 * 7 * 64, 1024], 'w_3-3'),
    'w4': format_weights([1024, n_classes], 'w_3-4')
}

biases_3= {
    'b1': format_biases([32], 'b_3-1'),
    'b2': format_biases([64], 'b_3-2'),
    'b3': format_biases([1024], 'b_3-3'),
    'b4': format_biases([n_classes], 'b_3-4')
}

runs_3 = 3000
batchsize_3 = 20
learningrate_3 = 0.003
regularizationrate_3 = 0.0003


model_3 = CNNmodel(sess_3, weights_3, biases_3, runs_3, batchsize_3, learningrate_3, regularizationrate_3)

model_3.train()
model_3.test_accuracy()
sess_3.close()

Step: 0, Training accuracy: 0.25, Loss: 2.816543
Step: 100, Training accuracy: 1, Loss: 0.288369
Step: 200, Training accuracy: 0.9, Loss: 0.612343
Step: 300, Training accuracy: 0.9, Loss: 0.496815
Step: 400, Training accuracy: 0.9, Loss: 0.568156
Step: 500, Training accuracy: 0.95, Loss: 0.325526
Step: 600, Training accuracy: 1, Loss: 0.308819
Step: 700, Training accuracy: 0.95, Loss: 0.450259
Step: 800, Training accuracy: 0.95, Loss: 0.356524
Step: 900, Training accuracy: 1, Loss: 0.320757
Step: 1000, Training accuracy: 1, Loss: 0.252354
Step: 1100, Training accuracy: 1, Loss: 0.250033
Step: 1200, Training accuracy: 1, Loss: 0.256277
Step: 1300, Training accuracy: 0.95, Loss: 0.310451
Step: 1400, Training accuracy: 1, Loss: 0.252973
Step: 1500, Training accuracy: 1, Loss: 0.267030
Step: 1600, Training accuracy: 1, Loss: 0.235063
Step: 1700, Training accuracy: 1, Loss: 0.235208
Step: 1800, Training accuracy: 1, Loss: 0.256866
Step: 1900, Training accuracy: 0.95, Loss: 0.271403
Step: 20

In [12]:
sess_5 = tf.Session()

weights_5 = {
    'w1': format_weights([5, 5, 1, 32], 'w_5-1'),
    'w2': format_weights([5, 5, 32, 64], 'w_5-2'),
    'w3': format_weights([7 * 7 * 64, 1024], 'w_5-3'),
    'w4': format_weights([1024, n_classes], 'w_5-4')
}

biases_5= {
    'b1': format_biases([32], 'b_5-1'),
    'b2': format_biases([64], 'b_5-2'),
    'b3': format_biases([1024], 'b_5-3'),
    'b4': format_biases([n_classes], 'b_5-4')
}

runs_5 = 3000
batchsize_5 = 50
learningrate_5 = 0.001
regularizationrate_5 = 0.0001


model_5 = CNNmodel(sess_5, weights_5, biases_5, runs_5, batchsize_5, learningrate_5, regularizationrate_5)

model_5.train()
model_5.test_accuracy()
sess_5.close()

Step: 0, Training accuracy: 0.22, Loss: 2.667605
Step: 100, Training accuracy: 0.92, Loss: 0.257868
Step: 200, Training accuracy: 0.98, Loss: 0.169784
Step: 300, Training accuracy: 1, Loss: 0.096193
Step: 400, Training accuracy: 1, Loss: 0.107289
Step: 500, Training accuracy: 0.98, Loss: 0.163559
Step: 600, Training accuracy: 1, Loss: 0.093809
Step: 700, Training accuracy: 1, Loss: 0.092286
Step: 800, Training accuracy: 0.98, Loss: 0.106344
Step: 900, Training accuracy: 0.96, Loss: 0.217415
Step: 1000, Training accuracy: 0.98, Loss: 0.116735
Step: 1100, Training accuracy: 1, Loss: 0.104654
Step: 1200, Training accuracy: 0.94, Loss: 0.271424
Step: 1300, Training accuracy: 0.98, Loss: 0.115138
Step: 1400, Training accuracy: 0.96, Loss: 0.265512
Step: 1500, Training accuracy: 1, Loss: 0.082118
Step: 1600, Training accuracy: 0.96, Loss: 0.252909
Step: 1700, Training accuracy: 1, Loss: 0.076609
Step: 1800, Training accuracy: 1, Loss: 0.083297
Step: 1900, Training accuracy: 1, Loss: 0.084265