# Convolutional Neural Networks
   Taught By Vignesh Hari , Using MNIST Dataset
   
## Requirements
We'll be using 
Tensorflow 1.9.0


In [19]:
import tensorflow as tf
import tensorflow.examples.tutorials.mnist.input_data as input_data

## MNIST
Downloading and Using the MNIST Dataset 
The MNIST Dataset includes images of handwritten digits, they are 28 x 28 in size and have only one channel ( ie Black and White )  
One Hot Encoding Means that the y value is stored as a position of a list rather than a value   
eg : [0,0,1] instead of 2  ||  [0,1,0] instead of 1 || and so on ..  

In [20]:
mnist = input_data.read_data_sets('MNIST_data/', one_hot=True )

Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz


## Input Data
The data to be used in the Tensorflow Session is taken as a Tensorflow Placeholder , Since the images are 28 x 28 in size , We use (28 * 28) sized vectors to store the unrolled data

In [21]:
x_inp = tf.placeholder(tf.float32, [None, 784]) 
y = tf.placeholder(tf.float32, [None, 10]) # 10 input classes

## Reshaping
To use a Convolutional Neural Net we have to Convolute + Pool + Fully Connect ( Will be Explained later )

Convoluting means creating feature maps from the inputs by creating small frames and running it through the data, The Data has to be converted into its original dimention for this , hence we reshape the data back to 28 x 28

In [22]:
x = tf.reshape(x_inp, [-1, 28, 28, 1])

## Weights

Our Convolutions are 5x5 images , we are extracting 32 such convolutions in the first layer , So the first layer has 32 nodes its size is (5x5x1x32)  
  
The Second layer has 64 convolutions of each 5x5 and the input is from the previous layer which makes its size (5x5x32x64)

The use pooling to summarise the features made in convolution , the pooling scales down the image by the pool size , we are using 2x2 pools twice so the image gets reduced by 4 times , the 28x28 image becomes a 7x7 feature set , so now the output of the second layer will have an unrolled size of 7*7*64 , and the no of nodes in the fully connected layer can be decided by you , but for now we'll use 784 (28x28)

The output layer finally has 10 output classes and 784 input nodes , so the size of the weight is (784,10)


In [23]:
weight_conv1  = tf.Variable(tf.random_normal([5,5,1,32])) 
weight_conv2  = tf.Variable(tf.random_normal([5,5,32,64]))
weight_fc     = tf.Variable(tf.random_normal([7*7*64,784]))
weight_out    = tf.Variable(tf.random_normal([784,10]))

## Biases

Biases are used to make sure a neuron shoots a value even when it dosent have a value, the size of the bias is the no of nodes in that layer 


In [24]:
bias_conv1  = tf.Variable(tf.random_normal([32])) 
bias_conv2  = tf.Variable(tf.random_normal([64]))
bias_fc     = tf.Variable(tf.random_normal([784]))
bias_out    = tf.Variable(tf.random_normal([10]))

## Layer 1

In Layer one we take the initial 28x28 image and convolute it with 1x1 strides , then pool it with max_pool with sizes of 2x2 , skipping over 2x2 generating an image of 14x14  


In [25]:
conv1     = tf.add(tf.nn.conv2d(x , weight_conv1 , strides=[1,1,1,1] , padding="SAME") , bias_conv1) 
max_pool1 = tf.nn.max_pool(conv1 , ksize=[1,2,2,1] , strides=[1,2,2,1] , padding="SAME") 

## Layer 2

In Layer two we take the layer one 14x14 image and convolute it with 1x1 strides , then pool it with max_pool with sizes of 2x2 , skipping over 2x2 generating an image of 7x7

In [26]:
conv2     = tf.add(tf.nn.conv2d(max_pool1 , weight_conv2 , strides=[1,1,1,1] , padding="SAME") , bias_conv2)
max_pool2 = tf.nn.max_pool(conv2 , ksize=[1,2,2,1] , strides=[1,2,2,1] , padding="SAME") 

## Layer 3

In Layer we unroll the data from the convolution and then multiply the weights and use the RElU Activation function to find the output value of the nodes

In [27]:
fc = tf.reshape(max_pool2 , [-1 , 7*7*64 ])
fc = tf.nn.relu( tf.add(tf.matmul(fc , weight_fc) , bias_fc) )

## Layer 4

In the Final Layer we multiply the final weights and get the final output ( prediction value )

In [28]:
out = tf.matmul(fc , weight_out) + bias_out

## Cross Entropy (cost) and Optimisation 

Finally we calculate the loss function or the cost function using softmax cross entropy   

$$( -1 * ((y * log(y)) + ((1-y) * log(1-y))   )$$

And we train the model using the Adam Optimiser built into tensorflow

In [29]:
cost = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits(logits=out, labels=y) )
optimizer = tf.train.AdamOptimizer().minimize(cost)

## Accuracy 

We calculate Accuracy by calculating the mean of the correct values in the validation set

In [30]:
correct = tf.equal(tf.argmax(out, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct, 'float')) * 100

## Initialising a TensorFlow Session

In [31]:
sess = tf.Session()
sess.run(tf.global_variables_initializer())

## Training the Dataset (MNIST)

using a batch size of 128 and training for 10 epochs

In [32]:
hm_epochs = 2
batch_size = 128

for epoch in range(hm_epochs):
    epoch_loss = 0
    for _ in range(int(mnist.train.num_examples/batch_size)):
        epoch_x, epoch_y = mnist.train.next_batch(batch_size)
        _, c = sess.run([optimizer, cost], feed_dict={x_inp: epoch_x, y: epoch_y})
        epoch_loss += c

    print 'Completed Epoch# : ', epoch, ' : Epochs Left : ',hm_epochs-epoch - 1,' : loss : ',epoch_loss

Completed Epoch# :  0  : Epochs Left :  1  : loss :  1728728.2463989258
Completed Epoch# :  1  : Epochs Left :  0  : loss :  420482.5573348999


In [36]:
print 'Accuracy For Training is ' ,sess.run(accuracy , feed_dict={x_inp:mnist.test.images, y:mnist.test.labels}) , "%"

 Accuracy For Training is  94.53 %
