# Architecture of our network is:
    
- (Input) -> [batch_size, 28, 28, 1]  >> Apply 32 filter of [5x5]
- (Convolutional layer 1)  -> [batch_size, 28, 28, 32]
- (ReLU 1)  -> [?, 28, 28, 32]
- (Max pooling 1) -> [?, 14, 14, 32]
- (Convolutional layer 2)  -> [?, 14, 14, 64] 
- (ReLU 2)  -> [?, 14, 14, 64] 
- (Max pooling 2)  -> [?, 7, 7, 64] 
- [fully connected layer 3] -> [1x1024]
- [ReLU 3]  -> [1x1024]
- [Drop out]  -> [1x1024]
- [fully connected layer 4] -> [1x10]


# The MNIST data

In [3]:
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data", one_hot=True)

Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.
Instructions for updating:
Please write your own downloading logic.
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting MNIST_data/train-images-idx3-ubyte.gz
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Instructions for updating:
Please use tf.one_hot on tensors.
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.


In [6]:
sess = tf.InteractiveSession()

In [7]:
width = 28
height = 28
flat = width * height
class_output = 10

# Input and output

In [8]:
x = tf.placeholder(tf.float32 , shape=[None,flat])
y_ = tf.placeholder(tf.float32 , shape=[None,class_output])

In [9]:
x_image = tf.reshape(x,[-1,28,28,1])

# Convolutional layer 1

In [13]:
w_conv1 = tf.Variable(tf.truncated_normal([5,5,1,32], stddev=0.1))
b_conv1 = tf.Variable(tf.constant(0.1, shape=[32]))

In [15]:
convolve1 = tf.nn.conv2d(x_image, w_conv1, strides=[1,1,1,1], padding = 'SAME') + b_conv1

# Applying RELU 1

In [16]:
h_conv1 = tf.nn.relu(convolve1)

# Applying Maxpooling

In [17]:
conv1 = tf.nn.max_pool(h_conv1, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME')

# Convolutional Layer 2

In [20]:
w_conv2 = tf.Variable(tf.truncated_normal([5,5,32,64], stddev=0.1))
b_conv2 = tf.Variable(tf.constant(0.1,shape=[64]))

In [21]:
convolve2 = tf.nn.conv2d(conv1, w_conv2, strides=[1,1,1,1], padding='SAME')


# Applying RELU 2

In [22]:
h_conv2 = tf.nn.relu(convolve2)

# Applying Maxpooling

In [25]:
conv2 = tf.nn.max_pool(h_conv2, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME')

# Fully Connected Layer 3

We need a fully connected layer to use the Softmax and create the probabilities in the end. Fully connected layers take the high-level filtered images from previous layer, that is all 64 matrices, and convert them to a flat array.

So, each matrix [7x7] will be converted to a matrix of [49x1], and then all of the 64 matrix will be connected, which make an array of size [3136x1].Then We will connect it to another layer of size [1024x1]. So, the weight 
between these 2 layers will be [3136x1024]


In [26]:
layer2_matrix = tf.reshape(conv2,[-1,7*7*64])

# Weight and Biases between layer 2 & 3

In [27]:
w_fc1 = tf.Variable(tf.truncated_normal([7*7*64,1024], stddev=0.1))
b_fc1 = tf.Variable(tf.constant(0.1,shape=[1024]))

In [28]:
fc1 = tf.matmul(layer2_matrix, w_fc1) + b_fc1

# Applying RELU 3

In [29]:
h_fc1=tf.nn.relu(fc1)

# Applying dropout layer

In [31]:
keep_probe = tf.placeholder(tf.float32)
layer_drop = tf.nn.dropout(h_fc1,keep_probe)

# Fully Connected Layer 4

In [32]:
W_fc2 = tf.Variable(tf.truncated_normal([1024, 10], stddev=0.1))
b_fc2 = tf.Variable(tf.constant(0.1, shape=[10]))

In [34]:
fc = tf.matmul(layer_drop, W_fc2) + b_fc2

In [45]:
y_cnn = tf.nn.softmax(fc)
y_cnn

<tf.Tensor 'Softmax_1:0' shape=(?, 10) dtype=float32>

# Define function and train the model

# Loss function & optimizer

In [39]:
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y_cnn), reduction_indices=[1]))

In [40]:
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)

In [51]:
correct_prediction = tf.equal(tf.argmax(y_cnn, 1), tf.argmax(y_, 1))

In [52]:
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

In [54]:
sess.run(tf.global_variables_initializer())

# Training the model

In [63]:
for i in range(400):
    batch = mnist.train.next_batch(50)
    if i%10 == 0:
        train_accuracy = accuracy.eval(feed_dict={x:batch[0], y_:batch[1], keep_probe : 1.0})
        print("step %d,training accuracy %g" %(i,float(train_accuracy)))
    train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_probe : 0.5})

step 0,training accuracy 0.86
step 10,training accuracy 0.84
step 20,training accuracy 0.94
step 30,training accuracy 0.9
step 40,training accuracy 0.9
step 50,training accuracy 0.82
step 60,training accuracy 0.82
step 70,training accuracy 0.9
step 80,training accuracy 0.9
step 90,training accuracy 0.86
step 100,training accuracy 0.92
step 110,training accuracy 0.9
step 120,training accuracy 0.94
step 130,training accuracy 0.92
step 140,training accuracy 1
step 150,training accuracy 0.82
step 160,training accuracy 0.96
step 170,training accuracy 0.92
step 180,training accuracy 0.94
step 190,training accuracy 0.98
step 200,training accuracy 0.9
step 210,training accuracy 0.94
step 220,training accuracy 1
step 230,training accuracy 0.94
step 240,training accuracy 0.96
step 250,training accuracy 0.92
step 260,training accuracy 0.96
step 270,training accuracy 0.92
step 280,training accuracy 0.92
step 290,training accuracy 0.96
step 300,training accuracy 0.94
step 310,training accuracy 1
st

In [65]:
n_batches = mnist.test.images.shape[0]
cumulative_accuracy = 0.0
for index in range(n_batches):
    batch = mnist.test.next_batch(50)
    cumulative_accuracy += accuracy.eval(feed_dict={x: batch[0], y_: batch[1], keep_probe: 1.0})
print("test accuracy {}".format(cumulative_accuracy / n_batches))

test accuracy 0.9473039980709552
