## Step 1 Importing TF, Datasets

Getting the data here

In [1]:
import tensorflow as tf

In [2]:
from tensorflow.examples.tutorials.mnist import input_data

In [None]:
mnist = input_data.read_data_sets("MNIST_data/",one_hot=True)

## step 2 ::   Create a bunch of helper functions for Conv2D

1. Conv2D
2. Regular functions that come along with it.


In [None]:
def init_weights(shape):
    init_random_dist = tf.truncated_normal(shape, stddev=0.1)
    return tf.Variable(init_random_dist)

In [18]:
def init_bias(shape):
    init_bias_vals = tf.constant(0.1, shape=shape)
    return tf.Variable(init_bias_vals)

In [19]:
def conv2d(X,W):
    return tf.nn.conv2d(X, W, strides=[1,1,1,1], padding="SAME")

In [20]:
def max_pool_2x2(X):
    return tf.nn.max_pool(X, ksize=[1,2,2,1],
                         strides=[1,2,2,1],
                         padding="SAME")

#### Explanation for why 1x2x2x1 `[batch, height, width, channel or depth]`

Here we just want to do a 2X2 grid max pooling for height and width of the image.

**Remember**: The first 1 is the batch: You don't usually want to skip over examples in your batch, or you shouldn't have included them in the first place. :)

The last 1 is the depth of the convolution: You don't usually want to skip inputs, for the same reason.

The conv2d operator is more general, so you could create convolutions that slide the window along other dimensions (this is what I was trying to explain), but that's not a typical use in convnets. The typical use is to use them spatially.

**Why reshape to -1?** -1 is a placeholder that says "adjust as necessary to match the size needed for the full tensor." It's a way of making the code be independent of the input batch size, so that you can change your pipeline and not have to adjust the batch size everywhere in the code.

In [21]:
def convolutional_layer(input_x, shape):
    W = init_weights(shape)
    b = init_bias([shape[3]])
    return tf.nn.relu(conv2d(input_x,W) + b)

In [22]:
def normal_full_layer(input_layer, size):
    input_size = int(input_layer.get_shape()[1])
    W = init_weights([input_size, size])
    b = init_bias([size])
    return tf.matmul(input_layer, W) + b

### Build out our CNN 
- placeholders
- layers
- loss fxn
- optimizers
- init, run

In [23]:
# placeholders
X = tf.placeholder(tf.float32, shape=[None, 784])
y_true = tf.placeholder(tf.float32, shape=[None, 10])

In [24]:
# add hidden layers
x_image = tf.reshape(X, [-1, 28,28,1])

In [25]:
# layer 1
# conv_1 = convolutional_layer(x_image, shape=[6, 6,1, 32])
# conv_1_pooling = max_pool_2x2(conv_1)
conv_1 = convolutional_layer(x_image,shape=[5,5,1,32])
conv_1_pooling = max_pool_2x2(conv_1)

In [26]:
conv_2 = convolutional_layer(conv_1_pooling,shape=[5, 5, 32, 64])
conv_2_pooling = max_pool_2x2(conv_2)

In [27]:
conv_2_flat = tf.reshape(conv_2_pooling, [-1, 7*7*64])
full_layer_one = tf.nn.relu(normal_full_layer(conv_2_flat, 1024))

### **Explanation to why `7x7` and why `[-1, 7x7x64]`

The 2x2 filter reduces height and width by 50% each. first as pool layer 1 to Our output tensor produced by `max_pooling2d() (pool1)` has a shape of `[batch_size, 14, 14, 32]`: the 2x2 filter reduces height and width by 50% each. and then again in pooling layer 2 to `[batch_size, 7,7,64]`, thus 28/2/2 = 7 for h and 28/2/2 = 7 for w.

**Next** the flattening out thing...

In the `reshape()` operation above, the -1 signifies that the batch_size dimension will be dynamically calculated based on the number of examples in our input data. Each example has `7 (pool2 height) * 7 (pool2 width) * 64 (pool2 channels) features, so we want the features dimension to have a value of 7 * 7 * 64 (3136 in total)`. The output tensor, pool2_flat, has shape `[batch_size, 3136]`.

In [28]:
hold_prob = tf.placeholder(tf.float32)
full_one_dropout = tf.nn.dropout(full_layer_one, keep_prob=hold_prob)

In [29]:
y_pred = normal_full_layer(full_one_dropout, 10)

### Loss Function, optimizer, init Var etc

In [31]:
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits
                               (labels=y_true,logits=y_pred))

In [32]:
optimizer = tf.train.AdamOptimizer(learning_rate=0.0001)
train = optimizer.minimize(cross_entropy)

In [33]:
init = tf.global_variables_initializer()

In [34]:
steps = 5000

with tf.Session() as sess:
    sess.run(init)
    
    for i in range(steps):
        batch_x, batch_y = mnist.train.next_batch(50)
        
        sess.run(train, feed_dict={X:batch_x, y_true:batch_y, hold_prob:0.3})
        
        # print message every 50 steps
        if i%50 == 0:
            print("Currently on step{}".format(i))
            print("Accuracy i :")
            # Test the train model
            prediction = tf.equal(tf.argmax(y_pred,1), tf.argmax(y_true, 1))
            accuracy = tf.reduce_mean(tf.cast(prediction, tf.float32))
            
            print(sess.run(accuracy, feed_dict={X:mnist.test.images,
                                               y_true:mnist.test.labels,
                                               hold_prob:1.0}))
        

Currently on step0
Accuracy i :
0.0879
Currently on step50
Accuracy i :
0.6972
Currently on step100
Accuracy i :
0.8193
Currently on step150
Accuracy i :
0.8786
Currently on step200
Accuracy i :
0.8973
Currently on step250
Accuracy i :
0.9086
Currently on step300
Accuracy i :
0.9145
Currently on step350
Accuracy i :
0.9299
Currently on step400
Accuracy i :
0.9294
Currently on step450
Accuracy i :
0.9336
Currently on step500
Accuracy i :
0.9393
Currently on step550
Accuracy i :
0.9428
Currently on step600
Accuracy i :
0.9431
Currently on step650
Accuracy i :
0.9484
Currently on step700
Accuracy i :
0.9473
Currently on step750
Accuracy i :
0.9491
Currently on step800
Accuracy i :
0.9523
Currently on step850
Accuracy i :
0.9538
Currently on step900
Accuracy i :
0.9524
Currently on step950
Accuracy i :
0.954
Currently on step1000
Accuracy i :
0.9579
Currently on step1050
Accuracy i :
0.9601
Currently on step1100
Accuracy i :
0.9605
Currently on step1150
Accuracy i :
0.9617
Currently on ste

RMSprop, 6x6 filter == 97.75
Adamoptimizer = 5x5 filter == 99.25%