# TensorFlow Assignment: Convolutional Neural Network (CNN)

**[Duke Community Standard](http://integrity.duke.edu/standard.html): By typing your name below, you are certifying that you have adhered to the Duke Community Standard in completing this assignment.**

Name: 

### Convolutional Neural Network

Build a 2-layer CNN for MNIST digit classfication. Feel free to play around with the model architecture and see how the training time/performance changes, but to begin, try the following:

Image -> convolution (32 5x5 filters) -> nonlinearity (ReLU) ->  (2x2 max pool) -> convolution (64 5x5 filters) -> nonlinearity (ReLU) -> (2x2 max pool) -> fully connected (256 hidden units) -> nonlinearity (ReLU) -> fully connected (10 hidden units) -> softmax

Some tips:
- The CNN model might take a while to train. Depending on your machine, you might expect this to take up to half an hour. If you see your validation performance start to plateau, you can kill the training.

- Since CNNs a more complex than the logistic regression and MLP models you've worked with before, so you may find it helpful to use a more advanced optimizer. You're model will train faster if you use [`tf.train.AdamOptimizer`](https://www.tensorflow.org/api_docs/python/tf/train/AdamOptimizer) instead of `tf.train.GradientDescentOptimizer`. A learning rate of 1e-4 is a good starting point.

In [1]:
import tensorflow as tf
from tqdm import trange
from tensorflow.examples.tutorials.mnist import input_data

### Suprpess warning
tf.compat.v1.logging.set_verbosity(tf.compat.v1.logging.ERROR)

In [33]:
### YOUR CODE HERE ###

#### Convolutional layer

def cnn2(x):
    
    x_cnn1 = tf.reshape(x, shape = [-1, 28, 28, 1])

    W1 = tf.Variable(tf.truncated_normal([5 ,5, 1, 32], stddev = 0.1))

    b1 = tf.Variable(tf.zeros([32]))


    # Apply convolutional layer
    conv1_preact = tf.nn.conv2d(x_cnn1, W1, strides=[1, 1, 1, 1], padding="SAME") + b1
    
    conv1 = tf.nn.relu(conv1_preact)

    max_pool_cnn1 = tf.nn.max_pool(conv1, ksize=[1,2,2,1], strides=[1,2,2,1], padding="SAME")

    W2 = tf.Variable(tf.truncated_normal([5, 5, 32, 64], stddev = 0.1))

    b2 = tf.Variable(tf.zeros([64]))
    
    conv2_preact = tf.nn.conv2d(max_pool_cnn1, W2, strides=[1, 1, 1, 1], padding="SAME") + b2
    
    conv2 = tf.nn.relu(conv2_preact)
    
    max_pool_cnn2 = tf.nn.max_pool(conv2, ksize=[1,2,2,1], strides=[1,2,2,1], padding="SAME")
    
    W3 = tf.Variable(tf.truncated_normal([7*7*64, 256]))
    b3 = tf.Variable(tf.zeros([256]))
    x_fc = tf.reshape(max_pool_cnn2, [-1, 7*7*64])
    y_preact3 = tf.matmul(x_fc, W3)+ b3
    y3 = tf.nn.relu(y_preact3)
    
    
    W4 = tf.Variable(tf.truncated_normal([256,10]))
    b4 = tf.Variable(tf.zeros([10]))
    y_preact4 = tf.matmul(y3, W4) + b4
    
    return y_preact4


In [34]:
x = tf.placeholder(tf.float32, [None, 784])
y = cnn2(x)
y_ = tf.placeholder(tf.float32, [None, 10])

In [35]:
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(labels=y_, logits=y))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
# Import data
mnist = input_data.read_data_sets("datasets/MNIST_data/", one_hot=True)

Extracting datasets/MNIST_data/train-images-idx3-ubyte.gz
Extracting datasets/MNIST_data/train-labels-idx1-ubyte.gz
Extracting datasets/MNIST_data/t10k-images-idx3-ubyte.gz
Extracting datasets/MNIST_data/t10k-labels-idx1-ubyte.gz


In [37]:
# Create a Session object, initialize all variables
sess = tf.Session()
sess.run(tf.global_variables_initializer())

# Train
for _ in trange(1000):
    batch_xs, batch_ys = mnist.train.next_batch(100)
    sess.run([cross_entropy,train_step], feed_dict={x: batch_xs, y_: batch_ys})

# Test trained model
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print('Test accuracy: {0}'.format(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels})))

sess.close()

100%|██████████| 1000/1000 [08:52<00:00,  1.88it/s]


Test accuracy: 0.9053000211715698


### Short answer

1\. How does the CNN compare in accuracy with yesterday's logistic regression and MLP models? How about training time?

`Time is longer but the accurancy is about the same`

2\. How many trainable parameters are there in the CNN you built for this assignment?

*Note: By trainable parameters, I mean individual scalars. For example, a weight matrix that is 10x5 has 50.*

`W1: (5 * 5 + 1)* 32
 W2: (5 * 5 + 1) * 64
 F1: (7*7*64 + 1) * 256
 F2: (256 + 1) * 10`

3\. When would you use a CNN versus a logistic regression model or an MLP?

`Based on the level of task. If the image classification is light work like mnist`