[View in Colaboratory](https://colab.research.google.com/github/aminzabardast/Tensorflow-Tutorials/blob/dev/5_Convolutional_Neural_Network.ipynb)

# Convolutional Neural Networks With Tensor Flow

**Convolutional Neural Networks**, also known as **ConvNet**s or **CNN**s, are a type of Neural Networks that performe better in specific data types. These type of networks are specially better for image analysis.

This notebook is dedicated to creating CNNs using Tensoflow library and the specific example that will be studied is [MNIST database](http://yann.lecun.com/exdb/mnist/) for handwritten digit recognition.

## Loading Libraries and Data

The only package needed for this example is `Tensorflow`.  `MNIST` data is available as a tutorial example inside tensorflow package. 

In [1]:
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

By calling `input_data.read_data_sets('MNIST_data', one_hot=True)`, Tensorflow will download the data for you.

In [2]:
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.
Instructions for updating:
Please write your own downloading logic.
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting MNIST_data/train-images-idx3-ubyte.gz
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Instructions for updating:
Please use tf.one_hot on tensors.
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.


## Function: Divide and Conquer

It is very good practice to approach your problem by dividing and conquering. Here, the problem at hand becomes easier by encapsulating each logical part into a function.

Creating Concolution, Max Pooling, and Fully Connected layers are handled by `conv2d`, `maxPool2x2`, and `fullyConnected` respectively. 

Accuracy of the mthod is calculated by `computeAccuracy`. To do this all of the test images will run through the network and the percentage of true results will be calcularted.

In [3]:
def conv2d(input_data, kernel_shape, activation_function=tf.nn.relu, name=None):
    weights = tf.Variable(tf.truncated_normal(kernel_shape, stddev=0.1))
    biases = tf.Variable(tf.constant(.1, shape=[kernel_shape[3]]))
    conv_result = tf.nn.conv2d(input_data, weights, strides=[1, 1, 1, 1], padding='SAME', name=name) + biases
    return activation_function(conv_result)

In [4]:
def maxPool2x2(input_data, name=None):
    return tf.nn.max_pool(input_data, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME', name=name)

In [5]:
def fullyConnected(input_data, shape, activation_function=tf.nn.relu, name=None):
    weights = tf.Variable(tf.truncated_normal(shape, stddev=0.1))
    biases = tf.Variable(tf.zeros([1, shape[1]]) + 0.1)
    return activation_function(tf.matmul(input_data, weights) + biases)

In [6]:
def computeAccuracy(v_xs, v_ys):
    y_pre = sess.run(rfc2, feed_dict={xs: v_xs})
    correct_prediction = tf.equal(tf.argmax(y_pre,1), tf.argmax(v_ys,1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    result = sess.run(accuracy, feed_dict={xs: v_xs, ys: v_ys})
    return result

## Creating The Network and Training

The target structure in this example is not considered complicated. This is a simple CNN with two **Convolutional** layers, each followed by a **2 by 2 Max Pooling** layer. Then the result of layer will be flattened and go through Two **Fully Connected** layers. The power of CNN in Image Analysis is apparent here since this simple network can achieve $\%98$ accuracy in **MNIST** data set with simple **Stochastic Gradient Descent**.

![Graphical Representation Of Our Network](https://raw.githubusercontent.com/aminzabardast/Tensorflow-Tutorials/dev/figures/5_CNN_structure.png)

Images in **MNIST** datasset are all cropped to **28 by 28** size, which is the size of input layer. The input will be convolved by a **5 by 5** layer and takes the single chanel image to a 32 changel image. This means the input to layer two is a **28 by 28 by 32** image. This result will go through a 2 by 2 Max Pooling and becomes a **14 by 14 by 32** image. 

This **14 by 14 by 32** image will go through another convolution with **3 by 3 convolving kernel**. This results in a **14 by 14 by 64** image which will be converted to a **7 by 7 by 64** image after max pooling.

The intention behind this approach is to encode spacial data of the image into a smaller space with more channels. The information might seem scrambled afterwards but in reality this extracts the essence of the image so classification step can happen.

Additionally, the convolution operation makes the network invariant to spatial changes in the image. This means the network will not be mistaken if the digit is not centered exactly in the center, hence this is a spatial invarient approach.

Finally, **7 by 7 by 64** output will beflattened into a **3136** ($7 \times 7 \times 64$) array. This array goes through two fully connected network to be classified.

In [7]:
with tf.name_scope('Input'):
    # Intensity values are between 0 and 255.
    # This will be converted to between 0 and 1.
    xs = tf.placeholder(dtype=tf.float32, shape=[None, 784], name='Data')/255
    # MNIST data are storred in vectors and the need to be reshaped to 28*28 image.
    x_images = tf.reshape(xs,shape=[-1, 28, 28, 1], name='ReshapedData')
    
with tf.name_scope('Truth'):
    ys = tf.placeholder(dtype=tf.float32, shape=[None, 10], name='Truth')

with tf.name_scope('Conv1'):
    # 5x5 Kernel, Feature Map Input 1 (Gray scale image) and Output 32
    kernel1_shape = [5, 5, 1, 32]
    # Using kernel1 to calculate the convolved layer / Output size: 28 x 28 x 32
    rconv1 = conv2d(x_images, kernel1_shape)

with tf.name_scope('MaxPool1'):
    # Max polling the result of Conv1 layer / Output size: 14 x 14 x 32
    rpool1 = maxPool2x2(rconv1)

with tf.name_scope('Conv2'):
    # 3x3 Kernel, Feature Map Input 32 and Output 64
    kernel2_shape = [3, 3, 32, 64]
    # Using kernel2 to calculate the convolved layer / Output size: 14 x 14 x 64
    rconv2 = conv2d(rpool1, kernel2_shape)

with tf.name_scope('MaxPool2'):
    # Max polling the result of Conv2 layer / Output size: 7 x 7 x 64
    rpool2 = maxPool2x2(rconv2)

with tf.name_scope('FC1'):
    # Shape of the layer
    fc1_shape = [7*7*64, 1024]
    # Flattening From [n,7,7,64] to [n,3136]
    rpool2_flattened = tf.reshape(rpool2, shape=[-1, 7*7*64])
    rfc1 = fullyConnected(rpool2_flattened, fc1_shape)

with tf.name_scope('FC2'):
    # Shape of the layer
    fc2_shape = [1024, 10]
    rfc2 = fullyConnected(rfc1, fc2_shape, tf.nn.softmax)

with tf.name_scope('Loss'):
    # Cross Entropy as the loss function
    crossEntropy = tf.reduce_mean(-tf.reduce_sum(ys*tf.log(rfc2), reduction_indices=[1]))
    tf.summary.scalar(name='CrossEntropy', tensor=crossEntropy)

with tf.name_scope('Optimizer'):
    # Optmizing using Simple Gradient Descent
    trainStep = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(crossEntropy)

# Creating Session
sess = tf.Session()

# Initiating Variable
init = tf.global_variables_initializer()

# Merging all summaries
merged = tf.summary.merge_all()

# Creating File Writers
trainWriter = tf.summary.FileWriter(logdir='logs/train', graph=sess.graph)
testWriter = tf.summary.FileWriter(logdir='logs/test', graph=sess.graph)

# Initiating Session
sess.run(init)

# Training
for epoch in range(4000):
    # Training with 100 Images in one epoch
    trainBatchXs, trainBatchYs = mnist.train.next_batch(100)
    testBatchXs, testBatchYs = mnist.test.next_batch(100)
    
    # Forward and backward pass
    sess.run(trainStep, feed_dict={xs: trainBatchXs, ys: trainBatchYs})
    
    # Adding the state of network to logs
    trainWriter.add_summary(sess.run(merged, feed_dict={xs: trainBatchXs, ys: trainBatchYs}), epoch)
    testWriter.add_summary(sess.run(merged, feed_dict={xs: testBatchXs, ys: testBatchYs}), epoch)
    
    if (epoch+1) % 100 == 0 or epoch == 0:
        print('Epoch: {}'.format(epoch+1))

# Calculating final accuracy
accuracy = computeAccuracy(mnist.test.images, mnist.test.labels)*100
print('\nFinal Accuracy: %{0:.2f}'.format(accuracy))


Epoch: 1
Epoch: 100
Epoch: 200
Epoch: 300
Epoch: 400
Epoch: 500
Epoch: 600
Epoch: 700
Epoch: 800
Epoch: 900
Epoch: 1000
Epoch: 1100
Epoch: 1200
Epoch: 1300
Epoch: 1400
Epoch: 1500
Epoch: 1600
Epoch: 1700
Epoch: 1800
Epoch: 1900
Epoch: 2000
Epoch: 2100
Epoch: 2200
Epoch: 2300
Epoch: 2400
Epoch: 2500
Epoch: 2600
Epoch: 2700
Epoch: 2800
Epoch: 2900
Epoch: 3000
Epoch: 3100
Epoch: 3200
Epoch: 3300
Epoch: 3400
Epoch: 3500
Epoch: 3600
Epoch: 3700
Epoch: 3800
Epoch: 3900
Epoch: 4000

Final Accuracy: %98.19


## Downloading Log

In [8]:
!tar czvf logs.tar.gz logs

from google.colab import files
files.download('logs.tar.gz')

!rm -rvf logs*

logs/
logs/train/
logs/train/events.out.tfevents.1530643617.fd4b4cae378d
logs/test/
logs/test/events.out.tfevents.1530643617.fd4b4cae378d
removed 'logs/train/events.out.tfevents.1530643617.fd4b4cae378d'
removed directory 'logs/train'
removed 'logs/test/events.out.tfevents.1530643617.fd4b4cae378d'
removed directory 'logs/test'
removed directory 'logs'
removed 'logs.tar.gz'
