In [2]:
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
from tensorflow.examples.tutorials.mnist import input_data
import time
import tensorflow.contrib.layers as layers
import os
from os.path import exists

In [3]:
%matplotlib inline

# To infinity and beyond 
So far we used low level tensorflow tunctions. Now we going to learn what convolutional neural network is.

**Neural Networks** are essentially mathematical models to solve an optimization problem. They are made of neurons, the basic computation unit of neural networks. A neuron takes an input(say x), do some computation on it(say: multiply it with a variable w and adds another variable b ) to produce a value (say; z= wx+b). This value is passed to a non-linear function called activation function(f) to produce the final output(activation) of a neuron.

**Layers**
If you stack neurons in a single line, it’s called a  layer; which is the next building block of neural networks

**Layer types**
1. Fully connected layers - dot product with input tensor
2. Convolutional layers - apply convolution operation to input tensor
2. Pooling layers - reduce spatial dimensions of input tensor




# Let's look more in layers
Fully connected layers follow this pattern

![alt text]( http://cs231n.github.io/assets/nn1/neural_net2.jpeg "Fully connected network")


A ConvNet arranges its neurons in three dimensions (width, height, depth), as visualized in one of the layers. Every layer of a ConvNet transforms the 3D input volume to a 3D output volume of neuron activations. In this example, the red input layer holds the image, so its width and height would be the dimensions of the image, and the depth would be 3 (Red, Green, Blue channels).

![alt text]( http://cs231n.github.io/assets/cnn/cnn.jpeg "Convolutional network")


# CNN neuron connectivity pattern
In this example input volume in red (e.g. a 32x32x3 CIFAR-10 image), and an example volume of neurons in the first Convolutional layer. Each neuron in the convolutional layer is connected only to a local region in the input volume spatially, but to the full depth (i.e. all color channels

![alt text](http://cs231n.github.io/assets/cnn/depthcol.jpeg "Neuron example")

For example, suppose that the input volume has size [32x32x3], (e.g. an RGB CIFAR-10 image). If the receptive field (or the filter size) is 5x5, then each neuron in the Conv Layer will have weights to a [5x5x3] region in the input volume, for a total of 5*5*3 = 75 weights (and +1 bias parameter). Notice that the extent of the connectivity along the depth axis must be 3, since this is the depth of the input volume.

# Pooling layers
pooling layer reduce tensor spatial dimesion. Max pooling choose max value from pooling region. Mean pooling averages poolng region.

![alt text](https://i2.wp.com/cv-tricks.com/wp-content/uploads/2017/02/maxpool.jpg?resize=300%2C140 "Pooling example")


# CNN architectures 

cnn architecture tries to answer question: how to assemble layers in model. Basicaly we do this:

**INPUT -> [[CONV -> NONL]\*N -> POOL]\*M -> [FC->NONL]\*K -> FC**
Where N,M,K are some constants.

**Popular architectures**
1. GoogLeNet aka Inception(v1, v2, v3, v4)
2. VGG
3. ResNet
4. and many more

Look at my presentation if you want to get overview about popular CNN architectures:  https://youtu.be/H-iHcz89M5U?t=32

If you want good course on CNN's look at Stanford CS class CS231n: Convolutional Neural Networks for Visual Recognition http://cs231n.github.io/

# Lets look more in convolutions in tensorflow

**Low level interface**:
```python
tf.nn.conv2d(input, filter, strides, padding, use_cudnn_on_gpu=None, data_format=None, name=None)
tf.nn.max_pool(value, ksize, strides, padding, data_format='NHWC', name=None)
tf.matmul(X, w) + b - fully connected
```
requires manually define weights variables, you need to know what are you doing :D, not suitable for rapid prototyping


**High level interface**:
```python
tf.contrib.layers
tf.layers 
```
Developers are including code from contrib to core so there is duplication. Tensorflow is changing rapidly so be ready to change your code ir you update tensorflow

```python
tf.contrib.layers.conv2d(*args, **kwargs)

* inputs: a Tensor of rank N+2 of shape [batch_size] + input_spatial_shape + [in_channels]
* num_outputs: integer, the number of output filters.
* kernel_size: a sequence of N positive integers specifying the spatial dimensions of of the filters. 
* stride: a sequence of N positive integers specifying the stride at which to compute output.
* padding: one of "VALID" or "SAME".
* activation_fn: activation function, set to None to skip it and maintain a linear activation.

tf.contrib.layers.fully_connected(*args, **kwargs)

* inputs: A tensor of with at least rank 2 and value for the last dimension, i.e. [batch_size, depth]
* num_outputs: Integer or long, the number of output units in the layer.
* activation_fn: activation function, set to None to skip it and maintain a linear activation.

tf.contrib.layers.max_pool2d(*args, **kwargs)

* inputs: A 4-D tensor of shape [batch_size, height, width, channels] 
* kernel_size: A list of length 2: [kernel_height, kernel_width] of the pooling kernel over which the op is computed. 
* stride: A list of length 2: [stride_height, stride_width]. Can be an int if both strides are the same. Note that * presently both strides must have the same value.
* padding: The padding method, either 'VALID' or 'SAME'.

```


# Let's build our CNN model

In [4]:
tf.reset_default_graph()
def build_models(input):
    #Network pattern
    #INPUT -> [[CONV -> RELU] -> POOL]*2 -> [FC->RELU] -> FC
    
    # tensors to tensorflow by default go in by b01c order [batch, height, width, channels]
    with tf.variable_scope('cnn_model'):
        images = tf.reshape(input, shape=[-1, 28, 28, 1])
        net = layers.conv2d(images, 32, [3,3], padding='same', activation_fn=tf.nn.relu, scope='conv1')
        net = layers.max_pool2d(net, kernel_size = [2,2], stride=[2,2], scope='pool1')
        net = layers.conv2d(net, 64, [3,3], padding='same', activation_fn=tf.nn.relu, scope='conv2')
        net = layers.max_pool2d(net, kernel_size = [2,2], stride=[2,2], scope='pool2') #shape is[?,7,7,64]
        # To fully connected layer we need to pass 2d tensor
        net = layers.flatten(net)
        net = layers.fully_connected(net, 512, activation_fn=tf.nn.relu, scope='fc1')
        logits = layers.fully_connected(net, 10, activation_fn=None, scope='fc2')
        return logits

    
#Lets look how our model looks like
X = tf.placeholder(tf.float32, [None, 784], name="X_placeholder")
net = build_models(X)

gpu_opts = tf.GPUOptions(per_process_gpu_memory_fraction=0.5)
with tf.Session(config=tf.ConfigProto(gpu_options=gpu_opts)) as sess:
    sess.run(tf.global_variables_initializer())
    for i in tf.global_variables():
        print i.op.name, tf.reduce_mean(i).eval()



cnn_model/conv1/weights 0.00079708
cnn_model/conv1/biases 0.0
cnn_model/conv2/weights -0.000188029
cnn_model/conv2/biases 0.0
cnn_model/fc1/weights 1.8262e-06
cnn_model/fc1/biases 0.0
cnn_model/fc2/weights -0.000754143
cnn_model/fc2/biases 0.0


In [5]:
#Lets build different model
tf.reset_default_graph()
def build_model_dropout(input, is_training=True):
    DROPOUT = 0.75
    # tensors to tensorflow by default go in by b01c order [batch, height, width, channels]
    with tf.variable_scope('cnn_model'):
        images = tf.reshape(input, shape=[-1, 28, 28, 1])
        net = layers.conv2d(images, 32, [3,3], padding='same', activation_fn=tf.nn.relu, scope='conv1')
        net = layers.max_pool2d(net, kernel_size = [2,2], stride=[2,2], scope='pool1')
        net = layers.conv2d(net, 64, [3,3], padding='same', activation_fn=tf.nn.relu, scope='conv2')
        net = layers.max_pool2d(net, kernel_size = [2,2], stride=[2,2], scope='pool2') #shape is[?,7,7,64]
        # To fully connected layer we need to pass 2d tensor
        net = layers.flatten(net)
        net = layers.fully_connected(net, 512, activation_fn=tf.nn.relu, scope='fc1')
        net = layers.dropout(net, DROPOUT,is_training=is_training)
        logits = layers.fully_connected(net, 10, activation_fn=None, scope='fc2')
        return logits

# Now we need to write new train script
In your spare time you can think about what other architectures you can think of.

In [6]:
tf.reset_default_graph()
# Step 1: Get data
mnist = input_data.read_data_sets('MNIST_data/', one_hot=True)

# Step 2: Define paramaters for the model
LEARNING_RATE = 0.01
BATCH_SIZE = 128
SKIP_STEP = 100
N_EPOCHS = 10

# Step 3: create placeholders for features and labels
# each image in the MNIST data is of shape 28*28 = 784
# therefore, each image is represented with a 1x784 tensor
# We'll be doing dropout for hidden layer so we'll need a placeholder
# for the dropout probability too
# Use None for shape so we can change the batch_size once we've built the graph
with tf.name_scope('data'):
    X = tf.placeholder(tf.float32, [None, 784], name="X_placeholder")
    y = tf.placeholder(tf.float32, [None, 10], name="Y_placeholder")
    
    is_training = tf.placeholder(tf.bool, [], name="Train_flag")
    
global_step = tf.Variable(0, dtype=tf.int32, trainable=False, name='global_step')

# Step 4: define model
# Our model is conv -> relu -> pool -> conv -> relu -> pool -> fully connected -> softmax
logits = build_models(X)
# logits = build_model_dropout(X)

# Step 5: define loss function
# use softmax cross entropy with logits as the loss function form tf.losses
with tf.name_scope('loss'):
    tf.losses.softmax_cross_entropy(y, logits=logits)
    loss = tf.losses.get_total_loss(add_regularization_losses=False)

# Step 6: define training op
# using gradient descent with learning rate of LEARNING_RATE to minimize cost
optimizer = tf.train.GradientDescentOptimizer(LEARNING_RATE).minimize(loss, 
                                        global_step=global_step)
# Step 7 define test op
with tf.name_scope('test_op'):
    correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(y, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

# Step 8: We create dir where to store checkpoint files
save_dir = './MNIST_data/test/cnn/'
if not exists(save_dir):
    os.mkdir(save_dir)
    
gpu_opts = tf.GPUOptions(per_process_gpu_memory_fraction=0.5)
with tf.Session(config=tf.ConfigProto(gpu_options=gpu_opts)) as sess:
    sess.run(tf.global_variables_initializer())
    saver = tf.train.Saver()
    
    ckpt = tf.train.get_checkpoint_state(save_dir)
    # if that checkpoint exists, restore from checkpoint
    if ckpt and ckpt.model_checkpoint_path:
        saver.restore(sess, ckpt.model_checkpoint_path)
    
    initial_step = global_step.eval()
    
    start_time = time.time()
    n_batches = int(mnist.train.num_examples / BATCH_SIZE)
    print "INFO:  n_batches per epoch: {} initial_step: {}".format(n_batches, initial_step)
    total_loss = 0.0
    for index in range(initial_step, n_batches * N_EPOCHS): # train the model n_epochs times
        X_batch, Y_batch = mnist.train.next_batch(BATCH_SIZE)
        _, loss_batch = sess.run([optimizer, loss], feed_dict={X: X_batch, y:Y_batch}) 
#         _, loss_batch = sess.run([optimizer, loss], feed_dict={X: X_batch, y:Y_batch, is_training:True}) 
        total_loss += loss_batch
        if (index + 1) % SKIP_STEP == 0:
            print('Average loss at step {}: {:5.3f}'.format(index + 1, total_loss / SKIP_STEP))
            total_loss = 0.0
            saver.save(sess, save_dir+'cnn_model', index)
    
    print("Optimization Finished!") # should be around 0.35 after 25 epochs
    print("Total time: {0} seconds".format(time.time() - start_time))
    
    # test the model
    
#     acc =  sess.run(accuracy, feed_dict={X: mnist.test.images, y: mnist.test.labels, is_training:False})
    acc =  sess.run(accuracy, feed_dict={X: mnist.test.images, y: mnist.test.labels})
    print 'Accuracy {0}'.format(acc) 
    
    #Lets list all checkpoint files
    ckpt = tf.train.get_checkpoint_state(save_dir)
    for saved_model in ckpt.all_model_checkpoint_paths:
        print saved_model
        

Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
INFO:  n_batches per epoch: 429 initial_step: 4200
Optimization Finished!
Total time: 0.838398933411 seconds
Accuracy 0.977400124073
./MNIST_data/test/cnn/cnn_model-3799
./MNIST_data/test/cnn/cnn_model-3899
./MNIST_data/test/cnn/cnn_model-3999
./MNIST_data/test/cnn/cnn_model-4099
./MNIST_data/test/cnn/cnn_model-4199


# So boys and girls now you realy know kung fu :D