# Convolutional Neural Networks

## set up

First upgrade to v 0.6.0
``` bash
pip install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.6.0-py2-none-any.whl```

## DIY Demo

<img src="tf/highres_444712191.jpeg">

In [None]:
!python tf/classify_image.py --image_file tf/highres_444712191.jpeg

# How does it work?
http://cs231n.github.io

#### Overview

##### Neural net:

<img src="http://cs231n.github.io/assets/nn1/neural_net2.jpeg">

##### ConvNet:

<img src="http://cs231n.github.io/assets/cnn/cnn.jpeg">



# A typical ConvNet

**Basic:**

INPUT -> [[CONV - RELU] -> POOL] -> FC


### 1) INPUT: 
raw pixel values of the image. eg: 32 x 32 x 3

### 2) CONV: 
convolution of the input and the weights. increases the depth, eg: 32 x 32 x 12

### 3) RELU (rectified linear unit):  <img src="https://upload.wikimedia.org/math/9/d/b/9db6867e7cad45ba0853963a952a0fbc.png">
elementwise activation function. doesn't change the size.

### 4) POOL:
downsampling along the spatial dimensions. eg: 16 x 16 x 12

### 5) FC: 
fully-connected layer. compute the class scores. shrinks the volume into 1 x 1 x N (N is a large number eg 1024)

**Deep: **

INPUT -> [[CONV -> RELU] x N -> POOL?] x M -> [FC -> RELU] x K -> FC 



## Convolutional Layer:
A set of learnable filters.  spatially small, but extends through the full depth of the input volume.
 
 
 
<img src="http://cs231n.github.io/assets/cnn/depthcol.jpeg">

<img src="http://cs231n.github.io/assets/nn1/neuron_model.jpeg">



#### Spatial arrangement:
**Depth:**  number of neurons in the CONV layer that connect to the same region of the input volume.

**stride:** spatial shift  

**zero-padding:** pad the input with zeros on the border. 


**output size**: **(W − F + 2P )/S + 1** 
* input volume size (**W**), 
* Conv Layer filter size  (**F**), 
* stride (**S**), 
* zero padding (**P**) 

***EXAMPLE***

<img src="http://cs231n.github.io/assets/cnn/stride.jpeg">



In the example above: (5−3+2)/1+1 = 5



#### How it works

<img src="tf/conv.png">

## Pooling Layer

<img src="http://cs231n.github.io/assets/cnn/maxpool.jpeg">

<img src="http://cs231n.github.io/assets/cnn/pool.jpeg">

## Fully Connected Layer

<img src="http://image.slidesharecdn.com/cikm-keynote-nov2014-141125182455-conversion-gate01/95/large-scale-deep-learning-jeff-dean-39-638.jpg">

## ConvNet Architectures


**INPUT -> [[CONV -> RELU] x N -> POOL?] x M -> [FC -> RELU] x K -> FC**

N >= 0 (and usually N <= 3), 

M >= 0, K >= 0 (and usually K < 3). 

*Common patterns*:

* INPUT -> FC,  linear classifier.
* INPUT -> CONV -> RELU -> FC
* INPUT -> [CONV -> RELU -> POOL] x 2 -> FC -> RELU -> FC.   **We will build this**
* INPUT -> [CONV -> RELU -> CONV -> RELU -> POOL] x 3 -> [FC -> RELU] x 2 -> FC 

# Let's do it ourselves

# Recap: A Simple Logistic Regression

In [5]:
# Load MNIST Data
import tensorflow as tf


from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Extracting MNIST_data/train-images-idx3-ubyte.gz
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz


In [6]:
# Start an interactive session

import tensorflow as tf
sess = tf.InteractiveSession()


In [7]:
# place holders

x = tf.placeholder(tf.float32, shape=[None, 784])
y_ = tf.placeholder(tf.float32, shape=[None, 10])

In [8]:
# variables

W = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))

# intitialize
sess.run(tf.initialize_all_variables())


In [9]:
# predicted class 

y = tf.nn.softmax(tf.matmul(x,W) + b)


# cost function

cross_entropy = -tf.reduce_sum(y_*tf.log(y))


In [10]:
# train the model

train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)

for i in range(1000):
  batch = mnist.train.next_batch(50)
  train_step.run(feed_dict={x: batch[0], y_: batch[1]})

In [11]:
# evaluate

correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))


#  Multilayer Convolutional Network

In [12]:
# initialize weights

def weight_variable(shape):
  initial = tf.truncated_normal(shape, stddev=0.1)
  return tf.Variable(initial)

def bias_variable(shape):
  initial = tf.constant(0.1, shape=shape)
  return tf.Variable(initial)


In [13]:
# Convolution and Pooling

def conv2d(x, W):
  return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

def max_pool_2x2(x):
  return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
                        strides=[1, 2, 2, 1], padding='SAME')


In [14]:
# first convolutional layer

W_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])

"""
5x5 patch
32 features
"""

'\n5x5 patch\n32 features\n'

In [15]:
# reshape  x to a 4d tensor

x_image = tf.reshape(x, [-1,28,28,1])


In [16]:
# multiply by the weight 

h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)


In [17]:
# Second Layer

W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])

h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)

# 64 features for each 5x5 patch

In [18]:
# fully connected layer 

W_fc1 = weight_variable([7 * 7 * 64, 1024])
b_fc1 = bias_variable([1024])

h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)



In [19]:
# dropout layer

keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)


In [20]:
# readout layer: softmax regression

W_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])

y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)


In [None]:
# train and evaluate

cross_entropy = -tf.reduce_sum(y_*tf.log(y_conv))

train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)

correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))

accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

sess.run(tf.initialize_all_variables())

for i in range(1000):
  
  batch = mnist.train.next_batch(50)
  
  if i%100 == 0:
    train_accuracy = accuracy.eval(feed_dict={
        x:batch[0], y_: batch[1], keep_prob: 1.0})
    print("step %d, training accuracy %g"%(i, train_accuracy))
    
  train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})

print("test accuracy %g"%accuracy.eval(feed_dict={
    x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))

In [22]:
print("test accuracy %g"%accuracy.eval(feed_dict={
    x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))

test accuracy 0.9631


#  SKFlow: 
TensorFlow with sk-learn flavor!


``` bash
pip install git+git://github.com/tensorflow/skflow.git ```

In [33]:

import random
from sklearn import datasets, cross_validation, metrics

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

import skflow

### Download and load MNIST data.

mnist = input_data.read_data_sets('MNIST_data')



Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz


In [36]:
### Linear classifier.

classifier = skflow.TensorFlowLinearClassifier(
    n_classes=10, batch_size=100, steps=1000, learning_rate=0.1)

classifier.fit(mnist.train.images, mnist.train.labels)

score = metrics.accuracy_score(mnist.test.labels, classifier.predict(mnist.test.images))

print('Accuracy: {0:f}'.format(score))



Step #1, avg. loss: 2.62986
Step #101, avg. loss: 0.97821
Step #201, avg. loss: 0.54470
Step #301, avg. loss: 0.47226
Step #401, avg. loss: 0.43006
Step #501, avg. loss: 0.40804
Step #601, epoch #1, avg. loss: 0.38521
Step #701, epoch #1, avg. loss: 0.37848
Step #801, epoch #1, avg. loss: 0.37203
Step #901, epoch #1, avg. loss: 0.36297
Accuracy: 0.912300


In [None]:
### Convolutional network

def max_pool_2x2(tensor_in):
    return tf.nn.max_pool(tensor_in, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1],
        padding='SAME')

def conv_model(X, y):
    X = tf.reshape(X, [-1, 28, 28, 1])
    with tf.variable_scope('conv_layer1'):
        h_conv1 = skflow.ops.conv2d(X, n_filters=32, filter_shape=[5, 5], 
                                    bias=True, activation=tf.nn.relu)
        h_pool1 = max_pool_2x2(h_conv1)
    with tf.variable_scope('conv_layer2'):
        h_conv2 = skflow.ops.conv2d(h_pool1, n_filters=64, filter_shape=[5, 5], 
                                    bias=True, activation=tf.nn.relu)
        h_pool2 = max_pool_2x2(h_conv2)
        h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64])
    h_fc1 = skflow.ops.dnn(h_pool2_flat, [1024], activation=tf.nn.relu, keep_prob=0.5)
    return skflow.models.logistic_regression(h_fc1, y)

classifier = skflow.TensorFlowEstimator(
    model_fn=conv_model, n_classes=10, batch_size=100, steps=20000,
    learning_rate=0.001)

classifier.fit(mnist.train.images, mnist.train.labels)

score = metrics.accuracy_score(mnist.test.labels, classifier.predict(mnist.test.images))

print('Accuracy: {0:f}'.format(score))

Step #1, avg. loss: 3.16238
Step #2001, epoch #3, avg. loss: 1.29067
Step #4001, epoch #7, avg. loss: 0.39970
Step #6001, epoch #10, avg. loss: 0.28387
Step #8001, epoch #14, avg. loss: 0.22945
Step #10001, epoch #18, avg. loss: 0.19442