# Deep Learning (Tensorflow)
---
Zhiang Chen

June 2016

## 1. Deep Learning
---
### Book
http://www.deeplearningbook.org/

### Tutorial
Udacity

Tensorflow Tutorial

### Architectures
ImageNet Competition Winners:
LeNet-5,
AlexNet,
GoogleNet(Inception Model)

## 2. Tensorflow
---
### Atlas9
Tensorflow 0.8 (Both Python2/Python3, GPU supported) has been installed on atlas9. Currently, Tensorflow has GPU allocation problems. It can be solved by using ‘BFC’ (Best-fit with coalescing), or manually allocating GPUs. Also see [comments](https://github.com/tensorflow/tensorflow/blob/30b52579f6d66071ac7cdc7179e2c4aae3c9cb88/tensorflow/core/protobuf/config.proto).

### Not-MNIST
http://yaroslavvb.blogspot.com/2011/09/notmnist-dataset.html

### IPython
http://jupyter.org/

### SkFlow
[Simplified interface for TensorFlow (mimicking Scikit Learn) for Deep Learning.](https://github.com/tensorflow/skflow)
[CNN example](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/skflow/mnist.py).

## 3. Simple ConvNet Example
---
Input -> ConvNet(Relu) -> MaxPool(Relu) -> FC(Relu) -> FC -> Softmax

(Stochastic Gradient Descent & No Addtional Regularization)

### (1) Import packages

In [2]:
# These are all the modules we'll be using later. Make sure you can import them
# before proceeding further.
from __future__ import print_function
import numpy as np
import tensorflow as tf
from six.moves import cPickle as pickle
from six.moves import range

### (2) Load Data

In [3]:
pickle_file = 'notMNIST.pickle'

with open(pickle_file, 'rb') as f:
  save = pickle.load(f)
  train_dataset = save['train_dataset']
  train_labels = save['train_labels']
  valid_dataset = save['valid_dataset']
  valid_labels = save['valid_labels']
  test_dataset = save['test_dataset']
  test_labels = save['test_labels']
  del save  # hint to help gc free up memory
  print('Training set', train_dataset.shape, train_labels.shape)
  print('Validation set', valid_dataset.shape, valid_labels.shape)
  print('Test set', test_dataset.shape, test_labels.shape)

Training set (200000, 28, 28) (200000,)
Validation set (10000, 28, 28) (10000,)
Test set (10000, 28, 28) (10000,)


### (3) Pre-process Data

In [4]:
image_size = 28
num_labels = 10
num_channels = 1 # grayscale

import numpy as np

def reformat(dataset, labels):
  dataset = dataset.reshape(
    (-1, image_size, image_size, num_channels)).astype(np.float32)
  labels = (np.arange(num_labels) == labels[:,None]).astype(np.float32)
  return dataset, labels
train_dataset, train_labels = reformat(train_dataset, train_labels)
valid_dataset, valid_labels = reformat(valid_dataset, valid_labels)
test_dataset, test_labels = reformat(test_dataset, test_labels)
print('Training set', train_dataset.shape, train_labels.shape)
print('Validation set', valid_dataset.shape, valid_labels.shape)
print('Test set', test_dataset.shape, test_labels.shape)

Training set (200000, 28, 28, 1) (200000, 10)
Validation set (10000, 28, 28, 1) (10000, 10)
Test set (10000, 28, 28, 1) (10000, 10)


### (4) Define Some Functions

In [5]:
# Arithmetic Mean
def accuracy(predictions, labels):
  return (100.0 * np.sum(np.argmax(predictions, 1) == np.argmax(labels, 1))
          / predictions.shape[0])

### (5) Build DNN
---
Batch: the 'batch' in SGD

Patch: the kernel in ConvNet

Kernel: the kernel in pooling

In [6]:
batch_size = 16
patch_size = 5
kernel_size = 2
depth = 16  # the output of convnet
num_hidden = 64 # the number of hidden units in FC

graph = tf.Graph()

with graph.as_default():

  # Input data.
  tf_train_dataset = tf.placeholder(
    tf.float32, shape=(batch_size, image_size, image_size, num_channels))
  # convolution's input is a tensor of shape [batch,in_height,in_width,in_channels]
  tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size, num_labels))
  tf_valid_dataset = tf.constant(valid_dataset)
  tf_test_dataset = tf.constant(test_dataset)
  
  # Variables.
  layer1_weights = tf.Variable(tf.truncated_normal(
      [patch_size, patch_size, num_channels, depth], stddev=0.1))
  layer1_biases = tf.Variable(tf.zeros([depth]))
  # convolution's weights are called filter in tensorflow
  # it is a tensor of shape [kernel_hight,kernel_width,in_channels,out_channels]
  layer2_weights = tf.Variable(tf.truncated_normal(
      [patch_size, patch_size, depth, depth], stddev=0.1))
  layer2_biases = tf.Variable(tf.constant(1.0, shape=[depth]))
  layer3_weights = tf.Variable(tf.truncated_normal(
      [image_size // 4 * image_size // 4 * depth, num_hidden], stddev=0.1))
  layer3_biases = tf.Variable(tf.constant(1.0, shape=[num_hidden]))
  layer4_weights = tf.Variable(tf.truncated_normal(
      [num_hidden, num_labels], stddev=0.1))
  layer4_biases = tf.Variable(tf.constant(1.0, shape=[num_labels]))
  
  # Model.
  def model(data):
    conv = tf.nn.conv2d(data, layer1_weights, [1, 2, 2, 1], padding='SAME')
    hidden = tf.nn.relu(conv + layer1_biases)
    
    max_pool = tf.nn.max_pool(hidden,[1,kernel_size,kernel_size,1],[1,2,2,1],'SAME')
    hidden = tf.nn.relu(max_pool+layer2_biases)
    
    # 3D -> 2D
    shape = hidden.get_shape().as_list()
    reshape = tf.reshape(hidden, [shape[0], shape[1] * shape[2] * shape[3]])
    
    hidden = tf.nn.relu(tf.matmul(reshape, layer3_weights) + layer3_biases)
    return tf.matmul(hidden, layer4_weights) + layer4_biases
  
  # Training computation.
  logits = model(tf_train_dataset)
  loss = tf.reduce_mean(
    tf.nn.softmax_cross_entropy_with_logits(logits, tf_train_labels))
    
  # Optimizer.
  optimizer = tf.train.GradientDescentOptimizer(0.05).minimize(loss)
  
  # Predictions for the training, validation, and test data.
  train_prediction = tf.nn.softmax(logits)
  valid_prediction = tf.nn.softmax(model(tf_valid_dataset))
  test_prediction = tf.nn.softmax(model(tf_test_dataset))

### (6) Train Network

In [7]:
num_steps = 5000

'''
# use GPU0
# allocate CPU
config = tf.ConfigProto()
config.gpu_options.allocator_type = 'BFC'
config.gpu_options.allow_growth = True
config.log_device_placement = True
with tf.Session(graph=graph,config=config) as session:
'''
# use CPU0
with tf.Session(graph=graph) as session:
  tf.initialize_all_variables().run()
  print('Initialized')
  for step in range(num_steps):
    offset = (step * batch_size) % (train_labels.shape[0] - batch_size)
    batch_data = train_dataset[offset:(offset + batch_size), :, :, :]
    batch_labels = train_labels[offset:(offset + batch_size), :]
    feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels}
    _, l, predictions = session.run(
      [optimizer, loss, train_prediction], feed_dict=feed_dict)
    if (step % 500 == 0):
      print('Minibatch loss at step %d: %f' % (step, l))
      print('Minibatch accuracy: %.1f%%' % accuracy(predictions, batch_labels))
      print('Validation accuracy: %.1f%%' % accuracy(
        valid_prediction.eval(), valid_labels))
      print('--------------------------------------')
  print('Test accuracy: %.1f%%' % accuracy(test_prediction.eval(), test_labels))

Initialized
Minibatch loss at step 0: 4.284102
Minibatch accuracy: 12.5%
Validation accuracy: 10.0%
--------------------------------------
Minibatch loss at step 500: 2.310783
Minibatch accuracy: 0.0%
Validation accuracy: 10.0%
--------------------------------------
Minibatch loss at step 1000: 1.662268
Minibatch accuracy: 37.5%
Validation accuracy: 48.8%
--------------------------------------
Minibatch loss at step 1500: 0.958118
Minibatch accuracy: 87.5%
Validation accuracy: 79.5%
--------------------------------------
Minibatch loss at step 2000: 0.831504
Minibatch accuracy: 81.2%
Validation accuracy: 80.1%
--------------------------------------
Minibatch loss at step 2500: 0.392806
Minibatch accuracy: 93.8%
Validation accuracy: 82.0%
--------------------------------------
Minibatch loss at step 3000: 0.386772
Minibatch accuracy: 93.8%
Validation accuracy: 83.5%
--------------------------------------
Minibatch loss at step 3500: 0.360203
Minibatch accuracy: 93.8%
Validation accuracy

## 4. LeNet-5
---
[LeNet-5](http://www.dengfanxin.cn/wp-content/uploads/2016/03/1998Lecun.pdf) & input [dropout](https://www.cs.toronto.edu/~hinton/absps/JMLRdropout.pdf)

In [9]:
'''
LeNet-5 (Tensorflow, CPU)
Zhiang Chen
6/2016
zxc251@case.edu
'''
import time
start_time = time.time()

batch_size = 16
patch_size = 5
kernel_size = 2
depth1 = 6 #the depth of 1st convnet
depth2 = 16 #the depth of 2nd convnet
C5_units = 120
F6_units = 84
F7_units = 10

graph = tf.Graph()

with graph.as_default():
    # Input data
    tf_train_dataset = tf.placeholder(tf.float32, shape=(batch_size, image_size, image_size, num_channels))
    # convolution's input is a tensor of shape [batch,in_height,in_width,in_channels]
    tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size, num_labels))
    tf_valid_dataset = tf.constant(valid_dataset)
    tf_test_dataset = tf.constant(test_dataset)
    
    # Variables(weights and biases)
    C1_weights = tf.Variable(tf.truncated_normal([patch_size, patch_size, num_channels, depth1], stddev=0.1))
    # convolution's weights are called filter in tensorflow
    # it is a tensor of shape [kernel_hight,kernel_width,in_channels,out_channels]
    C1_biases = tf.Variable(tf.zeros([depth1]))
                            
    # S1_weights # Sub-sampling doesn't need weights and biases
    # S1_biases
    
    C3_weights = tf.Variable(tf.truncated_normal([patch_size, patch_size, depth1, depth2], stddev=0.1))
    C3_biases = tf.Variable(tf.constant(1.0, shape=[depth2]))
                            
    # S4_weights
    # S4_biases
     
    # C5 actually is a fully-connected layer                        
    C5_weights = tf.Variable(tf.truncated_normal([5 * 5 * depth2, C5_units], stddev=0.1))
    C5_biases = tf.Variable(tf.constant(1.0, shape=[C5_units]))
         
    F6_weights = tf.Variable(tf.truncated_normal([C5_units,F6_units], stddev=0.1))
    F6_biases = tf.Variable(tf.constant(1.0, shape=[F6_units]))
                                
    # FC and logistic regression replace RBF
    F7_weights = tf.Variable(tf.truncated_normal([F6_units,F7_units], stddev=0.1))
    F7_biases = tf.Variable(tf.constant(1.0, shape=[F7_units]))

    # Model
    def model(data):
        conv = tf.nn.conv2d(data, C1_weights, [1, 1, 1, 1], padding='SAME')
        hidden = tf.nn.relu(conv + C1_biases) # relu is better than tanh
        
        max_pool = tf.nn.max_pool(hidden,[1,kernel_size,kernel_size,1],[1,2,2,1],'VALID')
        hidden = tf.nn.relu(max_pool)
                                
        conv = tf.nn.conv2d(hidden, C3_weights, [1, 1, 1, 1], padding='VALID')
        hidden = tf.nn.relu(conv + C3_biases)

        max_pool = tf.nn.max_pool(hidden,[1,kernel_size,kernel_size,1],[1,2,2,1],'VALID')
        hidden = tf.nn.relu(max_pool)
                            
        shape = hidden.get_shape().as_list()
        reshape = tf.reshape(hidden, [shape[0], shape[1] * shape[2] * shape[3]])
        hidden = tf.nn.relu(tf.matmul(reshape, C5_weights) + C5_biases)
                            
        fc = tf.matmul(hidden,F6_weights)
        hidden = tf.nn.relu(fc + F6_biases)
        
        fc = tf.matmul(hidden,F7_weights)
        output = fc + F7_biases
    
        return output

    
    # Training computation.
    tf_train_dataset = tf.nn.dropout(tf_train_dataset,0.8) # input dropout
    logits = model(tf_train_dataset)
    loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits, tf_train_labels))
    
    # Optimizer.
    optimizer = tf.train.GradientDescentOptimizer(0.05).minimize(loss)
  
    # Predictions for the training, validation, and test data.
    train_prediction = tf.nn.softmax(logits)
    valid_prediction = tf.nn.softmax(model(tf_valid_dataset))
    test_prediction = tf.nn.softmax(model(tf_test_dataset))
    
# training
num_steps = 15000
config = tf.ConfigProto()
config.log_device_placement = True
with tf.Session(graph=graph, config = config) as session:
  tf.initialize_all_variables().run()
  print('Initialized')
  for step in range(num_steps):
    offset = (step * batch_size) % (train_labels.shape[0] - batch_size)
    batch_data = train_dataset[offset:(offset + batch_size), :, :, :]
    batch_labels = train_labels[offset:(offset + batch_size), :]
    feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels}
    _, l, predictions = session.run(
      [optimizer, loss, train_prediction], feed_dict=feed_dict)
    if (step % 500 == 0):
      print('Minibatch loss at step %d: %f' % (step, l))
      print('Minibatch accuracy: %.1f%%' % accuracy(predictions, batch_labels))
      print('Validation accuracy: %.1f%%' % accuracy(
        valid_prediction.eval(), valid_labels))
      print('--------------------------------------')
  print('Test accuracy: %.1f%%' % accuracy(test_prediction.eval(), test_labels))
end_time = time.time()
duration = (end_time - start_time)/60
print("Excution time: %0.2fmin" % duration)

Initialized
Minibatch loss at step 0: 2.904760
Minibatch accuracy: 25.0%
Validation accuracy: 10.2%
--------------------------------------
Minibatch loss at step 500: 0.231240
Minibatch accuracy: 93.8%
Validation accuracy: 80.7%
--------------------------------------
Minibatch loss at step 1000: 0.407553
Minibatch accuracy: 87.5%
Validation accuracy: 82.5%
--------------------------------------
Minibatch loss at step 1500: 0.542050
Minibatch accuracy: 87.5%
Validation accuracy: 84.4%
--------------------------------------
Minibatch loss at step 2000: 0.575230
Minibatch accuracy: 75.0%
Validation accuracy: 84.5%
--------------------------------------
Minibatch loss at step 2500: 0.302320
Minibatch accuracy: 87.5%
Validation accuracy: 85.8%
--------------------------------------
Minibatch loss at step 3000: 0.436809
Minibatch accuracy: 93.8%
Validation accuracy: 86.6%
--------------------------------------
Minibatch loss at step 3500: 0.250634
Minibatch accuracy: 100.0%
Validation accura

## What I am working on - Detecting the objects in image
---
ConvNet is useful to classify objects, but it cannot detect the objects in the image. It cannot tell the position of the objects, or classify two or more objects in one image. Currenty, there are two main solutions.

(1) Image Segmentation

(2) RNNs