# Homework 4
In this homework you will train your first convolutional neural network on  images from supertux. 

Development notes: 

1) If you are doing your homework in a Jupyter/iPython notebook you may need to 'Restart & Clear Output' after making a change and re-running a cell.  TensorFlow will not allow you to create multiple variables with the same name, which is what you are doing when you run a cell that creates a variable twice.<br/><br/>
2) Be careful with your calls to global_variables_initializer(). If you call it after training one network it will re-initialize your variables erasing your training.  In general, double check the outputs of your model after all training and before turning your model in. Ending a session will discard all your variable values.

## Part 0: Setup

In [1]:
import tensorflow as tf
import numpy as np
import util

# Load the data we are giving you
def load(filename, W=64, H=64):
    data = np.fromfile(filename, dtype=np.uint8).reshape((-1, W*H*3+1))
    images, labels = data[:, :-1].reshape((-1,H,W,3)), data[:, -1]
    return images, labels

image_data, label_data = load('tux_train.dat')

print('Input shape: ' + str(image_data.shape))
print('Labels shape: ' + str(label_data.shape))

num_classes = 6

Input shape: (12257, 64, 64, 3)
Labels shape: (12257,)


## Part 1: Define your convnet

Make sure the total number of parameters is less than 100,000.

In [2]:
# Lets clear the tensorflow graph, so that you don't have to restart the notebook every time you change the network
tf.reset_default_graph()

# Set up your input placeholder
inputs = tf.placeholder(tf.float32, (None,64,64,3), name='input')

# Whenever you deal with image data it's important to mean center it first and subtract the standard deviation
white_inputs = (inputs - 100.) / 72.


# Set up your label placeholders
labels = tf.placeholder(tf.int64, (None), name='labels')

# Step 1: define the compute graph of your CNN here
#   Use 5 conv2d layers (tf.contrib.layers.conv2d) and one pooling layer tf.contrib.layers.max_pool2d or tf.contrib.layers.avg_pool2d.
#   The output of the network should be a None x 1 x 1 x 6 tensor.
#   Make sure the last conv2d does not have a ReLU: activation_fn=None
h = tf.contrib.layers.conv2d(white_inputs, 19, (5,5), stride=2, scope="conv1")
h = tf.contrib.layers.conv2d(h, 30, (5,5), stride=2, scope="conv2")
h = tf.contrib.layers.conv2d(h, 50, (5,5), stride=2, scope="conv3")
h = tf.contrib.layers.conv2d(h, 100, (3,3), stride=2, scope="conv4")
h = tf.contrib.layers.max_pool2d(h, (3,3), stride=2, scope="pool")
h = tf.contrib.layers.conv2d(h, 6, (1,1), stride=2, activation_fn=None, scope="conv5")
# The input here should be a   None x 1 x 1 x 6   tensor
output = tf.identity(tf.contrib.layers.flatten(h), name='output')

# Step 2: use a classification loss function (from assignment 3)
loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits=output, labels=labels))

# Step 3: create an optimizer (from assignment 3)
optimizer = tf.train.MomentumOptimizer(0.001, 0.9)

# Step 4: use that optimizer on your loss function (from assignment 3)
opt = optimizer.minimize(loss)
correct = tf.equal(tf.argmax(output, 1), labels)
accuracy = tf.reduce_mean(tf.cast(correct, tf.float32))

print( "Total number of variables used ", np.sum([v.get_shape().num_elements() for v in tf.trainable_variables()]), '/', 100000 )

Total number of variables used  98980 / 100000


## Part 2: Training

Training might take up to 20 min depending on your architecture.  This time around you should get close to 100% trianing accuracy.

In [3]:
# Batch size
BS = 32

# Start a session
sess = tf.Session()

# Set up training
sess.run(tf.global_variables_initializer())

# This is a helper function that trains your model for several epochs un shuffled data
# train_func should take a single step in the optmimzation and return accuracy and loss
#   accuracy, loss = train_func(batch_images, batch_labels)
# HINT: train_func should call sess.run
def train(train_func):
    # An epoch is a single pass over the training data
    for epoch in range(20):
        # Let's shuffle the data every epoch
        np.random.seed(epoch)
        np.random.shuffle(image_data)
        np.random.seed(epoch)
        np.random.shuffle(label_data)
        # Go through the entire dataset once
        accs, losss = [], []
        for i in range(0, image_data.shape[0]-BS+1, BS):
            # Train a single batch
            batch_images, batch_labels = image_data[i:i+BS], label_data[i:i+BS]
            acc, loss = train_func(batch_images, batch_labels)
            accs.append(acc)
            losss.append(loss)
        print('[%3d] Accuracy: %0.3f  \t  Loss: %0.3f'%(epoch, np.mean(accs), np.mean(losss)))


# Train convnet
print('Convnet')
train(lambda I, L: sess.run([accuracy, loss, opt], feed_dict={inputs: I, labels: L})[:2])


Convnet
[  0] Accuracy: 0.699  	  Loss: 0.824
[  1] Accuracy: 0.914  	  Loss: 0.278
[  2] Accuracy: 0.945  	  Loss: 0.181
[  3] Accuracy: 0.959  	  Loss: 0.132
[  4] Accuracy: 0.966  	  Loss: 0.111
[  5] Accuracy: 0.971  	  Loss: 0.088
[  6] Accuracy: 0.979  	  Loss: 0.067
[  7] Accuracy: 0.980  	  Loss: 0.063
[  8] Accuracy: 0.985  	  Loss: 0.048
[  9] Accuracy: 0.988  	  Loss: 0.041
[ 10] Accuracy: 0.988  	  Loss: 0.039
[ 11] Accuracy: 0.992  	  Loss: 0.026
[ 12] Accuracy: 0.992  	  Loss: 0.025
[ 13] Accuracy: 0.995  	  Loss: 0.017
[ 14] Accuracy: 0.994  	  Loss: 0.020
[ 15] Accuracy: 0.996  	  Loss: 0.013
[ 16] Accuracy: 0.994  	  Loss: 0.019
[ 17] Accuracy: 0.998  	  Loss: 0.009
[ 18] Accuracy: 0.998  	  Loss: 0.008
[ 19] Accuracy: 0.999  	  Loss: 0.005


## Part 3: Evaluation

### See your model

In [4]:
# Show the current graph
util.show_graph(tf.get_default_graph().as_graph_def())

### Compute the valiation accuracy
The convnet still massively overfits. We will deal with this in assignment 5.

In [5]:
image_val, label_val = load('tux_val.dat')

print('Input shape: ' + str(image_val.shape))
print('Labels shape: ' + str(label_val.shape))

val_accuracy, val_loss = sess.run([accuracy, loss], feed_dict={inputs: image_val, labels: label_val})
print("ConvNet Validation Accuracy: ", val_accuracy)

Input shape: (3912, 64, 64, 3)
Labels shape: (3912,)
ConvNet Validation Accuracy:  0.941462


## Part 4: Save Model
Please note that we also want you to turn in your ipynb for this assignment.  Zip up the ipynb along with the tfg for your submission.

In [6]:
util.save('assignment4.tfg', session=sess)