Deep Learning
=============

Assignment 4
------------

Previously in `2_fullyconnected.ipynb` and `3_regularization.ipynb`, we trained fully connected networks to classify [notMNIST](http://yaroslavvb.blogspot.com/2011/09/notmnist-dataset.html) characters.

The goal of this assignment is make the neural network convolutional.

In [2]:
# These are all the modules we'll be using later. Make sure you can import them
# before proceeding further.
from __future__ import print_function
import numpy as np
import tensorflow as tf
from six.moves import cPickle as pickle
from six.moves import range

In [3]:
pickle_file = 'notMNIST.pickle'

with open(pickle_file, 'rb') as f:
  save = pickle.load(f)
  train_dataset = save['train_dataset']
  train_labels = save['train_labels']
  valid_dataset = save['valid_dataset']
  valid_labels = save['valid_labels']
  test_dataset = save['test_dataset']
  test_labels = save['test_labels']
  del save  # hint to help gc free up memory
  print('Training set', train_dataset.shape, train_labels.shape)
  print('Validation set', valid_dataset.shape, valid_labels.shape)
  print('Test set', test_dataset.shape, test_labels.shape)

Training set (200000, 28, 28) (200000,)
Validation set (10000, 28, 28) (10000,)
Test set (10000, 28, 28) (10000,)


Reformat into a TensorFlow-friendly shape:
- convolutions need the image data formatted as a cube (width by height by #channels)
- labels as float 1-hot encodings.

In [4]:
image_size = 28
num_labels = 10
num_channels = 1 # grayscale

import numpy as np

def reformat(dataset, labels):
  dataset = dataset.reshape(
    (-1, image_size, image_size, num_channels)).astype(np.float32)
  labels = (np.arange(num_labels) == labels[:,None]).astype(np.float32)
  return dataset, labels
train_dataset, train_labels = reformat(train_dataset, train_labels)
valid_dataset, valid_labels = reformat(valid_dataset, valid_labels)
test_dataset, test_labels = reformat(test_dataset, test_labels)
print('Training set', train_dataset.shape, train_labels.shape)
print('Validation set', valid_dataset.shape, valid_labels.shape)
print('Test set', test_dataset.shape, test_labels.shape)

Training set (200000, 28, 28, 1) (200000, 10)
Validation set (10000, 28, 28, 1) (10000, 10)
Test set (10000, 28, 28, 1) (10000, 10)


In [5]:
def accuracy(predictions, labels):
  return (100.0 * np.sum(np.argmax(predictions, 1) == np.argmax(labels, 1))
          / predictions.shape[0])

Let's build a small network with two convolutional layers, followed by one fully connected layer. Convolutional networks are more expensive computationally, so we'll limit its depth and number of fully connected nodes.

In [6]:
batch_size = 16
patch_size = 5
depth = 16
num_hidden = 64

graph = tf.Graph()

with graph.as_default():

  # Input data.
  tf_train_dataset = tf.placeholder(
    tf.float32, shape=(batch_size, image_size, image_size, num_channels))
  tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size, num_labels))
  tf_valid_dataset = tf.constant(valid_dataset)
  tf_test_dataset = tf.constant(test_dataset)
  
  # Variables.
  layer1_weights = tf.Variable(tf.truncated_normal(
      [patch_size, patch_size, num_channels, depth], stddev=0.1))
  layer1_biases = tf.Variable(tf.zeros([depth]))
  layer2_weights = tf.Variable(tf.truncated_normal(
      [patch_size, patch_size, depth, depth], stddev=0.1))
  layer2_biases = tf.Variable(tf.constant(1.0, shape=[depth]))
  layer3_weights = tf.Variable(tf.truncated_normal(
      [image_size // 4 * image_size // 4 * depth, num_hidden], stddev=0.1))
  layer3_biases = tf.Variable(tf.constant(1.0, shape=[num_hidden]))
  layer4_weights = tf.Variable(tf.truncated_normal(
      [num_hidden, num_labels], stddev=0.1))
  layer4_biases = tf.Variable(tf.constant(1.0, shape=[num_labels]))
  
  # Model.
  def model(data):
    conv = tf.nn.conv2d(data, layer1_weights, [1, 2, 2, 1], padding='SAME')
    hidden = tf.nn.relu(conv + layer1_biases)
    conv = tf.nn.conv2d(hidden, layer2_weights, [1, 2, 2, 1], padding='SAME')
    hidden = tf.nn.relu(conv + layer2_biases)
    shape = hidden.get_shape().as_list()
    reshape = tf.reshape(hidden, [shape[0], shape[1] * shape[2] * shape[3]])
    hidden = tf.nn.relu(tf.matmul(reshape, layer3_weights) + layer3_biases)
    return tf.matmul(hidden, layer4_weights) + layer4_biases
  
  # Training computation.
  logits = model(tf_train_dataset)
  loss = tf.reduce_mean(
    tf.nn.softmax_cross_entropy_with_logits(labels=tf_train_labels, logits=logits))
    
  # Optimizer.
  optimizer = tf.train.GradientDescentOptimizer(0.05).minimize(loss)
  
  # Predictions for the training, validation, and test data.
  train_prediction = tf.nn.softmax(logits)
  valid_prediction = tf.nn.softmax(model(tf_valid_dataset))
  test_prediction = tf.nn.softmax(model(tf_test_dataset))

In [7]:
num_steps = 1001

with tf.Session(graph=graph) as session:
  tf.global_variables_initializer().run()
  print('Initialized')
  for step in range(num_steps):
    offset = (step * batch_size) % (train_labels.shape[0] - batch_size)
    batch_data = train_dataset[offset:(offset + batch_size), :, :, :]
    batch_labels = train_labels[offset:(offset + batch_size), :]
    feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels}
    _, l, predictions = session.run(
      [optimizer, loss, train_prediction], feed_dict=feed_dict)
    if (step % 50 == 0):
      print('Minibatch loss at step %d: %f' % (step, l))
      print('Minibatch accuracy: %.1f%%' % accuracy(predictions, batch_labels))
      print('Validation accuracy: %.1f%%' % accuracy(
        valid_prediction.eval(), valid_labels))
  print('Test accuracy: %.1f%%' % accuracy(test_prediction.eval(), test_labels))

Initialized
Minibatch loss at step 0: 2.648376
Minibatch accuracy: 12.5%
Validation accuracy: 10.0%
Minibatch loss at step 50: 1.621365
Minibatch accuracy: 43.8%
Validation accuracy: 51.4%
Minibatch loss at step 100: 0.993872
Minibatch accuracy: 68.8%
Validation accuracy: 66.6%
Minibatch loss at step 150: 0.383054
Minibatch accuracy: 81.2%
Validation accuracy: 75.1%
Minibatch loss at step 200: 0.731487
Minibatch accuracy: 81.2%
Validation accuracy: 77.5%
Minibatch loss at step 250: 1.274179
Minibatch accuracy: 68.8%
Validation accuracy: 77.0%
Minibatch loss at step 300: 0.350021
Minibatch accuracy: 87.5%
Validation accuracy: 79.1%
Minibatch loss at step 350: 0.575869
Minibatch accuracy: 93.8%
Validation accuracy: 76.8%
Minibatch loss at step 400: 0.198210
Minibatch accuracy: 100.0%
Validation accuracy: 79.3%
Minibatch loss at step 450: 0.697943
Minibatch accuracy: 81.2%
Validation accuracy: 79.4%
Minibatch loss at step 500: 0.716721
Minibatch accuracy: 87.5%
Validation accuracy: 80.5%


---
Problem 1
---------

The convolutional model above uses convolutions with stride 2 to reduce the dimensionality. Replace the strides by a max pooling operation (`nn.max_pool()`) of stride 2 and kernel size 2.

---

In [8]:
batch_size = 16
patch_size = 5
depth = 16
num_hidden = 64

graph = tf.Graph()

with graph.as_default():
    
    #Input data
    tf_train_dataset = tf.placeholder(tf.float32, shape=(batch_size, image_size, image_size, num_channels))
    tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size, num_labels))
    tf_valid_dataset = tf.constant(valid_dataset)
    tf_test_dataset = tf.constant(test_dataset)
    
    #Variables
    # Dimensions for conv weights are ==> patch_height * patch_width * # channels * depth
    layer1_weights = tf.Variable(tf.truncated_normal([patch_size, patch_size, num_channels, depth], stddev=0.1))
    layer1_biases = tf.Variable(tf.zeros([depth]))
    layer2_weights = tf.Variable(tf.truncated_normal([patch_size, patch_size, depth, depth], stddev=0.1))
    layer2_biases = tf.Variable(tf.constant(1.0, shape=([depth])))
    layer3_weights = tf.Variable(tf.truncated_normal(
        [image_size / 4 * image_size / 4 * depth, num_hidden], stddev=0.1))
    layer3_biases = tf.Variable(tf.constant(1.0, shape=([num_hidden])))
    layer4_weights = tf.Variable(tf.truncated_normal(
        [num_hidden, num_labels], stddev=0.1))
    layer4_biases = tf.Variable(tf.constant(1.0, shape=([num_labels])))
    
    def model(data):
        # Dimensions for strides in conv layers are batch * patch_height * patch_weight * # channels
        # e.g. [1, 2, 2, 1]
        conv = tf.nn.conv2d(data, layer1_weights, [1, 1, 1, 1], padding='SAME')
        max_pool = tf.nn.max_pool(conv, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
        hidden = tf.nn.relu(max_pool + layer1_biases)
        
        conv = tf.nn.conv2d(hidden, layer2_weights, [1, 1, 1, 1], padding='SAME')
        max_pool = tf.nn.max_pool(conv, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
        hidden = tf.nn.relu(max_pool + layer2_biases)
        
        shape = hidden.get_shape().as_list()
        # batch_size * # total features <== dimensions of reshape
        reshape = tf.reshape(hidden, [shape[0], shape[1] * shape[2] * shape[3]])
        hidden = tf.nn.relu(tf.matmul(reshape, layer3_weights) + layer3_biases)
        return tf.matmul(hidden, layer4_weights) + layer4_biases
    
    # Training computation
    logits = model(tf_train_dataset)
    loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=tf_train_labels))
    
    # Optimizer
    # set arbitrary learning rate
    optimizer = tf.train.GradientDescentOptimizer(0.05).minimize(loss)
    
    # Predictions for train, validation, and test datasets
    train_predictions = tf.nn.softmax(logits)
    valid_predictions = tf.nn.softmax(model(tf_valid_dataset))
    test_predictions = tf.nn.softmax(model(tf_test_dataset))    

In [9]:
steps = 1001

with tf.Session(graph=graph) as session:
    tf.global_variables_initializer().run()
    print('Initialized')
    for step in range(steps):
        offset = (step * batch_size) % (train_labels.shape[0] - batch_size)
        batch_data = train_dataset[offset:(offset + batch_size), :, :, :]
        batch_labels = train_labels[offset:(offset + batch_size), :]
        feed_dict = {tf_train_dataset: batch_data, tf_train_labels: batch_labels}
        _, l, predictions = session.run([optimizer, loss, train_predictions], feed_dict=feed_dict)
        
        # print results after every 50 steps
        if step % 50 == 0:
            print('Minibatch loss at step %d: %f' % (step, l))
            print('Minibatch accuracy: %.1f%%' % accuracy(predictions, batch_labels))
            print('Validation accuracy: %.1f%%' % accuracy(valid_predictions.eval(), valid_labels))
    print('Test accuracy: %.1f%%' % accuracy(test_predictions.eval(), test_labels))

Initialized
Minibatch loss at step 0: 3.578502
Minibatch accuracy: 12.5%
Validation accuracy: 10.4%
Minibatch loss at step 50: 2.095940
Minibatch accuracy: 18.8%
Validation accuracy: 12.5%
Minibatch loss at step 100: 1.132037
Minibatch accuracy: 68.8%
Validation accuracy: 60.1%
Minibatch loss at step 150: 0.575631
Minibatch accuracy: 75.0%
Validation accuracy: 73.0%
Minibatch loss at step 200: 1.009418
Minibatch accuracy: 81.2%
Validation accuracy: 77.6%
Minibatch loss at step 250: 1.265356
Minibatch accuracy: 62.5%
Validation accuracy: 78.4%
Minibatch loss at step 300: 0.410024
Minibatch accuracy: 87.5%
Validation accuracy: 79.6%
Minibatch loss at step 350: 0.511104
Minibatch accuracy: 93.8%
Validation accuracy: 78.8%
Minibatch loss at step 400: 0.185245
Minibatch accuracy: 100.0%
Validation accuracy: 81.4%
Minibatch loss at step 450: 0.874217
Minibatch accuracy: 87.5%
Validation accuracy: 81.0%
Minibatch loss at step 500: 0.679845
Minibatch accuracy: 87.5%
Validation accuracy: 82.3%


---
Problem 2
---------

Try to get the best performance you can using a convolutional net. Look for example at the classic [LeNet5](http://yann.lecun.com/exdb/lenet/) architecture, adding Dropout, and/or adding learning rate decay.

---

In [13]:
batch_size = 16
patch_size = 5
depth = 16
num_hidden = 64
keep_prob = 0.5

graph = tf.Graph()

with graph.as_default():
    
    # Input data
    tf_train_dataset = tf.placeholder(tf.float32, shape=([batch_size, image_size, image_size, num_channels]))
    tf_train_labels = tf.placeholder(tf.float32, shape=([batch_size, num_labels]))
    tf_valid_dataset = tf.constant(valid_dataset)
    tf_test_dataset = tf.constant(test_dataset)
    
    # Variables
    # Dimensions for conv weights are ==> patch_height * patch_width * # channels * depth
    layer1_weights = tf.Variable(tf.truncated_normal([patch_size, patch_size, num_channels, depth], stddev=0.1))
    layer1_biases = tf.Variable(tf.zeros([depth]))
    layer2_weights = tf.Variable(tf.truncated_normal([patch_size, patch_size, depth, depth], stddev=0.1))
    layer2_biases = tf.Variable(tf.constant(1.0, shape=([depth])))
    layer3_weights = tf.Variable(tf.truncated_normal(
        [image_size / 4 * image_size / 4 * depth, num_hidden], stddev=0.1))
    layer3_biases = tf.Variable(tf.constant(1.0, shape=([num_hidden])))
    layer4_weights = tf.Variable(tf.truncated_normal(
        [num_hidden, num_labels], stddev=0.1))
    layer4_biases = tf.Variable(tf.constant(1.0, shape=([num_labels])))
    
    def model(data):
        # Dimensions for strides in conv layers are batch * patch_height * patch_weight * # channels
        # e.g. [1, 2, 2, 1]
        conv = tf.nn.conv2d(data, layer1_weights, [1, 1, 1, 1], padding='SAME')
        max_pool = tf.nn.max_pool(conv, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
        hidden = tf.nn.relu(max_pool + layer1_biases)
        
        conv = tf.nn.conv2d(hidden, layer2_weights, [1, 1, 1, 1], padding='SAME')
        max_pool = tf.nn.max_pool(conv, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
        hidden = tf.nn.relu(max_pool + layer2_biases)
        
        shape = hidden.get_shape().as_list()
        # batch_size * # total features <== dimensions of reshape
        reshape = tf.reshape(hidden, [shape[0], shape[1] * shape[2] * shape[3]])
        hidden = tf.nn.relu(tf.matmul(reshape, layer3_weights) + layer3_biases)
        hidden = tf.nn.dropout(hidden, keep_prob)
        return tf.matmul(hidden, layer4_weights) + layer4_biases
    
    # Training computations
    logits = model(tf_train_dataset)
    loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=tf_train_labels))
    
    # Optimizer
    optimizer = tf.train.GradientDescentOptimizer(0.05).minimize(loss)
    
    # Predictions for train, validation, and test datasets
    train_predictions = tf.nn.softmax(logits)
    valid_predictions = tf.nn.softmax(model(tf_valid_dataset))
    test_predictions = tf.nn.softmax(model(tf_test_dataset))

In [14]:
steps = 1001

with tf.Session(graph=graph) as session:
    tf.global_variables_initializer().run()
    print('Initialized')
    for step in range(steps):
        offset = (step * batch_size) % (train_labels.shape[0] - batch_size)
        batch_data = train_dataset[offset:(offset + batch_size), :, :, :]
        batch_labels = train_labels[offset:(offset + batch_size), :]
        feed_dict = {tf_train_dataset: batch_data, tf_train_labels: batch_labels}
        _, l, predictions = session.run([optimizer, loss, train_predictions], feed_dict=feed_dict)
        
        # print results after every 50 steps
        if step % 50 == 0:
            print('Minibatch loss at step %d: %f' % (step, l))
            print('Minibatch accuracy: %.1f%%' % accuracy(predictions, batch_labels))
            print('Validation accuracy: %.1f%%' % accuracy(valid_predictions.eval(), valid_labels))
    print('Test accuracy: %.1f%%' % accuracy(test_predictions.eval(), test_labels))

Initialized
Minibatch loss at step 0: 3.284095
Minibatch accuracy: 0.0%
Validation accuracy: 10.1%
Minibatch loss at step 50: 2.309104
Minibatch accuracy: 0.0%
Validation accuracy: 10.0%
Minibatch loss at step 100: 2.321524
Minibatch accuracy: 0.0%
Validation accuracy: 11.9%
Minibatch loss at step 150: 2.238084
Minibatch accuracy: 25.0%
Validation accuracy: 11.1%
Minibatch loss at step 200: 1.927131
Minibatch accuracy: 18.8%
Validation accuracy: 27.4%
Minibatch loss at step 250: 2.015536
Minibatch accuracy: 31.2%
Validation accuracy: 46.2%
Minibatch loss at step 300: 1.096430
Minibatch accuracy: 62.5%
Validation accuracy: 50.7%
Minibatch loss at step 350: 1.241962
Minibatch accuracy: 50.0%
Validation accuracy: 58.3%
Minibatch loss at step 400: 0.758139
Minibatch accuracy: 87.5%
Validation accuracy: 58.5%
Minibatch loss at step 450: 1.041650
Minibatch accuracy: 68.8%
Validation accuracy: 61.8%
Minibatch loss at step 500: 1.282104
Minibatch accuracy: 68.8%
Validation accuracy: 68.2%
Mini

After applying dropout to the model, the accuracy decreased significantly. The model we implemented without dropout and using max pooling of stride 2 and kernel size 2 seemed to produce the greatest accuracy. Using that as a stepping stone, there might be room to investigate the use of valid padding rather than same padding. Also, utilizing learning rate decay may also help improve the accuracy.

I propose the following model to test:
   - Using a model with 2 convolutions and average pooling of stride 2 and kernel size 2
   - Using valid padding

In [21]:
batch_size = 16
patch_size = 5
depth = 16
num_hidden = 64

graph = tf.Graph()

with graph.as_default():
    
    #Input data
    tf_train_dataset = tf.placeholder(tf.float32, shape=(batch_size, image_size, image_size, num_channels))
    tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size, num_labels))
    tf_valid_dataset = tf.constant(valid_dataset)
    tf_test_dataset = tf.constant(test_dataset)
    
    #Variables
    # Dimensions for conv weights are ==> patch_height * patch_width * # channels * depth
    layer1_weights = tf.Variable(tf.truncated_normal([patch_size, patch_size, num_channels, depth], stddev=0.1))
    layer1_biases = tf.Variable(tf.zeros([depth]))
    layer2_weights = tf.Variable(tf.truncated_normal([patch_size, patch_size, depth, depth], stddev=0.1))
    layer2_biases = tf.Variable(tf.constant(1.0, shape=([depth])))
    new_size = ((image_size - patch_size + 1) / 2 - patch_size + 1) / 2
    layer3_weights = tf.Variable(tf.truncated_normal(
        [new_size * new_size * depth, num_hidden], stddev=0.1))
    layer3_biases = tf.Variable(tf.constant(1.0, shape=([num_hidden])))
    layer4_weights = tf.Variable(tf.truncated_normal(
        [num_hidden, num_labels], stddev=0.1))
    layer4_biases = tf.Variable(tf.constant(1.0, shape=([num_labels])))
    
    def model(data):
        # Dimensions for strides in conv layers are batch * patch_height * patch_weight * # channels
        # e.g. [1, 2, 2, 1]
        conv = tf.nn.conv2d(data, layer1_weights, [1, 1, 1, 1], padding='VALID')
        avg_pool = tf.nn.avg_pool(conv, [1, 2, 2, 1], [1, 2, 2, 1], padding='VALID')
        hidden = tf.nn.relu(avg_pool + layer1_biases)
        
        conv = tf.nn.conv2d(hidden, layer2_weights, [1, 1, 1, 1], padding='VALID')
        avg_pool = tf.nn.avg_pool(conv, [1, 2, 2, 1], [1, 2, 2, 1], padding='VALID')
        hidden = tf.nn.relu(avg_pool + layer2_biases)
        
        shape = hidden.get_shape().as_list()
        # batch_size * # total features <== dimensions of reshape
        reshape = tf.reshape(hidden, [shape[0], shape[1] * shape[2] * shape[3]])
        hidden = tf.nn.relu(tf.matmul(reshape, layer3_weights) + layer3_biases)
        return tf.matmul(hidden, layer4_weights) + layer4_biases
    
    # Training computation
    logits = model(tf_train_dataset)
    loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=tf_train_labels))
    
    # Optimizer
    # set arbitrary learning rate
    optimizer = tf.train.GradientDescentOptimizer(0.05).minimize(loss)
    
    # Predictions for train, validation, and test datasets
    train_predictions = tf.nn.softmax(logits)
    valid_predictions = tf.nn.softmax(model(tf_valid_dataset))
    test_predictions = tf.nn.softmax(model(tf_test_dataset)) 

In [22]:
steps = 20001

with tf.Session(graph=graph) as session:
    tf.global_variables_initializer().run()
    print('Initialized')
    for step in range(steps):
        offset = (step * batch_size) % (train_labels.shape[0] - batch_size)
        batch_data = train_dataset[offset:(offset + batch_size), :, :, :]
        batch_labels = train_labels[offset:(offset + batch_size), :]
        feed_dict = {tf_train_dataset: batch_data, tf_train_labels: batch_labels}
        _, l, predictions = session.run([optimizer, loss, train_predictions], feed_dict=feed_dict)
        
        # print results after every 50 steps
        if step % 50 == 0:
            print('Minibatch loss at step %d: %f' % (step, l))
            print('Minibatch accuracy: %.1f%%' % accuracy(predictions, batch_labels))
            print('Validation accuracy: %.1f%%' % accuracy(valid_predictions.eval(), valid_labels))
    print('Test accuracy: %.1f%%' % accuracy(test_predictions.eval(), test_labels))

Initialized
Minibatch loss at step 0: 3.460052
Minibatch accuracy: 25.0%
Validation accuracy: 10.0%
Minibatch loss at step 50: 1.826339
Minibatch accuracy: 31.2%
Validation accuracy: 50.8%
Minibatch loss at step 100: 1.170673
Minibatch accuracy: 56.2%
Validation accuracy: 60.9%
Minibatch loss at step 150: 0.936268
Minibatch accuracy: 68.8%
Validation accuracy: 66.2%
Minibatch loss at step 200: 0.863309
Minibatch accuracy: 81.2%
Validation accuracy: 73.0%
Minibatch loss at step 250: 1.081261
Minibatch accuracy: 68.8%
Validation accuracy: 74.6%
Minibatch loss at step 300: 0.591220
Minibatch accuracy: 87.5%
Validation accuracy: 75.6%
Minibatch loss at step 350: 0.678739
Minibatch accuracy: 87.5%
Validation accuracy: 74.0%
Minibatch loss at step 400: 0.352687
Minibatch accuracy: 93.8%
Validation accuracy: 78.2%
Minibatch loss at step 450: 0.825565
Minibatch accuracy: 81.2%
Validation accuracy: 77.2%
Minibatch loss at step 500: 0.820468
Minibatch accuracy: 87.5%
Validation accuracy: 78.1%
M

Validation accuracy: 86.1%
Minibatch loss at step 4550: 0.335056
Minibatch accuracy: 87.5%
Validation accuracy: 86.4%
Minibatch loss at step 4600: 0.564620
Minibatch accuracy: 87.5%
Validation accuracy: 86.5%
Minibatch loss at step 4650: 0.953477
Minibatch accuracy: 87.5%
Validation accuracy: 86.3%
Minibatch loss at step 4700: 0.434167
Minibatch accuracy: 81.2%
Validation accuracy: 86.6%
Minibatch loss at step 4750: 0.752873
Minibatch accuracy: 62.5%
Validation accuracy: 86.3%
Minibatch loss at step 4800: 0.443651
Minibatch accuracy: 87.5%
Validation accuracy: 86.4%
Minibatch loss at step 4850: 0.240175
Minibatch accuracy: 93.8%
Validation accuracy: 86.4%
Minibatch loss at step 4900: 0.047569
Minibatch accuracy: 100.0%
Validation accuracy: 86.7%
Minibatch loss at step 4950: 0.320267
Minibatch accuracy: 93.8%
Validation accuracy: 86.5%
Minibatch loss at step 5000: 0.946181
Minibatch accuracy: 68.8%
Validation accuracy: 86.2%
Minibatch loss at step 5050: 0.308478
Minibatch accuracy: 93.8

Validation accuracy: 87.5%
Minibatch loss at step 9050: 0.288997
Minibatch accuracy: 93.8%
Validation accuracy: 87.9%
Minibatch loss at step 9100: 0.333830
Minibatch accuracy: 93.8%
Validation accuracy: 87.6%
Minibatch loss at step 9150: 0.717766
Minibatch accuracy: 68.8%
Validation accuracy: 87.4%
Minibatch loss at step 9200: 0.173999
Minibatch accuracy: 93.8%
Validation accuracy: 87.5%
Minibatch loss at step 9250: 0.786008
Minibatch accuracy: 75.0%
Validation accuracy: 87.8%
Minibatch loss at step 9300: 0.834765
Minibatch accuracy: 81.2%
Validation accuracy: 87.7%
Minibatch loss at step 9350: 0.228312
Minibatch accuracy: 87.5%
Validation accuracy: 87.4%
Minibatch loss at step 9400: 0.385215
Minibatch accuracy: 81.2%
Validation accuracy: 88.1%
Minibatch loss at step 9450: 0.372588
Minibatch accuracy: 87.5%
Validation accuracy: 87.8%
Minibatch loss at step 9500: 0.209252
Minibatch accuracy: 93.8%
Validation accuracy: 88.1%
Minibatch loss at step 9550: 0.267848
Minibatch accuracy: 87.5%

Minibatch loss at step 13500: 0.243003
Minibatch accuracy: 100.0%
Validation accuracy: 88.3%
Minibatch loss at step 13550: 0.393419
Minibatch accuracy: 93.8%
Validation accuracy: 88.6%
Minibatch loss at step 13600: 0.288635
Minibatch accuracy: 87.5%
Validation accuracy: 88.5%
Minibatch loss at step 13650: 0.470298
Minibatch accuracy: 81.2%
Validation accuracy: 88.6%
Minibatch loss at step 13700: 0.246978
Minibatch accuracy: 93.8%
Validation accuracy: 88.8%
Minibatch loss at step 13750: 0.749718
Minibatch accuracy: 81.2%
Validation accuracy: 88.6%
Minibatch loss at step 13800: 0.025338
Minibatch accuracy: 100.0%
Validation accuracy: 88.4%
Minibatch loss at step 13850: 0.178233
Minibatch accuracy: 100.0%
Validation accuracy: 88.2%
Minibatch loss at step 13900: 0.168128
Minibatch accuracy: 93.8%
Validation accuracy: 88.7%
Minibatch loss at step 13950: 0.383424
Minibatch accuracy: 87.5%
Validation accuracy: 88.3%
Minibatch loss at step 14000: 0.110274
Minibatch accuracy: 93.8%
Validation a

Minibatch loss at step 17950: 0.438410
Minibatch accuracy: 81.2%
Validation accuracy: 89.0%
Minibatch loss at step 18000: 0.400969
Minibatch accuracy: 81.2%
Validation accuracy: 88.7%
Minibatch loss at step 18050: 0.481948
Minibatch accuracy: 87.5%
Validation accuracy: 89.0%
Minibatch loss at step 18100: 0.602974
Minibatch accuracy: 87.5%
Validation accuracy: 88.6%
Minibatch loss at step 18150: 0.174620
Minibatch accuracy: 93.8%
Validation accuracy: 88.8%
Minibatch loss at step 18200: 0.539469
Minibatch accuracy: 93.8%
Validation accuracy: 89.0%
Minibatch loss at step 18250: 0.624055
Minibatch accuracy: 81.2%
Validation accuracy: 89.1%
Minibatch loss at step 18300: 0.195966
Minibatch accuracy: 100.0%
Validation accuracy: 88.5%
Minibatch loss at step 18350: 0.252418
Minibatch accuracy: 93.8%
Validation accuracy: 89.0%
Minibatch loss at step 18400: 0.625089
Minibatch accuracy: 81.2%
Validation accuracy: 88.8%
Minibatch loss at step 18450: 0.255877
Minibatch accuracy: 93.8%
Validation acc

The accuracy improved greatly. We can probably eek out more improvements by fine tuning the learning rate through implementing learning rate decay and maybe adding more layers. For now, it seems as though this newly proposed model has provided sufficient accuracy through using convolutions.