Ok, let's test this thing and make sure it actually works!

Using a modified VGG net on the CIFAR-10 dataset, a 32x32 pixel image dataset with 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.

For expediency, we'll run this in Keras.  
For Reproducibility:
CUDA 8.0
cuDNN 5.1
Tensorflow 1.2.1
Keras 2.0.8
jupyter 1.0

In [16]:
'''
Cifar-10 classification - VGG-net adaptation
Original dataset and info: https://www.cs.toronto.edu/~kriz/cifar.html for more information
Code base attributed to Giuseppe Bonaccorso and modified
See: https://www.bonaccorso.eu/2016/08/06/cifar-10-image-classification-with-keras-convnet/ for further information
'''
 
from __future__ import print_function

import numpy as np
import timeit 

from keras.callbacks import EarlyStopping
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Flatten
from keras.layers.convolutional import Conv2D
from keras.optimizers import Adam
from keras.layers.pooling import MaxPooling2D
from keras.utils import to_categorical


<html>
<head>
	<meta http-equiv="content-type" content="text/html; charset=utf-8"/>
	<title></title>
	<meta name="generator" content="LibreOffice 5.1.6.2 (Linux)"/>
	<meta name="created" content="2017-10-25T10:19:14.754943301"/>
	<meta name="changed" content="2017-10-25T10:23:54.153389676"/>
	<style type="text/css">
		@page { margin: 0.79in }
		p { margin-bottom: 0.1in; line-height: 120% }
	</style>
</head>
<body lang="en-US" dir="ltr">
<p style="margin-bottom: 0in; line-height: 100%"><font face="DejaVu Sans, sans-serif"><font size="4" style="font-size: 14pt">This
modified VGG follows the following structure:</font></font></p>
<p style="margin-bottom: 0in; line-height: 100%"><br/>

</p>
<p style="margin-bottom: 0in; line-height: 100%">	<font color="#0000ff"><font face="Liberation Sans, sans-serif"><b>Input
Layer (32x32x3)</b></font></font></p>
<p style="margin-bottom: 0in; line-height: 100%"><br/>

</p>
<p style="margin-bottom: 0in; line-height: 100%"><font color="#0000ff"><font face="Liberation Sans, sans-serif"><b>	Conv-32
3x3  --&gt; Conv 64 3x3 --&gt; Maxpool-64</b></font></font></p>
<p style="margin-bottom: 0in; line-height: 100%"><font color="#0000ff"><font face="Liberation Sans, sans-serif"><b>	Conv-128
3x3 --&gt; Maxpool-128</b></font></font></p>
<p style="margin-bottom: 0in; line-height: 100%"><font color="#0000ff"><font face="Liberation Sans, sans-serif"><b>	Conv-128
3x3 --&gt; Maxpool-128</b></font></font></p>
<p style="margin-bottom: 0in; line-height: 100%"><br/>

</p>
<p style="margin-bottom: 0in; line-height: 100%"><font color="#0000ff"><font face="Liberation Sans, sans-serif"><b>	FC
1024</b></font></font></p>
<p style="margin-bottom: 0in; line-height: 100%"><font color="#0000ff"><font face="Liberation Sans, sans-serif"><b>	FC
10 (Output)</b></font></font></p>
<p style="margin-bottom: 0in; line-height: 100%"><br/>

</p>
<p style="margin-bottom: 0in; line-height: 100%"><br/>

</p>
<p style="margin-bottom: 0in; line-height: 100%">With dropout added
after the first two Maxpools and the first FC layer.  
</p>
<p style="margin-bottom: 0in; line-height: 100%"><br/>

</p>
<p style="margin-bottom: 0in; line-height: 100%">For our purposes of
testing we will only modify the dropout values applied after the
first two MaxPool layers and the minibatch size.</p>
<p style="margin-bottom: 0in; line-height: 100%"><br/>

</p>
<p style="margin-bottom: 0in; line-height: 100%"><br/>

</p>
</body>
</html>

In [17]:
# For reproducibility
np.random.seed(1000)

if __name__ == '__main__':
    # Load the dataset
    (X_train, Y_train), (X_test, Y_test) = cifar10.load_data()

    # Create the model
    model = Sequential()

    model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(32, 32, 3)))
    model.add(Conv2D(64, kernel_size=(3, 3), activation='relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Dropout(0.5))  #we will adjust this dropout

    model.add(Conv2D(128, kernel_size=(3, 3), activation='relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Conv2D(128, kernel_size=(3, 3), activation='relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Dropout(0.5)) #we will adjust this dropout

    model.add(Flatten())
    model.add(Dense(1024, activation='relu'))
    model.add(Dropout(0.5)) #this dropout will remain constant
    model.add(Dense(10, activation='softmax'))



<html>
<head>
	<meta http-equiv="content-type" content="text/html; charset=utf-8"/>
	<title></title>
	<meta name="generator" content="LibreOffice 5.1.6.2 (Linux)"/>
	<meta name="created" content="2017-10-25T10:19:14.754943301"/>
	<meta name="changed" content="2017-10-25T10:44:21.236761251"/>
	<style type="text/css">
		@page { margin: 0.79in }
		p { margin-bottom: 0.1in; line-height: 120% }
		a:link { so-language: zxx }
	</style>
</head>
<body lang="en-US" dir="ltr">
<p style="margin-bottom: 0in; line-height: 100%"><font face="DejaVu Sans, sans-serif">The
following is a spreadsheet to estimate the memory requirements and
parameters involved in the FORWARD<b> PASS</b> of the convolutional
network.  Length and Width and depth should be self-explanatory
(Color images = RGB = 3 layers).  Size in Kb for one image and
subsequent layers are given (minibatch of 1).  Memory size in Mb for
Batches of 128 and 1024 are given for comparison.</font></p>
<p style="margin-bottom: 0in; line-height: 100%"><br/>

</p>
<p style="margin-bottom: 0in; line-height: 100%"><font face="DejaVu Sans, sans-serif">While
the memory requirements are easily handled with our 11GB GPU,  given
the small starting size of the images and the limited number of
layers in the convolutional network.  However, it should be obvious
that with larger images, memory requirements will increase quickly,
and batch size needs to be considered to not exceed available GPU
memory.  </font>
</p>
<p style="margin-bottom: 0in; line-height: 100%"><br/>

</p>
<p style="margin-bottom: 0in; line-height: 100%"><font face="DejaVu Sans, sans-serif">Note
that the largest actual memory use occurs early on in the first
convolutional block.  However, the greatest parameter use is at the
last fully-connected layer.  </font>
</p>
<p style="margin-bottom: 0in; line-height: 100%"><br/>

</p>
<p style="margin-bottom: 0in; line-height: 100%"><font face="DejaVu Sans, sans-serif">For
this example, using a 1024 batch size, the sum of memory used in the
network is 166Mb for the forward pass. <b>We need also to account for
the backward pass</b> – estimating at least 1x and possibly up to
2x.  So 166x3=498Mb or 0.5GB.  </font>
</p>
<p style="margin-bottom: 0in; line-height: 100%"><br/>

</p>
<p style="margin-bottom: 0in; line-height: 100%"><br/>

</p>
<p style="margin-bottom: 0in; line-height: 100%"><br/>

</p>
</body>
</html>



<html>
<head>
	
	<meta http-equiv="content-type" content="text/html; charset=utf-8"/>
	<title></title>
	<meta name="generator" content="LibreOffice 5.1.6.2 (Linux)"/>
	<meta name="created" content="2017-10-25T10:13:33.985785587"/>
	<meta name="changed" content="2017-10-25T10:15:54.715163964"/>
	
	<style type="text/css">
		body,div,table,thead,tbody,tfoot,tr,th,td,p { font-family:"Liberation Sans"; font-size:x-small }
		a.comment-indicator:hover + comment { background:#ffd; position:absolute; display:block; border:1px solid black; padding:0.5em;  } 
		a.comment-indicator { background:red; display:inline-block; border:1px solid black; width:0.5em; height:0.5em;  } 
		comment { display:none;  } 
	</style>
	
</head>

<body>
<table cellspacing="0" border="0">
	<colgroup span="10" width="85"></colgroup>
	<tr>
		<td height="17" align="left"><br></td>
		<td align="left">Pixel_L</td>
		<td align="left">Pixel_W</td>
		<td align="left">Layers</td>
		<td align="left">Size in Kb</td>
		<td align="left">Params</td>
		<td align="left">Batch-128 Mb</td>
		<td align="left">Params-128</td>
		<td align="left">Batch-1024 Mb</td>
		<td align="left">Params-1024</td>
	</tr>
	<tr>
		<td height="17" align="left">INPUT</td>
		<td align="right" sdval="32" sdnum="1033;">32</td>
		<td align="right" sdval="32" sdnum="1033;">32</td>
		<td align="right" sdval="3" sdnum="1033;">3</td>
		<td align="right" sdval="3" sdnum="1033;">3</td>
		<td align="right" sdval="1" sdnum="1033;">1</td>
		<td align="right" sdval="0.375" sdnum="1033;">0.375</td>
		<td align="right" sdval="128" sdnum="1033;">128</td>
		<td align="right" sdval="3" sdnum="1033;">3</td>
		<td align="right" sdval="1024" sdnum="1033;">1024</td>
	</tr>
	<tr>
		<td height="17" align="left">Conv-32, 3x3</td>
		<td align="right" sdval="32" sdnum="1033;">32</td>
		<td align="right" sdval="32" sdnum="1033;">32</td>
		<td align="right" sdval="32" sdnum="1033;">32</td>
		<td align="right" sdval="32" sdnum="1033;">32</td>
		<td align="right" sdval="864" sdnum="1033;">864</td>
		<td align="right" sdval="4" sdnum="1033;">4</td>
		<td align="right" sdval="110592" sdnum="1033;">110592</td>
		<td align="right" sdval="32" sdnum="1033;">32</td>
		<td align="right" sdval="884736" sdnum="1033;">884736</td>
	</tr>
	<tr>
		<td height="17" align="left">Conv-64, 3x3</td>
		<td align="right" sdval="32" sdnum="1033;">32</td>
		<td align="right" sdval="32" sdnum="1033;">32</td>
		<td align="right" sdval="64" sdnum="1033;">64</td>
		<td align="right" sdval="64" sdnum="1033;">64</td>
		<td align="right" sdval="18432" sdnum="1033;">18432</td>
		<td align="right" sdval="8" sdnum="1033;">8</td>
		<td align="right" sdval="2359296" sdnum="1033;">2359296</td>
		<td align="right" bgcolor="#FFCC00" sdval="64" sdnum="1033;">64</td>
		<td align="right" sdval="18874368" sdnum="1033;">18874368</td>
	</tr>
	<tr>
		<td height="17" align="left">Maxpool 64</td>
		<td align="right" sdval="16" sdnum="1033;">16</td>
		<td align="right" sdval="16" sdnum="1033;">16</td>
		<td align="right" sdval="64" sdnum="1033;">64</td>
		<td align="right" sdval="16" sdnum="1033;">16</td>
		<td align="right" sdval="36864" sdnum="1033;">36864</td>
		<td align="right" sdval="2" sdnum="1033;">2</td>
		<td align="right" sdval="4718592" sdnum="1033;">4718592</td>
		<td align="right" sdval="16" sdnum="1033;">16</td>
		<td align="right" sdval="37748736" sdnum="1033;">37748736</td>
	</tr>
	<tr>
		<td height="17" align="left">Conv-128, 3x3</td>
		<td align="right" sdval="16" sdnum="1033;">16</td>
		<td align="right" sdval="16" sdnum="1033;">16</td>
		<td align="right" sdval="128" sdnum="1033;">128</td>
		<td align="right" sdval="32" sdnum="1033;">32</td>
		<td align="right" sdval="73728" sdnum="1033;">73728</td>
		<td align="right" sdval="4" sdnum="1033;">4</td>
		<td align="right" sdval="9437184" sdnum="1033;">9437184</td>
		<td align="right" sdval="32" sdnum="1033;">32</td>
		<td align="right" sdval="75497472" sdnum="1033;">75497472</td>
	</tr>
	<tr>
		<td height="17" align="left">Maxpool 128</td>
		<td align="right" sdval="8" sdnum="1033;">8</td>
		<td align="right" sdval="8" sdnum="1033;">8</td>
		<td align="right" sdval="128" sdnum="1033;">128</td>
		<td align="right" sdval="8" sdnum="1033;">8</td>
		<td align="right" sdval="147456" sdnum="1033;">147456</td>
		<td align="right" sdval="1" sdnum="1033;">1</td>
		<td align="right" sdval="18874368" sdnum="1033;">18874368</td>
		<td align="right" sdval="8" sdnum="1033;">8</td>
		<td align="right" sdval="150994944" sdnum="1033;">150994944</td>
	</tr>
	<tr>
		<td height="17" align="left">Conv-128, 3x3</td>
		<td align="right" sdval="8" sdnum="1033;">8</td>
		<td align="right" sdval="8" sdnum="1033;">8</td>
		<td align="right" sdval="128" sdnum="1033;">128</td>
		<td align="right" sdval="8" sdnum="1033;">8</td>
		<td align="right" sdval="147456" sdnum="1033;">147456</td>
		<td align="right" sdval="1" sdnum="1033;">1</td>
		<td align="right" sdval="18874368" sdnum="1033;">18874368</td>
		<td align="right" sdval="8" sdnum="1033;">8</td>
		<td align="right" sdval="150994944" sdnum="1033;">150994944</td>
	</tr>
	<tr>
		<td height="17" align="left">Maxpool 128</td>
		<td align="right" sdval="4" sdnum="1033;">4</td>
		<td align="right" sdval="4" sdnum="1033;">4</td>
		<td align="right" sdval="128" sdnum="1033;">128</td>
		<td align="right" sdval="2" sdnum="1033;">2</td>
		<td align="right" sdval="147456" sdnum="1033;">147456</td>
		<td align="right" sdval="0.25" sdnum="1033;">0.25</td>
		<td align="right" sdval="18874368" sdnum="1033;">18874368</td>
		<td align="right" sdval="2" sdnum="1033;">2</td>
		<td align="right" sdval="150994944" sdnum="1033;">150994944</td>
	</tr>
	<tr>
		<td height="17" align="left">FC Dense 1024 </td>
		<td align="right" sdval="1" sdnum="1033;">1</td>
		<td align="right" sdval="1" sdnum="1033;">1</td>
		<td align="right" sdval="1024" sdnum="1033;">1024</td>
		<td align="right" sdval="1" sdnum="1033;">1</td>
		<td align="right" sdval="1179648" sdnum="1033;">1179648</td>
		<td align="right" sdval="0.125" sdnum="1033;">0.125</td>
		<td align="right" sdval="150994944" sdnum="1033;">150994944</td>
		<td align="right" sdval="1" sdnum="1033;">1</td>
		<td align="right" bgcolor="#FFCC00" sdval="1207959552" sdnum="1033;">1207959552</td>
	</tr>
	<tr>
		<td height="17" align="left">FC Dense 10  </td>
		<td align="right" sdval="1" sdnum="1033;">1</td>
		<td align="right" sdval="1" sdnum="1033;">1</td>
		<td align="right" sdval="10" sdnum="1033;">10</td>
		<td align="right" sdval="0" sdnum="1033;">0</td>
		<td align="right" sdval="92160" sdnum="1033;">92160</td>
		<td align="right" sdval="0" sdnum="1033;">0</td>
		<td align="right" sdval="11796480" sdnum="1033;">11796480</td>
		<td align="right" sdval="0" sdnum="1033;">0</td>
		<td align="right" sdval="94371840" sdnum="1033;">94371840</td>
	</tr>
	<tr>
		<td height="17" align="left"><br></td>
		<td align="left"><br></td>
		<td align="left"><br></td>
		<td align="left"><br></td>
		<td align="left"><br></td>
		<td align="left"><br></td>
		<td align="left"><br></td>
		<td align="left"><br></td>
		<td align="left"><br></td>
		<td align="left"><br></td>
	</tr>
	<tr>
		<td height="17" align="left">Totals</td>
		<td align="left"><br></td>
		<td align="left"><br></td>
		<td align="left"><br></td>
		<td align="right" sdval="166" sdnum="1033;">166</td>
		<td align="right" sdval="1844065" sdnum="1033;">1844065</td>
		<td align="right" sdval="20.75" sdnum="1033;">20.75</td>
		<td align="right" sdval="236040320" sdnum="1033;">236040320</td>
		<td align="right" sdval="166" sdnum="1033;">166</td>
		<td align="right" sdval="1888322560" sdnum="1033;">1888322560</td>
	</tr>
</table>
<!-- ************************************************************************** -->
</body>

</html>


In [18]:
  # Compile the model
model.compile(loss='categorical_crossentropy',
                  optimizer=Adam(lr=0.0001, decay=1e-6),
                  metrics=['accuracy'])

In [19]:
# Train the model
tic=timeit.default_timer()
model.fit(X_train / 255.0, to_categorical(Y_train),
              batch_size=256,  # we will adjust this batch size
              shuffle=True,
              epochs=250,
              validation_data=(X_test / 255.0, to_categorical(Y_test)),
              callbacks=[EarlyStopping(min_delta=0.001, patience=3)])
toc=timeit.default_timer()


Train on 50000 samples, validate on 10000 samples
Epoch 1/250
Epoch 2/250
Epoch 3/250
Epoch 4/250
Epoch 5/250
Epoch 6/250
Epoch 7/250
Epoch 8/250
Epoch 9/250
Epoch 10/250
Epoch 11/250
Epoch 12/250
Epoch 13/250
Epoch 14/250
Epoch 15/250
Epoch 16/250
Epoch 17/250
Epoch 18/250
Epoch 19/250
Epoch 20/250
Epoch 21/250
Epoch 22/250
Epoch 23/250
Epoch 24/250
Epoch 25/250
Epoch 26/250
Epoch 27/250
Epoch 28/250
Epoch 29/250
Epoch 30/250
Epoch 31/250
Epoch 32/250
Epoch 33/250
Epoch 34/250
Epoch 35/250
Epoch 36/250
Epoch 37/250
Epoch 38/250
Epoch 39/250
Epoch 40/250
Epoch 41/250
Epoch 42/250
Epoch 43/250
Epoch 44/250
Epoch 45/250
Epoch 46/250
Epoch 47/250
Epoch 48/250
Epoch 49/250
Epoch 50/250
Epoch 51/250
Epoch 52/250
Epoch 53/250
Epoch 54/250
Epoch 55/250
Epoch 56/250
Epoch 57/250
Epoch 58/250
Epoch 59/250
Epoch 60/250
Epoch 61/250
Epoch 62/250
Epoch 63/250
Epoch 64/250
Epoch 65/250
Epoch 66/250
Epoch 67/250
Epoch 68/250
Epoch 69/250
Epoch 70/250
Epoch 71/250
Epoch 72/250
Epoch 73/250
Epoch 74/2

In [20]:
# Evaluate the model
scores = model.evaluate(X_test / 255.0, to_categorical(Y_test))
secs = (toc-tic)
print('Loss: %.3f' % scores[0])
print('Accuracy: %.3f' % scores[1])
print('Time: %.1f' % secs)

Accuracy: 0.776
Time: 395.2



<html>
<head>
	<meta http-equiv="content-type" content="text/html; charset=utf-8"/>
	<title></title>
	<meta name="generator" content="LibreOffice 5.1.6.2 (Linux)"/>
	<meta name="created" content="2017-10-25T10:45:48.760074450"/>
	<meta name="changed" content="2017-10-25T10:52:58.334318219"/>
</head>
<body lang="en-US" dir="ltr">
<p><font face="DejaVu Sans, sans-serif">So, it works.</font></p>
<p><font face="DejaVu Sans, sans-serif">Best results achieved on
about 60 runs were an accuracy of 0.806, loss of 0.567. taking about
400 seconds (6 ½ minutes)</font></p>
<p><font face="DejaVu Sans, sans-serif">I may post graphs of
variation in dropout rate vs batch size, etc… once I have finished
writing the configuration posts.</font></p>
<p><font face="DejaVu Sans, sans-serif">Sorry about the delay on
those for you who are following along – the nearly finished post
was lost and I’m redoing it.</font></p>
</body>
</html>