Find file History
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.


This is an implementation of the GoogLeNet model for image classification described in Szegedy et. al. 2014.

The model presented here does not include any Local Response Normalization layers as were used in the published implementation.

Model script

The model run script is included here

Trained weights

The trained weights file can be downloaded from AWS using the following link: trained googlenet model weights.


This model is acheiving 64% top-1 and 85.5% top-5 accuracy on the validation data set.

During training, the images were randomly cropped and flipped horizontally but scale jittering and colorspace noise addition was not implemented.


To run the model, first the ImageNet data set needs to be uploaded and converted to the format compatible with neon (see instructions). Note there has been some changes to the format of the mean data subtraction; users with the old format may be prompted to run an update script before proceeding.

This script works with the neon commit SHA 66846b409. Make sure that your local repo is synced to this commit and run the installation procedure before proceeding.

If neon is installed into a virtualenv, make sure that it is activated before running the commands below. Also, the commands below use the GPU backend by default so add -b cpu if you are running on a system without a compatible GPU.

To test the model performance on the validation data set and benchmark the run times use the following command:

python -w path/to/dataset/batches --model_file googlenet.p

Additional options are available to add features like saving checkpoints and displaying logging information, use the --help option for details. For information on generating the ILSVRC2012 data ste macrobacthes check out the neon documentation page.


Training this model requires some features to neon which will be released soon. These scripts will be updated to include the training procedure as soon as possible.


Machine and GPU specs:

Intel(R) Core(TM) i5-4690 CPU @ 3.50GHz
Ubunutu 14.04
CUDA Driver Version 7.0

The run times for the fprop and bprop pass and the parameter update are given in the table below. The iteration row is the combined runtime for all functions in a training iteration. These results are for each minibatch consisting of 128 images of shape 224x224x3. The model was run 12 times, the first two passes were ignored and the last 10 were used to get the benchmark results.

|    Func     |      Mean    |
| fprop       |   116 msec   |
| bprop       |   261 msec   |
| update      |    45 msec   |
| iteration   |   424 msec   |


Going deeper with convolutions
Szegedy, Christian; Liu, Wei; Jia, Yangqing; Sermanet, Pierre; Reed, Scott; Anguelov, Dragomir;
Erhan, Dumitru; Vanhoucke, Vincent; Rabinovich, Andrew