Image Classification

Michael Mayer edited this page Dec 16, 2018 · 12 revisions

Image classification is performed using a pre-trained model (Inception V1). To get a basic understanding, you can read Image Classification using Deep Neural Networks — A beginner friendly approach using TensorFlow.

We might want to use Inception-ResNet-v2 instead of Inception V1, see Google AI Blog: Improving Inception and Image Classification in TensorFlow and CNN Architectures: LeNet, AlexNet, VGG, GoogLeNet, ResNet and more.

Natural Language Processing

As an addition step search should also use natural language processing to match search terms with similar tags/labels from the model. This is not implemented. We could use chewxy/lingo for this.

Pre-trained Models


Neural nets work best when they have many parameters, making them powerful function approximators. However, this means they must be trained on very large datasets. Because training models from scratch can be a very computationally intensive process requiring days or even weeks, we provide various pre-trained models, as listed below. These CNNs have been trained on the ILSVRC-2012-CLS image classification dataset.

In the table below, we list each model, the corresponding TensorFlow model file, the link to the model checkpoint, and the top 1 and top 5 accuracy (on the imagenet test set). Note that the VGG and ResNet V1 parameters have been converted from their original caffe formats (here and here), whereas the Inception and ResNet V2 parameters have been trained internally at Google. Also be aware that these accuracies were computed by evaluating using a single image crop. Some academic papers report higher accuracy by using multiple crops at multiple scales.

Model TF-Slim File Checkpoint Top-1 Accuracy Top-5 Accuracy
Inception V1 Code inception_v1_2016_08_28.tar.gz 69.8 89.6
Inception V2 Code inception_v2_2016_08_28.tar.gz 73.9 91.8
Inception V3 Code inception_v3_2016_08_28.tar.gz 78.0 93.9
Inception V4 Code inception_v4_2016_09_09.tar.gz 80.2 95.2
Inception-ResNet-v2 Code inception_resnet_v2_2016_08_30.tar.gz 80.4 95.3
ResNet V1 50 Code resnet_v1_50_2016_08_28.tar.gz 75.2 92.2
ResNet V1 101 Code resnet_v1_101_2016_08_28.tar.gz 76.4 92.9
ResNet V1 152 Code resnet_v1_152_2016_08_28.tar.gz 76.8 93.2
ResNet V2 50^ Code resnet_v2_50_2017_04_14.tar.gz 75.6 92.8
ResNet V2 101^ Code resnet_v2_101_2017_04_14.tar.gz 77.0 93.7
ResNet V2 152^ Code resnet_v2_152_2017_04_14.tar.gz 77.8 94.1
ResNet V2 200 Code TBA 79.9* 95.2*
VGG 16 Code vgg_16_2016_08_28.tar.gz 71.5 89.8
VGG 19 Code vgg_19_2016_08_28.tar.gz 71.1 89.8
MobileNet_v1_1.0_224 Code mobilenet_v1_1.0_224.tgz 70.9 89.9
MobileNet_v1_0.50_160 Code mobilenet_v1_0.50_160.tgz 59.1 81.9
MobileNet_v1_0.25_128 Code mobilenet_v1_0.25_128.tgz 41.5 66.3
MobileNet_v2_1.4_224^* Code mobilenet_v2_1.4_224.tgz 74.9 92.5
MobileNet_v2_1.0_224^* Code mobilenet_v2_1.0_224.tgz 71.9 91.0
NASNet-A_Mobile_224# Code nasnet-a_mobile_04_10_2017.tar.gz 74.0 91.6
NASNet-A_Large_331# Code nasnet-a_large_04_10_2017.tar.gz 82.7 96.2
PNASNet-5_Large_331 Code pnasnet-5_large_2017_12_13.tar.gz 82.9 96.2
PNASNet-5_Mobile_224 Code pnasnet-5_mobile_2017_12_13.tar.gz 74.2 91.9

^ ResNet V2 models use Inception pre-processing and input image size of 299 (use --preprocessing_name inception --eval_image_size 299 when using Performance numbers for ResNet V2 models are reported on the ImageNet validation set.

(#) More information and details about the NASNet architectures are available at this README

All 16 float MobileNet V1 models reported in the MobileNet Paper and all 16 quantized TensorFlow Lite compatible MobileNet V1 models can be found here.

(^#) More details on MobileNetV2 models can be found here.

(*): Results quoted from the paper.

Here is an example of how to download the Inception V3 checkpoint:

$ CHECKPOINT_DIR=/tmp/checkpoints
$ wget
$ tar -xvf inception_v3_2016_08_28.tar.gz
$ mv inception_v3.ckpt ${CHECKPOINT_DIR}
$ rm inception_v3_2016_08_28.tar.gz

Types of neural networks



Back to Metadata - see also: Color Extraction, Face Recognition, Geocoding

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.