Skip to content
Using Tensorflow and a Support Vector Machine to Create an Image Classifications Engine
Branch: master
Clone or download
Type Name Latest commit message Commit time
Failed to load latest commit information.
assets Added GNB, LDA, and QDA classifiers Feb 5, 2018
imagenet initial commit Jan 25, 2018
.gitignore added gitignore Feb 1, 2018
LICENCE.txt Added licence Jan 25, 2018 Added GNB, LDA, and QDA classifiers Feb 5, 2018 initial commit Jan 25, 2018 initial commit Jan 25, 2018 Added GNB, LDA, and QDA classifiers Feb 5, 2018


This example uses a sub set of 48 labelled images from the Caltech image set (, limited to between 40 and 80 images per label. The images are fed to a TensorFlow implementation of Inception V3 with the classification layer removed in order to produce a set of labelled feature vectors.

Dimensionality reduction is carried out on the 2048-d features using t-distributed stochastic neighbor embedding (t-SNE) to transform them into a 2-d feature which is easy to visualize. Note that t-SNE is used as an informative step. If the same color/label points are mostly clustered together there is a high chance that we could use the features to train a classifier with high accuracy.

The 2048-d labelled features are presented to a number of classifiers. Intially the project was to train a Support Vector Machine to classify images, however for comparison this has been extended to the following:

  • Support Vector Machine (SVM)
  • Extra Trees (ET)
  • Random Forest (RF)
  • K-Nearest Neighbor (KNN)
  • Multi-Layer Perceptron (ML)
  • Gaussian Naive Bayes (GNB)
  • Linear Discriminant Analysis (LDA)
  • Quadratic Discriminant Analysis (QDA)

Training and validation time, and the accuracy of each classifier is displayed. Most classifiers were run with their default tuning values, however tuning was carried, where possible, on those classifiers that fell well below 90% accuracy for their defaults, such of Extra Trees and Random Forsest (initially in the 75 - 78% region).

A summary of the results is as follows (training/test time, accuracy):

  • SVM: 6.77 sec, 96.9%
  • ET: 1.52 sec, 93.2%
  • RF: 16.47 sec, 90.8%
  • KNN: 2.2 sec, 91.5%
  • MLP: 13.83 sec, 97.1%
  • GNB: 1.1 sec, 91.8%
  • LDA: 4.95 sec, 91.0%
  • QDA: 0.84 sec, 5.3% (Variables are collinear warning!)

Note that these results vary between runs, and are just representative.

Quick Start

  1. Unzip the curated image set You should then have a directory called caltech_101_images in the same directory as

  2. The imagenet directory already has classify_image_graph_def.pb. If I've removed it to save space on my github account, then download it from, un-zip it, and place classify_image_graph_def.pb in a directory called 'imagenet'.

  3. Run using Python 3. The following packages are required: tensorflow, sklearn (scikit-learn), numpy, matplotlib. Run time (from scratch) was about 28 minutes on my dual core i7 Skylake laptop.


caltech t-SNE plot

Support Vector Machine

caltech SVM confusion matrix

Extra Trees

caltech ET confusion matrix

Random Forest

caltech RF confusion matrix

K-Nearest Neighbor

caltech KNN confusion matrix

Multi-Layer Perceptron

caltech MLP confusion matrix

Gaussian Naive Bayes

caltech GNB confusion matrix

Linear Discriminant Analysis

caltech LDA confusion matrix

Quadratic Discriminant Analysis

caltech QDA confusion matrix

You can’t perform that action at this time.