Skip to content
Go to file

Minimal Bag of Visual Words Image Classifier

Implementation of a content based image classifier using the bag of visual words model in Python.

As the name suggests, this is only a minimal example to illustrate the general workings of such a system. The code is not optimized for speed, memory consumption or recognition performance. For a more advanced (state 2008) system check:

If you need state of the art results for image classification check out keras.

The approach consists of two major steps called learning and classifying, represented in the files and

The script will generate a visual vocabulary and train a classifier using a user provided set of already classified images. After the learning phase will use the generated vocabulary and the trained classifier to predict the class for any image given to the script by the user.

The learning consists of:

  1. Extracting local features of all the dataset images
  2. Generating a codebook of visual words with clustering of the features
  3. Aggregating the histograms of the visual words for each of the traning images
  4. Feeding the histograms to the classifier to train a model

The classification consists of:

  1. Extracting local features of the to be classified image
  2. Aggregating the histograms of the visual words for the image using the prior generated codebook
  3. Feeding the histogram to the classifier to predict a class for the image

This code relies on:

  • SIFT features for local features
  • k-means for generation of the words via clustering
  • SVM as classifier using the LIBSVM library

Example use:

You train the classifier for a specific dataset with:

python -d path_to_folders_with_images

To classify images use:

python -c path_to_folders_with_images/codebook.file -m path_to_folders_with_images/trainingdata.svm.model images_you_want_to_classify

The dataset should have following structure, where all the images belonging to one class are in the same folder:

|-- path_to_folders_with_images
|    |-- class1
|    |-- class2
|    |-- class3
|    └-- classN

The folder can have any name. One example dataset would be the Caltech 101 dataset.


To install the necessary libraries run following code from working directory:

# installing libsvm
wget -O libsvm.tar.gz
tar -xzf libsvm.tar.gz
mkdir libsvm
cp -r libsvm-*/* libsvm/
rm -r libsvm-*/
cd libsvm
cp tools/ ../
cd ..

# installing sift
cp sift*/sift sift


If you get an IOError: SIFT executable not found error, try sudo apt-get install libc6-i386. sift is a 32Bit executable and you need to install additional libraries to make it run on 64Bit systems. More info and background on the misleading error message on unix.stackexchange



Chih-Chung Chang and Chih-Jen Lin, LIBSVM : a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1--27:27, 2011. Software available at


David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-110.

Taken from

Addapted from contained in the LIBSVM packet by Chih-Chung Chang and Chih-Jen Lin.


Implementation of a content based image classifier using the bag of visual words approach in Python together with Lowe's SIFT and Libsvm.



No releases published


You can’t perform that action at this time.