Minimal Bag of Visual Words Image Classifier
Implementation of a content based image classifier using the bag of visual words model in Python.
As the name suggests, this is only a minimal example to illustrate the general workings of such a system. The code is not optimized for speed, memory consumption or recognition performance. For a more advanced (state 2008) system check: https://github.com/shackenberg/phow_caltech101.py
If you need state of the art results for image classification check out keras.
learn.py will generate a visual vocabulary and train a classifier using a user provided set of already classified images.
After the learning phase
classify.py will use the generated vocabulary and the trained classifier to predict the class for any image given to the script by the user.
The learning consists of:
- Extracting local features of all the dataset images
- Generating a codebook of visual words with clustering of the features
- Aggregating the histograms of the visual words for each of the traning images
- Feeding the histograms to the classifier to train a model
The classification consists of:
- Extracting local features of the to be classified image
- Aggregating the histograms of the visual words for the image using the prior generated codebook
- Feeding the histogram to the classifier to predict a class for the image
This code relies on:
- SIFT features for local features
- k-means for generation of the words via clustering
- SVM as classifier using the LIBSVM library
You train the classifier for a specific dataset with:
python learn.py -d path_to_folders_with_images
To classify images use:
python classify.py -c path_to_folders_with_images/codebook.file -m path_to_folders_with_images/trainingdata.svm.model images_you_want_to_classify
The dataset should have following structure, where all the images belonging to one class are in the same folder:
. |-- path_to_folders_with_images | |-- class1 | |-- class2 | |-- class3 ... | └-- classN
The folder can have any name. One example dataset would be the Caltech 101 dataset.
To install the necessary libraries run following code from working directory:
# installing libsvm wget -O libsvm.tar.gz http://www.csie.ntu.edu.tw/~cjlin/cgi-bin/libsvm.cgi?+http://www.csie.ntu.edu.tw/~cjlin/libsvm+tar.gz tar -xzf libsvm.tar.gz mkdir libsvm cp -r libsvm-*/* libsvm/ rm -r libsvm-*/ cd libsvm make cp tools/grid.py ../grid.py cd .. # installing sift wget http://www.cs.ubc.ca/~lowe/keypoints/siftDemoV4.zip unzip siftDemoV4.zip cp sift*/sift sift
If you get an
IOError: SIFT executable not found error, try
sudo apt-get install libc6-i386.
sift is a 32Bit executable and you need to install additional libraries to make it run on 64Bit systems. More info and background on the misleading error message on unix.stackexchange
Chih-Chung Chang and Chih-Jen Lin, LIBSVM : a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1--27:27, 2011. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-110.
Addapted from easy.py contained in the LIBSVM packet by Chih-Chung Chang and Chih-Jen Lin.