Classifying ImageNet: the instant Caffe way
===========================================

Caffe has a Python interface, pycaffe, with a `caffe.Net` interface for models. There are both Python and MATLAB interfaces. While this example uses the off-the-shelf Python `caffe.Classifier` interface there is also a MATLAB example at `matlab/caffe/matcaffe_demo.m`.

Before we begin, you must compile Caffe. You should add the Caffe module to your `PYTHONPATH` although this example includes it automatically. If you haven't yet done so, please refer to the [installation instructions](http://caffe.berkeleyvision.org/installation.html). This example uses our pre-trained CaffeNet model, an ILSVRC12 image classifier. You can download it by running `./scripts/download_model_binary.py models/bvlc_reference_caffenet` or let the first step of this example download it for you.

Ready? Let's start.

In [1]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline


# Make sure that caffe is on the python path:
CAFFE_ROOT = '/home/waylonflinn/Development/caffe/'
import sys
sys.path.insert(0, CAFFE_ROOT + 'python')

import caffe

from caffe.io import load_image

# Set the right path to your model definition file, pretrained model weights,
# and the image you would like to classify.
REF_MODEL_FILE = CAFFE_ROOT + 'models/bvlc_reference_caffenet/deploy.prototxt'
REF_PRETRAINED = CAFFE_ROOT + 'models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel'

MODEL_FILE = './deploy.prototxt'
PRETRAINED = './oxford102_iter_50000.caffemodel'

RAW_DATA_DIR = './data/'
IMAGE_DATA_DIR = RAW_DATA_DIR + 'oxford102/jpg/'
SEGMENT_DATA_DIR = RAW_DATA_DIR + 'oxford102/segmim/'
FLOWER_FILE = RAW_DATA_DIR + 'imagelabels.mat'
# training, test and validation set indeces
SET_FILE = RAW_DATA_DIR + 'setid.mat'
IMAGE_FILE = IMAGE_DATA_DIR + 'image_00001.jpg'

IMAGE_FORMAT = 'image_{0:05d}.jpg'
SEGMENT_FORMAT = 'segmim_{0:05d}.jpg'

### Run this to get bigger plots

In [2]:
import matplotlib
matplotlib.rcParams['savefig.dpi'] = 2 * matplotlib.rcParams['savefig.dpi']
matplotlib.rcParams['savefig.dpi']

### Import Labels and Set Definitions

In [3]:
from scipy.io import loadmat
flower_mat = loadmat(FLOWER_FILE)
labels = flower_mat['labels'][0]

set_mat = loadmat(SET_FILE)
train_ids = set_mat['trnid'][0]
test_ids = set_mat['tstid'][0]
valid_ids = set_mat['valid'][0]

In [4]:
labels

In [5]:
train_ids

Map the label to the name

In [6]:
name_map = { 
            1 : 'Pink Primrose',
            2 : 'Hard-Leaved Pocket Orchid',
            3 : 'Canterbury Bells',
            4 : 'Sweet Pea',
            5 : 'English Marigold',
            6 : 'Tiger Lily',
            7 : 'Moon Orchid',
            8 : 'Bird of Paradise',
            9 : 'Monkshood',
            10: 'Globe Thistle',
            11: 'Snapdragon',
            12: 'Colts Foot',
            13: 'King Protea'}

Loading a network is easy. `caffe.Classifier` takes care of everything. Note the arguments for configuring input preprocessing: mean subtraction switched on by giving a mean array, input channel swapping takes care of mapping RGB into the reference ImageNet model's BGR order, and raw scaling multiplies the feature scale from the input [0,1] to the ImageNet model's [0,255].

We will set the phase to test since we are doing testing, and will first use CPU for the computation.

In [7]:
imagenet_mean = np.load(CAFFE_ROOT + 'python/caffe/imagenet/ilsvrc_2012_mean.npy').mean(1).mean(1)

In [8]:
caffe.set_mode_cpu()

In [9]:
net = caffe.Classifier(MODEL_FILE, PRETRAINED,
                       mean=imagenet_mean,
                       channel_swap=(2,1,0),
                       raw_scale=255,
                       image_dims=(256, 256))

Let's take a look at our example image with Caffe's image loading helper.

### Search for Images with a Certain Label

Let's find all the images with the label `7`. This should be the Moon Orchid.

In [10]:
search_label = 7
search_label_indeces = np.nonzero(labels == search_label)[0]

Should find `40` images

In [11]:
len(search_label_indeces)

### Construct Filename and Load Image

Now we'll load the first image that matches the label.

In [12]:
search_file_index = search_label_indeces[0]

In [13]:
# add one to index because image file numbers start at 1
file_name = IMAGE_FORMAT.format(search_file_index + 1)
test_image = IMAGE_DATA_DIR + file_name
test_image

In [14]:
input_image = load_image(test_image)
plt.imshow(input_image)

Time to classify. The default is to actually do 10 predictions, cropping the center and corners of the image as well as their mirrored versions, and average over the predictions:

In [15]:
output = net.predict([input_image])  # predict takes any number of images, and formats them for the Caffe net automatically
predictions = output[0]
print('prediction shape: {0}'.format(predictions.shape))

In [16]:
classes = np.arange(0, len(predictions))
plt.bar(classes, predictions, 1.0)
# class numbers start at 1
predicted_class_index = predictions.argmax()
predicted_class = predicted_class_index + 1
predicted_name = name_map[predicted_class] if predicted_class in name_map else '<class only> ->'
print('predicted class: {0} ({1})'.format(predicted_name, predicted_class))
print('prediction percent: {0:.1f}%'.format(predictions[predicted_class_index] * 100))

This should be pretty close to a 100% match (since it's in the training set).

The predicted class is 7 ('Moon Orchid'), as expected.

Now let's try something a little harder. Predicting something from the test set. 

We'll pick the first index from the original training set. Remember we trained on the original test set, so we'll test on the original training set. This is a little confusing, but it gave us more data to train our network on.

In [17]:
test_file_index = train_ids[0]
labels[test_file_index]

First index has a label of `1` ('Pink Primrose'). So that's what we should see in the prediction.
Let's take a look at the image.

In [18]:
# add one to index because image file numbers start at 1
file_name = IMAGE_FORMAT.format(test_file_index + 1)
test_image = IMAGE_DATA_DIR + file_name
input_image = load_image(test_image)
plt.imshow(input_image)

Now we'll make the new prediction for this image.

In [19]:
output = net.predict([input_image], oversample=False) 
predictions = output[0]
print('prediction shape: {0}'.format(predictions.shape))

In [20]:
classes = np.arange(0, len(predictions))
plt.bar(classes, predictions, 1.0)
# class numbers start at 1
predicted_class_index = predictions.argmax()
predicted_class = predicted_class_index + 1
predicted_name = name_map[predicted_class] if predicted_class in name_map else '<class only> ->'
print('predicted class: {0} ({1})'.format(predicted_name, predicted_class))
print('prediction percent: {0:.1f}%'.format(predictions[predicted_class_index] * 100))

Class label `1` ('Pink Primrose') as expected. You can also see some (much smaller) secondary predictions in the bar chart above (they're tiny, look near the axis).