# Classification: Instant Recognition with Caffe

In this example we'll classify an image with the bundled CaffeNet model (which is based on the network architecture of Krizhevsky et al. for ImageNet).

We'll compare CPU and GPU modes and then dig into the model to inspect features and the output.

### 1. Setup

* First, set up Python, `numpy`, and `matplotlib`.

In [1]:
# set up Python environment: numpy for numerical routines, and matplotlib for plotting
import numpy as np
import matplotlib.pyplot as plt
# display plots in this notebook
%matplotlib inline

# set display defaults
plt.rcParams['figure.figsize'] = (10, 10)        # large images
plt.rcParams['image.interpolation'] = 'nearest'  # don't interpolate: show square pixels
plt.rcParams['image.cmap'] = 'gray'  # use grayscale output rather than a (potentially misleading) color heatmap

* Load `caffe`.

In [2]:
# The caffe module needs to be on the Python path;
#  we'll add it here explicitly.
import sys
caffe_root = '../'  # this file should be run from {caffe_root}/examples (otherwise change this line)
sys.path.insert(0, caffe_root + 'python')

import caffe
# If you get "No module named _caffe", either you have not built pycaffe or you have the wrong path.

* If needed, download the reference model ("CaffeNet", a variant of AlexNet).

In [3]:
import os
if os.path.isfile(caffe_root + 'models/vggsms/vgg_siamese.caffemodel'):
    print 'CaffeNet found.'
else:
    print 'Downloading pre-trained CaffeNet model...'
    !../scripts/download_model_binary.py ../models/vggsms

CaffeNet found.


### 2. Load net and set up input preprocessing

* Set Caffe to CPU mode and load the net from disk.

In [4]:
caffe.set_mode_cpu()

model_def = caffe_root + 'models/vggsms/deploy_orginal.prototxt'
model_weights = caffe_root + 'models/vggsms/vgg_siamese.caffemodel'

net = caffe.Net(model_def,      # defines the structure of the model
                model_weights,  # contains the trained weights
                caffe.TEST)     # use test mode (e.g., don't perform dropout)

* Set up input preprocessing. (We'll use Caffe's `caffe.io.Transformer` to do this, but this step is independent of other parts of Caffe, so any custom preprocessing code may be used).

    Our default CaffeNet is configured to take images in BGR format. Values are expected to start in the range [0, 255] and then have the mean ImageNet pixel value subtracted from them. In addition, the channel dimension is expected as the first (_outermost_) dimension.
    
    As matplotlib will load images with values in the range [0, 1] in RGB format with the channel as the _innermost_ dimension, we are arranging for the needed transformations here.

### 3. CPU classification

* Now we're ready to perform classification. Even though we'll only classify one image, we'll set a batch size of 50 to demonstrate batching.

In [5]:
import deepdish as dd
import lmdb
import glob2 
#import random
import PIL
from PIL import Image
import plyvel

datapath = '/home/darshan/ML/Pair_images/*.JPG'
files = glob2. glob(datapath)
sortedfiles = sorted(files)
nrfiles = np.size(sortedfiles)


In [6]:
print(sortedfiles)

['/home/darshan/ML/Pair_images/000040-02.JPG', '/home/darshan/ML/Pair_images/000040-03.JPG']


In [7]:
img1 = Image.open(sortedfiles[0])
img1=img1.resize((224,224),PIL.Image.ANTIALIAS)
img1=np.uint8(img1)
img1= img1[:, :, (2, 1, 0)]
img1 = img1.transpose((2, 0, 1))

img2 = np.uint8(Image.open(sortedfiles[1]).resize((224,224),PIL.Image.ANTIALIAS))
img2= img2[:, :, (2, 1, 0)]
img2 = img2.transpose((2, 0, 1))

img3 = np.concatenate((img1,img2))
img3 = img3;

datum = caffe.io.array_to_datum(img3)  

print img1.shape
print img2.shape
print img3.shape
print {'data': net.blobs['data'].data.shape}

(3, 224, 224)
(3, 224, 224)
(6, 224, 224)
{'data': (1, 6, 224, 224)}


In [8]:
# create transformer for the input called 'data'
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})

transformer.set_transpose('data', (2,0,1))  # move image channels to outermost dimension
#transformer.set_mean('data', mu)            # subtract the dataset-mean value in each channel
transformer.set_raw_scale('data', 255)      # rescale from [0, 1] to [0, 255]
transformer.set_channel_swap('data', (5,4,3,2,1,0))  # swap channels from RGB to BGR

In [9]:
# set the size of the input (we can skip this if we're happy
#  with the default; we can also change it later, e.g., for different batch sizes)
net.blobs['data'].reshape(1,          # batch size
                          6,         # 3-channel (BGR) images
                          224, 224)  # image size is 227x227

* Load an image (that comes with Caffe) and perform the preprocessing we've set up.

In [10]:
#image = caffe.io.load_image(caffe_root + 'examples/images/cat.jpg')
#transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
transformed_image = transformer.preprocess('data', img3)
#plt.imshow(img3)

* Adorable! Let's classify it!

In [11]:
# copy the image data into the memory allocated for the net
net.blobs['data'].data[...] = transformed_image

### perform classification
output = net.forward()

output_prob = output['prob'][0]  # the output probability vector for the first image in the batch

print output_prob
#print 'predicted class is:', output_prob.argmax()
#print len(output_prob)


[ 0.92395574  0.07604422]


In [101]:
np.savetxt("foo.csv", output_prob, delimiter=",")