# Example of extracting features from a pretrained neural network in Caffe using SkiCaffe

SkiCaffe is a wrapper that provides a "scikit-learn like" API to pretrained networks such as those distributed in the [Caffe Model Zoo](https://github.com/BVLC/caffe/wiki/Model-Zoo) or elsewhere (such as [DeepDetect](http://www.deepdetect.com/applications/model/)). Basically, I wanted to use these pretrained models for extracting features, but also use the powerful pipelines of scikit-learn. Here we illustrate it's basic use for extracting features. 

In [1]:
from skicaffe import SkiCaffe

To use SKiCaffe, we need Caffe so we have to specify where Caffe was installed. We assume that the installation of Caffe is the default one from [BVLC](https://github.com/BVLC/caffe). We specify the location of Caffe on our system with caffe_root:

In [2]:
caffe_root = '/usr/local/caffe/'
model_file = caffe_root + 'models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel'
prototxt_file = caffe_root + 'models/bvlc_reference_caffenet/deploy.prototxt'

DLmodel = SkiCaffe(caffe_root = caffe_root,
                   model_prototxt_path = prototxt_file, 
                   model_trained_path = model_file, 
                   include_labels = True,
                   return_type = "pandasDF")

In Scikit-learn parlance, SkiCaffe is an estimator, since it is meant for extracting features from images. Therefore SkiCaffe inherits from the [BaseEstimator](http://scikit-learn.org/stable/modules/generated/sklearn.base.BaseEstimator.html#sklearn.base.BaseEstimator) and [TransformerMixin](http://scikit-learn.org/stable/modules/generated/sklearn.base.TransformerMixin.html#sklearn.base.TransformerMixin) classes. The two methods that we overwrite are the fit and transform methods. SkiCaffe is a bit unusual in that the input data that is transformed is not a numpy array X of shape [n_samples, n_features], but rather the input is a python list of image paths. SkiCaffe takes these image paths and "transforms" them by returning features that are derived from a pretrained neural net in Caffe. 

The fit method loads the specified pretrained network. The transform method takes the list of images paths and returns the image features extracted from a specific layer as a numpy array (or optionally a Pandas Data Frame).  

To load the pretrained network, the fit method in SkiCaffe needs the paths for the following two files:
- deploy.prototxt file: neural network definition file for prediction
- caffemodel file: the trained neural network (for example, bvlc_googlenet.caffemodel)

The current setup of SkiCaffe is very ImageNet centric, therefore other supporting files from the default installation of Caffe are used (see documentation for more details and change as needed). 





## Load pretrained BVLC reference model

In this example we will be using the BVLC reference model and specify the protxt file and trained caffemodel file. We "fit" the model by loading the pretrained network. We can also use the layer_sizes attribute of our model to see what the different layers and their sizes are.

In [3]:
DLmodel.fit()
print 'Number of layers:', len(DLmodel.layer_sizes)
DLmodel.layer_sizes

caffe imported successfully
Number of layers: 15


[('data', (10, 3, 227, 227)),
 ('conv1', (10, 96, 55, 55)),
 ('pool1', (10, 96, 27, 27)),
 ('norm1', (10, 96, 27, 27)),
 ('conv2', (10, 256, 27, 27)),
 ('pool2', (10, 256, 13, 13)),
 ('norm2', (10, 256, 13, 13)),
 ('conv3', (10, 384, 13, 13)),
 ('conv4', (10, 384, 13, 13)),
 ('conv5', (10, 256, 13, 13)),
 ('pool5', (10, 256, 6, 6)),
 ('fc6', (10, 4096)),
 ('fc7', (10, 4096)),
 ('fc8', (10, 1000)),
 ('prob', (10, 1000))]

## Specify list of images and extract features

In [4]:
image_paths = ['./images/cat.jpg', 
               './images/1404329745.jpg']

We can now "transform" these images by extracting their features from our pretrained network. All we have to do is specify the layer name (listed above). Here we select the output of the last fully connected layer "fc8". If you like Data Frames (like me), then you can optionally specify that the return type be a Pandas Data Frame. 

In [5]:
image_features = DLmodel.transform(X = image_paths)
image_features.head()

Unnamed: 0,pred.class,pred.conf,fc8.0,fc8.1,fc8.2,fc8.3,fc8.4,fc8.5,fc8.6,fc8.7,...,fc8.990,fc8.991,fc8.992,fc8.993,fc8.994,fc8.995,fc8.996,fc8.997,fc8.998,fc8.999
0,"n02123045 tabby, tabby cat",0.301984,-3.951654,3.974992,-2.126802,-1.871261,-2.66216,-1.35372,-1.741305,0.325119,...,0.965206,-1.478447,-1.339074,-2.622084,-2.790083,-0.731835,-2.223104,-3.972739,3.412313,5.382214
1,n09472597 volcano,0.301601,-0.368596,-2.329629,-2.272709,-1.236161,0.337776,-1.146151,0.180243,-1.866235,...,1.175359,0.087868,-0.151791,-2.187589,-1.585882,1.581705,-1.362859,-0.552606,5.315578,-0.433723


Now  lets get the faetures again but this time from an earlier layer and have the features returned as a numpy array. We also do not care about the labels so we will not extract that either:

In [6]:
DLmodel.include_labels = False
DLmodel.return_type = 'numpy_array'
image_features = DLmodel.transform(X = image_paths, layer_name='fc7')
image_features

array([[ 0.        ,  0.        ,  0.        , ...,  0.        ,
         4.88522577,  6.54640722],
       [ 0.        ,  0.        ,  0.        , ...,  0.        ,
         0.        ,  0.        ]], dtype=float32)

## Load pretrained ResNet model

Let's now get features using a different model. We will use a ResNet-50 model that we include in this repo as an example. In this example we will also ensure that the returned features DataFrame will have an "image_paths" column: 

In [7]:
caffe_root = '/usr/local/caffe/'
model_file = './models/ResNet-50-model.caffemodel'
prototxt_file = './models/ResNet-50-deploy.prototxt'

ResNet = SkiCaffe(caffe_root = caffe_root,
                  model_prototxt_path = prototxt_file, 
                  model_trained_path = model_file, 
                  include_labels = False,
                  include_image_paths = True,
                  return_type = "pandasDF")

In [8]:
ResNet.fit()
print 'Number of layers:', len(ResNet.layer_sizes)
ResNet.layer_sizes

caffe imported successfully
Number of layers: 106


[('data', (1, 3, 224, 224)),
 ('conv1', (1, 64, 112, 112)),
 ('pool1', (1, 64, 56, 56)),
 ('pool1_pool1_0_split_0', (1, 64, 56, 56)),
 ('pool1_pool1_0_split_1', (1, 64, 56, 56)),
 ('res2a_branch1', (1, 256, 56, 56)),
 ('res2a_branch2a', (1, 64, 56, 56)),
 ('res2a_branch2b', (1, 64, 56, 56)),
 ('res2a_branch2c', (1, 256, 56, 56)),
 ('res2a', (1, 256, 56, 56)),
 ('res2a_res2a_relu_0_split_0', (1, 256, 56, 56)),
 ('res2a_res2a_relu_0_split_1', (1, 256, 56, 56)),
 ('res2b_branch2a', (1, 64, 56, 56)),
 ('res2b_branch2b', (1, 64, 56, 56)),
 ('res2b_branch2c', (1, 256, 56, 56)),
 ('res2b', (1, 256, 56, 56)),
 ('res2b_res2b_relu_0_split_0', (1, 256, 56, 56)),
 ('res2b_res2b_relu_0_split_1', (1, 256, 56, 56)),
 ('res2c_branch2a', (1, 64, 56, 56)),
 ('res2c_branch2b', (1, 64, 56, 56)),
 ('res2c_branch2c', (1, 256, 56, 56)),
 ('res2c', (1, 256, 56, 56)),
 ('res2c_res2c_relu_0_split_0', (1, 256, 56, 56)),
 ('res2c_res2c_relu_0_split_1', (1, 256, 56, 56)),
 ('res3a_branch1', (1, 512, 28, 28)),
 ('r

In [9]:
image_features = ResNet.transform(X = image_paths)
image_features.head()

Unnamed: 0,image_paths,fc1000.0,fc1000.1,fc1000.2,fc1000.3,fc1000.4,fc1000.5,fc1000.6,fc1000.7,fc1000.8,...,fc1000.990,fc1000.991,fc1000.992,fc1000.993,fc1000.994,fc1000.995,fc1000.996,fc1000.997,fc1000.998,fc1000.999
0,./images/cat.jpg,-1.422729,1.202515,-4.054061,-2.982319,-2.029956,-0.502533,-1.860753,0.857842,1.985428,...,0.521646,0.397216,0.230053,0.05036,-1.623829,-1.071765,-0.017579,-0.923358,3.147756,4.90351
1,./images/1404329745.jpg,-0.560677,-1.725651,-0.995183,-0.201458,-0.613417,-2.232322,-0.607065,-1.143935,-2.150558,...,-0.027323,-0.578616,0.019842,-2.18331,-1.461334,0.499992,-1.939194,-1.392281,0.628841,0.852022
