# Using deep features to build an image classifier

# Fire up GraphLab Create
dataset train https://d396qusza40orc.cloudfront.net/phoenixassets/image_train_data.csv  
dataset test https://d396qusza40orc.cloudfront.net/phoenixassets/image_test_data.csv

In [1]:
import numpy as np
import pandas as pd
%matplotlib inline

# Load a common image analysis dataset

We will use a popular benchmark dataset in computer vision called CIFAR-10.  

(We've reduced the data to just 4 categories = {'cat','bird','automobile','dog'}.)

This dataset is already split into a training set and test set.  

In [2]:
image_train = pd.read_csv('image_train_data.csv')
image_test = pd.read_csv('image_test_data.csv')

# Exploring the image data

In [3]:
image_train['image'].head()

0    Height: 32 Width: 32
1    Height: 32 Width: 32
2    Height: 32 Width: 32
3    Height: 32 Width: 32
4    Height: 32 Width: 32
Name: image, dtype: object

# Train a classifier on the raw image pixels

We first start by training a classifier on just the raw pixels of the image.

In [4]:
from sklearn.linear_model import LogisticRegressionCV
from sklearn import preprocessing

In [5]:
raw_pixel_model = LogisticRegressionCV()
le = preprocessing.LabelEncoder()

In [6]:
image_train['image_array'] = image_train['image_array'].apply(lambda x :[int(i) for i in x[1:-1].split(' ')])
image_test['image_array'] = image_test['image_array'].apply(lambda x :[int(i) for i in x[1:-1].split(' ')])

In [7]:
train_image_array = [i for i in image_train['image_array'].values ]

In [8]:
train_y = le.fit_transform(image_train.label)

In [9]:
raw_pixel_model.fit(train_image_array,train_y)

LogisticRegressionCV(Cs=10, class_weight=None, cv=None, dual=False,
           fit_intercept=True, intercept_scaling=1.0, max_iter=100,
           multi_class='ovr', n_jobs=1, penalty='l2', random_state=None,
           refit=True, scoring=None, solver='lbfgs', tol=0.0001, verbose=0)

# Make a prediction with the simple model based on raw pixels

In [10]:
#image_test[0:3]['image'].show()

In [11]:
test_image_array =  [i for i in image_test['image_array'].values]
test_y = le.transform(image_test.label)

In [12]:
image_test[0:3]['label']

0           cat
1    automobile
2           cat
Name: label, dtype: object

In [13]:
le.inverse_transform(raw_pixel_model.predict(test_image_array[0:3]))

array(['dog', 'cat', 'dog'], dtype=object)

The model makes wrong predictions for all three images.

# Evaluating raw pixel model on test data

In [14]:
from sklearn.metrics import accuracy_score

In [15]:
true_label = le.transform(image_test['label'])
raw_pred_label = raw_pixel_model.predict(test_image_array)

In [16]:
accuracy_score(true_label,raw_pred_label)

0.43874999999999997

The accuracy of this model is poor, getting only about 46% accuracy.

# Can we improve the model using deep features

We only have 2005 data points, so it is not possible to train a deep neural network effectively with so little data.  Instead, we will use transfer learning: using deep features trained on the full ImageNet dataset, we will train a simple model on this small dataset.

In [17]:
len(image_train)

2005

## Computing deep features for our images

The two lines below allow us to compute deep features.  This computation takes a little while, so we have already computed them and saved the results as a column in the data you loaded. 

(Note that if you would like to compute such deep features and have a GPU on your machine, you should use the GPU enabled GraphLab Create, which will be significantly faster for this task.)

In [18]:
# deep_learning_model = graphlab.load_model('http://s3.amazonaws.com/GraphLab-Datasets/deeplearning/imagenet_model_iter45')
# image_train['deep_features'] = deep_learning_model.extract_features(image_train)

As we can see, the column deep_features already contains the pre-computed deep features for this data. 

In [19]:
image_train.head()

Unnamed: 0,id,image,label,deep_features,image_array
0,24,Height: 32 Width: 32,bird,[0.242872 1.09545 0 0.39363 0 0 11.8949 0 0 0 ...,"[73, 77, 58, 71, 68, 50, 77, 69, 44, 120, 116,..."
1,33,Height: 32 Width: 32,cat,[0.525088 0 0 0 0 0 9.94829 0 0 0 0 0 1.01264 ...,"[7, 5, 8, 7, 5, 8, 5, 4, 6, 7, 4, 7, 11, 5, 9,..."
2,36,Height: 32 Width: 32,cat,[0.566016 0 0 0 0 0 9.9972 0 0 0 1.38345 0 0.7...,"[169, 122, 65, 131, 108, 75, 193, 196, 192, 21..."
3,70,Height: 32 Width: 32,dog,[1.1298 0 0 0.778194 0 0.758051 9.83053 0 0 0....,"[154, 179, 152, 159, 183, 157, 165, 189, 162, ..."
4,90,Height: 32 Width: 32,bird,[1.71787 0 0 0 0 0 9.33936 0 0 0 0 0 0.412137 ...,"[216, 195, 180, 201, 178, 160, 210, 184, 164, ..."


# Given the deep features, let's train a classifier

In [20]:
image_train['deep_features']=image_train['deep_features'].apply(lambda x:[float(i) for i in x[1:-1].split(' ')])
image_test['deep_features']=image_test['deep_features'].apply(lambda x:[float(i) for i in x[1:-1].split(' ')])

In [21]:
train_deep_features = [i for i in image_train['deep_features'].values]

In [22]:
deep_features_model = LogisticRegressionCV()
deep_features_model.fit(train_deep_features,train_y)

LogisticRegressionCV(Cs=10, class_weight=None, cv=None, dual=False,
           fit_intercept=True, intercept_scaling=1.0, max_iter=100,
           multi_class='ovr', n_jobs=1, penalty='l2', random_state=None,
           refit=True, scoring=None, solver='lbfgs', tol=0.0001, verbose=0)

# Apply the deep features model to first few images of test set

In [23]:
#image_test[0:3]['image'].show()

In [24]:
test_deep_features = [i for i in image_test['deep_features'].values]

In [25]:
le.inverse_transform(deep_features_model.predict(test_deep_features[0:3]))

array(['cat', 'automobile', 'cat'], dtype=object)

The classifier with deep features gets all of these images right!

# Compute test_data accuracy of deep_features_model

As we can see, deep features provide us with significantly better accuracy (about 78%)

In [26]:
deep_pred = deep_features_model.predict(test_deep_features)
true_label =le.transform(image_test['label'])
accuracy_score(true_label,deep_pred)

0.80774999999999997