# Using deep features to build an image classifier

## Fire up turicreate

In [1]:
import turicreate

# Load a common image analysis dataset

We will use a popular benchmark dataset in computer vision called CIFAR-10.  

(We've reduced the data to just 4 categories = {'cat','bird','automobile','dog'}.)

This dataset is already split into a training set and test set.  

In [2]:
image_train = turicreate.SFrame('./image_train_data')
image_test = turicreate.SFrame('./image_test_data')

In [3]:
image_train

id,image,label,deep_features,image_array
24,Height: 32 Width: 32,bird,"[0.24287176132202148, 1.0954537391662598, 0.0, ...","[73.0, 77.0, 58.0, 71.0, 68.0, 50.0, 77.0, 69.0, ..."
33,Height: 32 Width: 32,cat,"[0.5250879526138306, 0.0, 0.0, 0.0, 0.0, 0.0, ...","[7.0, 5.0, 8.0, 7.0, 5.0, 8.0, 5.0, 4.0, 6.0, 7.0, ..."
36,Height: 32 Width: 32,cat,"[0.5660159587860107, 0.0, 0.0, 0.0, 0.0, 0.0, ...","[169.0, 122.0, 65.0, 131.0, 108.0, 75.0, ..."
70,Height: 32 Width: 32,dog,"[1.129795789718628, 0.0, 0.0, 0.7781944870948792, ...","[154.0, 179.0, 152.0, 159.0, 183.0, 157.0, ..."
90,Height: 32 Width: 32,bird,"[1.7178692817687988, 0.0, 0.0, 0.0, 0.0, 0.0, ...","[216.0, 195.0, 180.0, 201.0, 178.0, 160.0, ..."
97,Height: 32 Width: 32,automobile,"[1.5781855583190918, 0.0, 0.0, 0.0, 0.0, 0.0, ...","[33.0, 44.0, 27.0, 29.0, 44.0, 31.0, 32.0, 45.0, ..."
107,Height: 32 Width: 32,dog,"[0.0, 0.0, 0.22067785263061523, ...","[97.0, 51.0, 31.0, 104.0, 58.0, 38.0, 107.0, 61.0, ..."
121,Height: 32 Width: 32,bird,"[0.0, 0.23753464221954346, ...","[93.0, 96.0, 88.0, 102.0, 106.0, 97.0, 117.0, ..."
136,Height: 32 Width: 32,automobile,"[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 7.57378625869751, ...","[35.0, 59.0, 53.0, 36.0, 56.0, 56.0, 42.0, 62.0, ..."
138,Height: 32 Width: 32,bird,"[0.6589357256889343, 0.0, 0.0, 0.0, 0.0, 0.0, ...","[205.0, 193.0, 195.0, 200.0, 187.0, 193.0, ..."


In [4]:
image_test

id,image,label,deep_features,image_array
0,Height: 32 Width: 32,cat,"[1.1346900463104248, 0.0, 0.0, 0.0, ...","[158.0, 112.0, 49.0, 159.0, 111.0, 47.0, ..."
6,Height: 32 Width: 32,automobile,"[0.2313588261604309, 0.0, 0.0, 0.0, 0.0, ...","[160.0, 37.0, 13.0, 185.0, 49.0, 11.0, 20 ..."
8,Height: 32 Width: 32,cat,"[0.0, 0.0, 0.034419238567352295, ...","[23.0, 19.0, 23.0, 19.0, 21.0, 28.0, 21.0, 16.0, ..."
9,Height: 32 Width: 32,automobile,"[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 11.6065092086792, ...","[217.0, 215.0, 209.0, 210.0, 208.0, 202.0, ..."
12,Height: 32 Width: 32,dog,"[0.3223174810409546, 0.0, 1.2493335008621216, 0.0, ...","[91.0, 64.0, 30.0, 82.0, 58.0, 30.0, 87.0, 73.0, ..."
16,Height: 32 Width: 32,dog,"[0.0, 0.0, 0.34735703468322754, ...","[95.0, 76.0, 78.0, 92.0, 77.0, 78.0, 89.0, 77.0, ..."
24,Height: 32 Width: 32,dog,"[1.3155765533447266, 0.0, 0.0, 0.0, 0.0, 0.0, ...","[136.0, 134.0, 118.0, 142.0, 141.0, 126.0, ..."
25,Height: 32 Width: 32,bird,"[0.0, 0.31728875637054443, ...","[100.0, 103.0, 74.0, 68.0, 91.0, 65.0, 116.0, ..."
31,Height: 32 Width: 32,dog,"[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 9.260188102722168, ...","[127.0, 130.0, 81.0, 130.0, 133.0, 88.0, ..."
33,Height: 32 Width: 32,dog,"[0.1307867169380188, 0.7276672124862671, 0.0, ...","[118.0, 113.0, 81.0, 122.0, 117.0, 83.0, ..."


# Train a classifier on the raw image pixels

We first start by training a classifier on just the raw pixels of the image.

In [7]:
raw_pixel_model = turicreate.logistic_classifier.create(image_train, target = 'label',
                                              features = ['image_array'])

PROGRESS: Creating a validation set from 5 percent of training data. This may take a while.
          You can set ``validation_set=None`` to disable validation tracking.



# Make a prediction with the simple model based on raw pixels

In [8]:
image_test[0:3]['image']   # actual value

dtype: Image
Rows: 3
['Height: 32 Width: 32', 'Height: 32 Width: 32', 'Height: 32 Width: 32']

In [9]:
image_test[0:3]['label']   # actual value

dtype: str
Rows: 3
['cat', 'automobile', 'cat']

In [10]:
raw_pixel_model.predict(image_test[0:3])   # predicted value

dtype: str
Rows: 3
['bird', 'cat', 'bird']

### The model makes wrong predictions for all three images!

# Evaluating raw pixel model on test data

In [11]:
raw_pixel_model.evaluate(image_test)

{'accuracy': 0.4895,
 'auc': 0.7345449999999989,
 'confusion_matrix': Columns:
 	target_label	str
 	predicted_label	str
 	count	int
 
 Rows: 16
 
 Data:
 +--------------+-----------------+-------+
 | target_label | predicted_label | count |
 +--------------+-----------------+-------+
 |     dog      |    automobile   |   79  |
 |     dog      |       bird      |  235  |
 |     cat      |    automobile   |  136  |
 |     cat      |       cat       |  280  |
 |     cat      |       dog       |  419  |
 |     dog      |       dog       |  496  |
 |     bird     |    automobile   |  129  |
 |  automobile  |    automobile   |  657  |
 |     bird     |       cat       |  147  |
 |     dog      |       cat       |  190  |
 +--------------+-----------------+-------+
 [16 rows x 3 columns]
 Note: Only the head of the SFrame is printed.
 You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.,
 'f1_score': 0.48539350464034947,
 'log_loss': 1.187935018685503,
 'precision

### The accuracy of this model is poor, getting only about 48.9% accuracy.

# Can we improve the model using deep features

We only have 2005 data points, so it is not possible to train a deep neural network effectively with so little data.  Instead, we will use transfer learning: using deep features trained on the full ImageNet dataset, we will train a simple model on this small dataset.

In [12]:
len(image_train)

2005

## Computing deep features for our images

The two lines below allow us to compute deep features.  This computation takes a little while, so we have already computed them and saved the results as a column in the data you loaded. 

(Note that if you would like to compute such deep features and have a GPU on your machine, you should use the GPU enabled turicreate, which will be significantly faster for this task.)

In [15]:
# deep_learning_model = turicreate.load_model('http://s3.amazonaws.com/turicreate-Datasets/deeplearning/imagenet_model_iter45')
# image_train['deep_features'] = deep_learning_model.extract_features(image_train)

As we can see, the column deep_features already contains the pre-computed deep features for this data. 

In [16]:
image_train.head()

id,image,label,deep_features,image_array
24,Height: 32 Width: 32,bird,"[0.24287176132202148, 1.0954537391662598, 0.0, ...","[73.0, 77.0, 58.0, 71.0, 68.0, 50.0, 77.0, 69.0, ..."
33,Height: 32 Width: 32,cat,"[0.5250879526138306, 0.0, 0.0, 0.0, 0.0, 0.0, ...","[7.0, 5.0, 8.0, 7.0, 5.0, 8.0, 5.0, 4.0, 6.0, 7.0, ..."
36,Height: 32 Width: 32,cat,"[0.5660159587860107, 0.0, 0.0, 0.0, 0.0, 0.0, ...","[169.0, 122.0, 65.0, 131.0, 108.0, 75.0, ..."
70,Height: 32 Width: 32,dog,"[1.129795789718628, 0.0, 0.0, 0.7781944870948792, ...","[154.0, 179.0, 152.0, 159.0, 183.0, 157.0, ..."
90,Height: 32 Width: 32,bird,"[1.7178692817687988, 0.0, 0.0, 0.0, 0.0, 0.0, ...","[216.0, 195.0, 180.0, 201.0, 178.0, 160.0, ..."
97,Height: 32 Width: 32,automobile,"[1.5781855583190918, 0.0, 0.0, 0.0, 0.0, 0.0, ...","[33.0, 44.0, 27.0, 29.0, 44.0, 31.0, 32.0, 45.0, ..."
107,Height: 32 Width: 32,dog,"[0.0, 0.0, 0.22067785263061523, ...","[97.0, 51.0, 31.0, 104.0, 58.0, 38.0, 107.0, 61.0, ..."
121,Height: 32 Width: 32,bird,"[0.0, 0.23753464221954346, ...","[93.0, 96.0, 88.0, 102.0, 106.0, 97.0, 117.0, ..."
136,Height: 32 Width: 32,automobile,"[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 7.57378625869751, ...","[35.0, 59.0, 53.0, 36.0, 56.0, 56.0, 42.0, 62.0, ..."
138,Height: 32 Width: 32,bird,"[0.6589357256889343, 0.0, 0.0, 0.0, 0.0, 0.0, ...","[205.0, 193.0, 195.0, 200.0, 187.0, 193.0, ..."


# Given the deep features, let's train a classifier

In [17]:
deep_features_model = turicreate.logistic_classifier.create(image_train,
                                                            features = ['deep_features'],
                                                            target = 'label'
                                                           )

PROGRESS: Creating a validation set from 5 percent of training data. This may take a while.
          You can set ``validation_set=None`` to disable validation tracking.



# Apply the deep features model to first few images of test set

In [19]:
image_test[0:3]['image']

dtype: Image
Rows: 3
['Height: 32 Width: 32', 'Height: 32 Width: 32', 'Height: 32 Width: 32']

In [21]:
image_test[0:3]['label']   # actual value

dtype: str
Rows: 3
['cat', 'automobile', 'cat']

In [22]:
deep_features_model.predict(image_test[0:3])   # predicted value

dtype: str
Rows: 3
['cat', 'automobile', 'cat']

### The classifier with deep features gets all of these images right!

# Compute test_data accuracy of deep_features_model

As we can see, deep features provide us with significantly better accuracy (about 78%)

In [23]:
deep_features_model.evaluate(image_test)

{'accuracy': 0.7955,
 'auc': 0.9428215416666658,
 'confusion_matrix': Columns:
 	target_label	str
 	predicted_label	str
 	count	int
 
 Rows: 16
 
 Data:
 +--------------+-----------------+-------+
 | target_label | predicted_label | count |
 +--------------+-----------------+-------+
 |     dog      |    automobile   |   12  |
 |     bird     |       dog       |   56  |
 |     dog      |       cat       |  224  |
 |  automobile  |       cat       |   21  |
 |     dog      |       bird      |   48  |
 |     cat      |    automobile   |   20  |
 |     cat      |       cat       |  706  |
 |     bird     |       cat       |  125  |
 |     cat      |       dog       |  208  |
 |     dog      |       dog       |  716  |
 +--------------+-----------------+-------+
 [16 rows x 3 columns]
 Note: Only the head of the SFrame is printed.
 You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.,
 'f1_score': 0.7967920194728516,
 'log_loss': 0.5895928637984982,
 'precision