### Transfer Learning Example

#### Goal
- Take dogs-vs-cats Kaggle dataset
- Use pretrained Xception to predict ImageNet label
- Use an online lookup to translate Imagenet label to dogs-vs-cats classification

#### Summary of result
Predict Imagenet labels of 25000 cases in dogs-vd-cats training dataset directly using Xception without training
- 23156 correct prediction
- 1190 cases are classified not as dog and cat categories (1056 from dog, 134 from cat)
- 17 dogs misclassified as cat
- 637 cats misclassified as dog
- Accuracy and error rate as follows

In [16]:
# Accuracy vs error rate
23156/25000, (1190 + 654) / 25000

(0.92624, 0.07376)

#### Ref
- ImageNet labels to dog-vs-cat map 
https://github.com/zygmuntz/kaggle-cats-and-dogs/
Specifically,
https://github.com/zygmuntz/kaggle-cats-and-dogs/blob/master/overfeat/compute_train_acc.py 
https://github.com/zygmuntz/kaggle-cats-and-dogs/blob/master/overfeat/predict.py

In [2]:
cf = open('kaggle-cats-and-dogs\overfeat\data\cats.txt')
cats = cf.read()
cats = cats.split( "\n" )

In [3]:
df = open('kaggle-cats-and-dogs\overfeat\data\dogs.txt')
dogs = df.read()
dogs = dogs.split( "\n" )

In [4]:
dog_labels = [x.split(',')[0] for x in dogs]
cat_labels = [x.split(',')[0] for x in cats]

In [7]:
# The prediction is run on the GPU cluster by using Keras Xception pretrained model
import pandas as pd
raw = pd.read_csv('raw_prediction.csv')

In [8]:
pred = []

In [9]:
for r in range(raw.shape[0]):
    cls = raw['class'][r].replace('_',' ') # Remove the underscore from the direct ImageNet label prediction
    if cls in dog_labels:
        pred.append('dog')
    elif cls in cat_labels:
        pred.append('cat')
    else:
        pred.append('unknown')

In [11]:
raw['pred'] = pred

In [13]:
# Summary
raw.groupby(['truth','pred']).size()

truth  pred   
cat    cat        10807
       dog          637
       unknown     1056
dog    cat           17
       dog        12349
       unknown      134
dtype: int64

In [36]:
# Totaly ERROR rate
(1190 + 654) / 25000

0.07376

In [35]:
raw[(raw.pred != 'unknown') & (raw.pred != raw.truth)]

Unnamed: 0.1,Unnamed: 0,id,class,prob,truth,file,pred
49,49,n02104365,schipperke,0.129365,cat,cat.11665.jpg,dog
61,61,n02105412,kelpie,0.156135,cat,cat.10107.jpg,dog
79,79,n02120505,grey_fox,0.175790,cat,cat.11871.jpg,dog
85,85,n02086910,papillon,0.077606,cat,cat.12098.jpg,dog
99,99,n02120505,grey_fox,0.268015,cat,cat.5539.jpg,dog
120,120,n02095314,wire-haired_fox_terrier,0.502985,cat,cat.599.jpg,dog
122,122,n02106382,Bouvier_des_Flandres,0.229874,cat,cat.10609.jpg,dog
129,129,n02120505,grey_fox,0.087596,cat,cat.3931.jpg,dog
157,157,n02113186,Cardigan,0.574266,cat,cat.6748.jpg,dog
185,185,n02113186,Cardigan,0.108588,cat,cat.6703.jpg,dog


In [14]:
raw[raw.pred == 'unknown']

Unnamed: 0.1,Unnamed: 0,id,class,prob,truth,file,pred
15,15,n02133161,American_black_bear,0.258392,dog,dog.10844.jpg,unknown
34,34,n04599235,wool,0.109830,cat,cat.5418.jpg,unknown
42,42,n03223299,doormat,0.292160,cat,cat.5844.jpg,unknown
44,44,n02492660,howler_monkey,0.298832,cat,cat.8455.jpg,unknown
67,67,n03803284,muzzle,0.929156,dog,dog.6159.jpg,unknown
78,78,n04493381,tub,0.174189,cat,cat.7460.jpg,unknown
88,88,n02909870,bucket,0.230651,cat,cat.10096.jpg,unknown
118,118,n01675722,banded_gecko,0.328046,cat,cat.1865.jpg,unknown
132,132,n01883070,wombat,0.571488,dog,dog.8457.jpg,unknown
141,141,n02797295,barrow,0.382083,cat,cat.329.jpg,unknown


In [39]:
raw.to_csv('processed_prediction.csv')