# Accuracy Assessment

To judge how well our classifier does it is critical to provide metrics of how well it does on images it has never seen before. Here we have collected a series of image, "testing images", that were not used in training. 

We organize our testing images in the same file structure as our training dataset:

+ Testing_Images 
    + Buses
        - busimage.jpg
    + Fedex
        - fedeximage.jpg
    + Other
        - otherimage.jpg
        
        
We then need to read in our predicted labels and compare them to actual: 

### Organize Predicted and Actual

In [197]:
import scripts.label_image
import sys
import numpy as np
import tensorflow as tf
import os
from datetime import datetime
import matplotlib.pyplot as plt
import os
from PIL import Image, ImageDraw, ImageFont
import time
import psutil
import pandas as pd


df = pd.read_csv(r'/home/mmann1123/Dropbox/Apps/predicted_labels.csv')



In [198]:
# limit to input of interest
df = df[['Path','Date','Class','Prob']]
# add file name column
df['File'] = df['Path'].map(lambda a: os.path.basename(a))
df.head()

Unnamed: 0,Path,Date,Class,Prob,File
0,/home/mmann1123/Dropbox/Apps/PiCameraLogger/16...,2018-03-16 10:53:00,largecar,0.381579,Picapture_16_03_2018-10:53:00.jpg
1,/home/mmann1123/Dropbox/Apps/PiCameraLogger/16...,2018-03-16 15:53:00,fedex,0.9347,Picapture_16_03_2018-15:53:00.jpg
2,/home/mmann1123/Dropbox/Apps/PiCameraLogger/16...,2018-03-16 12:38:00,largecar,0.372817,Picapture_16_03_2018-12:38:00.jpg
3,/home/mmann1123/Dropbox/Apps/PiCameraLogger/16...,2018-03-16 07:23:00,car,0.30412,Picapture_16_03_2018-07:23:00.jpg
4,/home/mmann1123/Dropbox/Apps/PiCameraLogger/16...,2018-03-16 19:15:00,largecar,0.281663,Picapture_16_03_2018-19:15:00.jpg


In [199]:
# map directory and files into labels
def get_directory_structure(rootdir):
    #Creates a nested dictionary that represents the folder structure of rootdir
    dir = {}
    rootdir = rootdir.rstrip(os.sep)
    start = rootdir.rfind(os.sep) + 1
    for path, dirs, files in os.walk(rootdir):
        folders = path[start:].split(os.sep)
        subdir = dict.fromkeys(files)
        parent = reduce(dict.get, folders[:-1], dir)
        parent[folders[-1]] = subdir
    return dir

dir_dict = get_directory_structure(r'./Testing_Images')

In [200]:
# find class of all Bus testing images
buses = df[df['File'].isin(dir_dict['Testing_Images']['Bus'].keys())] 
# add predicted label 
buses.loc[:,'Pred'] = pd.Series('bus', index=buses.index)
print(buses)

                                                    Path                 Date  \
147    /home/mmann1123/Dropbox/Apps/PiCameraLogger/16...  2018-03-16 09:36:00   
454    /home/mmann1123/Dropbox/Apps/PiCameraLogger/16...  2018-03-16 12:26:00   
483    /home/mmann1123/Dropbox/Apps/PiCameraLogger/16...  2018-03-16 09:45:00   
685    /home/mmann1123/Dropbox/Apps/PiCameraLogger/16...  2018-03-16 09:14:00   
12355  /home/mmann1123/Dropbox/Apps/PiCameraLogger/29...  2018-03-29 09:14:00   
12456  /home/mmann1123/Dropbox/Apps/PiCameraLogger/29...  2018-03-29 07:15:00   
13503  /home/mmann1123/Dropbox/Apps/PiCameraLogger/22...  2018-03-22 11:11:00   
13672  /home/mmann1123/Dropbox/Apps/PiCameraLogger/22...  2018-03-22 16:34:00   
13800  /home/mmann1123/Dropbox/Apps/PiCameraLogger/22...  2018-03-22 16:27:01   
14920  /home/mmann1123/Dropbox/Apps/PiCameraLogger/14...  2018-04-14 08:11:00   
14934  /home/mmann1123/Dropbox/Apps/PiCameraLogger/14...  2018-04-14 08:48:00   
15056  /home/mmann1123/Dropb

In [201]:
# find class of all Fedex testing images
Fedex = df[df['File'].isin(dir_dict['Testing_Images']['Fedex'].keys())] 
# add predicted label 
Fedex.loc[:,'Pred'] = pd.Series('fedex', index=Fedex.index)
#print(Fedex)

In [202]:
# find class of all Fedex testing images
Other = df[df['File'].isin(dir_dict['Testing_Images']['Other'].keys())] 
# add predicted label 
Other.loc[:,'Pred'] = pd.Series('other', index=Other.index)

# rename actual class to other unless it is bus or fedex 
Other.loc[(Other.Class!='fedex') & (Other.Class!='bus'),'Class'] = pd.Series('other', index=Other.index)
#print(Other)

In [203]:
# Create unified df with all predictions and actual
pred_act  = buses.append([Fedex, Other])

### Accuracy Measures

In [204]:
# Confusion Metrics

pd.crosstab(pred_act['Class'], pred_act['Pred'],   colnames=['Predicted']).apply(lambda r: 100.0 * r/r.sum())

Predicted,bus,fedex,other
Class,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
bus,85.0,0.0,4.761905
fedex,15.0,90.0,9.52381
largecar,0.0,10.0,0.0
other,0.0,0.0,85.714286


In [205]:
# Overall accuracy 
accuracy_score(pred_act['Class'], pred_act['Pred'])    

0.86885245901639341

In [206]:
# Kappa 
cohen_kappa_score(pred_act['Class'], pred_act['Pred'])    
    

0.80657946888624654

Looks like there is some confusion between the fedex and bus classes. We can try retraining the model using inception_v3 which should provide more accurate results. Or since both are indications of conjestion, we can leave as is. 