# Evaluation

After cleaning our test set with object detection we extracted features from all the data set images (using ResNet18 pre trained model). We used the feture vectors of the data to preform classification using K Nearest Neighbor (K-NN) algorithm. 

In this notebook we'll process the results from the classification process. 

In [86]:
# imports for code 
import pandas as pd
import numpy as np 
import matplotlib.pyplot as plt
from tqdm import tqdm 

In [91]:
# load the following csv files as dataframe 
url_test ='https://raw.githubusercontent.com/matankleiner/Identify-Known-Sites-in-Photo-Album/master/data/test/test.csv'
url_test_more_classes1 = 'https://raw.githubusercontent.com/matankleiner/Identify-Known-Sites-in-Photo-Album/master/data/test/more_classes/test_more_classes1.csv'
url_test_more_classes2 = 'https://raw.githubusercontent.com/matankleiner/Identify-Known-Sites-in-Photo-Album/master/data/test/more_classes/test_more_classes2.csv'
url_test_more_classes3 = 'https://raw.githubusercontent.com/matankleiner/Identify-Known-Sites-in-Photo-Album/master/data/test/more_classes/test_more_classes3.csv'
url_train = 'https://raw.githubusercontent.com/matankleiner/Identify-Known-Sites-in-Photo-Album/master/data/train/train.csv' 
url_detctedt_landmarks = 'https://raw.githubusercontent.com/matankleiner/Identify-Known-Sites-in-Photo-Album/master/landmark_classifier/landmarks_csv_files/keep_v3_openimage.csv'
url_clean_v3 = 'https://raw.githubusercontent.com/matankleiner/Identify-Known-Sites-in-Photo-Album/master/data/test/clean_test_v3.csv'
url_clean_v3_v4 = 'https://raw.githubusercontent.com/matankleiner/Identify-Known-Sites-in-Photo-Album/master/data/test/clean_test_v3_v4.csv'
url_pred = 'https://raw.githubusercontent.com/matankleiner/Identify-Known-Sites-in-Photo-Album/master/feature_extraction/results_csv/predicted_class_embedded_test.csv'
url_dist = 'https://raw.githubusercontent.com/matankleiner/Identify-Known-Sites-in-Photo-Album/master/feature_extraction/results_csv/dist_embedded_test.csv' 
url_nn = 'https://raw.githubusercontent.com/matankleiner/Identify-Known-Sites-in-Photo-Album/master/feature_extraction/results_csv/nearest_neighbor_embedded_test.csv'

test_df = pd.read_csv(url_test) 
test_more_classes1_df = pd.read_csv(url_test_more_classes1)
test_more_classes2_df = pd.read_csv(url_test_more_classes2)
test_more_classes3_df = pd.read_csv(url_test_more_classes3)
train_df = pd.read_csv(url_train)
detctedt_landmarks_df = pd.read_csv(url_detctedt_landmarks)
clean_v3_df = pd.read_csv(url_clean_v3)
clean_v3_v4_df = pd.read_csv(url_clean_v3_v4)
pred_df = pd.read_csv(url_pred)
dist_df = pd.read_csv(url_dist)
nn_df = pd.read_csv(url_nn)

In [92]:
def change_df(df): 
    """
    Changing the dataframe so it will be easier to work with. 
    Param: 
        df (pd.DataFrame): The dataframe to change 
    Return: 
        df (pd.DataFrame): The chnaged dataframe 
    """
    df = df.drop("Unnamed: 0", axis=1)
    df.insert(0, "id", test_df["id"], True) 
    return df 

pred_df = change_df(pred_df)
pred_df = pred_df.rename(columns={"0": "prediction"})
dist_df = change_df(dist_df)
nn_df = change_df(nn_df)

### Prediction 

In [96]:
# due to the way the test set is organized we split it to 4 different test sets  
# convert the type of the test_df["landmarks"] from str to np.int64 
for i in range(test_df.shape[0]): 
    np.int64(test_df["landmarks"][i])
    np.int64(test_more_classes1_df["landmarks"][i])
    np.int64(test_more_classes2_df["landmarks"][i])
    np.int64(test_more_classes3_df["landmarks"][i])
    
# check if any of the given prediction is correct (in each one of the test sets)
pred_series1 = test_df["landmarks"] == pred_df["prediction"]
pred_series2 = test_more_classes1_df["landmarks"] == pred_df["prediction"]
pred_series3 = test_more_classes2_df["landmarks"] == pred_df["prediction"]
pred_series4 = test_more_classes3_df["landmarks"] == pred_df["prediction"]
correct_pred = len(pred_series1[pred_series1].index) + len(pred_series2[pred_series2].index) + \
               len(pred_series3[pred_series3].index) + len(pred_series4[pred_series4].index)
print ("There are {} correct prediction which is {:.2f}% accuracy out of all the landmarks in the test set."\
       .format(correct_pred, correct_pred / test_df[test_df.landmarks != 0].shape[0] * 100))
print("\nThe accuracy out of all the images in the test set is {:.2f}%".format(correct_pred / test_df.shape[0] * 100))

There are 275 correct prediction which is 16.95% accuracy out of all the landmarks in the test set.

The accuracy out of all the images in the test set is 0.23%


Using feature vectors and K-NN classification we managed to predict **275 landmarks correctly** which is **16.95% accuracy** out of all the landmarks in the test set.

However, the test set is mainly out of domain images, so if we calculate our accuracy out of all the images in the test set (i.e, the given test set and the one we cleaned using object detection) it'll be only **0.23%**.

We'll calculate the accuracy on the cleaned test set versions. One of them was cleaned using YOLOv3 object detctor and the other was cleaned using YOLOv3 and YOLOv4. 

In [105]:
# check for the correct prediction in the clean_v3 test set 
pred_clean_v3 = pred_df[pred_df["id"].isin(clean_v3_df["id"])]
pred_clean_v3_1 = pred_series1[pred_series1.index.isin(pred_clean_v3.index)]
pred_clean_v3_2 = pred_series2[pred_series2.index.isin(pred_clean_v3.index)]
pred_clean_v3_3 = pred_series3[pred_series3.index.isin(pred_clean_v3.index)]
pred_clean_v3_4 = pred_series4[pred_series4.index.isin(pred_clean_v3.index)]

correct_pred_v3 = len(pred_clean_v3_1[pred_clean_v3_1].index) + len(pred_clean_v3_2[pred_clean_v3_2].index) + \
                  len(pred_clean_v3_3[pred_clean_v3_3].index) + len(pred_clean_v3_4[pred_clean_v3_4].index)   
                                            
print ("There are {} correct prediction which is {:.2f}% accuracy out of all the landmarks in the clean test set "
        "using YOLO v3.".format(correct_pred_v3, correct_pred_v3/clean_v3_df[clean_v3_df.landmarks != "0"].shape[0]*100))
print("\nThe accuracy out of all the images in this clean test set is {:.2f}%"\
      .format(correct_pred_v3 / clean_v3_df.shape[0] * 100))

There are 275 correct prediction which is 17.08% accuracy out of all the landmarks in the clean test set using YOLO v3.

The accuracy out of all the images in this clean test set is 0.30%


In [107]:
# check for the correct prediction in the clean_v3 test set 
pred_clean_v3_v4 = pred_df[pred_df["id"].isin(clean_v3_v4_df["id"])]
pred_clean_v3_v4 = pred_series[pred_series.index.isin(pred_clean_v3_v4.index)]

pred_clean_v3_v4_1 = pred_series1[pred_series1.index.isin(pred_clean_v3_v4.index)]
pred_clean_v3_v4_2 = pred_series2[pred_series2.index.isin(pred_clean_v3_v4.index)]
pred_clean_v3_v4_3 = pred_series3[pred_series3.index.isin(pred_clean_v3_v4.index)]
pred_clean_v3_v4_4 = pred_series4[pred_series4.index.isin(pred_clean_v3_v4.index)]

correct_pred_v3_v4 = len(pred_clean_v3_v4_1[pred_clean_v3_v4_1].index)+len(pred_clean_v3_v4_2[pred_clean_v3_v4_2].index)+\
                     len(pred_clean_v3_v4_3[pred_clean_v3_v4_3].index)+len(pred_clean_v3_v4_4[pred_clean_v3_v4_4].index)   

print ("There are {} correct prediction which is {:.2f}% accuracy out of all the landmarks in the clean test set "
        "using YOLO v3 and\nYOLO v4.".format(correct_pred_v3_v4,\
        correct_pred_v3_v4 / clean_v3_v4_df[clean_v3_v4_df.landmarks != "0"].shape[0] * 100))
print("\nThe accuracy out of all the images in this clean test set is {:.2f}%"\
      .format(correct_pred_v3_v4 / clean_v3_v4_df.shape[0] * 100))

There are 273 correct prediction which is 17.13% accuracy out of all the landmarks in the clean test set using YOLO v3 and
YOLO v4.

The accuracy out of all the images in this clean test set is 0.32%


As we can see, using the clean data set improve the accuracy, out of all the landmarks and out of all the images. However, the improvence is not very significant. 

The best results we recieved are on the clean test set using YOLO v3 and YOLO v4. In this test set we predicted correctly **273 landmarks** which is which is **17.13% accuracy** out of all the landmarks in this test set and **0.32% accuracy** out of all the images in this test set.

### Nearest Neighbor examination 

In [109]:
# the nn_df hold the index of the matching neighbor in the train set, wo would like to replace it with the matching class 
col_to_replace0 = train_df.loc[nn_df["0"]]["landmark_id"]
nn_df['0'] = col_to_replace0.values
col_to_replace1 = train_df.loc[nn_df["1"]]["landmark_id"]
nn_df['1'] = col_to_replace1.values
col_to_replace2 = train_df.loc[nn_df["2"]]["landmark_id"]
nn_df['2'] = col_to_replace2.values
col_to_replace3 = train_df.loc[nn_df["3"]]["landmark_id"]
nn_df['3'] = col_to_replace3.values
col_to_replace4 = train_df.loc[nn_df["4"]]["landmark_id"]
nn_df['4'] = col_to_replace4.values

In [110]:
nn_df

Unnamed: 0,id,0,1,2,3,4
0,e324e0f3e6d9e504,42422,79959,138982,93154,147263
1,d9e17c5f3e0c47b3,14968,41941,95885,117418,38746
2,1a748a755ed67512,5156,164193,164193,67109,84309
3,537bf9bdfccdafea,48328,69301,136675,158991,136675
4,13f4c974274ee08b,136675,202793,25369,187755,188686
...,...,...,...,...,...,...
117222,e351c3e672c25fbd,47663,190441,23777,23777,56062
117223,5426472625271a4d,54785,54785,54785,54785,113750
117224,7b6a585405978398,171111,112512,200128,21500,142109
117225,d885235ba249cf5d,162403,162403,162403,115930,136675


In [131]:
first_neighbor_series = test_df["landmarks"] == nn_df["4"]
print ("There are {} correct prediction.".format(len(first_neighbor_series[first_neighbor_series].index)))

There are 139 correct prediction.
