# Improving Accuracy

It's not always the case that you have the perfect dataset and the perfect test subjects. Imagine a situation where you only have one picture/subject in your dataset, and you have to find the face of one person from thousands of images. The results can very quickly get messy. In this notebook, I will show a way to tackle this problem.

In [1]:
import cv2
import matplotlib.pyplot as plt
import keras_vggface as kv
import modules.utils as utils
import tensorflow as tf
import os
import pandas as pd
import numpy as np
import nmslib

In [2]:
# Declare a FacePreprocess instance.
from modules.FacePreprocess import FacePreprocess
ssd_model = r'./models/ssd/deploy.prototxt.txt'
ssd_weights = r'./models/ssd/res10_300x300_ssd_iter_140000.caffemodel'
processor = FacePreprocess(ssd_model, ssd_weights)

In [3]:
# Use the facial embedding model you want to use
model = kv.VGGFace(
    model='resnet50', 
    include_top=False, 
    input_shape=(224, 224, 3), 
    pooling='avg'
)
input_size = (224, 224)

In [4]:
# ==========================================================================
# Compute Embeddings
id_list = pd.DataFrame(columns=['name', 'file'])
embeddings = []
for id in os.listdir('./dataset/train/'):
    folder = './dataset/train/{}/'.format(id)
    
    # use img_1 for every subject
    file = 'img_1.jpg'
    try:
        filepath = os.path.join(folder, file)
        processed_img = processor.preproc(cv2.imread(filepath))[0][0]
        embedding = model.predict(utils.resize(processed_img, input_size), verbose=False)[0,:]

        id_list.loc[len(id_list.index)] = [id, filepath]
        embeddings.append(embedding)
    except:
        print('Failed to process/predict {}'.format(filepath))
print('='*10 + '\n{} images processed.'.format(len(embeddings)))

# ==========================================================================
# Initialize nmslib
index_time_params = {'M': 15, 'indexThreadQty': 4, 'efConstruction': 100, 'post' : 0}

# l2 dist.
index_l2 = nmslib.init(
    method = 'hnsw', # hierarchical navigable small world graph
    space = 'l2', # euclidean
    data_type = nmslib.DataType.DENSE_VECTOR
) 
index_l2.addDataPointBatch(embeddings)
index_l2.createIndex(index_time_params)

# cosine simil.
index_cos = nmslib.init(
    method = 'hnsw', # hierarchical navigable small world graph
    space = 'cosinesimil', # cosine
    data_type = nmslib.DataType.DENSE_VECTOR
) 
index_cos.addDataPointBatch(embeddings)
index_cos.createIndex(index_time_params)

5 images processed.


Let's test this model on `./dataset/test/test_2/`

In [5]:
results_all = pd.DataFrame(
    columns = ['ID', 'Model', 'Dist.', 'True', 'False', 'Avg. Confidence', 'Std. Confidence'], 
)

for file in os.listdir('./dataset/test/test_2/'):
    video_path = './dataset/test/test_2/'+file
    id = file.replace('.mp4', '')

    cap = cv2.VideoCapture(video_path)
    count = {
        'l2':{'True': 0, 'False':0, 'conf':[]},
        'cosine':{'True': 0, 'False':0, 'conf':[]},
    }
    while(cap.isOpened()):
        ret, frame = cap.read()
        try:
            img = frame.copy()
            faces = processor.preproc(img)

            if len(faces)>0:
                for face in faces:
                    # target embeddings
                    target = model.predict(utils.resize(face[0], input_size), verbose=False)[0,:]
                    target = np.array(target, dtype='f')
                    target = np.expand_dims(target, axis=0)

                    # l2
                    neighbors, distances = index_l2.knnQueryBatch(target, k=1, num_threads=4)[0]
                    name = id_list['name'][neighbors[0]]
                    if name == id:
                        count['l2']['True'] += 1
                        count['l2']['conf'].append(distances[0])
                    else:
                        count['l2']['False'] += 1

                    # cosinesimil
                    neighbors, distances = index_cos.knnQueryBatch(target, k=1, num_threads=4)[0]
                    name = id_list['name'][neighbors[0]]
                    if name == id:
                        count['cosine']['True'] += 1
                        count['cosine']['conf'].append(distances[0])
                    else:
                        count['cosine']['False'] += 1
        except:
            break
    cap.release()

    results_all.loc[len(results_all)] = [id, 'resnet50', 'l2', count['l2']['True'], count['l2']['False'], np.average(count['l2']['conf']), np.std(count['l2']['conf'])]
    results_all.loc[len(results_all)] = [id, 'resnet50', 'cosinesimil', count['cosine']['True'], count['cosine']['False'], np.average(count['cosine']['conf']), np.std(count['cosine']['conf'])]
results_all

Unnamed: 0,ID,Model,Dist.,True,False,Avg. Confidence,Std. Confidence
0,irene,resnet50,l2,174,80,6621.646484,763.319153
1,irene,resnet50,cosinesimil,151,103,0.322694,0.038077
2,joy,resnet50,l2,540,3,8066.996094,1063.527222
3,joy,resnet50,cosinesimil,542,1,0.333638,0.036382
4,seulgi,resnet50,l2,211,7,6007.01416,985.237366
5,seulgi,resnet50,cosinesimil,213,5,0.244937,0.048893
6,wendy,resnet50,l2,91,116,9710.950195,1389.878418
7,wendy,resnet50,cosinesimil,93,114,0.418397,0.064799
8,yeri,resnet50,l2,320,94,7113.11084,1012.91864
9,yeri,resnet50,cosinesimil,340,74,0.327458,0.094571


In [6]:
accuracy = np.sum(results_all['True'])/(np.sum(results_all['True'])+np.sum(results_all['False']))*100
print('Accuracy: {:0.2f} %'.format(accuracy))

Accuracy: 81.75 %


In [7]:
output_path = './output/real_time_face_recognition/results.xlsx'
with pd.ExcelWriter(output_path, engine='openpyxl', mode='a', if_sheet_exists='replace') as writer:  
    results_all.to_excel(writer, sheet_name='single sample', index=False)

As we can see, with only one image/subject, the accuracy isn't as great (For reference, with 5 images/subject the accuracy is 85.45%). In this case we only have 5 subjects in our dataset, so the results aren't as bad. But when you have more people in your dataset, the results will only get worse. 

## Method: Add confidence threshold

## Method: Double verification

For this method, we will use both distance models at once. If both models return the same ID, we'll add it to the tally, otherwise we'll count it as unknown.

![fig](./assets/fig3.svg)