# Improving Accuracy

It's not always the case that you have the perfect dataset and the perfect test subjects. Imagine a situation where you only have one picture/subject in your dataset, and you have to find the face of one person from thousands of images. The results can very quickly get messy. In this notebook, I will show a way to tackle this problem.

In [6]:
import cv2
import matplotlib.pyplot as plt
import keras_vggface as kv
import modules.utils as utils
import tensorflow as tf
import os
import pandas as pd
import numpy as np
import nmslib

In [7]:
# Declare a FacePreprocess instance.
from modules.FacePreprocess import FacePreprocess
ssd_model = r'./models/ssd/deploy.prototxt.txt'
ssd_weights = r'./models/ssd/res10_300x300_ssd_iter_140000.caffemodel'
processor = FacePreprocess(ssd_model, ssd_weights)

In [8]:
# Use the facial embedding model you want to use
model = kv.VGGFace(
    model='resnet50', 
    include_top=False, 
    input_shape=(224, 224, 3), 
    pooling='avg'
)
input_size = (224, 224)

In [9]:
# ==========================================================================
# Compute Embeddings
id_list = pd.DataFrame(columns=['name', 'file'])
embeddings = []
for id in os.listdir('./dataset/train/'):
    folder = './dataset/train/{}/'.format(id)
    
    # use img_1 for every subject
    file = 'img_1.jpg'
    try:
        filepath = os.path.join(folder, file)
        processed_img = processor.preproc(cv2.imread(filepath))[0][0]
        embedding = model.predict(utils.resize(processed_img, input_size), verbose=False)[0,:]

        id_list.loc[len(id_list.index)] = [id, filepath]
        embeddings.append(embedding)
    except:
        print('Failed to process/predict {}'.format(filepath))
print('='*10 + '\n{} images processed.'.format(len(embeddings)))

# ==========================================================================
# Initialize nmslib
index_time_params = {'M': 15, 'indexThreadQty': 4, 'efConstruction': 100, 'post' : 0}

# l2 dist.
index_l2 = nmslib.init(
    method = 'hnsw', # hierarchical navigable small world graph
    space = 'l2', # euclidean
    data_type = nmslib.DataType.DENSE_VECTOR
) 
index_l2.addDataPointBatch(embeddings)
index_l2.createIndex(index_time_params)

# cosine simil.
index_cos = nmslib.init(
    method = 'hnsw', # hierarchical navigable small world graph
    space = 'cosinesimil', # cosine
    data_type = nmslib.DataType.DENSE_VECTOR
) 
index_cos.addDataPointBatch(embeddings)
index_cos.createIndex(index_time_params)

5 images processed.


Let's test this model on the example for [notebook 4](4.%20real%20time%20face%20recognition.ipynb)

In [11]:
# original clip from https://youtu.be/Ia3x_X_OX58?si=aA5GdMpRGcCar2xF 
video_path = './dataset/test/test_2/joy.mp4'

results = pd.DataFrame(
    columns = ['count', 'irene', 'seulgi', 'wendy', 'joy', 'yeri'], 
    index = ['l2', 'cosinesimil']
)
results.fillna(0, inplace=True)

# load video
cap = cv2.VideoCapture(video_path)

# read video frame-by-frame
while(cap.isOpened()):
    ret, frame = cap.read()
    if ret:
        img = frame.copy()
        faces = processor.preproc(img)

        if len(faces)>0:
            for face in faces:
                results['count']['l2'] += 1
                results['count']['cosinesimil'] += 1


                # target embeddings
                target = model.predict(utils.resize(face[0], input_size), verbose=False)[0,:]
                target = np.array(target, dtype='f')
                target = np.expand_dims(target, axis=0)

                # l2
                neighbors, distances = index_l2.knnQueryBatch(target, k=1, num_threads=4)[0]
                name = id_list['name'][neighbors[0]]
                results[name]['l2'] += 1

                # cosinesimil
                neighbors, distances = index_cos.knnQueryBatch(target, k=1, num_threads=4)[0]
                name = id_list['name'][neighbors[0]]
                results[name]['cosinesimil'] += 1
    else:
        break

cap.release()

In [12]:
results

Unnamed: 0,count,irene,seulgi,wendy,joy,yeri
l2,543,0,0,0,540,3
cosinesimil,543,0,0,0,542,1


As we can see, with only one image/subject, the accuracy isn't as great. In this case we only have 5 subjects in our dataset, so the results aren't as bad. But when you have more people in your dataset, the results will only get worse. 

## Method: Double verification

For this method, we will use both distance models at once. If both models return the same ID, we'll add it to the tally, otherwise we'll count it as unknown.

![fig](./assets/fig3.svg)

In [14]:
results_DV = pd.DataFrame(
    columns = ['unknown', 'irene', 'seulgi', 'wendy', 'joy', 'yeri'], 
    
)
results_DV.fillna(0, inplace=True)

In [15]:
results_DV

Unnamed: 0,unknown,irene,seulgi,wendy,joy,yeri
