### Face Similarity with using pretrained FaceNet model ### 
In this document, we will be using **FaceNet** pretrained network to try and build a Face Similarity system. To test out the system, we will be generating 128-d encoded features of 10000 randomly selected celebrities out of almost 100,000 avaliable ones. This data was collected by Microsoft and publicly available at https://exposing.ai/msceleb/. I used only 10,000 images to generate feature encoded database because I only wanted to test out FaceNet's model, but filling the "celeb images" folder with rest of available images would surely improve the model performance.

*here is one interesting article describing chances of finding doppelgangers/lookalikes somewhere in the world.*
https://www.bbc.com/future/article/20160712-you-are-surprisingly-likely-to-have-a-living-doppelganger


In [20]:
# import cv2
import os
import numpy as np
from numpy import genfromtxt
import pandas as pd
import tensorflow as tf
from PIL import Image
import json

%matplotlib inline
%load_ext autoreload
%autoreload 2


The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [28]:
# example of loading the keras facenet model
from keras.models import load_model
# load the model
model = load_model('facenet_keras.h5/model/facenet_keras.h5')
# summarize input and output shape
print(model.inputs)
print(model.outputs)


[<KerasTensor: shape=(None, 160, 160, 3) dtype=float32 (created by layer 'input_1')>]
[<KerasTensor: shape=(None, 128) dtype=float32 (created by layer 'Bottleneck_BatchNorm')>]


In [29]:
itest = Image.open('images/danielle.png')
img = cv2.cvtColor(cv2.imread("images/danielle.png"),cv2.COLOR_BGR2RGB)

In [30]:
img.shape

(96, 96, 3)

In [31]:
image = Image.fromarray(img)
image = image.resize((160, 160))
img = np.asarray(image)
img.shape

(160, 160, 3)

In [32]:
img = img.astype('float32')
# standardize pixel values across channels (global)
mean, std = img.mean(), img.std()
img = (img - mean) / std
img = np.expand_dims(img, axis=0)
img.shape

(1, 160, 160, 3)

In [33]:
prediction = model.predict(img)
prediction.shape

(1, 128)

In [16]:
def process_img(path):
    """
    Basic custom function to process image
    """
    itest = Image.open(path)
    img = cv2.cvtColor(cv2.imread(path),cv2.COLOR_BGR2RGB)
    image = Image.fromarray(img)
    image = image.resize((160, 160))
    img = np.asarray(image)
    img = img.astype('float32')
    # standardize pixel values across channels (global)
    mean, std = img.mean(), img.std()
    img = (img - mean) / std
    img = np.expand_dims(img, axis=0)
    return img

In [17]:
database = {}

# First time generating encodings
for name in os.listdir('celeb_images/'):
    i = int(name.split('.')[0])
    database[i] = model.predict(process_img('celeb_images/'+name))
    if i%1000==0:
        print ("\tGenerated " + str(i) + " encodings...")  # KEEP TRACK OF PROGRESS
        


Saving generated database of 128-d encoded features.

In [18]:
# Saving database object as a JSON so we dont have to generate
# encodings when reusing

class NumpyEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, np.ndarray):
            return obj.tolist()
        return json.JSONEncoder.default(self, obj)
    
with open('data.json', 'w') as fp:
    json.dump(database, fp, cls=NumpyEncoder)

In [21]:
with open('data.json', 'r') as fp:
    database = json.load(fp)

In [22]:
def conv_to_string(i):
    """
    Small custom function to convert to proper string form
    """
    s = str(i)
    while len(s)!=6:
        s = '0'+s
        
    return s + '.jpg'

In [25]:
img = model.predict(process_img('enter_path_to_desired_persons_face'))
minV = 1000
maxKey = 0

print (img)
for key, k_img in database.items():
    value = np.linalg.norm(k_img-img)
    if value < minV:
        minV = value
        maxKey = key

similar = Image.open('celeb_images/' + conv_to_string(maxKey))
similar.show()

### Now, to see some results ### 
Now, when we implemented this simple Face Similarity Model, let's test it out. For those purposes, I chose a picture of my wonderful friend and compared it to our dataset. 
<img src="test/digi.png"> <center> V </center> <img src="test/garyVee.jpg">

To be honest, I see a quite resemblence between my friend and Gary Vee, especially taking in consideration we have only 10,000 images in our dataset and different possible facial features combinations. Now, you can use this notebook to see which celebrity looks most similar to you.

### That's it. This was just a simple notebook to showcase power of Computer Vision, pretrained models and ways to have fun with them.