Face-Recognition-in-Real-Time

Create your own databse, compile tripletloss with pre-trained FaceNet model, run real-time face recognition on local host by typing "flask run" in command prompt

Classification vs One Shot Learning

In classic Convolutional Neural Network (CNN) we train layers of CNN to learn features of objects we want to classify. For this we require a lot of data during during training. For example classifying various animals, vehicles etc. Now suppose we want to classify a new animal we have to re-train the entire network or use transfer learning.

In case of "One Shot Learning" we need only one training example per class/ category. In this we use "Siamese" network, Siamese mean twins. Input to Siamese network is an image and the output is "encoding of the input image". Similarly another image is fed to this network to calculate the image encodings. Now we calculate the distance between these encodings. If this distance is less than threshold means same person else not. The distance is calculated by the following formula

d(x(1), x(2)) = || f(x(1)) – f(x(2) ||²

Here we use triplet loss function During training we have three images: an anchor image of a person (A), a positive image of the same person (P), negative image (image of a different person) (N)

Triplet loss function is calculated by following formula

L(A,P,N) = max(|| f(A) – f(P) ||2 - || f(A) – f(N) ||2 + α, 0)

α: Hyperparameter, helps in preventing trivial solution.

For example α = 0.2, we want this α (margin) difference between positive and negative image distance

The above implementation is given in triplet_loss_inception.py

This program compiles the triplet loss with the pre-trained FaceNet model and store the model as triplet_loss_model.h5

Next step is to create your own database, for this you need one single image of every individual. The function "def creating_database()" generated a dictionary which stores "name" as "key" and "image embeddings" as "value".

The above function is in app.py. After creating the database you can grab an image from your webcam using OpenCV. Multitask Cascaded Convolutional Network (MTCNN) is used to detect faces from images. After detecting face you crop the image and feed it into FaceNet model for calculating encodings. This encoding is compared with the encodings from the database. L2 distance norm is used for comparison. If the distance between the calculated encoding and stored encoding is less than a minimum threshold, the program outputs the name of the person else "Not in database".

References: Coursera: https://www.coursera.org/learn/convolutional-neural-networks?specialization=deep-learning

More on MTCNN: https://towardsdatascience.com/how-does-a-face-detection-program-work-using-neural-networks-17896df8e6ff

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Procfile		Procfile
README.md		README.md
app.py		app.py
fr_utils.py		fr_utils.py
img.jpg		img.jpg
inception_blocks_v2.py		inception_blocks_v2.py
requirements.txt		requirements.txt
triplet_loss_inception.py		triplet_loss_inception.py
triplet_loss_model.h5		triplet_loss_model.h5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Procfile

Procfile

README.md

README.md

app.py

app.py

fr_utils.py

fr_utils.py

img.jpg

img.jpg

inception_blocks_v2.py

inception_blocks_v2.py

requirements.txt

requirements.txt

triplet_loss_inception.py

triplet_loss_inception.py

triplet_loss_model.h5

triplet_loss_model.h5

Repository files navigation

Face-Recognition-in-Real-Time

About

Releases

Packages

Languages

NikhilNandoskar/Face-Recognition-in-Real-Time

Folders and files

Latest commit

History

Repository files navigation

Face-Recognition-in-Real-Time

About

Topics

Resources

Stars

Watchers

Forks

Languages