For someone who wants to directly implement the project navigate to 04_open_face folder
Initially I decided to train a standard CNN model on faces ( as we would do for object detection), It was a disaster because two distinct objects have features which are easily distinguishable compared to two distinct faces(ie: A cat and a dog have features which are easily distinguishable as opposed to say faces of any two people ) ,
which makes it difficult for the model to mathematically encapsulate the relationship between the label and features, Meaning we need better set of features to represent our faces, rather than features generated by a Regular CNN,
That is why we use a pretrained network which is specially designed to generate embeddings/features for the faces which best distinguish two faces
I tried to code it myself, due to performance issues I ended up taking the help of Adrian Blog
Process of facial recognition can be broken down into 3 steps
-
Face detection
Detecting and extracting faces from the images
-
Generating embeddings( extracting features from Faces)
Converting the detected faces into embeddings( embeddings = features)
-
Training a model on those embeddings
Using an ml model to learn to classify the embeddings
For each step of the process different methods were used and they are mentioned below
Unlike simpler computer vision task, for face recognition our subject is not an entity rather it's part of entity(ie: we are not detecting humans or objects we are supposed to recognize a part of the subject), So before we send the image further down the pipeline we extract our subject which is the face making it easier for algorithms to recognize
Implementations can be found in the detection directory
Theory-: Video
Theory-: Post
Source-: Post
Using ResNet is the best approach based on the time taken and accuracy (here accuracy would be, If you slightly tilt your head towards any direction haarcascade would not identify it as better as ResNet)
Though LBPH is not the best method to produce embeddings for accurate facial recognition, since it also one of the ways which i have tried i am mentioning it.
Theory-: Post
We will be producing embeddings(features) out of the images which we consider is the better representation of the face (with a pretrained network, which is especially trained to do this) so that we can train the model to further predict the faces
Source-: Post
I have used SVM to train the model on the embeddings generated with open face implementation
Trying different methods(learning many implementations) I ended up with using implementation from Post You can go to final face recognition folder in the repo to implement it (by training it to your face and testing it out)
I have wrapped this model as an API, so that we can serve it on different platforms you can check that out here