GitHub - codeaway23/face-similarity: Find out whether two images are of the same person or not

Problem

Find out if two images are of the same person or not. The model is trained on this dataset.

Usage

Please make sure you have the latest versions of keras, keras-preprocessing, pandas installed. Certain functions might not work otherwise.

Download the dataset and add it in 'data' directory in the main directory. A few data samples are included in the directory to understand the folder structure.
To train the model, run the following command from the main directory.

python3 train.py

After training, to generate results using a trained model, run the following command from the main directory:

python3 final.py

I couldn't upload the data or my trained model file due to size limits.

Approach

A multi-input single output model for binary classification is built
The model uses inception-v3 network as its convolutional backbone
Imagenet weights are transfered to the same to initialise training
Dataset used turned into lists of combinations. One list for combinations of images of same people. The other for combinations of images of different people.
All data could not be used for training due to harware limitations. Instead, a specific number of combinations are extracted from the dataset to build train, test dataframes.
The dataframes are fed into the multi_input_generator function in utils which augments data for multi-input models.
This data is then used to train the model.
Finally, in the final.py script, a similarity metric is found and a prediction is made on a pair of images drawn from the dataset itself.

Results

For the following parameters -

nb_epochs = 1
train_image_pairs = 4000
test_image_pairs = 500

Best validation accuracy achieved - 72%

Best model achieved a validation accuracy of 76.4% after 5 epochs of training.

Limitations and Remedies

Can test more approaches in building the model. Try different backbones. (I experimented with ResNet50 and Inception-V3, Inception-V3 came out victorious)
Try a deeper network with more trainable parameters.
The model doesn't generalise well yet. Performance varies a lot on different PRNG seeds.
Should train for more number of epochs and on a bigger subset of the data.
Should try finding ways to screen out combinations better/more efficiently to make the dataset more balanced.
Image augmentation hampers the training.
Similarity metric is based on the sigmoid activation value found in the last layer. Instead, similarity metrics used in image processing can be used like SSIM, earth movers distance, etc.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data		data
README.md		README.md
final.py		final.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

README.md

README.md

final.py

final.py

train.py

train.py

utils.py

utils.py

Repository files navigation

Problem

Usage

Approach

Results

Limitations and Remedies

About

Releases

Packages

Languages

codeaway23/face-similarity

Folders and files

Latest commit

History

Repository files navigation

Problem

Usage

Approach

Results

Limitations and Remedies

About

Resources

Stars

Watchers

Forks

Languages