Find out if two images are of the same person or not. The model is trained on this dataset.
Please make sure you have the latest versions of keras, keras-preprocessing, pandas installed. Certain functions might not work otherwise.
-
Download the dataset and add it in 'data' directory in the main directory. A few data samples are included in the directory to understand the folder structure.
-
To train the model, run the following command from the main directory.
python3 train.py
- After training, to generate results using a trained model, run the following command from the main directory:
python3 final.py
I couldn't upload the data or my trained model file due to size limits.
- A multi-input single output model for binary classification is built
- The model uses inception-v3 network as its convolutional backbone
- Imagenet weights are transfered to the same to initialise training
- Dataset used turned into lists of combinations. One list for combinations of images of same people. The other for combinations of images of different people.
- All data could not be used for training due to harware limitations. Instead, a specific number of combinations are extracted from the dataset to build train, test dataframes.
- The dataframes are fed into the multi_input_generator function in utils which augments data for multi-input models.
- This data is then used to train the model.
- Finally, in the final.py script, a similarity metric is found and a prediction is made on a pair of images drawn from the dataset itself.
For the following parameters -
- nb_epochs = 1
- train_image_pairs = 4000
- test_image_pairs = 500
Best validation accuracy achieved - 72%
Best model achieved a validation accuracy of 76.4% after 5 epochs of training.
- Can test more approaches in building the model. Try different backbones. (I experimented with ResNet50 and Inception-V3, Inception-V3 came out victorious)
- Try a deeper network with more trainable parameters.
- The model doesn't generalise well yet. Performance varies a lot on different PRNG seeds.
- Should train for more number of epochs and on a bigger subset of the data.
- Should try finding ways to screen out combinations better/more efficiently to make the dataset more balanced.
- Image augmentation hampers the training.
- Similarity metric is based on the sigmoid activation value found in the last layer. Instead, similarity metrics used in image processing can be used like SSIM, earth movers distance, etc.