Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Labeling of images #98

Closed
sanjanajainsj opened this issue Jan 17, 2017 · 6 comments
Closed

Labeling of images #98

sanjanajainsj opened this issue Jan 17, 2017 · 6 comments

Comments

@sanjanajainsj
Copy link

Hello,
I want to train on the pre-trained Lightened CNN model. I would like know how should the labeling be done. Is it binary labeling stating 0 for similar images and 1 for non-similar? Or do we have to get a similarity between each image pair of the same subject? Thank you in advance.

@AlfredXiangWu
Copy link
Owner

We train the model only by softmax loss, which we consider it as a classification task. When testing, we extract features from "eltwise_fc1", which is a 256-d feature vector. And then we compute the cosine similarity of two feature vectors as the similarity score.

@sanjanajainsj
Copy link
Author

Thank you for the reply. I have another question. If we are just training each image at a time, how do we provide the softmax loss in the label file? Do we have a way to compute the loss to specify for each image that we train?

@AlfredXiangWu
Copy link
Owner

Do you mean that the batch size is set to 1? The batch size doesn't influence the computation of softmax loss.

@sanjanajainsj
Copy link
Author

Sorry i am a beginner here so i might have not understood it properly. From what i have understood we provide the lmdb of images to the model. And the lmdb is created using a txt file with labels. For example, label 1 for one subject, label 2 for another subject, and so on. So i mean to say do we provide the label as 1,2, etc. based on the subject number? Or should it be something similar to the softmax loss value?

@AlfredXiangWu
Copy link
Owner

Training the CNN as the classification task is enough. It is similar to the mnist or imagenet training.

@sanjanajainsj
Copy link
Author

Hello Sir,
Thank you so much for the reply. I tried fine-tuning my network with another dataset. However, when testing, the accuracy of my fine-tuned model dropped as compared to the original pre-trained model provided. I have provided a sample label file in the attachments below. I gave a label similar to 70000001,..., 7000006, ... I wanted to know if this is the correct way or not. Since CASIA-Webface gave labels till 65735300.,I assumed taking numbers beyond that as labels should be used to train to ensure they are different subjects. Is this the right way to do this? Also i changed the images to gray-scale since the model supported that and also i did not do any preprocessing once i got the face images (lower quality than CASIA). Could that be a reason for low accuracy? During testing, I got an accuracy of 65.84% on rgb images and 62.86% on grayscale images on the fine-tuned model. And on the originally provided model, I got an accuracy of 69.52% on rgb images and 65.62% on grayscale images.

label.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants