Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Embeddings classification ability #134

Closed
lodemo opened this issue Jan 26, 2017 · 17 comments
Closed

Embeddings classification ability #134

lodemo opened this issue Jan 26, 2017 · 17 comments

Comments

@lodemo
Copy link

lodemo commented Jan 26, 2017

Hello,
i wanted to create a comparison between different face recognition techniques in a classification context.
For this i reused the classification experiment from Openface (https://github.com/cmusatyalab/openface/blob/master/evaluation/lfw-classification.py) and integrated Facenet into it.
The results are however not as i would have thought. Even at 10 different people accuracy is only around 0.5. Are the resulting image embeddings different to the one from Openface in their ability to be used directly in classification, using a classifier like SVM?

At the moment im feeding a single image to the network like this and use the resulting embedding as the representation for classification/evaluation:

imgs = np.reshape(img, (1, 160, 160, 3)) # img is (160, 160, 3)
feed_dict = { images_placeholder:imgs }
emb = sess.run(embeddings, feed_dict=feed_dict)
rep = emb[0]

Or is the error perhaps in the evaluation of the results? Currently using the same accuracy calculation like Openface, the accuracy_score function from sklearn.metrics.

regards

@davidsandberg
Copy link
Owner

The loss used for training is different in openface compared to this implementation, but I don't think that should cause any major differences in how the resulting embedding can be used.
Could it be something related to scaling/normalization of the input image?

@lodemo
Copy link
Author

lodemo commented Jan 26, 2017

Thank you for the suggestion,
i use the Mtcnn aligned LFW dataset with size 160, like described in Validate on LFW.
In the Openface script, cv2 is used reading the data with the following,

img = cv2.imread(imgPath)
img = cv2.resize(img, (size, size))
...
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

which seems to load the image with cv2.imread as BGR and converts it to RGB.
RGB is correct for facenet? Size should not be changed as i specified 160.
I will further look into it anyway.

@lodemo
Copy link
Author

lodemo commented Jan 27, 2017

I did further look into it and replaced the data reading with the routine used in Facenet.
I also missed prewhitening completely which probably was the cause of it, or scipy misc.imread() does return different data than cv2.imread()...

After this the results did improve significantly, even more than i expected, compared to the Openface results.

accuracies

What would you say is the cause for such a difference to Openface, the different loss function? as the embedding dimensions are the same.

@davidsandberg
Copy link
Owner

Hi @lodemo!
That's a very nice result! It would be nice to see how many distractors are needed before the accuracy for "Facenet LinearLVC" starts to degrade.
For the differences I guess there are a few. I have never managed to get training using triplet loss to work very well. Training the model as a classifier seems to be much easier but it becomes impossible if the number of classes is too large. Also, I never managed to get the NN4 model to converge, so a solution to that was to use the Inception-Resnet models instead. And finally the alignment of training images is different and the MTCNN is very robust to partial occlusion, siluettes etc which makes the training data more diverse.
Which model did you use in this experiment? I have just uploaded a model trained on Ms-Celeb-1M that performs 4 times better than the 20161116-234200 model in terms of LFW errors.

@lodemo
Copy link
Author

lodemo commented Jan 28, 2017

Hi,
i used the 20161116-234200 model from the link in Validate on LFW, but i will try it again with the new model!

What do you mean exactly under distractors? Higher number of people? plan to do that in the next run.

@lodemo
Copy link
Author

lodemo commented Jan 30, 2017

Hi,
i tried running it with the 20170117-215115 model but encountered an error related to the
model-20170117-215115.ckpt-285000.data-00000-of-00001 file.

Error:
tensorflow/core/framework/op_kernel.cc:975] Data loss: Unable to open table file 20170117-215115/model-20170117-215115.ckpt-285000.data-00000-of-00001: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?

Is the file corrupt or does it need to be loaded differently as the 20161116-234200 model?

@davidsandberg
Copy link
Owner

Hi,
The new restore adds the last part to the file name so the only first part that should be given, i.e.
20170117-215115/model-20170117-215115.ckpt-285000 in this case.

@lodemo
Copy link
Author

lodemo commented Jan 30, 2017

Thank you,
it seems to work, i will run it and post results later.

regards

@lodemo
Copy link
Author

lodemo commented Jan 31, 2017

I ran the scripts with the new model 20170117-215115, it seems to improved the result further.

More prominently in the higher number of people, which i increased up to 1600 (the number of people in LFW with 2 or more images)

Model 20161116-234200:
400 people: 0.992176 accuracy
1600 people: 0.864177 accuracy

Model 20170117-215115:
400 people: 0.997066 accuracy
1600 people: 0.926279 accuracy

accuracies

accuracies

What is interesting, comparing the training time needed for the above results compared with Openface. Both training a LinearSVC with one vs. rest scheme:

traintimes

As the dimension are both the same, the SVM seems to take longer fitting the Facenet features.

@hardfish82
Copy link

@lodemo It's amazing work! I follow https://github.com/cmusatyalab/openface/blob/master/evaluation/lfw-classification.py and add the facenet model(20170117-215115), my result is not so good as you got. What did you improve? The C parameter of LinearSVC?

accuracies

@lodemo
Copy link
Author

lodemo commented Feb 9, 2017

Actually not, i did leave C=1 but i used the One vs. All scheme explicitly
LinearSVC(C=1, multi_class='ovr')

Did you apply all the preprocessing for the Facenet images?
RGB convert, Crop and Prewhitening of the Mtcnn aligned images is important.

I am currently also running evaluation of Openface and Facenet on the YouTube faces database.
Will post results when finished.

@hardfish82
Copy link

I used SVC API like this:
cls = SVC(kernel='linear', C=1)
And just now, I test your scheme that:
cls = LinearSVC(C=1, multi_class='ovr')
The result improves slightly:
accuracies
I did data preprocessing according facenet code, in the function getData() I added a new mode 'facenet' and used the facenet api to load data:

def getData(lfwPpl, nPpl, nImgs, mode, resize=96):
    X, y = [], []

    personNum = 0
    for (person, nTotalImgs) in lfwPpl[:nPpl]:
        imgs = sorted(os.listdir(person))
        for imgPath in imgs[:nImgs]:
            imgPath = os.path.join(person, imgPath)
            img = cv2.imread(imgPath)
            img = cv2.resize(img, (resize, resize))
            if mode == 'grayscale':
                img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
            elif mode == 'rgb':
                img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
            elif mode == 'facenet':
                img = facenet.load_data([imgPath], False, False, resize)[0]
            else:
                assert 0

            X.append(img)
            y.append(personNum)

        personNum += 1

    X = np.array(X)
    y = np.array(y)
    return (X, y)

@lodemo
Copy link
Author

lodemo commented Feb 9, 2017

I see, its probably because you resize the images for Facenet to 96, which if you use the pre-trained model should be 160. Or do you call getData with resize=160 for Facenet?

Did you align the images to 160 with Mtcnn?

@hardfish82
Copy link

Yes, The lfw images are aligned to size 160*160 and I got facenet data with resize=160. Otherwise a shape mismatch error would be raised.

    net = facenet_model.FaceNet("/home/x0269/models/20170131-234652")
    facenetGPUsvmDf = cacheToFile(cache)(openfaceExp)(lfwPpl, net, cls, 'facenet', 160)

Did you filter lfw images with https://github.com/cmusatyalab/openface/blob/master/util/prune-dataset.py? I did it with --numImagesThreshold=5, then the persons with less than 5 images are filtered. The total number of the left persons is 423.

@lodemo
Copy link
Author

lodemo commented Feb 9, 2017

Ah no i did not prune images.

In getLfwPplSorted the people are returned sorted for number of images, so selecting the first 100 people should result in the same as with the pruned dataset i think?

@lodemo
Copy link
Author

lodemo commented Feb 9, 2017

You can try running my version of the script, see if anything changes.
https://gist.github.com/lodemo/f49ac4a7402d2de3163cf5adfad79d43

I split most methods for Openface and Facenet, a little redundant but works for now.

@hardfish82
Copy link

Thanks @lodemo . The difference is on the align method. According to https://cmusatyalab.github.io/openface/demo-3-classifier/ , I use dlib align. The former result is based on the dlib_aligned faces.
I tried the mtcnn align method https://github.com/davidsandberg/facenet/wiki/Validate-on-LFW just now and got the same result as yours.
It's amazing! Thanks one more time.
accuracies

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants