Validate with MegaFace Challenge dataset #275

se7oluti0n · 2017-05-15T08:33:01Z

Hello, I have downloaded MegaFace challenge data for evaluating face recognition, both Identification and Verification problem.
I think testing result on LFW is not enough because 2 reasons:

LFW test only for verfication problem
LFW is not big dataset

MegaFace challenge provide large dataset for testing face recognition with 1M distractors, testing on FaceScrub dataset. Here is some result: http://megaface.cs.washington.edu/results/facescrubresults.html

I'm also write code for extract features and convert to MegaFace format.
But I worry that the input images from Megaface is aligned in different way with Facenet code.

Could you help me check this code @davidsandberg
Here is the link:
https://gist.github.com/se7oluti0n/8ff161505721b6c4ab25ccfe7996fd1a

se7oluti0n · 2017-05-15T08:47:45Z

here is some sample image from MegaFace. Its included Facescrub, FGnet for known subjects, and Megaface for unknown subjects (distractor)
https://drive.google.com/file/d/0B51U-GiVgFCrcWI2bllteWVjMzA/view?usp=sharing

se7oluti0n · 2017-05-18T07:06:27Z

I have tested the lasted model 20170512 with MegaFace challgene 1 (FaceScrub). This is result compared with others method (both on identification problem and verification on 1M distractor problem) . The blue line (david) show this repo's result. Pls check this @davidsandberg

Verification on 1M distractors

Identification on 1M distractors

davidsandberg · 2017-05-18T08:22:11Z

Hi @se7oluti0n,
Very nice plots!! I'm a bit surprised by the low identification rate at a small number of distractors (e.g. 10). I haven't looked at the details of how the test is set up, but I don't have any good explanation for this.
Can you make a PR to add the test script to the repo so I can have a look?

ugtony · 2017-05-18T08:26:33Z

Hi, @se7oluti0n,
I also ran the megaface challenge with the official script (run_experiment.py), but I don't know how to plot the results as you did. Could you tell me how you do it?

I used a model trained with casia-webface and get the rank-1 identification rate:
Distractor = 10, Rank1 = 0.9849
Distractor = 100, Rank1 = 0.9597
Distractor = 1,000, Rank1 = 0.9132
Distractor = 10,000, Rank1 = 0.8396
Distractor = 100,000, Rank1 = 0.7192
Distractor = 1,000,000, Rank1 = 0.5612

In your second graph, the blue line starts at a low identification rate(~0.92). It is probably because you didn't check the mtcnn-detected regions with the groundtruth for the facescrub dataset.

se7oluti0n · 2017-05-18T10:42:39Z

@davidsandberg @ugtony I'm a bit confused because my line is not the same shape with others. May be there are some mistakes in my steps. With distractor = 10, Rank 1 accuracy should be high as 0.98..

The detail steps:

Download megaface distractor set (1M images), keep it as it is ( no mtcnn alignment)
Download facescrub set, aligned with mtcnn
Run scripts to extract features ( including preprocessing like resize, pre whittening, just like validate on lfw) code in https://gist.github.com/se7oluti0n/8ff161505721b6c4ab25ccfe7996fd1a , evaluation kit is quite big so I upload to drive: https://drive.google.com/file/d/0B51U-GiVgFCrZ1dYYjdJYnVRZHc/view?usp=sharing

@ugtony When run the test with aligned facescrub images on Megaface site, the result is not good, may be because they aligned different way compare to mtcnn. So I do the alignment with mtcnn, but I did not check with the groundtruth.

How did you check the mtcnn-detected regions with the groundtruth for the facescrub dataset?
To plot the results, I downloaded current result (http://megaface.cs.washington.edu/Challenge1JSON.zip) and plot like this
https://gist.github.com/se7oluti0n/8b83bf1610972de6127835b4fd59eea7

davidsandberg · 2017-05-18T11:20:08Z

Thanks!
My guess is that the distractor set should be aligned using MTCNN as well. For the model to be able to discriminate well between faces the embedding needs to be precise also for the distractors. Without alignment the embeddings will be "noisy" which I guess could impact performance.

siebertlooije · 2017-05-18T11:40:41Z

@se7oluti0n Thanks for this, very clear explanation !

ugtony · 2017-05-18T12:43:40Z

@se7oluti0n,
Since there may be more than one faces in facescrub images, when align_dataset_mtcnn.py is used without modification, the probe set would contain a few wrong images. That is the reason why the identification rate got only ~0.92 when distractor=10. After all, it can't be right when a probe image is wrong.

Therefore, if multiple faces are detected in align_dataset_mtcnn.py, you should choose the one overlapping most with the groundtruth bbox. The bbox info is listed in the .json files, the same file you should use when face detection fails.

Thanks for explaining how the figures are plot.

@davidsandberg
I think the performance would be better than it actually is when megaface distractor set is not aligned, that should be the reason why se7oluti0n's curve doesn't drop so much when distractor = 1,000,000.

In short, in my opinion, the left portion of se7oluti0n's identification curve is lower than others' because some images in probe are wrong. The right portion of the curve doesn't decline so much because the probe/gallery sets are well aligned but the distractor set isn't.

se7oluti0n · 2017-05-18T13:02:30Z

@ugtony Thanks for the clear explanation. I wonder which classifier (center loss or triplet loss) you used when retrained with casia-webface? Is that the cleaned version casia-maxpy-clean?

@davidsandberg The results looked very competitive with others state of the art method e.g ntech lab Findface (http://fusion.kinja.com/this-face-recognition-company-is-causing-havoc-in-russi-1793856482)

For the center loss verion, if we use large dataset, will the performance be improved? MegaFace challenge also provide quite large datasets (4M images of 650k identities)
How about triplet loss version? I have not trained this yet but it seemed having problem of converging. Recently I found a paper that improve triplet loss, could you have a look at https://arxiv.org/pdf/1703.07737.pdf

ugtony · 2017-05-18T13:37:38Z

I used center loss to train my classifier.
I used the original casia-webface. I checked a few removed images in casia-maxpy-clean before, but I found some of them seem to be right. I guess it is cleaned by some automatic algorithm instead of manually. Beside, I didn't get better performance with it.
The performance of my model on LFW is 0.989 by the way.

ugtony · 2017-05-19T01:00:16Z

Hi @davidsandberg,

I used the code shared by @se7oluti0n to plot my result (a model trained with facenet_train_classifier.py on casia-webface) on megaface challenge 1. My result is plotted with red color, please take a look.

The performance is competitive when the false positivie rate > 0.001 and the distractors number < 10,000, while become worse than nearby curves when false positive rate is low and number of distractors is high.

Any idea why this happens? Maybe it's just because my training dataset size is smaller than others.

se7oluti0n · 2017-05-19T05:26:00Z

@ugtony @davidsandberg
I found I have mistake when extracting features for distractor's images:

In preprocessing, I use tf.image.resize_image_with_crop_or_pad(image, image_size, image_size) without noice that distractor's images is not aligned, and the challenge do provide aligned result in json files.
I have same mistake on aligned images of Facescrub downloaded from Megaface which has size of 300x300,

This is the result using aligned facescrub downloaded from MegaFace challenge, but the distractors are not aligned, use raw image. I will update the result after align distrator later

ugtony · 2017-05-19T05:56:29Z

It's good to know that TP increases to ~0.75 when FP=10^-6. It is pretty close to the "CenterLoss" performance (76.51%) reported in the paper "A Light CNN for Deep Face Representation with Noisy Labels".

I guess the identification rate would drop to somewhere around 0.65 after the distractor images are aligned.

se7oluti0n · 2017-05-19T09:14:52Z

@ugtony you are right. I guess without the distractor's alignment, the distractor images is not really a facical image, so it is easier to recognize probes from distractors.

latest results here. This results is very similar to the results in original center loss paper A Discriminative Feature Learning Approach for Deep Face Recognition (~65% for identification and 76% for verification)

yao5461 · 2017-11-12T13:58:41Z

@ugtony @se7oluti0n Hi, do you know how to define a new score model when I evaluate my model on Megaface? I want to use cosine similarity to measure the distance instead of euclidean distance.
Thanks! :)

Not-IITian · 2019-02-07T15:46:46Z

Hi,

Is it possible to evaluate your model on top-k accuracy on Megaface challenge other than top-1? e.g. k =3,5, 10.

Thanks

SeanSyue · 2019-07-08T09:23:27Z

@se7oluti0n Could you please renew the link to your evaluation kit. The original link seems broken.

ghost · 2019-09-20T13:41:46Z

Hi,
Do we need to normalize the images of Facescrub and Megaface before feeding them to the model to get features? Thank you for reading.

FlyingAnt2018 · 2020-06-04T07:32:24Z

hello! may i have your megaface dataset? i have applied for it at http://megaface.cs.washington.edu/dataset/download.html,but there is no reply! Thanks!

se7oluti0n closed this as completed May 26, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validate with MegaFace Challenge dataset #275

Validate with MegaFace Challenge dataset #275

se7oluti0n commented May 15, 2017 •

edited

Loading

se7oluti0n commented May 15, 2017

se7oluti0n commented May 18, 2017 •

edited

Loading

davidsandberg commented May 18, 2017

ugtony commented May 18, 2017 •

edited

Loading

se7oluti0n commented May 18, 2017 •

edited

Loading

davidsandberg commented May 18, 2017

siebertlooije commented May 18, 2017

ugtony commented May 18, 2017 •

edited

Loading

se7oluti0n commented May 18, 2017 •

edited

Loading

ugtony commented May 18, 2017

ugtony commented May 19, 2017 •

edited

Loading

se7oluti0n commented May 19, 2017

ugtony commented May 19, 2017

se7oluti0n commented May 19, 2017 •

edited

Loading

yao5461 commented Nov 12, 2017

Not-IITian commented Feb 7, 2019 •

edited

Loading

SeanSyue commented Jul 8, 2019

ghost commented Sep 20, 2019

FlyingAnt2018 commented Jun 4, 2020

Validate with MegaFace Challenge dataset #275

Validate with MegaFace Challenge dataset #275

Comments

se7oluti0n commented May 15, 2017 • edited Loading

se7oluti0n commented May 15, 2017

se7oluti0n commented May 18, 2017 • edited Loading

davidsandberg commented May 18, 2017

ugtony commented May 18, 2017 • edited Loading

se7oluti0n commented May 18, 2017 • edited Loading

davidsandberg commented May 18, 2017

siebertlooije commented May 18, 2017

ugtony commented May 18, 2017 • edited Loading

se7oluti0n commented May 18, 2017 • edited Loading

ugtony commented May 18, 2017

ugtony commented May 19, 2017 • edited Loading

se7oluti0n commented May 19, 2017

ugtony commented May 19, 2017

se7oluti0n commented May 19, 2017 • edited Loading

yao5461 commented Nov 12, 2017

Not-IITian commented Feb 7, 2019 • edited Loading

SeanSyue commented Jul 8, 2019

ghost commented Sep 20, 2019

FlyingAnt2018 commented Jun 4, 2020

se7oluti0n commented May 15, 2017 •

edited

Loading

se7oluti0n commented May 18, 2017 •

edited

Loading

ugtony commented May 18, 2017 •

edited

Loading

se7oluti0n commented May 18, 2017 •

edited

Loading

ugtony commented May 18, 2017 •

edited

Loading

se7oluti0n commented May 18, 2017 •

edited

Loading

ugtony commented May 19, 2017 •

edited

Loading

se7oluti0n commented May 19, 2017 •

edited

Loading

Not-IITian commented Feb 7, 2019 •

edited

Loading