Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can you reach 99.86% while the upper limit accuracy is 99.83% on LFW? #9

Closed
KaleidoZhouYN opened this issue Nov 3, 2017 · 7 comments

Comments

@KaleidoZhouYN
Copy link

commented Nov 3, 2017

there are few questions I want to know:
1.
As I know there are 10 error pairs in all the 6000 pairs,so the upper limit should be 1-(10/6000)%=99.83%,and the accuracy in your paper is 99.86%
2.
Can you show you training dataset and network which make the softmax loss to get a extremly high accuary(99.75%) on LFW? Softmax loss is not a very strong constrained loss and the result should be 98+% while training by MS_celeb_1M.
3.
What's the difference between your COCO_Loss and the combination of ASoftmax_loss and L2-constrain_loss?

@sciencefans

This comment has been minimized.

Copy link
Owner

commented Nov 3, 2017

Hi @KaleidoZhouYN ,
Since few people ask me the implementary details recent days especially on ICCV, I will update this repository and add some materials for face recognition, including network structure, pre-processed testing data and features of lfw faces.
And for your questions,

  1. Thanks for your prompt. In our evaluation code, insteading of evaluation on 10 splits, we directly verify all 6000 pairs and give the ROC result. But I don't think this procedure may cause any difference on result so we are still checking this and thanks again for your prompt.
    Update There are 10 incorrect pairs on LFW but only the 6 matching pairs are 'really incorrect'. The faces in the other 4 mismatching pairs are labeled with wrong identities but they are still mismatching. So the upper limit accuracy is (6000-6)/6000=99.90%.
  2. As described in Sec 6.2, training data is a subset of MS1M in which we remove all the identities in LFW and some identities with fewer clean data. Also as mentioned in Sec.6.2, the backbone structure is a single Inception-ResNet. Sorry to hear your results trained on MS1M. Here are some tips: the result of face verification is very sensitive to not only the network structure, but the alignment accuracy, cropped region, augmentation strategy and some hyper parameter like dropout ratio, weight decay, etc. Actually during our exploration, plenty of structures such as ResNet-x, inception-vx, VGG+MaxOut, DenseNet, SENet, etc. are tried with supervision by softmax loss, and the results of mAcc. are all 99.4+.
  3. In final fc layer, |w||x|cos<w,x>, which can be seen as a linear classifier, both asoftmax and coco focus on constraint on cos<w,x>, the angle between centroids and samples. Asoftmax implement this by adding a factor m (m>1) for \theta and keeping |w| and |x|. In coco, we directly optimize cos<w,x>.
@KaleidoZhouYN

This comment has been minimized.

Copy link
Author

commented Nov 3, 2017

@sciencefans OK,so you didn't use MTCNN for face alignment.I know that MTCNN will cause a lot of noise aligned images,thanks for your reply.

@sciencefans

This comment has been minimized.

Copy link
Owner

commented Nov 3, 2017

@KaleidoZhouYN
Yes, we found that the alignment accuracy is very important. We use landmarks predicted by RSA. It's code will be released soon. It is interesting that after we replace RSA with another alignment algorithm provided by Sensetime, the accuracy on lfw drops to 99.50%.

@KaleidoZhouYN

This comment has been minimized.

Copy link
Author

commented Nov 3, 2017

@sciencefans And add BatchNorm after each conv layer?

@sciencefans

This comment has been minimized.

Copy link
Owner

commented Nov 3, 2017

@KaleidoZhouYN

This comment has been minimized.

Copy link
Author

commented Nov 3, 2017

@sciencefans all right...I call that softmax plus BN

@Erdos001

This comment has been minimized.

Copy link

commented Dec 28, 2017

So, The coco loss is the combination of special case of A-softmax(m=1) and l2-constraint?
And I found some formulation mistake in your paper
default

default

default

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.