Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

acc is always about 0.5 using mobilenetface #189

Closed
tianxingyzxq opened this issue Apr 23, 2018 · 21 comments
Closed

acc is always about 0.5 using mobilenetface #189

tianxingyzxq opened this issue Apr 23, 2018 · 21 comments

Comments

@tianxingyzxq
Copy link

testing verification..
(12000, 128)
infer time 30.116359
[lfw][6000]XNorm: 38.367005
[lfw][6000]Accuracy-Flip: 0.50000+-0.00000
testing verification..
(14000, 128)
infer time 35.065952
[cfp_fp][6000]XNorm: 38.365932
[cfp_fp][6000]Accuracy-Flip: 0.50000+-0.00000
testing verification..
(12000, 128)
infer time 30.366434
[agedb_30][6000]XNorm: 38.366582
[agedb_30][6000]Accuracy-Flip: 0.50000+-0.00000
[6000]Accuracy-Highest: 0.51533

the train script is
CUDA_VISIBLE_DEVICES='0,1,2,3' python -u train_softmax.py --network y1 --loss-type 4 --margin-s 128 --margin-m 0.5 --per-batch-size 128 --emb-size 128 --data-dir ../datasets/faces_ms1m_112x112 --wd 0.00004 --fc7-wd-mult 10.0 --prefix ../model-mobilefacenet-128

@nttstar
Copy link
Collaborator

nttstar commented Apr 23, 2018

I have no GPU server to test it right now. The author told me that he did the experiments by fine-tuning(train softmax firstly and fine-tuning with arcface loss).

@tianxingyzxq
Copy link
Author

To pursue ultimate performance, MobileFaceNet, MobileFaceNet (112 × 96), and MobileFaceNet (96 × 96) are further trained by ArcFace loss on the cleaned training set of MS-Celeb-1M database [5] with 3.8M images from 85K subjects.
So the further train is fineturing ? not from scratch?

@tianxingyzxq
Copy link
Author

I use ms1m train the mobilenetface network with softmax, the first testing verification is still 0.5
INFO:root:Epoch[0] Batch [1980] Speed: 785.83 samples/sec acc=0.004492
lr-batch-epoch: 0.1 1999 0
testing verification..
(12000, 128)
infer time 5.902649
[lfw][2000]XNorm: 3.298171
[lfw][2000]Accuracy-Flip: 0.50000+-0.00000
testing verification..
(14000, 128)
infer time 6.765173
[cfp_fp][2000]XNorm: 3.160100
[cfp_fp][2000]Accuracy-Flip: 0.50000+-0.00000
testing verification..
(12000, 128)
infer time 5.85961
[agedb_30][2000]XNorm: 3.269653
[agedb_30][2000]Accuracy-Flip: 0.50000+-0.00000
[2000]Accuracy-Highest: 0.50000

@marcosly
Copy link

@tianxingyzxq I face the same problem.However,i got different after several hours.
lr change to 0.001
lr-batch-epoch: 0.001 6241 18
testing verification..
(12000, 128)
infer time 8.014871
[lfw][140000]XNorm: 33.146315
[lfw][140000]Accuracy-Flip: 0.98867+-0.00552
testing verification..
(14000, 128)
infer time 9.476789
[cfp_fp][140000]XNorm: 29.224738
[cfp_fp][140000]Accuracy-Flip: 0.84671+-0.02232
testing verification..
(12000, 128)
infer time 8.189307
[agedb_30][140000]XNorm: 33.785845
[agedb_30][140000]Accuracy-Flip: 0.88883+-0.02323

@visionxyz
Copy link

The same with you, always 0.5.

@lmmcc
Copy link

lmmcc commented Apr 25, 2018

image
i am also puzzled about the result...my result is like this,,,the train phase is reaching 10epcho,,,and the training acc is still 0.the lfw result is not as other researchers'. the highest is reaching to 0.79...

@nttstar
Copy link
Collaborator

nttstar commented Apr 25, 2018

I will provide a pretrained model soon.

@visionxyz
Copy link

@lmmcc I face the same problem when training with resnet-101. acc is low but lfw is good.

@wsx276166228
Copy link

@lmmcc The same problem with you! lfw accuracy=0.991,but train accuracy=0.000000. I wonder to know whether you solver the issue?

@wsx276166228
Copy link

@nttstar when I trained mobilefacenet with two 1080ti GPUs, and set batch_size to 256 per gpu. Through 20000 batchs, the lfw accuracy=0.991,but train accuracy=0.00000. I don't know if this phenomenon is normal? If it is a problem, do you know what caused this phenomenon? Thanks!

@Wisgon
Copy link

Wisgon commented May 11, 2018

@wsx276166228 What's the final accuracy of lfw,cfp_fp,agedb_30 you finally got?

@saxenauts
Copy link

@nttstar Hi, did you get a chance to upload the pretrained model for MobileFaceNet? Similar Problem here. Thanks.

@wsx276166228
Copy link

@Wisgon lfw=0.985 cfp_fp=0.854 agedb_30=0.921

@pribadihcr
Copy link

+1

@Wisgon
Copy link

Wisgon commented May 15, 2018

I have change the learning rate from 0.1(by default) to 0.01, but the acc is still 0.And I'm still running to see the final result.

@wsx276166228
Copy link

@ShiyangZhang what's you mean? what's the learing rate you set? Thanks!

@tp-nan
Copy link

tp-nan commented May 16, 2018

@wsx276166228 I forgot learning rate decay according to the original paper. Now I got a much better accuracy of lfw,cfp_fp,agedb_30. But it is not good enough,maybe my batch size(364) is too small. I'll report the accuracy latter.

@chenkingwen
Copy link

@wsx276166228 The same problem, do you have solved this problem?

@chenhuan19871014
Copy link

The same with you, always near 0.5.anyone knows how to solved it?

@nttstar nttstar closed this as completed Sep 18, 2018
@staceycy
Copy link

Same problem. Anyone got any idea?

@baishiruyue
Copy link

@tianxingyzxq @nttstar i face the same question, the three valid data is always 0.5,and i check the params file of the model, i found that some layer`s paramters is near to zero,so the model can not be trained any more. how did you sovle this question?? thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests