Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

epoch and training accuracy are always 0 #40

Closed
jolinlinlin opened this issue Apr 11, 2019 · 7 comments
Closed

epoch and training accuracy are always 0 #40

jolinlinlin opened this issue Apr 11, 2019 · 7 comments

Comments

@jolinlinlin
Copy link

I have a strange problem when I run python train_nets.py --class_number=85742 ,the epoch and training accuracy are always 0,only the reg_loss is decreasing,but accuracy on lfw is incresing.Is there someone tell me how to solve it?Thank you very much! The log in terminal as fllows:

`epoch 0, total_step 22050, total loss is 48.44 , inference loss is 48.34, reg_loss is 0.09, training accuracy is 0.000000, time 243.119 samples/sec
epoch 0, total_step 22100, total loss is 47.52 , inference loss is 47.42, reg_loss is 0.09, training accuracy is 0.000000, time 226.503 samples/sec
epoch 0, total_step 22150, total loss is 47.11 , inference loss is 47.01, reg_loss is 0.09, training accuracy is 0.000000, time 219.780 samples/sec
epoch 0, total_step 22200, total loss is 47.53 , inference loss is 47.43, reg_loss is 0.09, training accuracy is 0.000000, time 247.627 samples/sec
epoch 0, total_step 22250, total loss is 51.30 , inference loss is 51.21, reg_loss is 0.09, training accuracy is 0.000000, time 244.979 samples/sec
epoch 0, total_step 22300, total loss is 48.17 , inference loss is 48.08, reg_loss is 0.09, training accuracy is 0.000000, time 241.888 samples/sec
epoch 0, total_step 22350, total loss is 45.93 , inference loss is 45.84, reg_loss is 0.09, training accuracy is 0.000000, time 231.988 samples/sec
epoch 0, total_step 22400, total loss is 45.68 , inference loss is 45.59, reg_loss is 0.09, training accuracy is 0.000000, time 225.902 samples/sec
epoch 0, total_step 22450, total loss is 44.99 , inference loss is 44.90, reg_loss is 0.09, training accuracy is 0.000000, time 250.815 samples/sec
epoch 0, total_step 22500, total loss is 45.48 , inference loss is 45.39, reg_loss is 0.09, training accuracy is 0.000000, time 251.646 samples/sec
epoch 0, total_step 22550, total loss is 49.16 , inference loss is 49.08, reg_loss is 0.09, training accuracy is 0.000000, time 227.408 samples/sec
epoch 0, total_step 22600, total loss is 46.27 , inference loss is 46.19, reg_loss is 0.09, training accuracy is 0.000000, time 227.355 samples/sec
epoch 0, total_step 22650, total loss is 47.00 , inference loss is 46.91, reg_loss is 0.09, training accuracy is 0.000000, time 242.682 samples/sec
epoch 0, total_step 22700, total loss is 49.39 , inference loss is 49.30, reg_loss is 0.09, training accuracy is 0.000000, time 253.599 samples/sec
epoch 0, total_step 22750, total loss is 48.45 , inference loss is 48.37, reg_loss is 0.09, training accuracy is 0.000000, time 229.287 samples/sec
epoch 0, total_step 22800, total loss is 47.40 , inference loss is 47.31, reg_loss is 0.09, training accuracy is 0.000000, time 241.944 samples/sec
epoch 0, total_step 22850, total loss is 44.57 , inference loss is 44.48, reg_loss is 0.09, training accuracy is 0.000000, time 244.938 samples/sec
epoch 0, total_step 22900, total loss is 47.02 , inference loss is 46.93, reg_loss is 0.09, training accuracy is 0.000000, time 244.988 samples/sec
epoch 0, total_step 22950, total loss is 46.18 , inference loss is 46.10, reg_loss is 0.09, training accuracy is 0.000000, time 227.201 samples/sec
epoch 0, total_step 23000, total loss is 48.87 , inference loss is 48.78, reg_loss is 0.09, training accuracy is 0.000000, time 211.159 samples/sec
epoch 0, total_step 23050, total loss is 44.74 , inference loss is 44.65, reg_loss is 0.09, training accuracy is 0.000000, time 235.120 samples/sec
epoch 0, total_step 23100, total loss is 47.30 , inference loss is 47.22, reg_loss is 0.09, training accuracy is 0.000000, time 239.130 samples/sec
epoch 0, total_step 23150, total loss is 46.05 , inference loss is 45.96, reg_loss is 0.08, training accuracy is 0.000000, time 238.544 samples/sec
epoch 0, total_step 23200, total loss is 46.19 , inference loss is 46.10, reg_loss is 0.08, training accuracy is 0.000000, time 214.053 samples/sec
epoch 0, total_step 23250, total loss is 44.94 , inference loss is 44.86, reg_loss is 0.08, training accuracy is 0.000000, time 256.440 samples/sec
epoch 0, total_step 23300, total loss is 49.61 , inference loss is 49.53, reg_loss is 0.08, training accuracy is 0.000000, time 233.729 samples/sec
epoch 0, total_step 23350, total loss is 49.41 , inference loss is 49.32, reg_loss is 0.08, training accuracy is 0.000000, time 248.436 samples/sec
epoch 0, total_step 23400, total loss is 45.00 , inference loss is 44.92, reg_loss is 0.08, training accuracy is 0.000000, time 253.218 samples/sec
epoch 0, total_step 23450, total loss is 47.28 , inference loss is 47.19, reg_loss is 0.08, training accuracy is 0.000000, time 227.988 samples/sec
epoch 0, total_step 23500, total loss is 45.08 , inference loss is 45.00, reg_loss is 0.08, training accuracy is 0.000000, time 238.994 samples/sec
epoch 0, total_step 23550, total loss is 47.95 , inference loss is 47.87, reg_loss is 0.08, training accuracy is 0.000000, time 220.209 samples/sec
epoch 0, total_step 23600, total loss is 46.57 , inference loss is 46.49, reg_loss is 0.08, training accuracy is 0.000000, time 251.595 samples/sec
epoch 0, total_step 23650, total loss is 47.15 , inference loss is 47.07, reg_loss is 0.08, training accuracy is 0.000000, time 246.071 samples/sec
epoch 0, total_step 23700, total loss is 45.79 , inference loss is 45.71, reg_loss is 0.08, training accuracy is 0.000000, time 252.965 samples/sec
epoch 0, total_step 23750, total loss is 46.13 , inference loss is 46.05, reg_loss is 0.08, training accuracy is 0.000000, time 223.322 samples/sec
epoch 0, total_step 23800, total loss is 46.05 , inference loss is 45.97, reg_loss is 0.08, training accuracy is 0.000000, time 256.571 samples/sec
epoch 0, total_step 23850, total loss is 44.58 , inference loss is 44.50, reg_loss is 0.08, training accuracy is 0.000000, time 227.609 samples/sec
epoch 0, total_step 23900, total loss is 45.56 , inference loss is 45.47, reg_loss is 0.08, training accuracy is 0.000000, time 241.311 samples/sec
epoch 0, total_step 23950, total loss is 46.53 , inference loss is 46.45, reg_loss is 0.08, training accuracy is 0.000000, time 230.220 samples/sec
epoch 0, total_step 24000, total loss is 46.23 , inference loss is 46.15, reg_loss is 0.08, training accuracy is 0.000000, time 238.740 samples/sec

Iteration 24000 testing...
thresholds max: 0.05 <=> min: 0.01
total time 13.504s to evaluate 12000 images of lfw
Accuracy: 0.711+-0.014
Validation rate: 0.02667+-0.01135 @ FAR=0.00100
fpr and tpr: 0.975 0.989
Area Under Curve (AUC): 0.785
Equal Error Rate (EER): 0.291

epoch 0, total_step 24050, total loss is 47.09 , inference loss is 47.01, reg_loss is 0.08, training accuracy is 0.000000, time 229.386 samples/sec
epoch 0, total_step 24100, total loss is 48.51 , inference loss is 48.43, reg_loss is 0.08, training accuracy is 0.000000, time 244.295 samples/sec
epoch 0, total_step 24150, total loss is 45.82 , inference loss is 45.75, reg_loss is 0.08, training accuracy is 0.000000, time 244.469 samples/sec
epoch 0, total_step 24200, total loss is 46.87 , inference loss is 46.79, reg_loss is 0.08, training accuracy is 0.000000, time 219.603 samples/sec
epoch 0, total_step 24250, total loss is 47.36 , inference loss is 47.29, reg_loss is 0.08, training accuracy is 0.000000, time 257.455 samples/sec
epoch 0, total_step 24300, total loss is 46.20 , inference loss is 46.12, reg_loss is 0.08, training accuracy is 0.000000, time 250.922 samples/sec
epoch 0, total_step 24350, total loss is 45.01 , inference loss is 44.93, reg_loss is 0.08, training accuracy is 0.000000, time 252.173 samples/sec
epoch 0, total_step 24400, total loss is 45.65 , inference loss is 45.57, reg_loss is 0.08, training accuracy is 0.000000, time 247.256 samples/sec
epoch 0, total_step 24450, total loss is 45.98 , inference loss is 45.90, reg_loss is 0.08, training accuracy is 0.000000, time 233.881 samples/sec
epoch 0, total_step 24500, total loss is 45.13 , inference loss is 45.05, reg_loss is 0.08, training accuracy is 0.000000, time 240.512 samples/sec
epoch 0, total_step 24550, total loss is 45.88 , inference loss is 45.80, reg_loss is 0.08, training accuracy is 0.000000, time 233.891 samples/sec
epoch 0, total_step 24600, total loss is 47.03 , inference loss is 46.96, reg_loss is 0.08, training accuracy is 0.000000, time 225.433 samples/sec
epoch 0, total_step 24650, total loss is 45.02 , inference loss is 44.94, reg_loss is 0.08, training accuracy is 0.000000, time 247.749 samples/sec
epoch 0, total_step 24700, total loss is 46.08 , inference loss is 46.00, reg_loss is 0.08, training accuracy is 0.000000, time 226.536 samples/sec
epoch 0, total_step 24750, total loss is 47.53 , inference loss is 47.45, reg_loss is 0.08, training accuracy is 0.000000, time 233.217 samples/sec
epoch 0, total_step 24800, total loss is 46.37 , inference loss is 46.29, reg_loss is 0.08, training accuracy is 0.000000, time 227.516 samples/sec
epoch 0, total_step 24850, total loss is 47.47 , inference loss is 47.40, reg_loss is 0.08, training accuracy is 0.000000, time 224.324 samples/sec
epoch 0, total_step 24900, total loss is 43.75 , inference loss is 43.67, reg_loss is 0.08, training accuracy is 0.000000, time 235.426 samples/sec
epoch 0, total_step 24950, total loss is 46.13 , inference loss is 46.05, reg_loss is 0.08, training accuracy is 0.000000, time 230.768 samples/sec
epoch 0, total_step 25000, total loss is 45.00 , inference loss is 44.92, reg_loss is 0.08, training accuracy is 0.000000, time 227.717 samples/sec
epoch 0, total_step 25050, total loss is 47.16 , inference loss is 47.08, reg_loss is 0.08, training accuracy is 0.000000, time 233.962 samples/sec
epoch 0, total_step 25100, total loss is 44.87 , inference loss is 44.79, reg_loss is 0.08, training accuracy is 0.000000, time 234.183 samples/sec
epoch 0, total_step 25150, total loss is 46.57 , inference loss is 46.50, reg_loss is 0.08, training accuracy is 0.000000, time 234.577 samples/sec
epoch 0, total_step 25200, total loss is 45.61 , inference loss is 45.53, reg_loss is 0.07, training accuracy is 0.000000, time 237.438 samples/sec
epoch 0, total_step 25250, total loss is 45.36 , inference loss is 45.29, reg_loss is 0.07, training accuracy is 0.000000, time 226.352 samples/sec
epoch 0, total_step 25300, total loss is 48.13 , inference loss is 48.05, reg_loss is 0.07, training accuracy is 0.000000, time 248.282 samples/sec
epoch 0, total_step 25350, total loss is 44.71 , inference loss is 44.63, reg_loss is 0.07, training accuracy is 0.000000, time 232.164 samples/sec
epoch 0, total_step 25400, total loss is 47.49 , inference loss is 47.42, reg_loss is 0.07, training accuracy is 0.000000, time 242.975 samples/sec
epoch 0, total_step 25450, total loss is 43.46 , inference loss is 43.39, reg_loss is 0.07, training accuracy is 0.000000, time 254.416 samples/sec
epoch 0, total_step 25500, total loss is 47.37 , inference loss is 47.30, reg_loss is 0.07, training accuracy is 0.000000, time 231.277 samples/sec
epoch 0, total_step 25550, total loss is 49.20 , inference loss is 49.13, reg_loss is 0.07, training accuracy is 0.000000, time 230.822 samples/sec
epoch 0, total_step 25600, total loss is 45.28 , inference loss is 45.21, reg_loss is 0.07, training accuracy is 0.000000, time 234.142 samples/sec
epoch 0, total_step 25650, total loss is 43.09 , inference loss is 43.02, reg_loss is 0.07, training accuracy is 0.000000, time 217.183 samples/sec
epoch 0, total_step 25700, total loss is 45.45 , inference loss is 45.38, reg_loss is 0.07, training accuracy is 0.000000, time 233.817 samples/sec
epoch 0, total_step 25750, total loss is 43.53 , inference loss is 43.46, reg_loss is 0.07, training accuracy is 0.000000, time 218.992 samples/sec
epoch 0, total_step 25800, total loss is 44.83 , inference loss is 44.76, reg_loss is 0.07, training accuracy is 0.000000, time 255.731 samples/sec
epoch 0, total_step 25850, total loss is 44.84 , inference loss is 44.77, reg_loss is 0.07, training accuracy is 0.000000, time 224.062 samples/sec
epoch 0, total_step 25900, total loss is 45.74 , inference loss is 45.67, reg_loss is 0.07, training accuracy is 0.000000, time 250.024 samples/sec
epoch 0, total_step 25950, total loss is 46.41 , inference loss is 46.34, reg_loss is 0.07, training accuracy is 0.000000, time 229.824 samples/sec
epoch 0, total_step 26000, total loss is 46.05 , inference loss is 45.97, reg_loss is 0.07, training accuracy is 0.000000, time 231.275 samples/sec

Iteration 26000 testing...
thresholds max: 0.06 <=> min: 0.01
total time 14.038s to evaluate 12000 images of lfw
Accuracy: 0.719+-0.022
Validation rate: 0.02467+-0.01447 @ FAR=0.00100
fpr and tpr: 0.970 0.987
Area Under Curve (AUC): 0.800
Equal Error Rate (EER): 0.278
`

@dear-john
Copy link

dear-john commented Apr 11, 2019

i have the same problem as you, i have check all the issue, including open and closed, but find no solution.
Do you have solve this problem? thanks!

here is my log:
epoch 0, total_step 50, total loss is 65.83 , inference loss is 65.70, reg_loss is 0.14, training accuracy is 0.000000
epoch 0, total_step 100, total loss is 72.24 , inference loss is 72.11, reg_loss is 0.14, training accuracy is 0.000000
epoch 0, total_step 150, total loss is 80.90 , inference loss is 80.76, reg_loss is 0.14, training accuracy is 0.000000
epoch 0, total_step 200, total loss is 79.09 , inference loss is 78.95, reg_loss is 0.14, training accuracy is 0.000000
epoch 0, total_step 250, total loss is 81.85 , inference loss is 81.70, reg_loss is 0.14, training accuracy is 0.000000
epoch 0, total_step 300, total loss is 83.06 , inference loss is 82.92, reg_loss is 0.15, training accuracy is 0.000000
epoch 0, total_step 350, total loss is 83.29 , inference loss is 83.14, reg_loss is 0.15, training accuracy is 0.000000
epoch 0, total_step 400, total loss is 87.50 , inference loss is 87.35, reg_loss is 0.15, training accuracy is 0.000000

......
......

Iteration 2000 testing...
thresholds max: 0 <=> min: 0.0
Accuracy: 0.500+-0.000
Validation rate: 1.00000+-0.00000 @ FAR=1.00000
fpr and tpr: 0.998 0.998
thresholds max: 0 <=> min: 0.0
Accuracy: 0.500+-0.000
Validation rate: 1.00000+-0.00000 @ FAR=1.00000
fpr and tpr: 0.998 0.998
thresholds max: 0 <=> min: 0.0
Accuracy: 0.500+-0.000
Validation rate: 1.00000+-0.00000 @ FAR=1.00000
fpr and tpr: 0.998 0.998
thresholds max: 0 <=> min: 0.0
Accuracy: 0.500+-0.000
Validation rate: 1.00000+-0.00000 @ FAR=1.00000
fpr and tpr: 0.998 0.998

@dear-john
Copy link

dear-john commented Apr 11, 2019

i use https://pan.baidu.com/s/1S6LJZGdqcZRle1vlcMzHOQ#list/path=%2F , listed in README, dataset to train and validate, will this problem caused by different dataset?

@jolinlinlin
Copy link
Author

i use https://pan.baidu.com/s/1S6LJZGdqcZRle1vlcMzHOQ#list/path=%2F , listed in README, dataset to train and validate, will this problem caused by different dataset?

keep running and epoch will change, but accuracy is still always 0 and I don't konw why. Maybe there is something wrong with code.

@dear-john
Copy link

epoch 0, total_step 5850, total loss is 22.87 , inference loss is 22.83, reg_loss is 0.04, training accuracy is 0.000000
epoch 0, total_step 5900, total loss is 24.14 , inference loss is 24.10, reg_loss is 0.04, training accuracy is 0.000000
epoch 0, total_step 5950, total loss is 24.09 , inference loss is 24.05, reg_loss is 0.04, training accuracy is 0.000000
epoch 0, total_step 6000, total loss is 23.81 , inference loss is 23.77, reg_loss is 0.04, training accuracy is 0.000000

Iteration 6000 testing...
thresholds max: 0.12 <=> min: 0.07
Accuracy: 0.853+-0.020
Validation rate: 0.15367+-0.01683 @ FAR=0.00100
fpr and tpr: 0.965 0.986
thresholds max: 0.12 <=> min: 0.07
Accuracy: 0.825+-0.012
Validation rate: 0.09743+-0.02178 @ FAR=0.00114
fpr and tpr: 0.969 0.987
thresholds max: 0.12 <=> min: 0.07
Accuracy: 0.628+-0.011
Validation rate: 0.00486+-0.00339 @ FAR=0.00114
fpr and tpr: 0.966 0.973
thresholds max: 0.12 <=> min: 0.07
Accuracy: 0.667+-0.010
Validation rate: 0.01267+-0.00879 @ FAR=0.00100
fpr and tpr: 0.969 0.978
epoch 0, total_step 6050, total loss is 24.41 , inference loss is 24.37, reg_loss is 0.04, training accuracy is 0.000000
epoch 0, total_step 6100, total loss is 23.74 , inference loss is 23.70, reg_loss is 0.04, training accuracy is 0.000000

this is my current result, because my server crushed twice, so my epoch is 0 and total loss is about 24 rather than a bigger one, e.g. 60.
As you see, the accuracy is incresing and loss is decresing while keep running, so maybe there are problem with the evalation of training accuracy.
As long as the validation accuracy is incresing, i think we do not need worry too much.
But if the training accuracy can back to normal, i will pleasure to fix it.

@sirius-ai sirius-ai pinned this issue May 24, 2019
@sirius-ai sirius-ai unpinned this issue May 24, 2019
@sirius-ai
Copy link
Owner

At the beginning, accuracy is 0 that is normal, for the class number is so large and loss is still so big that lease network hardly classify correctly. With the test validation rate increase to bigger than 0.6 you will find that trainning acc will not be aways 0.

@siyiding1216
Copy link

At the beginning, accuracy is 0 that is normal, for the class number is so large and loss is still so big that lease network hardly classify correctly. With the test validation rate increase to bigger than 0.6 you will find that trainning acc will not be aways 0.

What is the final training acc you got? I could only get around 20% for around 25 epoch

@dangnh0611
Copy link

@siyiding1216 my problem too, did you fix it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants