New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training on VOC from Scratch #49
Comments
After much debugging, I found that part of my issue was apparently that I was training without initializing the weights, so my predictions quickly converged to a bunch of NaN's. I decided to retrain, initializing with the ImageNet weights like so
Meanwhile, I ran validation on the validation and train+ sets every 5 epochs to track training progress. Performance on the validation set began to stabilize around 250 epochs around 45 mIOU, so I began then reducing the learning rate like so
Now, after a total of about 410 epochs (started reducing learning rate from 240), I am still only achieving a max of 54.77 mIOU on the validation set. This is very much lower than the results presented in the paper. Any advice on how to improve would be greatly appreciated. |
Hi, @mcever , I'm also trying to reproduce the results on VOC 2012 dataset. Have you reproduced the results as paper reported? If you have did it, can you share your training command? |
Hi,
I am attempting to train this network on VOC from scratch, essentially trying to recreate the pre-trained weights available for download; however, after 70+ epochs, my model is still just predicting background for an mIOU of 3.49%. Here is the command I am running to train:
Inside data/VOCdevkit/VOC2012 I have the original download of JPEGImages and SegmentationClass, which provides the full color segmentation images. Any help would be much appreciated.
Here's a snippet of output that may or may not help, showing fcn_valid moving a lot. I'm not entirely sure what the output means, so any explanation on what it is could be useful.
2019-04-11 15:00:09,073 Host Epoch[78] Batch [66-67] Speed: 11.93 samples/sec fcn_valid=0.623302
2019-04-11 15:00:10,056 Host Labels: 0 0.6 -1.0
2019-04-11 15:00:10,058 Host Labels: 0 0.6 -1.0
Waited for 2.59876251221e-05 seconds
2019-04-11 15:00:10,075 Host Epoch[78] Batch [67-68] Speed: 11.98 samples/sec fcn_valid=0.644102
2019-04-11 15:00:10,076 Host Labels: 0 0.6 -1.0
2019-04-11 15:00:11,055 Host Labels: 0 0.6 -1.0
2019-04-11 15:00:11,056 Host Labels: 0 0.6 -1.0
Waited for 3.50475311279e-05 seconds
2019-04-11 15:00:11,074 Host Labels: 0 0.6 -1.0
2019-04-11 15:00:11,077 Host Epoch[78] Batch [68-69] Speed: 11.98 samples/sec fcn_valid=0.632405
2019-04-11 15:00:12,056 Host Labels: 0 0.6 -1.0
2019-04-11 15:00:12,058 Host Labels: 0 0.6 -1.0
Waited for 2.50339508057e-05 seconds
2019-04-11 15:00:12,074 Host Labels: 0 0.6 -1.0
2019-04-11 15:00:12,077 Host Epoch[78] Batch [69-70] Speed: 12.00 samples/sec fcn_valid=0.775874
2019-04-11 15:00:13,057 Host Labels: 0 0.6 -1.0
2019-04-11 15:00:13,058 Host Labels: 0 0.6 -1.0
Waited for 2.59876251221e-05 seconds
2019-04-11 15:00:13,074 Host Labels: 0 0.6 -1.0
2019-04-11 15:00:13,077 Host Epoch[78] Batch [70-71] Speed: 12.01 samples/sec fcn_valid=0.562744
2019-04-11 15:00:14,056 Host Labels: 0 0.6 -1.0
2019-04-11 15:00:14,058 Host Labels: 0 0.6 -1.0
Waited for 0.000184059143066 seconds
2019-04-11 15:00:14,074 Host Labels: 0 0.6 -1.0
2019-04-11 15:00:14,075 Host Epoch[78] Batch [71-72] Speed: 12.03 samples/sec fcn_valid=0.552027
The text was updated successfully, but these errors were encountered: