Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to reproduce seg_hrnet_w18_small_v1 #51

Closed
alvinwan opened this issue Sep 3, 2019 · 7 comments
Closed

Unable to reproduce seg_hrnet_w18_small_v1 #51

alvinwan opened this issue Sep 3, 2019 · 7 comments

Comments

@alvinwan
Copy link

alvinwan commented Sep 3, 2019

Thanks for 27488d4, the configuration file is very helpful. With that said, training on 4 GPUs as prescribed, I'm unable to reproduce Cityscapes validation accuracy of 70.3% (attained 65.21%) https://github.com/HRNet/HRNet-Semantic-Segmentation#small-models.

Is https://github.com/HRNet/HRNet-Semantic-Segmentation/blob/master/experiments/cityscapes/seg_hrnet_w18_small_v1_512x1024_sgd_lr1e-2_wd5e-4_bs_12_epoch484.yaml verbatim the file used to produce 70.3% or does it need further hyperparameter tuning? (I'm on the pytorch-v1.1 branch.)

In case it's helpful (although I'm sure this isn't informative), here are the cIoUs for the w18-v1 retrained model:

Loss: 0.179, MeanIU:  0.6509, Best_mIoU:  0.6521
[0.97245895 0.79921705 0.8969752  0.43651182 0.47062117 0.56336364
 0.57983322 0.68906234 0.91533262 0.60986547 0.93415257 0.74804671
 0.46804914 0.91671634 0.4241423  0.58802203 0.24108752 0.41514963
 0.69802723]
@sunke123
Copy link
Member

sunke123 commented Sep 6, 2019

Hi, sorry for the late reply.
This is my training log for seg_hrnet_w18_small_v1: https://1drv.ms/u/s!Aus8VCZ_C_33gSQ7irYs1DZy68yv?e=6fiuJN

The model 'hrnet_w18_for_mb' is the same as hrnet_w18_small_v1.
Please check it out.
I think that I use the same settings as you.
And If you use pytorch-v1.2, you can try to run this code on pytorch-v1.1. My friends tell me that they use pytorch-v1.2 and get worse performance.

@alvinwan
Copy link
Author

alvinwan commented Sep 6, 2019

No problem, thanks for replying! I noticed that the config in your training log includes CLASS_BALANCE: True whereas the current YAML does not have this variable set (by default, lib/config/default.py sets this variable to false. I will try retraining with the class balance variable set to true, and if that works, I'll make a PR with the change.

@alvinwan
Copy link
Author

alvinwan commented Sep 9, 2019

For posterity, I was unable to reproduce seg_hrnet_w18_small_v1's reported accuracy of 0.7026. Training with pytorch 1.1 + CLASS_BALANCE: True did improve my initial accuracy above by 1.5%. However, I obtained 0.6688 and 0.6674 (~3.4% short)

Edit: CLASS_BALANCE doesn't change anything, as the default is already true.

_C.LOSS.CLASS_BALANCE = True
Looks like the improvement came from downgrading pytorch 1.2 to pytorch 1.1.

@sunke123 would you happen to have the mean/std of your runs?

@alvinwan alvinwan changed the title Unable to reproduce w18-v1 Unable to reproduce seg_hrnet_w18_small_v1 Sep 9, 2019
@sunke123
Copy link
Member

I only ran it 1 once.
I think the difference of 3.4% is too large.
Could you share your training log with me?
I will double-check it.

@alvinwan
Copy link
Author

alvinwan commented Sep 10, 2019

@sunke123 Thanks for offering to take a look. I think I just figured it out, oops: Looking at your logs again, I noticed the first epoch results in 30% mIOU (whereas my first epoch results in 10% mIOU), and your logs contain an extra line:

Is that a mobilenet-v3 checkpoint pretrained on imagenet? If so, that likely explains the discrepancy (if so, oops, sorry). Just in case that's not the issue, here are both training logs:

I realize I didn't initialize from the pretrained Imagenet weights, as stated in the README. Sorry for the bother -- I'll try again and update here.

@alvinwan
Copy link
Author

@sunke123 sorry for bothering, that fixed it! haha. I appreciate your help. These are probably the first reproducible results I've ever seen. The code is clean, the results are reproducible... couldn't ask for more.

@sunke123
Copy link
Member

Congrats! hh~
Thanks for your attention.
If you have any questions, please feel free to contact us.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants