New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to use OHEM loss function? #15
Comments
@qijiezhao Thanks for your attention. I have updated run_local.sh. The hyperparameters of OHEM are --ohem-thres 0.7 --ohem-keep 100000. |
Yeah, I reproduced the results of PSPNet、FCN_NonLocal、Deeplabv3 and CCNet with 769x769 input size(±0.2%). I think these steps are not that troublesome. |
Great! Does FCN_NonLocal use Resnet-101 as the backbone? Can you tell me the performance of FCN_NonLocal? I don't have V100 to run FCN_NonLocal with Resnet-101. |
Yes, FCN_nonlocal with r101, I have tuned with a lot different settings to maximize the performance, the highest is around 79.97(val set, single scale, however, much more params than CCNet then). The tricks are like: tuning the key channels(pruning), decrease the unnecessary channels etc. In addition, If I use the segmentation toolbox code to run ohem loss function, the modification is also: |
Ok, thanks for the information. |
Thanks for your help, my command is: |
Hi, zilong: I have evaluated normal FCN-non-local with OHEM and it improves 0.7 points than without OHEM. However, when I change to my own method, it gets no improvement, I think my method may conflict with OHEM. Have you ever met this problem when designing CCNet. I want to keep a long-term communication with you on semantics segmentation problems. Can you add a WeChat friend with me? My ID is zhaoqijie8356. Thanks. |
Hi, I also met the same problem, Using OHEM did no improvement to my own method. It seems that OHEM may be sensitive to network architecture. |
Hi, @qijiezhao Thanks |
Sure。 The env is: 2V100 GPUs, Pytorch0.4.1, images per GPU is 4. |
@qijiezhao Thanks very much |
@speedinghzl Can you share the exact learning rate and the number of training steps for finetune this model? |
Hi, Zilong:
Thanks for your contribution to this amazing repo! I really appreciate of this!
By the way, I want to reproduce the highest score in your paper: 81.4 on Cityscapes test set.(with 4 V100 GPUs).
I have already reproduced the default setting single-scale result on val set: 79.7 on my machine.
To reach the 81.4, as far as I have known: besides the default settings, OHEM loss function and multi-scale inference should also be used. But when I re-train the code with setting --ohem True, the evaluation results drop a lot - about 10 points are decreased.
So my problem is, what also should I modify to achieve the result as proposed in the original paper?
Thanks a lot for your help!
The text was updated successfully, but these errors were encountered: