Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting very low mIoU on Cityscapes #89

Open
sreeragh-ar opened this issue Dec 1, 2020 · 3 comments
Open

Getting very low mIoU on Cityscapes #89

sreeragh-ar opened this issue Dec 1, 2020 · 3 comments

Comments

@sreeragh-ar
Copy link

sreeragh-ar commented Dec 1, 2020

Observed that mIoU is not improving beyond 28% on Cityscapes. (on the branch pytorch-1.1)

Changes made in run configs

  • Reduced batch size to 4
  • Number of GPUs is 2
  • Reduced input size to 512*512
  • Trained for 1,20,000 iterations
    image

I trained in different phases.
For eg. trained the model for 20,000 iterations. Evaluated the model and also recorded the learning rate after 20,000 iterations

Then restart training the model with
--restore-from ./snapshots/CS_scenes_20000.pth
LR={learning rate after 20000 iterations}

Am I missing something?

@sreeragh-ar
Copy link
Author

@speedinghzl

An update.
I just modified evaluation script to use predict_whole() function instead of predict_sliding() and I obtained mIoU of 57.09% mIoU with the same model.

I also tried downloading the provided trained model which is shared on the home page.
2+OHEM model - https://drive.google.com/file/d/1eiX1Xf1o16DvQc3lkFRi4-Dk7IBVspUQ/view
But evaluating this using the above changes in the evaluation script gave mIoU of 48.59% instead of the expected 80%.

Could you please guide me to find what I may be missing?

@speedinghzl
Copy link
Owner

@sreeragh-ar Thanks for your results.
The performance between predict_whole() function and predict_sliding() may be caused by the size of the input image.

If the size of the input image is small, the network may not obtain enough context for classification when using predict_sliding().

The provided trained model was trained with 769x769 input size, using 769x769 as input size may be enough to obtain rich context information. When using predict_sliding(), the input size of the testing phase is the same as the training phase, which is good for better performance.

It is a trade-off between the rich context and the consistency of training and testing settings. It's just my guess. Any discussion is welcome.

@sreeragh-ar
Copy link
Author

sreeragh-ar commented Dec 26, 2020

@speedinghzl
Thanks for the explanation.

I have one more query.

Downloaded official CCNet trained model and evaluated with official cityscapesScripts. But unable to reproduce the expected result.

  • Used single-scale prediction i.e predict_whole() .

  • Set --whole=True to use the input size of 1024*2048 to get rich context information.

  • Saved the predictions after converting train ids to class ids. (like 0 to 7, 1 to 8, 2 to 11 etc) as png files.

  • Used csEvalPixelLevelSemanticLabeling in the cityscapesScripts
    Obtained a low mIoU as 0.486 (i.e, 48.6%).

  • Issue should not be with the training part as I have downloaded the trained model.

  • Issue should not be with calculating mIoU as I am using the official Cityscapes script.

  • The part I am doubtful about is the prediction part.
    I am unable to figure out the exact issue here.

Could you please share the exact run_local.sh file and evaluate.py file which gives the expected result. (branch or commit id is also fine)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants