Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can not reproduce the PQ on ade20k dataset by training from scratch? #14

Closed
wjgaas opened this issue Dec 4, 2022 · 8 comments
Closed
Labels
question Further information is requested

Comments

@wjgaas
Copy link

wjgaas commented Dec 4, 2022

can not get PQ 48 on ade20k dataset with swim-L backbone,i only get PQ 46,how to get the result you provide in the paper?

@praeclarumjj3 praeclarumjj3 added the question Further information is requested label Dec 4, 2022
@praeclarumjj3
Copy link
Member

Hi @wjgaas, thanks for your interest in our work. How many times did you train? You may need to retrain to accommodate the variance in performance. We trained all our models thrice and reported the best results.

@wjgaas
Copy link
Author

wjgaas commented Dec 5, 2022

Hi @wjgaas, thanks for your interest in our work. How many times did you train? You may need to retrain to accommodate the variance in performance. We trained all our models thrice and reported the best results.

thansk for you reply, i trained thrice, the results are 46.31, 46.27, 46.52 PQ on ade20k dataset with swim-L backbone, with the config you provided in the readme (https://github.com/SHI-Labs/OneFormer/blob/main/configs/ade20k/swin/oneformer_swin_large_bs16_160k.yaml).

@praeclarumjj3
Copy link
Member

Hi @wjgaas, could you also share the scores for AP and mIoU metrics? Also, to be sure, you are using the same environment as suggested in the installation instructions, right?

@praeclarumjj3
Copy link
Member

praeclarumjj3 commented Dec 7, 2022

Hi @wjgaas, were you able to resolve this issue? If not, could you share your training logs? Also, did you try training with loading the https://github.com/SwinTransformer/storage/releases/download/v1.0.0/swin_large_patch4_window12_384_22kto1k.pth weights?

@praeclarumjj3
Copy link
Member

Closing this due to inactivity. Feel free to re-open if you still face issues.

@achen46
Copy link

achen46 commented Mar 4, 2023

Hi @wjgaas, thanks for your interest in our work. How many times did you train? You may need to retrain to accommodate the variance in performance. We trained all our models thrice and reported the best results.

thansk for you reply, i trained thrice, the results are 46.31, 46.27, 46.52 PQ on ade20k dataset with swim-L backbone, with the config you provided in the readme (https://github.com/SHI-Labs/OneFormer/blob/main/configs/ade20k/swin/oneformer_swin_large_bs16_160k.yaml).

I also experience the same issue.

@achen46
Copy link

achen46 commented Mar 4, 2023

In general, results should not vary that much within the same runs (if you run it 10 times, the variance needs to be reasonable) or even with a slightly different dependencies.

I see your paper is accepted to CVPR and congrats on that but this is a very serious issue and I hope authors address it.

A good first step is to publish all the logs.

@praeclarumjj3
Copy link
Member

Hi @achen46, thank you for your interest in our work.

Please share your logs and exact details on your environment (GPU architecture and model, CUDA toolkit version, PyTorch, Torchvision, Detectron, and NATTEN versions + their compiled CUDA versions), so we can help you. That is the first piece of information any issue on an open-source repository requires. Simply stating that "it does not work with exactly following instructions" does not help.

We ran an experiment with a fresh clone of the same code (this GitHub repo) that you are having issues with, and we got the following numbers: PQ: 50.5, AP: 36.2, mIoU (s.s./m.s.): 56.6/57.6 (trained yesterday on 03/05/2023). The results are better than our reported numbers in our CVPR paper with PQ: 49.8, AP: 35.9, mIoU (s.s./m.s.): 57.0/57.7 (trained 7 months ago on 08/14/2022), where we only ran three times and reported the best number.

You can find the WandB logs for the original and reproduced runs here: WandB logs. We also share the training log with step-wise loss values for your reference and environment setup details to help your experiments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants