can not reproduce the PQ on ade20k dataset by training from scratch？ #14

wjgaas · 2022-12-04T02:09:32Z

can not get PQ 48 on ade20k dataset with swim-L backbone，i only get PQ 46，how to get the result you provide in the paper?

praeclarumjj3 · 2022-12-04T06:47:31Z

Hi @wjgaas, thanks for your interest in our work. How many times did you train? You may need to retrain to accommodate the variance in performance. We trained all our models thrice and reported the best results.

wjgaas · 2022-12-05T02:51:17Z

Hi @wjgaas, thanks for your interest in our work. How many times did you train? You may need to retrain to accommodate the variance in performance. We trained all our models thrice and reported the best results.

thansk for you reply, i trained thrice, the results are 46.31, 46.27, 46.52 PQ on ade20k dataset with swim-L backbone, with the config you provided in the readme (https://github.com/SHI-Labs/OneFormer/blob/main/configs/ade20k/swin/oneformer_swin_large_bs16_160k.yaml).

praeclarumjj3 · 2022-12-05T03:18:48Z

Hi @wjgaas, could you also share the scores for AP and mIoU metrics? Also, to be sure, you are using the same environment as suggested in the installation instructions, right?

praeclarumjj3 · 2022-12-07T04:18:16Z

Hi @wjgaas, were you able to resolve this issue? If not, could you share your training logs? Also, did you try training with loading the https://github.com/SwinTransformer/storage/releases/download/v1.0.0/swin_large_patch4_window12_384_22kto1k.pth weights?

praeclarumjj3 · 2022-12-08T02:16:51Z

Closing this due to inactivity. Feel free to re-open if you still face issues.

achen46 · 2023-03-04T18:47:30Z

Hi @wjgaas, thanks for your interest in our work. How many times did you train? You may need to retrain to accommodate the variance in performance. We trained all our models thrice and reported the best results.

thansk for you reply, i trained thrice, the results are 46.31, 46.27, 46.52 PQ on ade20k dataset with swim-L backbone, with the config you provided in the readme (https://github.com/SHI-Labs/OneFormer/blob/main/configs/ade20k/swin/oneformer_swin_large_bs16_160k.yaml).

I also experience the same issue.

achen46 · 2023-03-04T21:00:44Z

In general, results should not vary that much within the same runs (if you run it 10 times, the variance needs to be reasonable) or even with a slightly different dependencies.

I see your paper is accepted to CVPR and congrats on that but this is a very serious issue and I hope authors address it.

A good first step is to publish all the logs.

praeclarumjj3 · 2023-03-06T19:43:11Z

Hi @achen46, thank you for your interest in our work.

Please share your logs and exact details on your environment (GPU architecture and model, CUDA toolkit version, PyTorch, Torchvision, Detectron, and NATTEN versions + their compiled CUDA versions), so we can help you. That is the first piece of information any issue on an open-source repository requires. Simply stating that "it does not work with exactly following instructions" does not help.

We ran an experiment with a fresh clone of the same code (this GitHub repo) that you are having issues with, and we got the following numbers: PQ: 50.5, AP: 36.2, mIoU (s.s./m.s.): 56.6/57.6 (trained yesterday on 03/05/2023). The results are better than our reported numbers in our CVPR paper with PQ: 49.8, AP: 35.9, mIoU (s.s./m.s.): 57.0/57.7 (trained 7 months ago on 08/14/2022), where we only ran three times and reported the best number.

You can find the WandB logs for the original and reproduced runs here: WandB logs. We also share the training log with step-wise loss values for your reference and environment setup details to help your experiments.

praeclarumjj3 added the question Further information is requested label Dec 4, 2022

praeclarumjj3 closed this as completed Dec 8, 2022

achen46 mentioned this issue Mar 4, 2023

Cannot Reproduce Results (Concerning Discrepancies !) #26

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

can not reproduce the PQ on ade20k dataset by training from scratch？ #14

can not reproduce the PQ on ade20k dataset by training from scratch？ #14

wjgaas commented Dec 4, 2022

praeclarumjj3 commented Dec 4, 2022

wjgaas commented Dec 5, 2022

praeclarumjj3 commented Dec 5, 2022

praeclarumjj3 commented Dec 7, 2022 •

edited

Loading

praeclarumjj3 commented Dec 8, 2022

achen46 commented Mar 4, 2023

achen46 commented Mar 4, 2023

praeclarumjj3 commented Mar 6, 2023

can not reproduce the PQ on ade20k dataset by training from scratch？ #14

can not reproduce the PQ on ade20k dataset by training from scratch？ #14

Comments

wjgaas commented Dec 4, 2022

praeclarumjj3 commented Dec 4, 2022

wjgaas commented Dec 5, 2022

praeclarumjj3 commented Dec 5, 2022

praeclarumjj3 commented Dec 7, 2022 • edited Loading

praeclarumjj3 commented Dec 8, 2022

achen46 commented Mar 4, 2023

achen46 commented Mar 4, 2023

praeclarumjj3 commented Mar 6, 2023

praeclarumjj3 commented Dec 7, 2022 •

edited

Loading