About comparison with other sota methods #15

Luffy03 · 2022-04-27T12:20:52Z

Sorry for bothering you again!
First, I have re-trained several models, i.e., CPS, U2PL, and AEL. But my results are far below their reported results. Among these SOTA methods with public codes, I can only reproduce expected results based on your ST++ (e.g. 74.15 for 1/16 pascal). Appreciate it!
I found there are several different settings within former methods, i.e., stronger backbone(resnet_stem), auxiliary decoders, larger crop_size, OHEM loss, Sync_bn, and several training tricks, which are extremely unfair for comparisons.
Then, I wonder whether the reviewers ask you to compare yours with these methods? Is it necessary to re-train your ST++ with the same settings? Just for curiosity.......

LiheYoung · 2022-04-27T14:23:54Z

Hi, really thanks for your appreciation, this encourages me a lot.

Actually, I also notice that some recent methods adopt several extra techniques, just as you mentioned. Moreover, they also perform sliding window evaluation on the Cityscapes, which may boost the final performance by ~2%.

Back to your questions, we are asked to compare with CPS and AEL during our submission. What we can do is to report our results when incorporated with part of these techniques. Besides, you may report your reproduced CPS/AEL/U2PL results in your work. I think you can also re-train ST++ with the same advanced techniques for a fair comparison.

Luffy03 · 2022-04-28T01:48:17Z

OK, thx very much!
By the way, I have some more problems with the selective training. As shown in the attached figures, the ablation study for ST++ is only employed for the pascal dataset in the arxiv version of your paper. However, I found that this strategy boosts limited performance for citysTcapes. I think it is because of the large size of cityscapes image, this image-level selecting method is weak. I also found the blind 2-stage training also achieve closed results to ST++ for cityscapes, which is quite different from that of pascal.
Thus, will the ablation study for cityscapes reported in your advanced version of ST++? Or would you please provide more details?

LiheYoung · 2022-04-28T06:13:10Z

Thanks for your results on Cityscapes!

Actually, we did not conduct ablations on the Cityscapes, since the training process is too time-consuming and unaffordable for us. I guess that on Cityscapes, selective training based on grids may be better, such as dividing the whole image into 2x2 grids.

Luffy03 · 2022-04-28T06:18:53Z

thx for your advice, i will give it a try

LiheYoung · 2022-04-29T15:33:46Z

Hi @Luffy03,

Could you tell me the reproduced results of these three methods? And did you adopt the same training settings as their main papers, such as the batch size and cropping size? Thanks a lot.

Luffy03 · 2022-04-30T15:17:56Z

Hi @Luffy03,

Could you tell me the reproduced results of these three methods? And did you adopt the same training settings as their main papers, such as the batch size and cropping size? Thanks a lot.

Hi, limited by the computation memory, I reproduce these methods based on former settings, (i.e., 321 and 721 crop size, traditional resnet, no ohem, and the same augmentations). And I add slide-window evaluation to cityscapes. All results are reported with deeplab101, batch size 16 for pascal and 12 for cityscapes. For fair comparisons, aux-decoder is used for only CPS not for others. Interestingly, I find cps_loss is useless but two decoders and cutmix_loss are useful, and I discard cutmix for fairness. These codes are copied from their official implementations and I fix them into my framework, thus there may be something wrong. I will spend more time on experiments.

Luffy03 · 2022-04-30T15:24:10Z

Hi @Luffy03,
Could you tell me the reproduced results of these three methods? And did you adopt the same training settings as their main papers, such as the batch size and cropping size? Thanks a lot.

Hi, limited by the computation memory, I reproduce these methods based on former settings, (i.e., 321 and 721 crop size, traditional resnet, no ohem, and the same augmentations). And I add slide-window evaluation to cityscapes. All results are reported with deeplab101, batch size 16 for pascal and 12 for cityscapes. For fair comparisons, aux-decoder is used for only CPS not for others. Interestingly, I find cps_loss is useless but two decoders and cutmix_loss are useful, and I discard cutmix for fairness. These codes are copied from their official implementations and I fix them into my framework, thus there may be something wrong. I will spend more time on experiments.

I have found some mistakes in my own reproduced AEL and U2PL codes compared with the official, the results reported may be not accurate. Maybe you can reproduce yourself, and I may fix them in one or two weeks.......

LiheYoung · 2022-04-30T15:45:53Z

Thanks for sharing your detailed experiments! In my personal experience, I also find CPS loss fails to bring obvious improvement.

After all, a fair comparison is necessary, and really looking forward to your future results~

LiheYoung · 2022-05-01T03:39:20Z

By the way, apart from what you mentioned, AEL and U2PL also adopt a more advanced DeepLabv3+ decoder, and their ResNet output stride is set as 8 instead of 16 in their official codes.

LiheYoung closed this as completed May 8, 2022

LiheYoung mentioned this issue Jun 4, 2022

Pascal VOC with split_0, 1/16 setting get 55.17 mAP? #18

Closed

ZhenZHAO mentioned this issue Jun 16, 2023

CPS结果 ZhenZHAO/AugSeg#6

Closed

LiheYoung mentioned this issue Jun 28, 2023

Experimental results of CPS LiheYoung/UniMatch#53

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About comparison with other sota methods #15

About comparison with other sota methods #15

Luffy03 commented Apr 27, 2022

LiheYoung commented Apr 27, 2022 •

edited

Loading

Luffy03 commented Apr 28, 2022

LiheYoung commented Apr 28, 2022

Luffy03 commented Apr 28, 2022

LiheYoung commented Apr 29, 2022 •

edited

Loading

Luffy03 commented Apr 30, 2022

Luffy03 commented Apr 30, 2022

LiheYoung commented Apr 30, 2022

LiheYoung commented May 1, 2022 •

edited

Loading

About comparison with other sota methods #15

About comparison with other sota methods #15

Comments

Luffy03 commented Apr 27, 2022

LiheYoung commented Apr 27, 2022 • edited Loading

Luffy03 commented Apr 28, 2022

LiheYoung commented Apr 28, 2022

Luffy03 commented Apr 28, 2022

LiheYoung commented Apr 29, 2022 • edited Loading

Luffy03 commented Apr 30, 2022

Luffy03 commented Apr 30, 2022

LiheYoung commented Apr 30, 2022

LiheYoung commented May 1, 2022 • edited Loading

LiheYoung commented Apr 27, 2022 •

edited

Loading

LiheYoung commented Apr 29, 2022 •

edited

Loading

LiheYoung commented May 1, 2022 •

edited

Loading