the epoch_IoU of retrained refinement network can only up to 0.35 on deepglobe dataset #18

DwRolin · 2022-06-02T16:48:25Z

I tried to retrain the segmentation backbone and refinement network following the guideline in readme https://github.com/VinAIResearch/MagNet#training-backbone-networks.
The best_mIoU of retrained backbone fpn is 0.6363 , this result is close to the baseline IoU 0.6722 shown in readme.

In this sense, the performance of retrained refinement network with retrained backbone should be close to the performance with pretrained backbone.
In the retraining of refinement network, the change of epoch_IoU with pretrained backbone was like following image,

the change of epoch_IoU with retrained backbone was like following image.

With the retrained backbone, the epoch_IoU can only up to 0.35.
I tried to find the difference between pretrained backbone and retrained backbone.
I separated the validate part from backbone/train.py to evaluate the performance of pretrained backbone. https://github.com/DwRolin/temp_code/blob/main/eval_pretrain.py
What's strange is that the MeanIU of pretrained backbone is only 0.07.
I would like to know what causes this contradiction and how to make the retrained refinement network work well.

DwRolin · 2022-06-02T16:53:33Z

The following is the training log for the backbone fpn.
resnet_fpn_train_612x612_sgd_lr1e-2_wd5e-4_bs_12_epoch484_2022-05-05-16-57_train.log

hmchuong · 2022-06-05T14:34:44Z

Hi,
Are you working on the DeepGlobe database? Can you check that the CUDNN. ENABLED are the same in both backbone and your refinement training script? and it should be True.

DwRolin · 2022-06-06T07:40:11Z

Thank you for your serious reply!
I set CUDNN.ENABLED to true, then the issue is resolved.
I cloned this code about six months ago, at that time, the CUDNN.ENABLED was set to false.
And I am curious about why the CUDNN.ENABLE is set to false in hrnet_ocr_w18_train_256x128_sgd_lr1e-2_wd5e-4_bs_12_epoch484.yaml, while it is set to true in resnet_fpn_train_612x612_sgd_lr1e-2_wd5e-4_bs_12_epoch484.yaml.

hmchuong · 2022-06-07T01:20:03Z

Hi,

The reason is that the Batchnorm2d behavior depends on CUDNN

hmchuong closed this as completed Jun 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

the epoch_IoU of retrained refinement network can only up to 0.35 on deepglobe dataset #18

the epoch_IoU of retrained refinement network can only up to 0.35 on deepglobe dataset #18

DwRolin commented Jun 2, 2022

DwRolin commented Jun 2, 2022

hmchuong commented Jun 5, 2022

DwRolin commented Jun 6, 2022

hmchuong commented Jun 7, 2022

the epoch_IoU of retrained refinement network can only up to 0.35 on deepglobe dataset #18

the epoch_IoU of retrained refinement network can only up to 0.35 on deepglobe dataset #18

Comments

DwRolin commented Jun 2, 2022

DwRolin commented Jun 2, 2022

hmchuong commented Jun 5, 2022

DwRolin commented Jun 6, 2022

hmchuong commented Jun 7, 2022