Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segmentation training problems #20

Closed
Eli-YiLi opened this issue Nov 24, 2020 · 5 comments
Closed

segmentation training problems #20

Eli-YiLi opened this issue Nov 24, 2020 · 5 comments

Comments

@Eli-YiLi
Copy link

  1. It seems that you use the train_set to train segmentation model. why not use trainaug?
  2. Following the setting in Training the segmentation code  #11, my results is 61.5 training with trainaug and 56.7 with train. Why it differs a lot from the results of the paper? (Note that the weight is from ilsvrc-cls_rna-a1_cls1000_ep-0001.params. test resolution is (1024*512) * [0.5, 0.75, 1.0, 1.25, 1.5, 1.75] in test.)
  3. why it drops after applying crf in RW step?
@YudeWang
Copy link
Owner

  1. parser.add_argument("--train_list", default="voc12/train_aug.txt", type=str)
  2. The setting in Training the segmentation code  #11 is for retrain step, which is not included in this repository. Here are three steps: SEAM, RW, retrain. And I wonder why the test resolution is 1024x512. The images in PASCAL VOC are much smaller than it.
  3. crf not always improve performance, especially when the prediction is not good enough.

@Eli-YiLi
Copy link
Author

thanks for you reply

  1. in retrain phase, which dataset do you use? train or trainaug
    2.I tried some stronger segmentation algorithms like psp, deeplabv3 on mmsegmentation, with backbone res101 and wide_res38. But the highest mIoU is 62.5(psp res101). Their img_scale is (2048, 512)

@Eli-YiLi
Copy link
Author

specifically, the config of data part is as follow:

img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
crop_size = (512, 512)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations'),
dict(type='Resize', img_scale=(2048, 512), ratio_range=(0.5, 2.0)),
dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
dict(type='RandomFlip', prob=0.5),
dict(type='PhotoMetricDistortion'),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_semantic_seg']),
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(2048, 512),
img_ratios=[0.5, 0.75, 1.0, 1.25, 1.5, 1.75],
flip=True,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(type='Normalize', **img_norm_cfg),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]

the mIoU on VOC12 is 79.95(supervised), it is supposed better than deeplabv1.

could you please give some suggestion to me, tks.

@YudeWang
Copy link
Owner

YudeWang commented Nov 24, 2020

@Eli-YiLi

  1. I use trainaug set in retrain step
  2. 448x448 or 513x513 randomly cropped patch is enough for VOC images. For the reason that the pseudo label is not good enough, advanced models like PSPNet/DeepLabv3/v3+ will overfit on these low-quality pseudo labels, leading to a performance degeneration. I use Deeplabv1 to retrain and the setting is given in Training the segmentation code  #11

@Eli-YiLi
Copy link
Author

Thank you a lot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants