Bug in forward #3

Haochen-Wang409 · 2021-10-13T03:06:52Z

Hi, I am trying to run your code, and find a bug.

As you can see, the batch_size of images_unsup_strong is 1, which is not allowed in BatchNorm2d when the model is on train mode.
https://github.com/hzhupku/SemiSeg-AEL/blob/main/train.py#L351

The information for the bug is

File "../../train.py", line 497, in <module>
    main()
  File "../../train.py", line 134, in main
    labeled_epoch, model_teacher, trainloader_unsup, criterion_cons, class_criterion, cutmix_bank)
  File "../../train.py", line 344, in train
    preds_student_unsup = model(images_unsup_strong)
  File "/mnt/cache/share/spring/conda_envs/miniconda3/envs/s0.3.4/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/mnt/cache/share/spring/conda_envs/miniconda3/envs/s0.3.4/lib/python3.6/site-packages/torch/nn/parallel/distributed.py", line 705, in forward
    output = self.module(*inputs[0], **kwargs[0])
  File "/mnt/cache/share/spring/conda_envs/miniconda3/envs/s0.3.4/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/mnt/lustre/wanghaochen/semiseg/semseg/models/model_helper.py", line 48, in forward
    pred_head = self.decoder([f1, f2,feat1, feat2])
  File "/mnt/cache/share/spring/conda_envs/miniconda3/envs/s0.3.4/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/mnt/lustre/wanghaochen/semiseg/semseg/models/decoder.py", line 54, in forward
    aspp_out = self.aspp(x4)
  File "/mnt/cache/share/spring/conda_envs/miniconda3/envs/s0.3.4/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/mnt/lustre/wanghaochen/semiseg/semseg/models/base.py", line 46, in forward
    feat1 = F.upsample(self.conv1(x), size=(h, w), mode='bilinear', align_corners=True)
  File "/mnt/cache/share/spring/conda_envs/miniconda3/envs/s0.3.4/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/mnt/cache/share/spring/conda_envs/miniconda3/envs/s0.3.4/lib/python3.6/site-packages/torch/nn/modules/container.py", line 119, in forward
    input = module(input)
  File "/mnt/cache/share/spring/conda_envs/miniconda3/envs/s0.3.4/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/mnt/cache/share/spring/conda_envs/miniconda3/envs/s0.3.4/lib/python3.6/site-packages/torch/nn/modules/batchnorm.py", line 539, in forward
    bn_training, exponential_average_factor, self.eps)
  File "/mnt/cache/share/spring/conda_envs/miniconda3/envs/s0.3.4/lib/python3.6/site-packages/torch/nn/functional.py", line 2147, in batch_norm
    _verify_batch_size(input.size())
  File "/mnt/cache/share/spring/conda_envs/miniconda3/envs/s0.3.4/lib/python3.6/site-packages/torch/nn/functional.py", line 2114, in _verify_batch_size
    raise ValueError("Expected more than 1 value per channel when training, got input size {}".format(size))
ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1])

The text was updated successfully, but these errors were encountered:

hzhupku · 2021-10-13T05:46:37Z

Did you change the batch size?

Haochen-Wang409 · 2021-10-13T05:50:00Z

I change the batch size in config.yaml to 8

wqhIris · 2022-01-28T10:12:27Z

Hi @Haochen-Wang409,
I also meet this bug, do you solve it?

Amos1109 · 2022-10-27T02:46:22Z

Hi,Did you solve the question?

Haochen-Wang409 · 2022-10-27T02:48:17Z

Oh, I have solved the problem by using multi-GPUs, i.e., python -m torch.distributed.launch.

Amos1109 · 2022-10-27T03:06:43Z

@Haochen-Wang409 thank you! Is there a better way to avoid using distributed training?

Haochen-Wang409 closed this as completed Nov 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug in forward #3

Bug in forward #3

Haochen-Wang409 commented Oct 13, 2021

hzhupku commented Oct 13, 2021

Haochen-Wang409 commented Oct 13, 2021

wqhIris commented Jan 28, 2022

Amos1109 commented Oct 27, 2022

Haochen-Wang409 commented Oct 27, 2022

Amos1109 commented Oct 27, 2022

Bug in forward #3

Bug in forward #3

Comments

Haochen-Wang409 commented Oct 13, 2021

hzhupku commented Oct 13, 2021

Haochen-Wang409 commented Oct 13, 2021

wqhIris commented Jan 28, 2022

Amos1109 commented Oct 27, 2022

Haochen-Wang409 commented Oct 27, 2022

Amos1109 commented Oct 27, 2022