Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: zip argument #1 must support iteration #9

Open
xjtuzhjm opened this issue Aug 8, 2021 · 3 comments
Open

TypeError: zip argument #1 must support iteration #9

xjtuzhjm opened this issue Aug 8, 2021 · 3 comments

Comments

@xjtuzhjm
Copy link

xjtuzhjm commented Aug 8, 2021

我在训练结束的时候,进入 eval() 函数进行测试的时候, 出现了以下问题

return type(out)(map(gather_map, zip(*outputs)))
TypeError: zip argument #1 must support iteration

下面是训练过程的详细信息:

`INFO:root:Munch({'batch_size': 32, 'workers': 0, 'nepoch': 100, 'model_name': 'vrcnet', 'load_model': None, 'start_epoch': 0, 'num_points': 2048, 'work_dir': 'log/', 'flag': 'debug', 'loss': 'cd', 'manual_seed': None, 'use_mean_feature': False, 'step_interval_to_print': 500, 'epoch_interval_to_save': 1, 'epoch_interval_to_val': 1, 'varying_constant': '0.01, 0.1, 0.5, 1', 'varying_constant_epochs': '5, 15, 30', 'lr': 0.0001, 'lr_decay': True, 'lr_decay_interval': 40, 'lr_decay_rate': 0.7, 'lr_step_decay_epochs': None, 'lr_step_decay_rates': None, 'lr_clip': 1e-06, 'optimizer': 'Adam', 'weight_decay': 0, 'betas': '0.9, 0.999', 'layers': '1, 1, 1, 1', 'distribution_loss': 'KLD', 'knn_list': '16', 'pk': 10, 'local_folding': True, 'points_label': True, 'num_coarse_raw': 1024, 'num_fps': 2048, 'num_coarse': 2048, 'save_vis': False, 'eval_emd': False})
(62400, 2048, 3)
(2400, 2048, 3) (62400,)
(41600, 2048, 3)
(1600, 2048, 3) (41600,)
INFO:root:Length of train dataset:62400
INFO:root:Length of test dataset:41600
INFO:root:Random Seed: 785
Jitting Chamfer 3D
Loaded JIT 3D CUDA chamfer distance
Loaded JIT 3D CUDA emd
INFO:root:vrcnet_cd_debug_2021-08-08T14:50:26 train [0: 0/1950]  loss_type: cd, fine_loss: 0.183416 total_loss: 4.883119 lr: 0.000100 alpha: 0.01
INFO:root:vrcnet_cd_debug_2021-08-08T14:50:26 train [0: 500/1950]  loss_type: cd, fine_loss: 0.041089 total_loss: 0.644301 lr: 0.000100 alpha: 0.01
INFO:root:vrcnet_cd_debug_2021-08-08T14:50:26 train [0: 1000/1950]  loss_type: cd, fine_loss: 0.039373 total_loss: 0.594741 lr: 0.000100 alpha: 0.01
INFO:root:vrcnet_cd_debug_2021-08-08T14:50:26 train [0: 1500/1950]  loss_type: cd, fine_loss: 0.034346 total_loss: 0.527504 lr: 0.000100 alpha: 0.01
INFO:root:Saving net...
INFO:root:Testing...
Traceback (most recent call last):
  File "/home/zhjp/project/MVP_Benchmark/completion/train.py", line 214, in <module>
    train()
  File "/home/zhjp/project/MVP_Benchmark/completion/train.py", line 153, in train
    val(net, epoch, val_loss_meters, dataloader_test, best_epoch_losses)
  File "/home/zhjp/project/MVP_Benchmark/completion/train.py", line 171, in val
    result_dict = net(inputs, gt, prefix="val")
  File "/home/zhjp/miniconda3/envs/mvp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/zhjp/miniconda3/envs/mvp/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 156, in forward
    return self.gather(outputs, self.output_device)
  File "/home/zhjp/miniconda3/envs/mvp/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 168, in gather
    return gather(outputs, output_device, dim=self.dim)
  File "/home/zhjp/miniconda3/envs/mvp/lib/python3.7/site-packages/torch/nn/parallel/scatter_gather.py", line 68, in gather
    res = gather_map(outputs)
  File "/home/zhjp/miniconda3/envs/mvp/lib/python3.7/site-packages/torch/nn/parallel/scatter_gather.py", line 62, in gather_map
    for k in out))
  File "/home/zhjp/miniconda3/envs/mvp/lib/python3.7/site-packages/torch/nn/parallel/scatter_gather.py", line 62, in <genexpr>
    for k in out))
  File "/home/zhjp/miniconda3/envs/mvp/lib/python3.7/site-packages/torch/nn/parallel/scatter_gather.py", line 63, in gather_map
    return type(out)(map(gather_map, zip(*outputs)))
TypeError: zip argument #1 must support iteration

Process finished with exit code 1

我用四块显卡训练的,我看网上说这个问题是多显卡学习的时候的问题,但是找了好久没找到正确的解决方法。

 inputs = inputs.float().cuda()
            gt = gt.float().cuda()
            inputs = inputs.transpose(2, 1).contiguous()
            result_dict = net(inputs, gt, prefix="val")   # 就这个地方有问题
            for k, v in val_loss_meters.items():
                v.update(result_dict[k].mean().item(), curr_batch_size)
@Yuchen-Tao
Copy link

Hello, is this problem solved? I am facing the same issue when I try to train on four GPUs.

@zhujunli1993
Copy link

Me too. I also encountered the same error when I used 2 gpus during training and validating the model. Did anyone know how to fix it?

@184688164
Copy link

Perhaps you can try to change "result_dict = net(inputs, gt, prefix="val")" in train.py file to "result_dict = net.module.forward(inputs, gt, prefix="val")"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants