Skip to content

AttributeError: 'MMDistributedDataParallel' object has no attribute '_use_replicated_tensor_module' #179

@xuch98

Description

@xuch98

something is wrong when I execute the command below for training the model on my own dataset.
bash ./dist_train.sh configs/mask_rcnn_efficientvit_m4_fpn_1x_coco.py 4 --cfg-options model.backbone.pretrained=./runs/efficientvit_m4.pth
What I have done is just formatting my dataset into COCO-type and downloading the pretrained checkpoint. Here is the detailed information:

2023-06-12 20:57:08,304 - mmdet - INFO - workflow: [('train', 1)], max: 12 epochs
2023-06-12 20:57:08,304 - mmdet - INFO - Checkpoints will be saved to /home/xc/transform/Cream/EfficientViT/downstream/work_dirs/mask_rcnn_efficientvit_m4_fpn_1x_coco by HardDiskBackend.
2023-06-12 20:57:10,480 - mmdet - INFO - Saving checkpoint at 1 epochs
[ ] 0/81, elapsed: 0s, ETA:Traceback (most recent call last):
File "/home/xc/transform/Cream/EfficientViT/downstream/./train.py", line 245, in
main()
File "/home/xc/transform/Cream/EfficientViT/downstream/./train.py", line 234, in main
train_detector(
File "/home/xc/transform/Cream/EfficientViT/downstream/mmdet_custom/apis/train.py", line 184, in train_detector
runner.run(data_loaders, cfg.workflow)
File "/home/xc/anaconda3/envs/seg/lib/python3.10/site-packages/mmcv/runner/epoch_based_runner.py", line 136, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/xc/anaconda3/envs/seg/lib/python3.10/site-packages/mmcv/runner/epoch_based_runner.py", line 58, in train
self.call_hook('after_train_epoch')
File "/home/xc/anaconda3/envs/seg/lib/python3.10/site-packages/mmcv/runner/base_runner.py", line 317, in call_hook
getattr(hook, fn_name)(self)
File "/home/xc/anaconda3/envs/seg/lib/python3.10/site-packages/mmcv/runner/hooks/evaluation.py", line 271, in after_train_epoch
self._do_evaluate(runner)
File "/home/xc/anaconda3/envs/seg/lib/python3.10/site-packages/mmdet/core/evaluation/eval_hooks.py", line 126, in _do_evaluate
results = multi_gpu_test(
File "/home/xc/anaconda3/envs/seg/lib/python3.10/site-packages/mmdet/apis/test.py", line 109, in multi_gpu_test
result = model(return_loss=False, rescale=True, **data)
File "/home/xc/anaconda3/envs/seg/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/xc/anaconda3/envs/seg/lib/python3.10/site-packages/torch/nn/parallel/distributed.py", line 1535, in forward
else self._run_ddp_forward(*inputs, **kwargs)
File "/home/xc/anaconda3/envs/seg/lib/python3.10/site-packages/mmcv/parallel/distributed.py", line 160, in _run_ddp_forward
self._use_replicated_tensor_module else self.module
File "/home/xc/anaconda3/envs/seg/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1614, in getattr
raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'MMDistributedDataParallel' object has no attribute '_use_replicated_tensor_module'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions