-
Notifications
You must be signed in to change notification settings - Fork 237
Description
something is wrong when I execute the command below for training the model on my own dataset.
bash ./dist_train.sh configs/mask_rcnn_efficientvit_m4_fpn_1x_coco.py 4 --cfg-options model.backbone.pretrained=./runs/efficientvit_m4.pth
What I have done is just formatting my dataset into COCO-type and downloading the pretrained checkpoint. Here is the detailed information:
2023-06-12 20:57:08,304 - mmdet - INFO - workflow: [('train', 1)], max: 12 epochs
2023-06-12 20:57:08,304 - mmdet - INFO - Checkpoints will be saved to /home/xc/transform/Cream/EfficientViT/downstream/work_dirs/mask_rcnn_efficientvit_m4_fpn_1x_coco by HardDiskBackend.
2023-06-12 20:57:10,480 - mmdet - INFO - Saving checkpoint at 1 epochs
[ ] 0/81, elapsed: 0s, ETA:Traceback (most recent call last):
File "/home/xc/transform/Cream/EfficientViT/downstream/./train.py", line 245, in
main()
File "/home/xc/transform/Cream/EfficientViT/downstream/./train.py", line 234, in main
train_detector(
File "/home/xc/transform/Cream/EfficientViT/downstream/mmdet_custom/apis/train.py", line 184, in train_detector
runner.run(data_loaders, cfg.workflow)
File "/home/xc/anaconda3/envs/seg/lib/python3.10/site-packages/mmcv/runner/epoch_based_runner.py", line 136, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/xc/anaconda3/envs/seg/lib/python3.10/site-packages/mmcv/runner/epoch_based_runner.py", line 58, in train
self.call_hook('after_train_epoch')
File "/home/xc/anaconda3/envs/seg/lib/python3.10/site-packages/mmcv/runner/base_runner.py", line 317, in call_hook
getattr(hook, fn_name)(self)
File "/home/xc/anaconda3/envs/seg/lib/python3.10/site-packages/mmcv/runner/hooks/evaluation.py", line 271, in after_train_epoch
self._do_evaluate(runner)
File "/home/xc/anaconda3/envs/seg/lib/python3.10/site-packages/mmdet/core/evaluation/eval_hooks.py", line 126, in _do_evaluate
results = multi_gpu_test(
File "/home/xc/anaconda3/envs/seg/lib/python3.10/site-packages/mmdet/apis/test.py", line 109, in multi_gpu_test
result = model(return_loss=False, rescale=True, **data)
File "/home/xc/anaconda3/envs/seg/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/xc/anaconda3/envs/seg/lib/python3.10/site-packages/torch/nn/parallel/distributed.py", line 1535, in forward
else self._run_ddp_forward(*inputs, **kwargs)
File "/home/xc/anaconda3/envs/seg/lib/python3.10/site-packages/mmcv/parallel/distributed.py", line 160, in _run_ddp_forward
self._use_replicated_tensor_module else self.module
File "/home/xc/anaconda3/envs/seg/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1614, in getattr
raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'MMDistributedDataParallel' object has no attribute '_use_replicated_tensor_module'