Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

'DistributedDataParallel' object has no attribute 'model' #90

Closed
aixin200005 opened this issue Jun 16, 2020 · 5 comments
Closed

'DistributedDataParallel' object has no attribute 'model' #90

aixin200005 opened this issue Jun 16, 2020 · 5 comments
Labels
bug Something isn't working

Comments

@aixin200005
Copy link

馃悰 Bug

Hello, I have trained 7 classes of data according to the requirements of Train Custom Data.However, the following errors occurred:
Traceback (most recent call last):
File "train.py", line 400, in
train(hyp)
File "train.py", line 203, in train
check_best_possible_recall(dataset, anchors=model.model[-1].anchor_grid, thr=hyp['anchor_t'])
File "/home/aixin/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 539, in getattr
type(self).name, name))
AttributeError: 'DistributedDataParallel' object has no attribute 'model'

Environment

  • OS: [Ubuntu18.04]
  • GPU [NVIDIA TITAN Xp * 4]
@aixin200005 aixin200005 added the bug Something isn't working label Jun 16, 2020
@github-actions
Copy link
Contributor

github-actions bot commented Jun 16, 2020

Hello @aixin200005, thank you for your interest in our work! Please visit our Custom Training Tutorial to get started, and see our Jupyter Notebook Open In Colab, Docker Image, and Google Cloud Quickstart Guide for example environments.

If this is a bug report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom model or data training question, please note that Ultralytics does not provide free personal support. As a leader in vision ML and AI, we do offer professional consulting, from simple expert advice up to delivery of fully customized, end-to-end production solutions for our clients, such as:

  • Cloud-based AI systems operating on hundreds of HD video streams in realtime.
  • Edge AI integrated into custom iOS and Android apps for realtime 30 FPS video inference.
  • Custom data training, hyperparameter evolution, and model exportation to any destination.

For more information please visit https://www.ultralytics.com.

@ChenYingpeng
Copy link

@aixin200005 You can change below code in train.py.

`

if device.type != 'cpu' and torch.cuda.device_count() > 1 and torch.distributed.is_available():
    check_best_possible_recall(dataset, anchors=model.module.model[-1].anchor_grid, thr=hyp['anchor_t'])
else:
    check_best_possible_recall(dataset, anchors=model.model[-1].anchor_grid, thr=hyp['anchor_t'])

`

@aixin200005
Copy link
Author

@ChenYingpeng
Thank you for your suggestion, but I still report an error after revising according to your suggestion
Traceback (most recent call last):
File "train_aixin.py", line 400, in
train(hyp)
File "train_aixin.py", line 195, in train
check_best_possible_recall(dataset, anchors=model.module.model[-1].anchor_grid, thr=hyp['anchor_t'])
File "/home/aixin/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 539, in getattr
type(self).name, name))
AttributeError: 'Model' object has no attribute 'module'

@glenn-jocher
Copy link
Member

@ChenYingpeng @aixin200005 yes I see, this is caused by the anchor checking trying to pull anchors from a single-gpu model training. I'll see what I can do to fix this, thank you for the suggestion @ChenYingpeng .

@glenn-jocher
Copy link
Member

This should be resolved now in the latest commit, please git pull and try again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants