AttributeError: module 'torchvision.models' has no attribute 'get_model' #6761

satpalsr · 2022-10-13T08:33:52Z

🐛 Describe the bug

I am trying to execute RetinaNet training with

torchrun --nproc_per_node=1 /content/vision/references/detection/train.py\
    --dataset coco --data-path=/content/vision/dataset --model retinanet_resnet50_fpn --epochs 26\
    --lr-steps 16 22 --aspect-ratio-group-factor 3 --lr 0.01 --weights-backbone ResNet50_Weights.IMAGENET1K_V1

but get's an AttributeError: module 'torchvision.models' has no attribute 'get_model'

Complete trace:

| distributed init (rank 0): env://
Namespace(amp=False, aspect_ratio_group_factor=3, batch_size=2, data_augmentation='hflip', data_path='/content/vision/dataset', dataset='coco', device='cuda', dist_backend='nccl', dist_url='env://', distributed=True, epochs=26, gpu=0, lr=0.01, lr_gamma=0.1, lr_scheduler='multisteplr', lr_step_size=8, lr_steps=[16, 22], model='retinanet_resnet50_fpn', momentum=0.9, norm_weight_decay=None, opt='sgd', output_dir='.', print_freq=20, rank=0, resume='', rpn_score_thresh=None, start_epoch=0, sync_bn=False, test_only=False, trainable_backbone_layers=None, use_copypaste=False, use_deterministic_algorithms=False, weight_decay=0.0001, weights=None, weights_backbone='ResNet50_Weights.IMAGENET1K_V1', workers=4, world_size=1)
Loading data
loading annotations into memory...
Done (t=14.51s)
creating index...
index created!
loading annotations into memory...
Done (t=2.34s)
creating index...
index created!
Creating data loaders
Using [0, 0.5, 0.6299605249474366, 0.7937005259840997, 1.0, 1.2599210498948732, 1.5874010519681994, 2.0, inf] as bins for aspect ratio quantization
Count of instances per bin: [  104   982 24236  2332  8225 74466  5763  1158]
/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py:566: UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
  cpuset_checked))
Creating model
Traceback (most recent call last):
  File "/content/vision/references/detection/train.py", line 311, in <module>
    main(args)
  File "/content/vision/references/detection/train.py", line 222, in main
    model = torchvision.models.get_model(
AttributeError: module 'torchvision.models' has no attribute 'get_model'
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 1179) of binary: /usr/bin/python3
Traceback (most recent call last):
  File "/usr/local/bin/torchrun", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.7/dist-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 345, in wrapper
    return f(*args, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/torch/distributed/run.py", line 761, in main
    run(args)
  File "/usr/local/lib/python3.7/dist-packages/torch/distributed/run.py", line 755, in run
    )(*cmd_args)
  File "/usr/local/lib/python3.7/dist-packages/torch/distributed/launcher/api.py", line 131, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/usr/local/lib/python3.7/dist-packages/torch/distributed/launcher/api.py", line 247, in launch_agent
    failures=result.failures,
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
============================================================
/content/vision/references/detection/train.py FAILED
------------------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2022-10-13_08:26:16
  host      : 291a9c949d94
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 1179)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================

Versions

PyTorch version: 1.12.1+cu113
Is debug build: False
CUDA used to build PyTorch: 11.3
ROCM used to build PyTorch: N/A

OS: Ubuntu 18.04.6 LTS (x86_64)
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Clang version: 6.0.0-1ubuntu2 (tags/RELEASE_600/final)
CMake version: version 3.22.6
Libc version: glibc-2.26

Python version: 3.7.14 (default, Sep  8 2022, 00:06:44)  [GCC 7.5.0] (64-bit runtime)
Python platform: Linux-5.10.133+-x86_64-with-Ubuntu-18.04-bionic
Is CUDA available: True
CUDA runtime version: 11.2.152
GPU models and configuration: GPU 0: Tesla T4
Nvidia driver version: 460.32.03
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.8.1.1
/usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.1.1
/usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.1.1
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.1.1
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.1.1
/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.1.1
/usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.1.1
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] numpy==1.21.6
[pip3] torch==1.12.1+cu113
[pip3] torchaudio==0.12.1+cu113
[pip3] torchsummary==1.5.1
[pip3] torchtext==0.13.1
[pip3] torchvision==0.13.1+cu113
[conda] Could not collect

The text was updated successfully, but these errors were encountered:

NicolasHug · 2022-10-13T08:45:04Z

Hi @satpalsr
get_model() is only available in the dev branch (main) and in the nightly release of torchvision. It will be released in the upcoming release in a few week. Mind you though, it's still marked as Beta right now, so backward compatibility isn't guaranteed.

satpalsr · 2022-10-13T08:52:25Z

I cloned the repo and did python setup.py install for installation. That should have fixed?

NicolasHug · 2022-10-13T08:54:31Z

Yes, installing from source should make get_model() available

satpalsr · 2022-10-13T08:55:17Z

But I am having trouble in Colab

NicolasHug · 2022-10-13T08:59:27Z

Looking at your logs, the source build is failing. You'll also need to install the nightly version of torch core (check out our instructions in the readme). But your best bet is to install the nightly version of torchvision instead of building from source

BUGUANLAN · 2022-11-17T12:59:28Z

i meet the same problem ,my torchvision's version :0.2.2 ,i know the version is too low ,so the torchvision hasnot 'get_model'.have you resolve it? my GPU is very old ,so i cannot update my torchvision ,i donot know how to make it .can you share your ideal?

datumbox · 2022-11-17T14:42:36Z

@BUGUANLAN The get_model() method was added at v0.14. If you can't upgrade to the latest version due to hardware constrains, then I think the best option for you would be to fetch the models by using the legacy idiom:

torchvision.models.__dict__[model_name](**kwargs)

Given there is nothing to resolve, I'll be closing this issue.

datumbox closed this as completed Nov 17, 2022

NicolasHug mentioned this issue Dec 15, 2023

Enabling User2 workflow pytorch/torchtune#94

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AttributeError: module 'torchvision.models' has no attribute 'get_model' #6761

AttributeError: module 'torchvision.models' has no attribute 'get_model' #6761

satpalsr commented Oct 13, 2022

NicolasHug commented Oct 13, 2022

satpalsr commented Oct 13, 2022 •

edited

Loading

NicolasHug commented Oct 13, 2022

satpalsr commented Oct 13, 2022

NicolasHug commented Oct 13, 2022

BUGUANLAN commented Nov 17, 2022

datumbox commented Nov 17, 2022

AttributeError: module 'torchvision.models' has no attribute 'get_model' #6761

AttributeError: module 'torchvision.models' has no attribute 'get_model' #6761

Comments

satpalsr commented Oct 13, 2022

🐛 Describe the bug

Versions

NicolasHug commented Oct 13, 2022

satpalsr commented Oct 13, 2022 • edited Loading

NicolasHug commented Oct 13, 2022

satpalsr commented Oct 13, 2022

NicolasHug commented Oct 13, 2022

BUGUANLAN commented Nov 17, 2022

datumbox commented Nov 17, 2022

satpalsr commented Oct 13, 2022 •

edited

Loading