KeyError: 'module_list.85.Conv2d.weight' #657

Samjith888 · 2019-11-25T12:08:33Z

Got the following error:

$ python train.py --data data/coco.data --cfg cfg/yolov3.cfg
Namespace(accumulate=2, adam=False, arc='default', batch_size=32, bucket='', cache_images=False, cfg='cfg/yolov3.cfg', data='data/coco.data', device='', epochs=273, evolve=False, img_size=416, img_weights=False, multi_scale=False, name='', nosave=False, notest=False, prebias=False, rect=False, resume=False, transfer=False, var=None, weights='weights/ultralytics49.pt')
Using CUDA device0 _CudaDeviceProperties(name='GeForce GTX 1070', total_memory=8116MB)

Traceback (most recent call last):
  File "train.py", line 444, in <module>
    train()  # train normally
  File "train.py", line 111, in train
    chkpt['model'] = {k: v for k, v in chkpt['model'].items() if model.state_dict()[k].numel() == v.numel()}
  File "train.py", line 111, in <dictcomp>
    chkpt['model'] = {k: v for k, v in chkpt['model'].items() if model.state_dict()[k].numel() == v.numel()}
KeyError: 'module_list.85.Conv2d.weight'
(base)

glenn-jocher · 2019-11-25T20:51:49Z

@Samjith888 your command automatically loads the ultralytics49.pt backbone, which requires yolov3-spp.cfg. You must remove the backbone by using --weights '', or specify a weights-cfg combination that is compatible.

This error is caused by a user supplying incompatible --weights and --cfg arguments. To solve this you must specify no weights (i.e. random initialization of the model) using --weights '' and any --cfg, or use a --cfg that is compatible with your --weights. If none are specified, the defaults are --weights ultralytics49.pt and --cfg cfg/yolov3-spp.cfg.

Compatible --weights --cfg combinations:

python3 train.py --weights yolov3.pt --cfg cfg/yolov3.cfg
python3 train.py --weights yolov3.weights --cfg cfg/yolov3.cfg
python3 train.py --weights yolov3-spp.pt --cfg cfg/yolov3-spp.cfg
python3 train.py --weights ultralytics49.pt --cfg cfg/yolov3-spp.cfg
python3 train.py --weights ultralytics68.pt --cfg cfg/yolov3-spp.cfg

To train from scratch (randomly initialized weights), use:

python3 train.py --weights '' --cfg cfg/*.cfg  # any cfg will work here

ultralytics49.pt is currently the highest performing YOLOv3 model (trained from scratch using this repo) available at the default img-size of 416 (see #310), which is the reason it is used as the default backbone.

hanrui15765510320 · 2019-11-30T11:26:41Z

if i don't want pre_weights,how should i do?

okanlv · 2019-11-30T11:32:13Z

As @glenn-jocher said,

You must remove the backbone by using --weights ''

hanrui15765510320 · 2019-12-02T12:35:35Z

thanks,bro

daddydrac · 2019-12-02T22:48:01Z

I ran this: python3 train.py --data data/custom.data --cfg cfg/yolov3-spp-r.cfg

And got:

AssertionError: Target classes exceed model classes

What am I mising?

glenn-jocher · 2020-01-16T19:10:01Z

I'll close this issue for now as the original issue appears to have been resolved, and/or no activity has been seen for some time. Feel free to comment if this is not the case.

rohan-pradhan · 2020-01-22T23:48:01Z

Hi guys,

I'm trying to train on a custom CFG (therefore should be using a random initialization of weights). I understand that to do this we should set --weights ''

Unfortunately, even when I do that, it keeps trying to download the weights and I get this error:
Exception: '' missing, try downloading from https://drive.google.com/open?id=1LezFG5g3BCW6iYaV89B2i64cqEUZD7e0

This is the full command I am using to train:
python train.py --weights '' --cfg cfg/yolov3-custom.cfg --data data/coco1.data

Any help would be great - thanks!

glenn-jocher · 2020-01-23T06:33:39Z

@rohan-pradhan no space: --weights ''

$ python3 train.py --weights '' --data coco16.data

Namespace(accumulate=4, adam=False, arc='default', batch_size=16, bucket='', cache_images=False, cfg='cfg/yolov3-spp.cfg', data='coco16.data', device='', epochs=273, evolve=False, img_size=[416], multi_scale=False, name='', nosave=False, notest=False, rect=False, resume=False, single_cls=False, var=None, weights='')
Using CPU

Caching labels (16 found, 0 missing, 0 empty, 0 duplicate, for 16 images): 100%|█████████████████████████████| 16/16 [00:00<00:00, 2515.70it/s]
Caching labels (16 found, 0 missing, 0 empty, 0 duplicate, for 16 images): 100%|█████████████████████████████| 16/16 [00:00<00:00, 5567.35it/s]
Model Summary: 225 layers, 6.29987e+07 parameters, 6.29987e+07 gradients
Using 8 dataloader workers
Starting training for 273 epochs...

     Epoch   gpu_mem      GIoU       obj       cls     total   targets  img_size
     0/272        0G       7.7      13.3      7.87      28.9       211       416: 100%|██████████████████████████| 1/1 [01:05<00:00, 65.12s/it]
               Class    Images   Targets         P         R   mAP@0.5        F1:   0%|                                  | 0/1 [00:00<?, ?it/s]

rohan-pradhan · 2020-01-23T16:07:16Z

Thanks for the quick response, Glenn. Unfortunately, even when I copy and paste your command it still gives the same error.

`>python train.py --weights '' --data coco1.data
Namespace(accumulate=4, adam=False, arc='default', batch_size=16, bucket='', cache_images=False, cfg='cfg/yolov3-spp.cfg', data='coco1.data', device='', epochs=273, evolve=False, img_size=416, img_weights=False, multi_scale=False, name='', nosave=False, notest=False, prebias=False, rect=False, resume=False, transfer=False, var=None, weights="''")
Using CUDA device0 _CudaDeviceProperties(name='GeForce RTX 2080 Ti', total_memory=11264MB)

2020-01-23 11:02:59.119516: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
Downloading https://pjreddie.com/media/files/''
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (22) The requested URL returned error: 404 Not Found
'rm' is not recognized as an internal or external command,
operable program or batch file.
Traceback (most recent call last):
  File "train.py", line 463, in <module>
    train()  # train normally
  File "train.py", line 108, in train
    attempt_download(weights)
  File "C:\Users\Rohan\Documents\Development\Thesis\yolov3\models.py", line 454, in attempt_download
    raise Exception(msg)
Exception: '' missing, try downloading from https://drive.google.com/open?id=1LezFG5g3BCW6iYaV89B2i64cqEUZD7e0`

Not sure why it is treating '' as a string.

rohan-pradhan · 2020-01-23T16:08:47Z

Figured it out! Changed it to --weights "" and it seemed to work.

Thanks again!

glenn-jocher · 2020-01-23T17:01:55Z

@rohan-pradhan ah interesting. What's your OS?

rohan-pradhan · 2020-01-23T17:05:38Z

@glenn-jocher I'm running Windows 10 in a Conda environment (Anaconda Prompt).

glenn-jocher · 2020-01-23T19:59:44Z

@rohan-pradhan hmm ok. Perhaps it's windows.

sunset326 · 2020-10-02T07:26:05Z

hi,guys
when i run python train.py --data data/rbc.data --cfg cfg/yolov3.cfg --weights ""
python train.py --data data/rbc.data --cfg cfg/yolov3.cfg --weights ''
python train.py --data data/rbc.data --cfg cfg/yolov3.cfg --weights weights/yolov3.pt
python train.py --data data/rbc.data --cfg cfg/yolov3.cfg --weights weights/yolov3.weights

the same error occured,as follows.
my pytorch is 1.5.1 + torchvision 0.6.0

Traceback (most recent call last):
File "train.py", line 431, in
train(hyp) # train normally
File "train.py", line 164, in train
model, optimizer = amp.initialize(model, optimizer, opt_level='O1', verbosity=0)
File "/home/anaconda2/envs/Maskrcnn_Benchmark/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/apex/amp/frontend.py", line 339, in initialize
return _initialize(models, optimizers, _amp_state.opt_properties, num_losses, cast_model_outputs)
File "/home/anaconda2/envs/Maskrcnn_Benchmark/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/apex/amp/_initialize.py", line 228, in _initialize
handle = amp_init(loss_scale=properties.loss_scale, verbose=(_amp_state.verbosity == 2))
File "/home/anaconda2/envs/Maskrcnn_Benchmark/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/apex/amp/amp.py", line 101, in init
try_caching, verbose)
File "/home/anaconda2/envs/Maskrcnn_Benchmark/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/apex/amp/wrap.py", line 33, in cached_cast
if not utils.has_func(mod, fn):
File "/home/anaconda2/envs/Maskrcnn_Benchmark/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/apex/amp/utils.py", line 132, in has_func
if isinstance(mod, torch.nn.backends.backend.FunctionBackend):
AttributeError: module 'torch.nn' has no attribute 'backends
`

glenn-jocher · 2020-10-04T15:20:58Z

@sunset326 update torch to latest version.

sunset326 · 2020-10-05T02:32:57Z

@sunset326 update torch to latest version.

thx,brother
i have solved the problem,the requirement.txt says python > = 3.7, i update my python,and the problem doesn't occures.

glenn-jocher · 2024-01-28T14:34:58Z

@sunset326 Great to hear that updating Python resolved the issue! If you have any more questions or run into further issues, feel free to ask. Happy training! 🚀

Samjith888 added the bug Something isn't working label Nov 25, 2019

This was referenced Nov 25, 2019

KeyError: 'module_list.85.Conv2d.weight' #650

Closed

Error when using last.pt for object detection after custom training #500

Closed

glenn-jocher added question Further information is requested and removed bug Something isn't working labels Nov 25, 2019

YMkai mentioned this issue Dec 23, 2019

If a pre-trained model nessary? #739

Closed

glenn-jocher closed this as completed Jan 16, 2020

bzburr mentioned this issue Apr 23, 2020

resume error #1089

Closed

anhnktp mentioned this issue Apr 25, 2020

Pretrained yolov3-spp-ultralytics.pt is not compatible with cfg/yolov3-asff.cfg #1097

Closed

kuangbo mentioned this issue Aug 13, 2020

How to add the SE attention module to Yolov3-spp-ultralytics？ #1451

Closed

jacklee-scau mentioned this issue Dec 30, 2020

FasterRCNN 训练错误 WZMIAOMIAO/deep-learning-for-image-processing#111

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KeyError: 'module_list.85.Conv2d.weight' #657

KeyError: 'module_list.85.Conv2d.weight' #657

Samjith888 commented Nov 25, 2019 •

edited by glenn-jocher

glenn-jocher commented Nov 25, 2019 •

edited

hanrui15765510320 commented Nov 30, 2019

okanlv commented Nov 30, 2019 •

edited

hanrui15765510320 commented Dec 2, 2019

daddydrac commented Dec 2, 2019 •

edited

glenn-jocher commented Jan 16, 2020

rohan-pradhan commented Jan 22, 2020 •

edited

glenn-jocher commented Jan 23, 2020 •

edited

rohan-pradhan commented Jan 23, 2020

rohan-pradhan commented Jan 23, 2020

glenn-jocher commented Jan 23, 2020

rohan-pradhan commented Jan 23, 2020

glenn-jocher commented Jan 23, 2020

sunset326 commented Oct 2, 2020 •

edited

glenn-jocher commented Oct 4, 2020

sunset326 commented Oct 5, 2020

glenn-jocher commented Jan 28, 2024

KeyError: 'module_list.85.Conv2d.weight' #657

KeyError: 'module_list.85.Conv2d.weight' #657

Comments

Samjith888 commented Nov 25, 2019 • edited by glenn-jocher

glenn-jocher commented Nov 25, 2019 • edited

hanrui15765510320 commented Nov 30, 2019

okanlv commented Nov 30, 2019 • edited

hanrui15765510320 commented Dec 2, 2019

daddydrac commented Dec 2, 2019 • edited

glenn-jocher commented Jan 16, 2020

rohan-pradhan commented Jan 22, 2020 • edited

glenn-jocher commented Jan 23, 2020 • edited

rohan-pradhan commented Jan 23, 2020

rohan-pradhan commented Jan 23, 2020

glenn-jocher commented Jan 23, 2020

rohan-pradhan commented Jan 23, 2020

glenn-jocher commented Jan 23, 2020

sunset326 commented Oct 2, 2020 • edited

glenn-jocher commented Oct 4, 2020

sunset326 commented Oct 5, 2020

glenn-jocher commented Jan 28, 2024

Samjith888 commented Nov 25, 2019 •

edited by glenn-jocher

glenn-jocher commented Nov 25, 2019 •

edited

okanlv commented Nov 30, 2019 •

edited

daddydrac commented Dec 2, 2019 •

edited

rohan-pradhan commented Jan 22, 2020 •

edited

glenn-jocher commented Jan 23, 2020 •

edited

sunset326 commented Oct 2, 2020 •

edited