Key mismatch while loading the model? #28

Jumabek · 2018-05-31T07:29:53Z

I am having issue loading the trained checkpoint to FPNSSD512 model.
How can I fix that?

RuntimeError: Error(s) in loading state_dict for FPNSSD512:
	Missing key(s) in state_dict: "fpn.conv1.weight", "fpn.bn1.running_var", "fpn.bn1.bias", "fpn.bn1.running_mean", "fpn.bn1.weight", "fpn.layer1.0.conv1.weight", "fpn.layer1.0.bn1.running_var", "fpn.layer1.0.bn1.bias", "fpn.layer1.0.bn1.running_mean", "fpn.layer1.0.bn1.weight", "fpn.layer1.0.conv2.weight", "fpn.layer1.0.bn2.running_var", "fpn.layer1.0.bn2.bias", 

        Unexpected key(s) in state_dict: "module.fpn.conv1.weight", "module.fpn.bn1.weight", "module.fpn.bn1.bias", "module.fpn.bn1.running_mean", "module.fpn.bn1.running_var", "module.fpn.layer1.0.conv1.weight"

The text was updated successfully, but these errors were encountered:

Jumabek · 2018-05-31T08:36:12Z

following code before loading the checkpoint solved the issue

if device == 'cuda':
    net = torch.nn.DataParallel(net)
    cudnn.benchmark = True

ahkarami · 2018-06-23T10:49:50Z

Dear @Jumabek,
I have also your reported issue.
my script is something like this:

import torch
import torch.backends.cudnn as cudnn
from models.fpnssd.net import FPNSSD512


# Print the PyTorch Version:
print(torch.__version__)  # 0.4.0


# *************** Parameters **************** #
# Check use GPU or not
use_gpu = torch.cuda.is_available()  # use GPU
if use_gpu:
    device = torch.device("cuda:0")  
else:
    device = torch.device("cpu")


# ** Loading Pre-Trained Weights:
net = FPNSSD512(num_classes=20).to(device)
net = torch.nn.DataParallel(net)
cudnn.benchmark = True
# download pre-trained weights from:
# https://drive.google.com/open?id=1yy_kUnm_hZR3uk9yLcaQSMwxVn7wApTU
net.load_state_dict(torch.load('./fpnssd512_20_trained.pth'))
net.eval()

However, I got your reported error. Would you please help me to address this issue?

ahkarami · 2018-06-26T05:58:35Z

Dear @kuangliu,
Would you please answer my above question?

Jumabek · 2018-07-03T13:11:52Z

@ahkarami sorry for late reply.
While I do not fully understand the issue.
Can you run the code below:
I added net = torch.nn.DataParallel(net) after loading the model

import torch
import torch.backends.cudnn as cudnn
from models.fpnssd.net import FPNSSD512


# Print the PyTorch Version:
print(torch.__version__)  # 0.4.0


# *************** Parameters **************** #
# Check use GPU or not
use_gpu = torch.cuda.is_available()  # use GPU
if use_gpu:
    device = torch.device("cuda:0")  
else:
    device = torch.device("cpu")


# ** Loading Pre-Trained Weights:
net = FPNSSD512(num_classes=20).to(device)
net = torch.nn.DataParallel(net)
cudnn.benchmark = True
# download pre-trained weights from:
# https://drive.google.com/open?id=1yy_kUnm_hZR3uk9yLcaQSMwxVn7wApTU
net.load_state_dict(torch.load('./fpnssd512_20_trained.pth'))
net = torch.nn.DataParallel(net)
net.eval()

ahkarami · 2018-07-04T06:13:00Z

Dear @Jumabek,
Thank you for your reply. Sorry for my inconvenience. I have tested your recommended script, but unfortunately the error is remain. The error is:

Traceback (most recent call last):
  File "/home/user/TorchCV/Attempt1.py", line 54, in <module>
    net.load_state_dict(torch.load('./fpnssd512_20_trained.pth'))
  File "/opt/pytorch4/torch/nn/modules/module.py", line 721, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for FPNSSD512:
	Missing key(s) in state_dict: "fpn.conv1.weight", "fpn.bn1.running_mean", "fpn.bn1.running_var", ...
	Unexpected key(s) in state_dict: "extractor.conv1.weight", "extractor.bn1.weight", "extractor.bn1.bias", ....

Process finished with exit code 1

It is worth nothing that I have tested the above code on a system which has just one GTX 1080ti GPU (with CUDA 9.0 & cuDNN 7).

dearleiii · 2018-07-20T18:39:14Z

Hi I followed your code and seems helped me solve the issue of unexpected key,
But I'm wondering what's the reason for it to occur?
Why is DataParallel help to solve it?

zacario-li · 2018-08-14T03:20:13Z

@ahkarami I meet the same issue with you. Have you fixed it now?

ahkarami · 2018-08-14T06:41:17Z

Dear @zacario-li,
Unfortunately I couldn't address the issue. I can train & test model by my own GPU (i.e., my trained models are correct) but the released pre-trained model has the above issue. I think the problem related to this fact that the pre-trained model has been trained on a machine with multi GPU but now we want to use it in a machine with just one GPU. However, In this case using the torch.nn.DataParallel(net) command must address the problem, but we saw that this command can't solve the problem!!!

root-master · 2018-10-10T05:22:04Z

If you want to load the weights after DataParallel use:
net.module.load_state_dict(pertained_weights)
If you want to load the weights before DataParallel use:
net.load_state_dict(pertained_weights)

silkylove · 2018-10-17T15:15:11Z

Dear @ahkarami ,
I think the pretrained fpnssd model provided by @kuangliu is not the same as /models/fpnssd/net.py. Actually, he said that he just replaced vgg16 by fpn50 in ssd512 which is /models/ssd/net.py. So you could not use the model created by /models/fpnssd/net.py to load the wights in /models/ssd/net.py as the keys are not matched.
The solution to use his provided pretrained model is to train his ssd512 model with fpn50 not fpnssd512 model in /models/fpnssd/net.py.
Also, it seems that he did not put all of his examples on this github or he delete something before pushing.

ahkarami · 2018-10-17T18:09:14Z

Dear @silkylove,
Thank you very much for your useful information. Could you load & use his pre-trained network?
If yes, would you please release its loading code?

silkylove · 2018-10-18T03:42:51Z

Dear @ahkarami ,
Ok, I will release the code after I get similar performence compared to his pretrained fpnssd512 model.

ahkarami · 2018-10-18T05:58:36Z

Thank you very much @silkylove.

silkylove · 2018-10-20T03:19:14Z

@ahkarami
Please check my code.
https://github.com/silkylove/ObjectDetection/tree/master/example/fpnssd
I also uploaded the training log with adam with 100 epochs which could get 73.95mAP until now. I am now training SGD with 200 epochs on that which I think would get higher mAP, I will release the training log later.
Also, you can uncommen this line in eval.py https://github.com/silkylove/ObjectDetection/blob/master/example/fpnssd/eval.py#L25 to got his pertrained model's performence (about 56mAP). And make sure not to use dataparallel.

ahkarami · 2018-10-20T16:27:03Z

Dear @silkylove,
Thank you very much for your time. Your implemented and modified code is really valuable. It would be also great If you upload your pre-trained model (e.g., in Google Drive).

silkylove · 2018-10-21T03:29:31Z

@ahkarami
I uploaded the sgd training and eval log. And with sgd, I can only got aound 76% mAP now. The pretrained model was in here.

ahkarami · 2018-10-21T19:25:08Z

@silkylove,
Thank you very much.

Jumabek closed this as completed May 31, 2018

Jumabek reopened this Jul 3, 2018

dearleiii mentioned this issue Jul 20, 2018

RuntimeError: Error(s) in loading state_dict for APXM_conv3: Missing key(s) in state_dict: "main.0.bias", dearleiii/PIRM-2018-SISR-Challenge#22

Open

epeterson12 mentioned this issue Oct 5, 2018

Train on multiple GPUs NRCan/geo-deep-learning#17

Closed

innat mentioned this issue Oct 15, 2020

RuntimeError: Error(s) in loading state_dict for DataParallel WenjiaWang0312/TextZoom#13

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Key mismatch while loading the model? #28

Key mismatch while loading the model? #28

Jumabek commented May 31, 2018 •

edited

Loading

Jumabek commented May 31, 2018

ahkarami commented Jun 23, 2018

ahkarami commented Jun 26, 2018

Jumabek commented Jul 3, 2018 •

edited

Loading

ahkarami commented Jul 4, 2018

dearleiii commented Jul 20, 2018

zacario-li commented Aug 14, 2018

ahkarami commented Aug 14, 2018

root-master commented Oct 10, 2018

silkylove commented Oct 17, 2018

ahkarami commented Oct 17, 2018

silkylove commented Oct 18, 2018

ahkarami commented Oct 18, 2018

silkylove commented Oct 20, 2018 •

edited

Loading

ahkarami commented Oct 20, 2018

silkylove commented Oct 21, 2018

ahkarami commented Oct 21, 2018

Key mismatch while loading the model? #28

Key mismatch while loading the model? #28

Comments

Jumabek commented May 31, 2018 • edited Loading

Jumabek commented May 31, 2018

ahkarami commented Jun 23, 2018

ahkarami commented Jun 26, 2018

Jumabek commented Jul 3, 2018 • edited Loading

ahkarami commented Jul 4, 2018

dearleiii commented Jul 20, 2018

zacario-li commented Aug 14, 2018

ahkarami commented Aug 14, 2018

root-master commented Oct 10, 2018

silkylove commented Oct 17, 2018

ahkarami commented Oct 17, 2018

silkylove commented Oct 18, 2018

ahkarami commented Oct 18, 2018

silkylove commented Oct 20, 2018 • edited Loading

ahkarami commented Oct 20, 2018

silkylove commented Oct 21, 2018

ahkarami commented Oct 21, 2018

Jumabek commented May 31, 2018 •

edited

Loading

Jumabek commented Jul 3, 2018 •

edited

Loading

silkylove commented Oct 20, 2018 •

edited

Loading