size mismatch during load_sate_dict() #4

hannwi · 2022-08-10T06:34:13Z

Hello, I appreciate your awesome work.
I want to try evaluation, but there's an error while calling load_state_dict() in evaluate.py
the error message is as below:

size mismatch for layer6.conv2.0.weight: copying a param with shape torch.Size([48, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([48, 256, 3, 3]).

It seems that the dimensions in the pretrained model 'LIP_epoch_149.pth' and the constructed model from LIPDataSet() are different in some layers. Could you check this issue?

Thank you!

tjpulkl · 2022-08-11T02:11:58Z

I download the project and rerun the code, I have not encount the issue you mentioned.
please make sure that you create the folder of snapshots and put the trained model in
this folder.

hannwi · 2022-08-17T07:26:43Z

Thanks for your reply!
I downloaded the pretrained model from Google Drive and put it on the folder as below:

is there any chance that the model on Google Drive is different from the one on Baidu Drive?

tjpulkl · 2022-08-17T07:43:10Z

font{ line-height: 1.6; } ul,ol{ padding-left: 20px; list-style-position: inside; } Hi, I understand what you have done to the code. When you train the network you should load the pretrained model which termed as resnet101-imagenet.pth.The aformethioned pretrained model is from the training on imagenet dataset for image classification.Usually, when you train the network for segmentation, we can first load the pretrained classification model to improve the performance, this is a common practice.The final trained model for segmentation on LIP is LIP_epoch_149.pth.All in all, resnet101-imagenet.pth is the pretrained model for training the network, LIP_epoch_149.pthis our final trained model to evaluate. On 8/17/2022 ***@***.***> wrote： Thanks for your reply! I downloaded the pretrained model from Google Drive and put it on the folder as below: is there any chance that the model on Google Drive is different from the one on Baidu Drive? —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: ***@***.***>

olream · 2022-11-20T03:25:00Z

when i do evaluation，I have a similar problem. Here is my code.

model = Res_Deeplab(num_classes=20)
print(f'parms: {sum(p.numel() for p in model.parameters() if p.requires_grad)}')

state_dict = model.state_dict().copy()
state_dict_old = torch.load('CDGNet/LIP_epoch_149.pth')

for key, nkey in zip(state_dict_old.keys(), state_dict.keys()):
    if key != nkey:
        # remove the 'module.' in the 'key'
        state_dict[key[7:]] = deepcopy(state_dict_old[key])
    else:
        state_dict[key] = deepcopy(state_dict_old[key])

model.load_state_dict(state_dict)

return “ runtimeError OrderedDict mutated during iteration”

so，i modify the code

model = Res_Deeplab(num_classes=20)
print(f'parms: {sum(p.numel() for p in model.parameters() if p.requires_grad)}')

state_dict = model.state_dict().copy()
state_dict_new = collections.OrderedDict()
state_dict_old = torch.load('CDGNet/LIP_epoch_149.pth')

for key, nkey in zip(state_dict_old.keys(), state_dict.keys()):
    if key != nkey:
        # remove the 'module.' in the 'key'
        state_dict_new[key[7:]] = deepcopy(state_dict_old[key])
    else:
        state_dict_new[key] = deepcopy(state_dict_old[key])

model.load_state_dict(state_dict_new)

return RuntimeError: Error(s) in loading state_dict for ResNet:
Missing key(s) in state_dict: "layer5.stages.0.2.0.weight", "layer5.stages.0.2.0.bias", "layer5.stages.0.2.0.running_mean", "layer5.stages.0.2.0.running_var", "layer5.stages.1.2.0.weight", "layer5.stages.1.2.0.bias", "layer5.stages.1.2.0.running_mean", "layer5.stages.1.2.0.running_var", "layer5.stages.2.2.0.weight", "layer5.stages.2.2.0.bias", "layer5.stages.2.2.0.running_mean", "layer5.stages.2.2.0.running_var", "layer5.stages.3.2.0.weight", "layer5.stages.3.2.0.bias", "layer5.stages.3.2.0.running_mean", "layer5.stages.3.2.0.running_var", "layer5.bottleneck.1.0.weight", "layer5.bottleneck.1.0.bias", "layer5.bottleneck.1.0.running_mean", "layer5.bottleneck.1.0.running_var", "edge_layer.conv1.1.0.weight", "edge_layer.conv1.1.0.bias", "edge_layer.conv1.1.0.running_mean", "edge_layer.conv1.1.0.running_var", "edge_layer.conv2.1.0.weight", "edge_layer.conv2.1.0.bias", "edge_layer.conv2.1.0.running_mean", "edge_layer.conv2.1.0.running_var", "edge_layer.conv3.1.0.weight", "edge_layer.conv3.1.0.bias", "edge_layer.conv3.1.0.running_mean", "edge_layer.conv3.1.0.running_var", "layer6.conv1.1.0.weight", "layer6.conv1.1.0.bias", "layer6.conv1.1.0.running_mean", "layer6.conv1.1.0.running_var", "layer6.conv2.1.0.weight", "layer6.conv2.1.0.bias", "layer6.conv2.1.0.running_mean", "layer6.conv2.1.0.running_var", "layer6.conv3.1.0.weight", "layer6.conv3.1.0.bias", "layer6.conv3.1.0.running_mean", "layer6.conv3.1.0.running_var", "layer6.conv3.3.0.weight", "layer6.conv3.3.0.bias", "layer6.conv3.3.0.running_mean", "layer6.conv3.3.0.running_var", "layer6.addCAM.0.weight", "layer6.addCAM.1.0.weight", "layer6.addCAM.1.0.bias", "layer6.addCAM.1.0.running_mean", "layer6.addCAM.1.0.running_var", "layer7.1.0.weight", "layer7.1.0.bias", "layer7.1.0.running_mean", "layer7.1.0.running_var", "sq4.0.weight", "sq4.1.0.weight", "sq4.1.0.bias", "sq4.1.0.running_mean", "sq4.1.0.running_var", "sq5.0.weight", "sq5.1.0.weight", "sq5.1.0.bias", "sq5.1.0.running_mean", "sq5.1.0.running_var", "f9.0.weight", "f9.1.0.weight", "f9.1.0.bias", "f9.1.0.running_mean", "f9.1.0.running_var", "hwAttention.gamma", "hwAttention.beta", "hwAttention.conv_hgt1.0.weight", "hwAttention.conv_hgt1.1.weight", "hwAttention.conv_hgt1.1.bias", "hwAttention.conv_hgt1.1.running_mean", "hwAttention.conv_hgt1.1.running_var", "hwAttention.conv_hgt2.0.weight", "hwAttention.conv_hgt2.1.weight", "hwAttention.conv_hgt2.1.bias", "hwAttention.conv_hgt2.1.running_mean", "hwAttention.conv_hgt2.1.running_var", "hwAttention.conv_hwPred1.0.weight", "hwAttention.conv_hwPred1.0.bias", "hwAttention.conv_hwPred2.0.weight", "hwAttention.conv_hwPred2.0.bias", "hwAttention.conv_upDim1.0.weight", "hwAttention.conv_upDim1.0.bias", "hwAttention.conv_upDim2.0.weight", "hwAttention.conv_upDim2.0.bias", "hwAttention.cmbFea.0.weight", "hwAttention.cmbFea.1.weight", "hwAttention.cmbFea.1.bias", "hwAttention.cmbFea.1.running_mean", "hwAttention.cmbFea.1.running_var", "L.weight", "L.bias".
Unexpected key(s) in state_dict: "layer5.stages.0.2.weight", "layer5.stages.0.2.bias", "layer5.stages.0.2.running_mean", "layer5.stages.0.2.running_var", "layer5.stages.1.2.weight", "layer5.stages.1.2.bias", "layer5.stages.1.2.running_mean", "layer5.stages.1.2.running_var", "layer5.stages.2.2.weight", "layer5.stages.2.2.bias", "layer5.stages.2.2.running_mean", "layer5.stages.2.2.running_var", "layer5.stages.3.2.weight", "layer5.stages.3.2.bias", "layer5.stages.3.2.running_mean", "layer5.stages.3.2.running_var", "layer5.bottleneck.1.weight", "layer5.bottleneck.1.bias", "layer5.bottleneck.1.running_mean", "layer5.bottleneck.1.running_var", "edge_layer.conv1.1.weight", "edge_layer.conv1.1.bias", "edge_layer.conv1.1.running_mean", "edge_layer.conv1.1.running_var", "edge_layer.conv2.1.weight", "edge_layer.conv2.1.bias", "edge_layer.conv2.1.running_mean", "edge_layer.conv2.1.running_var", "edge_layer.conv3.1.weight", "edge_layer.conv3.1.bias", "edge_layer.conv3.1.running_mean", "edge_layer.conv3.1.running_var", "layer6.conv1.1.weight", "layer6.conv1.1.bias", "layer6.conv1.1.running_mean", "layer6.conv1.1.running_var", "layer6.conv2.1.weight", "layer6.conv2.1.bias", "layer6.conv2.1.running_mean", "layer6.conv2.1.running_var", "layer6.conv3.1.weight", "layer6.conv3.1.bias", "layer6.conv3.1.running_mean", "layer6.conv3.1.running_var", "layer6.conv3.3.weight", "layer6.conv3.3.bias", "layer6.conv3.3.running_mean", "layer6.conv3.3.running_var", "layer7.1.weight", "layer7.1.bias", "layer7.1.running_mean", "layer7.1.running_var".
size mismatch for layer6.conv2.0.weight: copying a param with shape torch.Size([48, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([48, 256, 3, 3]).
size mismatch for layer6.conv3.0.weight: copying a param with shape torch.Size([256, 304, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 304, 3, 3]).
size mismatch for layer7.0.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 1024, 3, 3]).

birdortyedi · 2023-01-09T15:19:27Z

when i do evaluation，I have a similar problem. Here is my code.
model = Res_Deeplab(num_classes=20)
print(f'parms: {sum(p.numel() for p in model.parameters() if p.requires_grad)}')

state_dict = model.state_dict().copy()
state_dict_old = torch.load('CDGNet/LIP_epoch_149.pth')

for key, nkey in zip(state_dict_old.keys(), state_dict.keys()):
    if key != nkey:
        # remove the 'module.' in the 'key'
        state_dict[key[7:]] = deepcopy(state_dict_old[key])
    else:
        state_dict[key] = deepcopy(state_dict_old[key])

model.load_state_dict(state_dict)
return “ runtimeError OrderedDict mutated during iteration”

so，i modify the code
model = Res_Deeplab(num_classes=20)
print(f'parms: {sum(p.numel() for p in model.parameters() if p.requires_grad)}')

state_dict = model.state_dict().copy()
state_dict_new = collections.OrderedDict()
state_dict_old = torch.load('CDGNet/LIP_epoch_149.pth')

for key, nkey in zip(state_dict_old.keys(), state_dict.keys()):
    if key != nkey:
        # remove the 'module.' in the 'key'
        state_dict_new[key[7:]] = deepcopy(state_dict_old[key])
    else:
        state_dict_new[key] = deepcopy(state_dict_old[key])

model.load_state_dict(state_dict_new)
return RuntimeError: Error(s) in loading state_dict for ResNet: Missing key(s) in state_dict: "layer5.stages.0.2.0.weight", "layer5.stages.0.2.0.bias", "layer5.stages.0.2.0.running_mean", "layer5.stages.0.2.0.running_var", "layer5.stages.1.2.0.weight", "layer5.stages.1.2.0.bias", "layer5.stages.1.2.0.running_mean", "layer5.stages.1.2.0.running_var", "layer5.stages.2.2.0.weight", "layer5.stages.2.2.0.bias", "layer5.stages.2.2.0.running_mean", "layer5.stages.2.2.0.running_var", "layer5.stages.3.2.0.weight", "layer5.stages.3.2.0.bias", "layer5.stages.3.2.0.running_mean", "layer5.stages.3.2.0.running_var", "layer5.bottleneck.1.0.weight", "layer5.bottleneck.1.0.bias", "layer5.bottleneck.1.0.running_mean", "layer5.bottleneck.1.0.running_var", "edge_layer.conv1.1.0.weight", "edge_layer.conv1.1.0.bias", "edge_layer.conv1.1.0.running_mean", "edge_layer.conv1.1.0.running_var", "edge_layer.conv2.1.0.weight", "edge_layer.conv2.1.0.bias", "edge_layer.conv2.1.0.running_mean", "edge_layer.conv2.1.0.running_var", "edge_layer.conv3.1.0.weight", "edge_layer.conv3.1.0.bias", "edge_layer.conv3.1.0.running_mean", "edge_layer.conv3.1.0.running_var", "layer6.conv1.1.0.weight", "layer6.conv1.1.0.bias", "layer6.conv1.1.0.running_mean", "layer6.conv1.1.0.running_var", "layer6.conv2.1.0.weight", "layer6.conv2.1.0.bias", "layer6.conv2.1.0.running_mean", "layer6.conv2.1.0.running_var", "layer6.conv3.1.0.weight", "layer6.conv3.1.0.bias", "layer6.conv3.1.0.running_mean", "layer6.conv3.1.0.running_var", "layer6.conv3.3.0.weight", "layer6.conv3.3.0.bias", "layer6.conv3.3.0.running_mean", "layer6.conv3.3.0.running_var", "layer6.addCAM.0.weight", "layer6.addCAM.1.0.weight", "layer6.addCAM.1.0.bias", "layer6.addCAM.1.0.running_mean", "layer6.addCAM.1.0.running_var", "layer7.1.0.weight", "layer7.1.0.bias", "layer7.1.0.running_mean", "layer7.1.0.running_var", "sq4.0.weight", "sq4.1.0.weight", "sq4.1.0.bias", "sq4.1.0.running_mean", "sq4.1.0.running_var", "sq5.0.weight", "sq5.1.0.weight", "sq5.1.0.bias", "sq5.1.0.running_mean", "sq5.1.0.running_var", "f9.0.weight", "f9.1.0.weight", "f9.1.0.bias", "f9.1.0.running_mean", "f9.1.0.running_var", "hwAttention.gamma", "hwAttention.beta", "hwAttention.conv_hgt1.0.weight", "hwAttention.conv_hgt1.1.weight", "hwAttention.conv_hgt1.1.bias", "hwAttention.conv_hgt1.1.running_mean", "hwAttention.conv_hgt1.1.running_var", "hwAttention.conv_hgt2.0.weight", "hwAttention.conv_hgt2.1.weight", "hwAttention.conv_hgt2.1.bias", "hwAttention.conv_hgt2.1.running_mean", "hwAttention.conv_hgt2.1.running_var", "hwAttention.conv_hwPred1.0.weight", "hwAttention.conv_hwPred1.0.bias", "hwAttention.conv_hwPred2.0.weight", "hwAttention.conv_hwPred2.0.bias", "hwAttention.conv_upDim1.0.weight", "hwAttention.conv_upDim1.0.bias", "hwAttention.conv_upDim2.0.weight", "hwAttention.conv_upDim2.0.bias", "hwAttention.cmbFea.0.weight", "hwAttention.cmbFea.1.weight", "hwAttention.cmbFea.1.bias", "hwAttention.cmbFea.1.running_mean", "hwAttention.cmbFea.1.running_var", "L.weight", "L.bias". Unexpected key(s) in state_dict: "layer5.stages.0.2.weight", "layer5.stages.0.2.bias", "layer5.stages.0.2.running_mean", "layer5.stages.0.2.running_var", "layer5.stages.1.2.weight", "layer5.stages.1.2.bias", "layer5.stages.1.2.running_mean", "layer5.stages.1.2.running_var", "layer5.stages.2.2.weight", "layer5.stages.2.2.bias", "layer5.stages.2.2.running_mean", "layer5.stages.2.2.running_var", "layer5.stages.3.2.weight", "layer5.stages.3.2.bias", "layer5.stages.3.2.running_mean", "layer5.stages.3.2.running_var", "layer5.bottleneck.1.weight", "layer5.bottleneck.1.bias", "layer5.bottleneck.1.running_mean", "layer5.bottleneck.1.running_var", "edge_layer.conv1.1.weight", "edge_layer.conv1.1.bias", "edge_layer.conv1.1.running_mean", "edge_layer.conv1.1.running_var", "edge_layer.conv2.1.weight", "edge_layer.conv2.1.bias", "edge_layer.conv2.1.running_mean", "edge_layer.conv2.1.running_var", "edge_layer.conv3.1.weight", "edge_layer.conv3.1.bias", "edge_layer.conv3.1.running_mean", "edge_layer.conv3.1.running_var", "layer6.conv1.1.weight", "layer6.conv1.1.bias", "layer6.conv1.1.running_mean", "layer6.conv1.1.running_var", "layer6.conv2.1.weight", "layer6.conv2.1.bias", "layer6.conv2.1.running_mean", "layer6.conv2.1.running_var", "layer6.conv3.1.weight", "layer6.conv3.1.bias", "layer6.conv3.1.running_mean", "layer6.conv3.1.running_var", "layer6.conv3.3.weight", "layer6.conv3.3.bias", "layer6.conv3.3.running_mean", "layer6.conv3.3.running_var", "layer7.1.weight", "layer7.1.bias", "layer7.1.running_mean", "layer7.1.running_var". size mismatch for layer6.conv2.0.weight: copying a param with shape torch.Size([48, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([48, 256, 3, 3]). size mismatch for layer6.conv3.0.weight: copying a param with shape torch.Size([256, 304, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 304, 3, 3]). size mismatch for layer7.0.weight: copying a param with shape torch.Size([256, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 1024, 3, 3]).

Same problem for me.

ashwinvaswani · 2023-01-26T21:20:47Z

@tjpulkl I am facing the same problem. What could be the reason? Please advice!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

size mismatch during load_sate_dict() #4

size mismatch during load_sate_dict() #4

hannwi commented Aug 10, 2022

tjpulkl commented Aug 11, 2022

hannwi commented Aug 17, 2022

tjpulkl commented Aug 17, 2022 via email

olream commented Nov 20, 2022 •

edited

Loading

birdortyedi commented Jan 9, 2023

ashwinvaswani commented Jan 26, 2023

size mismatch during load_sate_dict() #4

size mismatch during load_sate_dict() #4

Comments

hannwi commented Aug 10, 2022

tjpulkl commented Aug 11, 2022

hannwi commented Aug 17, 2022

tjpulkl commented Aug 17, 2022 via email

olream commented Nov 20, 2022 • edited Loading

birdortyedi commented Jan 9, 2023

ashwinvaswani commented Jan 26, 2023

olream commented Nov 20, 2022 •

edited

Loading