Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Checkpoint Mismatch for ResMLP #867

Closed
ShoufaChen opened this issue Sep 13, 2021 · 1 comment
Closed

[BUG] Checkpoint Mismatch for ResMLP #867

ShoufaChen opened this issue Sep 13, 2021 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@ShoufaChen
Copy link
Contributor

Hi,

Thanks for your great work.

I found the checkpoint mismatch problem when loading resmlp-12 model,

Missing key(s) in state_dict: "stem.proj.weight", "stem.proj.bias", "blocks.0.ls1", "blocks.0.ls2", "blocks.0.linear_tokens.weight", "blocks.0.linear_tokens.bias", "blocks.0.mlp_channe
ls.fc1.weight", "blocks.0.mlp_channels.fc1.bias", "blocks.0.mlp_channels.fc2.weight", "blocks.0.mlp_channels.fc2.bias", "blocks.1.ls1", "blocks.1.ls2", "blocks.1.linear_tokens.weight", "blocks
.1.linear_tokens.bias", "blocks.1.mlp_channels.fc1.weight", "blocks.1.mlp_channels.fc1.bias", "blocks.1.mlp_channels.fc2.weight", "blocks.1.mlp_channels.fc2.bias", "blocks.2.ls1", "blocks.2.ls
2", "blocks.2.linear_tokens.weight", "blocks.2.linear_tokens.bias", "blocks.2.mlp_channels.fc1.weight", "blocks.2.mlp_channels.fc1.bias", "blocks.2.mlp_channels.fc2.weight", "blocks.2.mlp_chan
nels.fc2.bias", "blocks.3.ls1", "blocks.3.ls2", "blocks.3.linear_tokens.weight", "blocks.3.linear_tokens.bias", "blocks.3.mlp_channels.fc1.weight", "blocks.3.mlp_channels.fc1.bias", "blocks.3.
mlp_channels.fc2.weight", "blocks.3.mlp_channels.fc2.bias", "blocks.4.ls1", "blocks.4.ls2", "blocks.4.linear_tokens.weight", "blocks.4.linear_tokens.bias", "blocks.4.mlp_channels.fc1.weight", 
"blocks.4.mlp_channels.fc1.bias", "blocks.4.mlp_channels.fc2.weight", "blocks.4.mlp_channels.fc2.bias", "blocks.5.ls1", "blocks.5.ls2", "blocks.5.linear_tokens.weight", "blocks.5.linear_tokens
.bias", "blocks.5.mlp_channels.fc1.weight", "blocks.5.mlp_channels.fc1.bias", "blocks.5.mlp_channels.fc2.weight", "blocks.5.mlp_channels.fc2.bias", "blocks.6.ls1", "blocks.6.ls2", "blocks.6.li
near_tokens.weight", "blocks.6.linear_tokens.bias", "blocks.6.mlp_channels.fc1.weight", "blocks.6.mlp_channels.fc1.bias", "blocks.6.mlp_channels.fc2.weight", "blocks.6.mlp_channels.fc2.bias", 
"blocks.7.ls1", "blocks.7.ls2", "blocks.7.linear_tokens.weight", "blocks.7.linear_tokens.bias", "blocks.7.mlp_channels.fc1.weight", "blocks.7.mlp_channels.fc1.bias", "blocks.7.mlp_channels.fc2
.weight", "blocks.7.mlp_channels.fc2.bias", "blocks.8.ls1", "blocks.8.ls2", "blocks.8.linear_tokens.weight", "blocks.8.linear_tokens.bias", "blocks.8.mlp_channels.fc1.weight", "blocks.8.mlp_ch
annels.fc1.bias", "blocks.8.mlp_channels.fc2.weight", "blocks.8.mlp_channels.fc2.bias", "blocks.9.ls1", "blocks.9.ls2", "blocks.9.linear_tokens.weight", "blocks.9.linear_tokens.bias", "blocks.
9.mlp_channels.fc1.weight", "blocks.9.mlp_channels.fc1.bias", "blocks.9.mlp_channels.fc2.weight", "blocks.9.mlp_channels.fc2.bias", "blocks.10.ls1", "blocks.10.ls2", "blocks.10.linear_tokens.w
eight", "blocks.10.linear_tokens.bias", "blocks.10.mlp_channels.fc1.weight", "blocks.10.mlp_channels.fc1.bias", "blocks.10.mlp_channels.fc2.weight", "blocks.10.mlp_channels.fc2.bias", "blocks.
11.ls1", "blocks.11.ls2", "blocks.11.linear_tokens.weight", "blocks.11.linear_tokens.bias", "blocks.11.mlp_channels.fc1.weight", "blocks.11.mlp_channels.fc1.bias", "blocks.11.mlp_channels.fc2.
weight", "blocks.11.mlp_channels.fc2.bias".                                                                                                                                                     
        Unexpected key(s) in state_dict: "patch_embed.proj.weight", "patch_embed.proj.bias", "blocks.0.gamma_1", "blocks.0.gamma_2", "blocks.0.attn.weight", "blocks.0.attn.bias", "blocks.0.mlp
.fc1.weight", "blocks.0.mlp.fc1.bias", "blocks.0.mlp.fc2.weight", "blocks.0.mlp.fc2.bias", "blocks.1.gamma_1", "blocks.1.gamma_2", "blocks.1.attn.weight", "blocks.1.attn.bias", "blocks.1.mlp.f
c1.weight", "blocks.1.mlp.fc1.bias", "blocks.1.mlp.fc2.weight", "blocks.1.mlp.fc2.bias", "blocks.2.gamma_1", "blocks.2.gamma_2", "blocks.2.attn.weight", "blocks.2.attn.bias", "blocks.2.mlp.fc1
.weight", "blocks.2.mlp.fc1.bias", "blocks.2.mlp.fc2.weight", "blocks.2.mlp.fc2.bias", "blocks.3.gamma_1", "blocks.3.gamma_2", "blocks.3.attn.weight", "blocks.3.attn.bias", "blocks.3.mlp.fc1.w
eight", "blocks.3.mlp.fc1.bias", "blocks.3.mlp.fc2.weight", "blocks.3.mlp.fc2.bias", "blocks.4.gamma_1", "blocks.4.gamma_2", "blocks.4.attn.weight", "blocks.4.attn.bias", "blocks.4.mlp.fc1.wei
ght", "blocks.4.mlp.fc1.bias", "blocks.4.mlp.fc2.weight", "blocks.4.mlp.fc2.bias", "blocks.5.gamma_1", "blocks.5.gamma_2", "blocks.5.attn.weight", "blocks.5.attn.bias", "blocks.5.mlp.fc1.weigh
t", "blocks.5.mlp.fc1.bias", "blocks.5.mlp.fc2.weight", "blocks.5.mlp.fc2.bias", "blocks.6.gamma_1", "blocks.6.gamma_2", "blocks.6.attn.weight", "blocks.6.attn.bias", "blocks.6.mlp.fc1.weight"
, "blocks.6.mlp.fc1.bias", "blocks.6.mlp.fc2.weight", "blocks.6.mlp.fc2.bias", "blocks.7.gamma_1", "blocks.7.gamma_2", "blocks.7.attn.weight", "blocks.7.attn.bias", "blocks.7.mlp.fc1.weight", 
"blocks.7.mlp.fc1.bias", "blocks.7.mlp.fc2.weight", "blocks.7.mlp.fc2.bias", "blocks.8.gamma_1", "blocks.8.gamma_2", "blocks.8.attn.weight", "blocks.8.attn.bias", "blocks.8.mlp.fc1.weight", "b
locks.8.mlp.fc1.bias", "blocks.8.mlp.fc2.weight", "blocks.8.mlp.fc2.bias", "blocks.9.gamma_1", "blocks.9.gamma_2", "blocks.9.attn.weight", "blocks.9.attn.bias", "blocks.9.mlp.fc1.weight", "blo
cks.9.mlp.fc1.bias", "blocks.9.mlp.fc2.weight", "blocks.9.mlp.fc2.bias", "blocks.10.gamma_1", "blocks.10.gamma_2", "blocks.10.attn.weight", "blocks.10.attn.bias", "blocks.10.mlp.fc1.weight", "
blocks.10.mlp.fc1.bias", "blocks.10.mlp.fc2.weight", "blocks.10.mlp.fc2.bias", "blocks.11.gamma_1", "blocks.11.gamma_2", "blocks.11.attn.weight", "blocks.11.attn.bias", "blocks.11.mlp.fc1.weig
ht", "blocks.11.mlp.fc1.bias", "blocks.11.mlp.fc2.weight", "blocks.11.mlp.fc2.bias". 
@ShoufaChen ShoufaChen added the bug Something isn't working label Sep 13, 2021
@rwightman
Copy link
Collaborator

rwightman commented Sep 13, 2021

@ShoufaChen Weights from the facebook ResMLP impl need to be remaped since my model impl was done before the official release and I support 3 model variants Mixer/Res/gMLP with same model class... it's done automatically for weights loaded via pretrained checkpoint, but for manually loading checkpoint you need to remap yourself with https://github.com/rwightman/pytorch-image-models/blob/master/timm/models/mlp_mixer.py#L333-L347

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants