Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What are the Training Arguments for ImageNet Pre-Trained Model? #38

Open
LeighDavis opened this issue Mar 10, 2021 · 5 comments
Open

What are the Training Arguments for ImageNet Pre-Trained Model? #38

LeighDavis opened this issue Mar 10, 2021 · 5 comments

Comments

@LeighDavis
Copy link

Getting following error message when running the trained ImageNet model for image classification on my machine, which I downloaded from author's Dropbox link posted in this repo's readme link:

model.load_state_dict(torch.load(PATH, map_location=torch.device("cpu"))[\'state_dict\'])\n', ' File "C:\\Program Files\\Python36\\lib\\site-packages\\torch\\nn\\modules\\module.py", line 1052, in load_state_dict\n self.__class__.__name__, "\\n\\t".join(error_msgs)))\n', 'RuntimeError: Error(s) in loading state_dict for DataParallel:\n\tMissing key(s) in state_dict: "module.features.denseblock_1.denselayer_1.conv_1._count", "module.features.denseblock_1.denselayer_1.conv_1._stage", "module.features.denseblock_1.denselayer_1.conv_1._mask", "module.features.denseblock_1.denselayer_2.conv_1._count", "module.features.denseblock_1.denselayer_2.conv_1._stage", "module.features.denseblock_1.denselayer_2.conv_1._mask", "module.features.denseblock_1.denselayer_3.conv_1._count", "module.features.denseblock_1.denselayer_3.conv_1._stage", "module.features.denseblock_1.denselayer_3.conv_1._mask", "module.features.denseblock_1.denselayer_4.conv_1._count", "module.features.denseblock_1.denselayer_4.conv_1._stage", "module.features.denseblock_1.denselayer_4.conv_1._mask", "module.features.denseblock_2.denselayer_1.conv_1._count", "module.features.denseblock_2.denselayer_1.conv_1._stage", "module.features.denseblock_2.denselayer_1.conv_1._mask", "module.features.denseblock_2.denselayer_2.conv_1._count", "module.features.denseblock_2.denselayer_2.conv_1._stage", "module.features.denseblock_2.denselayer_2.conv_1._mask", "module.features.denseblock_2.denselayer_3.conv_1._count", "module.features.denseblock_2.denselayer_3.conv_1._stage", "module.features.denseblock_2.denselayer_3.conv_1._mask", "module.features.denseblock_2.denselayer_4.conv_1._count", "module.features.denseblock_2.denselayer_4.conv_1._stage", "module.features.denseblock_2.denselayer_4.conv_1._mask", "module.features.denseblock_2.denselayer_5.conv_1._count", "module.features.denseblock_2.denselayer_5.conv_1._stage", "module.features.denseblock_2.denselayer_5.conv_1._mask", "module.features.denseblock_2.denselayer_6.conv_1._count", "module.features.denseblock_2.denselayer_6.conv_1._stage", "module.features.denseblock_2.denselayer_6.conv_1._mask", "module.features.denseblock_3.denselayer_1.conv_1._count", "module.features.denseblock_3.denselayer_1.conv_1._stage", "module.features.denseblock_3.denselayer_1.conv_1._mask", "module.features.denseblock_3.denselayer_2.conv_1._count", "module.features.denseblock_3.denselayer_2.conv_1._stage", "module.features.denseblock_3.denselayer_2.conv_1._mask", "module.features.denseblock_3.denselayer_3.conv_1._count", "module.features.denseblock_3.denselayer_3.conv_1._stage", "module.features.denseblock_3.denselayer_3.conv_1._mask", "module.features.denseblock_3.denselayer_4.conv_1._count", "module.features.denseblock_3.denselayer_4.conv_1._stage", "module.features.denseblock_3.denselayer_4.conv_1._mask", "module.features.denseblock_3.denselayer_5.conv_1._count", "module.features.denseblock_3.denselayer_5.conv_1._stage", "module.features.denseblock_3.denselayer_5.conv_1._mask", "module.features.denseblock_3.denselayer_6.conv_1._count", "module.features.denseblock_3.denselayer_6.conv_1._stage", "module.features.denseblock_3.denselayer_6.conv_1._mask", "module.features.denseblock_3.denselayer_7.conv_1._count", "module.features.denseblock_3.denselayer_7.conv_1._stage", "module.features.denseblock_3.denselayer_7.conv_1._mask", "module.features.denseblock_3.denselayer_8.conv_1._count", "module.features.denseblock_3.denselayer_8.conv_1._stage", "module.features.denseblock_3.denselayer_8.conv_1._mask", "module.features.denseblock_4.denselayer_1.conv_1._count", "module.features.denseblock_4.denselayer_1.conv_1._stage", "module.features.denseblock_4.denselayer_1.conv_1._mask", "module.features.denseblock_4.denselayer_2.conv_1._count", "module.features.denseblock_4.denselayer_2.conv_1._stage", "module.features.denseblock_4.denselayer_2.conv_1._mask", "module.features.denseblock_4.denselayer_3.conv_1._count", "module.features.denseblock_4.denselayer_3.conv_1._stage", "module.features.denseblock_4.denselayer_3.conv_1._mask", "module.features.denseblock_4.denselayer_4.conv_1._count", "module.features.denseblock_4.denselayer_4.conv_1._stage", "module.features.denseblock_4.denselayer_4.conv_1._mask", "module.features.denseblock_4.denselayer_5.conv_1._count", "module.features.denseblock_4.denselayer_5.conv_1._stage", "module.features.denseblock_4.denselayer_5.conv_1._mask", "module.features.denseblock_4.denselayer_6.conv_1._count", "module.features.denseblock_4.denselayer_6.conv_1._stage", "module.features.denseblock_4.denselayer_6.conv_1._mask", "module.features.denseblock_4.denselayer_7.conv_1._count", "module.features.denseblock_4.denselayer_7.conv_1._stage", "module.features.denseblock_4.denselayer_7.conv_1._mask", "module.features.denseblock_4.denselayer_8.conv_1._count", "module.features.denseblock_4.denselayer_8.conv_1._stage", "module.features.denseblock_4.denselayer_8.conv_1._mask", "module.features.denseblock_4.denselayer_9.conv_1._count", "module.features.denseblock_4.denselayer_9.conv_1._stage", "module.features.denseblock_4.denselayer_9.conv_1._mask", "module.features.denseblock_4.denselayer_10.conv_1._count", "module.features.denseblock_4.denselayer_10.conv_1._stage", "module.features.denseblock_4.denselayer_10.conv_1._mask", "module.features.denseblock_5.denselayer_1.conv_1._count", "module.features.denseblock_5.denselayer_1.conv_1._stage", "module.features.denseblock_5.denselayer_1.conv_1._mask", "module.features.denseblock_5.denselayer_2.conv_1._count", "module.features.denseblock_5.denselayer_2.conv_1._stage", "module.features.denseblock_5.denselayer_2.conv_1._mask", "module.features.denseblock_5.denselayer_3.conv_1._count", "module.features.denseblock_5.denselayer_3.conv_1._stage", "module.features.denseblock_5.denselayer_3.conv_1._mask", "module.features.denseblock_5.denselayer_4.conv_1._count", "module.features.denseblock_5.denselayer_4.conv_1._stage", "module.features.denseblock_5.denselayer_4.conv_1._mask", "module.features.denseblock_5.denselayer_5.conv_1._count", "module.features.denseblock_5.denselayer_5.conv_1._stage", "module.features.denseblock_5.denselayer_5.conv_1._mask", "module.features.denseblock_5.denselayer_6.conv_1._count", "module.features.denseblock_5.denselayer_6.conv_1._stage", "module.features.denseblock_5.denselayer_6.conv_1._mask", "module.features.denseblock_5.denselayer_7.conv_1._count", "module.features.denseblock_5.denselayer_7.conv_1._stage", "module.features.denseblock_5.denselayer_7.conv_1._mask", "module.features.denseblock_5.denselayer_8.conv_1._count", "module.features.denseblock_5.denselayer_8.conv_1._stage", "module.features.denseblock_5.denselayer_8.conv_1._mask", "module.classifier.weight", "module.classifier.bias". \n\tUnexpected key(s) in state_dict: "module.features.denseblock_1.denselayer_1.conv_1.index", "module.features.denseblock_1.denselayer_2.conv_1.index", "module.features.denseblock_1.denselayer_3.conv_1.index", "module.features.denseblock_1.denselayer_4.conv_1.index", "module.features.denseblock_2.denselayer_1.conv_1.index", "module.features.denseblock_2.denselayer_2.conv_1.index", "module.features.denseblock_2.denselayer_3.conv_1.index", "module.features.denseblock_2.denselayer_4.conv_1.index", "module.features.denseblock_2.denselayer_5.conv_1.index", "module.features.denseblock_2.denselayer_6.conv_1.index", "module.features.denseblock_3.denselayer_1.conv_1.index", "module.features.denseblock_3.denselayer_2.conv_1.index", "module.features.denseblock_3.denselayer_3.conv_1.index", "module.features.denseblock_3.denselayer_4.conv_1.index", "module.features.denseblock_3.denselayer_5.conv_1.index", "module.features.denseblock_3.denselayer_6.conv_1.index", "module.features.denseblock_3.denselayer_7.conv_1.index", "module.features.denseblock_3.denselayer_8.conv_1.index", "module.features.denseblock_4.denselayer_1.conv_1.index", "module.features.denseblock_4.denselayer_2.conv_1.index", "module.features.denseblock_4.denselayer_3.conv_1.index", "module.features.denseblock_4.denselayer_4.conv_1.index", "module.features.denseblock_4.denselayer_5.conv_1.index", "module.features.denseblock_4.denselayer_6.conv_1.index", "module.features.denseblock_4.denselayer_7.conv_1.index", "module.features.denseblock_4.denselayer_8.conv_1.index", "module.features.denseblock_4.denselayer_9.conv_1.index", "module.features.denseblock_4.denselayer_10.conv_1.index", "module.features.denseblock_5.denselayer_1.conv_1.index", "module.features.denseblock_5.denselayer_2.conv_1.index", "module.features.denseblock_5.denselayer_3.conv_1.index", "module.features.denseblock_5.denselayer_4.conv_1.index", "module.features.denseblock_5.denselayer_5.conv_1.index", "module.features.denseblock_5.denselayer_6.conv_1.index", "module.features.denseblock_5.denselayer_7.conv_1.index", "module.features.denseblock_5.denselayer_8.conv_1.index", "module.classifier.index", "module.classifier.linear.weight", "module.classifier.linear.bias". \n\tsize mismatch for module.features.denseblock_1.denselayer_1.conv_1.conv.weight: copying a param with shape torch.Size([32, 2, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 16, 1, 1]).\n\tsize mismatch for module.features.denseblock_1.denselayer_1.conv_2.conv.weight: copying a param with shape torch.Size([8, 4, 3, 3]) from checkpoint, the shape in current model is torch.Size([8, 8, 3, 3]).\n\tsize mismatch for module.features.denseblock_1.denselayer_2.conv_1.conv.weight: copying a param with shape torch.Size([32, 3, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 24, 1, 1]).\n\tsize mismatch for module.features.denseblock_1.denselayer_2.conv_2.conv.weight: copying a param with shape torch.Size([8, 4, 3, 3]) from checkpoint, the shape in current model is torch.Size([8, 8, 3, 3]).\n\tsize mismatch for module.features.denseblock_1.denselayer_3.conv_1.conv.weight: copying a param with shape torch.Size([32, 4, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 32, 1, 1]).\n\tsize mismatch for module.features.denseblock_1.denselayer_3.conv_2.conv.weight: copying a param with shape torch.Size([8, 4, 3, 3]) from checkpoint, the shape in current model is torch.Size([8, 8, 3, 3]).\n\tsize mismatch for module.features.denseblock_1.denselayer_4.conv_1.conv.weight: copying a param with shape torch.Size([32, 5, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 40, 1, 1]).\n\tsize mismatch for module.features.denseblock_1.denselayer_4.conv_2.conv.weight: copying a param with shape torch.Size([8, 4, 3, 3]) from checkpoint, the shape in current model is torch.Size([8, 8, 3, 3]).\n\tsize mismatch for module.features.denseblock_2.denselayer_1.conv_1.conv.weight: copying a param with shape torch.Size([64, 6, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 48, 1, 1]).\n\tsize mismatch for module.features.denseblock_2.denselayer_1.conv_2.conv.weight: copying a param with shape torch.Size([16, 8, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).\n\tsize mismatch for module.features.denseblock_2.denselayer_2.conv_1.conv.weight: copying a param with shape torch.Size([64, 8, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 1, 1]).\n\tsize mismatch for module.features.denseblock_2.denselayer_2.conv_2.conv.weight: copying a param with shape torch.Size([16, 8, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).\n\tsize mismatch for module.features.denseblock_2.denselayer_3.conv_1.conv.weight: copying a param with shape torch.Size([64, 10, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 80, 1, 1]).\n\tsize mismatch for module.features.denseblock_2.denselayer_3.conv_2.conv.weight: copying a param with shape torch.Size([16, 8, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).\n\tsize mismatch for module.features.denseblock_2.denselayer_4.conv_1.conv.weight: copying a param with shape torch.Size([64, 12, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 96, 1, 1]).\n\tsize mismatch for module.features.denseblock_2.denselayer_4.conv_2.conv.weight: copying a param with shape torch.Size([16, 8, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).\n\tsize mismatch for module.features.denseblock_2.denselayer_5.conv_1.conv.weight: copying a param with shape torch.Size([64, 14, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 112, 1, 1]).\n\tsize mismatch for module.features.denseblock_2.denselayer_5.conv_2.conv.weight: copying a param with shape torch.Size([16, 8, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).\n\tsize mismatch for module.features.denseblock_2.denselayer_6.conv_1.conv.weight: copying a param with shape torch.Size([64, 16, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 128, 1, 1]).\n\tsize mismatch for module.features.denseblock_2.denselayer_6.conv_2.conv.weight: copying a param with shape torch.Size([16, 8, 3, 3]) from checkpoint, the shape in current model is torch.Size([16, 16, 3, 3]).\n\tsize mismatch for module.features.denseblock_3.denselayer_1.conv_1.conv.weight: copying a param with shape torch.Size([128, 18, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 144, 1, 1]).\n\tsize mismatch for module.features.denseblock_3.denselayer_1.conv_2.conv.weight: copying a param with shape torch.Size([32, 16, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).\n\tsize mismatch for module.features.denseblock_3.denselayer_2.conv_1.conv.weight: copying a param with shape torch.Size([128, 22, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 176, 1, 1]).\n\tsize mismatch for module.features.denseblock_3.denselayer_2.conv_2.conv.weight: copying a param with shape torch.Size([32, 16, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).\n\tsize mismatch for module.features.denseblock_3.denselayer_3.conv_1.conv.weight: copying a param with shape torch.Size([128, 26, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 208, 1, 1]).\n\tsize mismatch for module.features.denseblock_3.denselayer_3.conv_2.conv.weight: copying a param with shape torch.Size([32, 16, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).\n\tsize mismatch for module.features.denseblock_3.denselayer_4.conv_1.conv.weight: copying a param with shape torch.Size([128, 30, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 240, 1, 1]).\n\tsize mismatch for module.features.denseblock_3.denselayer_4.conv_2.conv.weight: copying a param with shape torch.Size([32, 16, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).\n\tsize mismatch for module.features.denseblock_3.denselayer_5.conv_1.conv.weight: copying a param with shape torch.Size([128, 34, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 272, 1, 1]).\n\tsize mismatch for module.features.denseblock_3.denselayer_5.conv_2.conv.weight: copying a param with shape torch.Size([32, 16, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).\n\tsize mismatch for module.features.denseblock_3.denselayer_6.conv_1.conv.weight: copying a param with shape torch.Size([128, 38, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 304, 1, 1]).\n\tsize mismatch for module.features.denseblock_3.denselayer_6.conv_2.conv.weight: copying a param with shape torch.Size([32, 16, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).\n\tsize mismatch for module.features.denseblock_3.denselayer_7.conv_1.conv.weight: copying a param with shape torch.Size([128, 42, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 336, 1, 1]).\n\tsize mismatch for module.features.denseblock_3.denselayer_7.conv_2.conv.weight: copying a param with shape torch.Size([32, 16, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).\n\tsize mismatch for module.features.denseblock_3.denselayer_8.conv_1.conv.weight: copying a param with shape torch.Size([128, 46, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 368, 1, 1]).\n\tsize mismatch for module.features.denseblock_3.denselayer_8.conv_2.conv.weight: copying a param with shape torch.Size([32, 16, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 32, 3, 3]).\n\tsize mismatch for module.features.denseblock_4.denselayer_1.conv_1.conv.weight: copying a param with shape torch.Size([256, 50, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 400, 1, 1]).\n\tsize mismatch for module.features.denseblock_4.denselayer_1.conv_2.conv.weight: copying a param with shape torch.Size([64, 32, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).\n\tsize mismatch for module.features.denseblock_4.denselayer_2.conv_1.conv.weight: copying a param with shape torch.Size([256, 58, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 464, 1, 1]).\n\tsize mismatch for module.features.denseblock_4.denselayer_2.conv_2.conv.weight: copying a param with shape torch.Size([64, 32, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).\n\tsize mismatch for module.features.denseblock_4.denselayer_3.conv_1.conv.weight: copying a param with shape torch.Size([256, 66, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 528, 1, 1]).\n\tsize mismatch for module.features.denseblock_4.denselayer_3.conv_2.conv.weight: copying a param with shape torch.Size([64, 32, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).\n\tsize mismatch for module.features.denseblock_4.denselayer_4.conv_1.conv.weight: copying a param with shape torch.Size([256, 74, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 592, 1, 1]).\n\tsize mismatch for module.features.denseblock_4.denselayer_4.conv_2.conv.weight: copying a param with shape torch.Size([64, 32, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).\n\tsize mismatch for module.features.denseblock_4.denselayer_5.conv_1.conv.weight: copying a param with shape torch.Size([256, 82, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 656, 1, 1]).\n\tsize mismatch for module.features.denseblock_4.denselayer_5.conv_2.conv.weight: copying a param with shape torch.Size([64, 32, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).\n\tsize mismatch for module.features.denseblock_4.denselayer_6.conv_1.conv.weight: copying a param with shape torch.Size([256, 90, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 720, 1, 1]).\n\tsize mismatch for module.features.denseblock_4.denselayer_6.conv_2.conv.weight: copying a param with shape torch.Size([64, 32, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).\n\tsize mismatch for module.features.denseblock_4.denselayer_7.conv_1.conv.weight: copying a param with shape torch.Size([256, 98, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 784, 1, 1]).\n\tsize mismatch for module.features.denseblock_4.denselayer_7.conv_2.conv.weight: copying a param with shape torch.Size([64, 32, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).\n\tsize mismatch for module.features.denseblock_4.denselayer_8.conv_1.conv.weight: copying a param with shape torch.Size([256, 106, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 848, 1, 1]).\n\tsize mismatch for module.features.denseblock_4.denselayer_8.conv_2.conv.weight: copying a param with shape torch.Size([64, 32, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).\n\tsize mismatch for module.features.denseblock_4.denselayer_9.conv_1.conv.weight: copying a param with shape torch.Size([256, 114, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 912, 1, 1]).\n\tsize mismatch for module.features.denseblock_4.denselayer_9.conv_2.conv.weight: copying a param with shape torch.Size([64, 32, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).\n\tsize mismatch for module.features.denseblock_4.denselayer_10.conv_1.conv.weight: copying a param with shape torch.Size([256, 122, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 976, 1, 1]).\n\tsize mismatch for module.features.denseblock_4.denselayer_10.conv_2.conv.weight: copying a param with shape torch.Size([64, 32, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).\n\tsize mismatch for module.features.denseblock_5.denselayer_1.conv_1.conv.weight: copying a param with shape torch.Size([512, 130, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 1040, 1, 1]).\n\tsize mismatch for module.features.denseblock_5.denselayer_1.conv_2.conv.weight: copying a param with shape torch.Size([128, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]).\n\tsize mismatch for module.features.denseblock_5.denselayer_2.conv_1.conv.weight: copying a param with shape torch.Size([512, 146, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 1168, 1, 1]).\n\tsize mismatch for module.features.denseblock_5.denselayer_2.conv_2.conv.weight: copying a param with shape torch.Size([128, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]).\n\tsize mismatch for module.features.denseblock_5.denselayer_3.conv_1.conv.weight: copying a param with shape torch.Size([512, 162, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 1296, 1, 1]).\n\tsize mismatch for module.features.denseblock_5.denselayer_3.conv_2.conv.weight: copying a param with shape torch.Size([128, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]).\n\tsize mismatch for module.features.denseblock_5.denselayer_4.conv_1.conv.weight: copying a param with shape torch.Size([512, 178, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 1424, 1, 1]).\n\tsize mismatch for module.features.denseblock_5.denselayer_4.conv_2.conv.weight: copying a param with shape torch.Size([128, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]).\n\tsize mismatch for module.features.denseblock_5.denselayer_5.conv_1.conv.weight: copying a param with shape torch.Size([512, 194, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 1552, 1, 1]).\n\tsize mismatch for module.features.denseblock_5.denselayer_5.conv_2.conv.weight: copying a param with shape torch.Size([128, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]).\n\tsize mismatch for module.features.denseblock_5.denselayer_6.conv_1.conv.weight: copying a param with shape torch.Size([512, 210, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 1680, 1, 1]).\n\tsize mismatch for module.features.denseblock_5.denselayer_6.conv_2.conv.weight: copying a param with shape torch.Size([128, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]).\n\tsize mismatch for module.features.denseblock_5.denselayer_7.conv_1.conv.weight: copying a param with shape torch.Size([512, 226, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 1808, 1, 1]).\n\tsize mismatch for module.features.denseblock_5.denselayer_7.conv_2.conv.weight: copying a param with shape torch.Size([128, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]).\n\tsize mismatch for module.features.denseblock_5.denselayer_8.conv_1.conv.weight: copying a param with shape torch.Size([512, 242, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 1936, 1, 1]).\n\tsize mismatch for module.features.denseblock_5.denselayer_8.conv_2.conv.weight: copying a param with shape torch.Size([128, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]).\n']

This is the training argument I have used in my image classification prediction script:
args = parser.parse_args(["--model", "condensenet_converted", "-b", "64", "-j", "20", "imagenet", "--stages", "4-6-8-10-8", "--growth", "8-16-32-64-128", "--gpu", "0"]). I have tried both, (C=G=4) and (C=G=8) pre-trained models from this repo. Thank you.

@lizhenstat
Copy link

@LeighDavis Hi, same questions here, did you find the answer? Thanks

@lizhenstat
Copy link

@LeighDavis, Hi, I find the training setting for imagenet in script.sh, it is

python main.py --model condensenet -b 256 -j 20 /PATH/TO/IMAGENET --epochs 120 --stages 4-6-8-10-8 --growth 8-16-32-64-128 --group-1x1 4 --group-3x3 4 --condense-factor 4 --bottleneck 4 --resume --group-lasso 0.00001 --gpu 0,1,2,3

@LeighDavis
Copy link
Author

LeighDavis commented Mar 23, 2021

Hi @lizhenstat, thanks for following up. Yes, I had tried those arguments from that training command from script.sh file but I kept getting similar shape errors. Did these arguments work for you?

@lizhenstat
Copy link

@LeighDavis Hi, to evaluate from the pre-trained converted mode, you can use the following code

python main.py --model condensenet_converted -b 64 -j 20 /PATH/TO/IMAGENET \
--stages 4-6-8-10-8 --growth 8-16-32-64-128 --gpu 0 --resume \
--evaluate-from /PATH/TO/CONVERTED/MODEL

Your log shows the mismatch of the parameters in the same layer
Since the converted model is rearranged to a group convolution, and on ImageNet, the condense factor is taking
the value of 2, which is equivalent to pruning 50% of the total filters.

I evaluate condensenet-74 on my one 1080-ti successfully.

@ShichenLiu
Copy link
Owner

@lizhenstat Thanks for addressing @LeighDavis 's questions. I currently have no ImageNet installed on my machine, so I am really glad you can help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants