Load pretrained Imagenet models by pytorch #347

Kongsea · 2020-01-17T12:29:25Z

When I load a pretrained imagenet model by pytorch using a finetune task, an AssertionError: Checkpoint does not contain classy_state_dict was raised. So I want to know, how to load a imagenet model to initialize the backbone of a class model?
Thank you.

The text was updated successfully, but these errors were encountered:

mannatsingh · 2020-01-17T20:33:35Z

Hi @Kongsea ! If I understand the problem correctly, you are trying to load a pretrained model from torchvision (or elsewhere) and fine tune it with Classy Vision.

The recommended approach for doing something like this would be to do the following -

import torchvision.models as models
my_model = models.resnet18(pretrained=True)

from classy_vision.models import ClassyModelWrapper
my_classy_model = ClassyModelWrapper(my_model)
my_task = ClassificationTask().set_model(my_classy_model)
...

Note that I'm doing two things here -

Converting a PyTorch model to a ClassyModel using ClassyModelWrapper
Instead of using a FineTuningTask, I am using a regular ClassificationTask

Let me know if that gives you any trouble!

Kongsea · 2020-01-19T07:08:39Z

Thank you @mannatsingh for your advice.
However, after wrap the model using ClassyModelWrapper, When I add a custom head to it using model.set_heads({"block3-2": {head.unique_id: head}}), the following error was raised:

Exception has occurred: ValueError
block block3-2 does not exist or can not be attached

So how to change the default head (fullyconnected layer) according to my custom dataset?
Thank you.

Kongsea · 2020-01-19T12:17:59Z

Finally, I found we can first build the block and then set the model head using:

resnet_model.build_attachable_block("block3-2", resnet_model.model.layer4)
resnet_model.set_heads({"block3-2": {head.unique_id: head}})

However, we met a new question: how to set to freeze some layer of the pretrained resnet model and train only the last layers and the fc head?
Could you give me some advice? @mannatsingh Thank you.

mannatsingh · 2020-01-21T15:54:20Z

@Kongsea unfortunately, allowing support to attach heads to a model requires a user to implement the model themselves (to propagate the output of the heads). So you will not be able to replace the original "head" with a ClassyHead. If you do want to accomplish that, you should copy the source implementation of your model and make changes to it in accordance with https://github.com/facebookresearch/ClassyVision/blob/master/classy_vision/models/resnext.py. This piece is intentionally not documented yet since it is liable to change in future versions.

With regard to freezing only the last later and the fc head, that isn't supported yet. What is supported is freezing the whole trunk and just training the fc head (https://classyvision.ai/tutorials/fine_tuning). Feel free to create a feature request for this!

P4ppenheimer · 2020-03-23T16:57:47Z

Hello all,

I'm basically in the same situation as Kongsea: I would like to have a simple possibility to use a pretrained model ideally configurable via the config file and with the option to freeze the trunk (or any other combination layer combination).

I'm using the standard classy_train.py file and just replace the model, which is loaded via the config, by a pretrained version with

task = build_task(config)

pretrained_resnet = torchvision.models.resnet50(pretrained=True, progress=True)
head = FullyConnectedHead(unique_id="default_head", num_classes=101, in_plane=2048)
pretrained_resnet = ClassyModelWrapper(pretrained_resnet)
    
pretrained_resnet.build_attachable_block("block3-2", pretrained_resnet.model.layer4)
pretrained_resnet.set_heads({"block3-2": {head.unique_id: head}})

task.set_model(pretrained_resnet)

But that doesn't allow me to freeze any layers.

@mannatsingh I have a couple of questions:

Is it planned to release a feature that allows easy use of on ImageNet pretrained models via the config and if that's the case what would be the time horizon?
Is there something wrong with not being able to replace the original "head" with a ClassyHead? If we attach a new ClassyHead, the original head should be ignored for optimizing, correct?
What exactly do you mean by "copy the source implementation of your model and make changes to it in accordance with [..]"? I mean sure, I could reimplement a model, as a ClassyModel, but how would I get the weights into it?
What would be the best way to use pretrained models with the option to freeze the trunk and just train the fc layer?
Thanks in advance

mannatsingh · 2020-03-24T02:50:14Z

Hi @P4ppenheimer , responding to your messages inline -

I'm using the standard classy_train.py file and just replace the model, which is loaded via the config, by a pretrained version with

task = build_task(config)

pretrained_resnet = torchvision.models.resnet50(pretrained=True, progress=True)
head = FullyConnectedHead(unique_id="default_head", num_classes=101, in_plane=2048)
pretrained_resnet = ClassyModelWrapper(pretrained_resnet)
    
pretrained_resnet.build_attachable_block("block3-2", pretrained_resnet.model.layer4)
pretrained_resnet.set_heads({"block3-2": {head.unique_id: head}})

task.set_model(pretrained_resnet)

But that doesn't allow me to freeze any layers.

This doesn't work because the heads can only be attached to ClassyBlocks within a ClassyModel, and this requires a model to define ClassyBlocks. At the moment, you can only set heads for the ResNe(X)t and DenseNet models within Classy Vision. What you're trying to use is a ResNet from torchvision, which will not work.

@mannatsingh I have a couple of questions:

Is it planned to release a feature that allows easy use of on ImageNet pretrained models via the config and if that's the case what would be the time horizon?

This would require us to host our own pre-trained models. We don't have an ETA for this, but I'm getting feedback from multiple people who are interested in this, so we can discuss internally to try get an ETA.

Is there something wrong with not being able to replace the original "head" with a ClassyHead? If we attach a new ClassyHead, the original head should be ignored for optimizing, correct?

The concept of a head (ClassyHead) is specific to Classy Vision. There is no way to tell which module is a head in a regular nn.Module, or where in the model a new ClassyHead can be attached. I realize that there are no docs for this - we need to improve our docs, but the exact API is still in flux.

What exactly do you mean by "copy the source implementation of your model and make changes to it in accordance with [..]"? I mean sure, I could reimplement a model, as a ClassyModel, but how would I get the weights into it?

You would have to write some code to 1) support attaching heads to that model by editing its implementation 2) modify the state dictionary of the checkpoint and make it compatible with the new model.
I would say that this is only doable by advanced PyTorch users who know what's happening under the hood inside a model's state.

What would be the best way to use pretrained models with the option to freeze the trunk and just train the fc layer?

You would need a pre-trained ClassyModel to do this easily. The other way is mentioned in my previous response.

The big thing I'm getting from this is that pre-trained classy models would be really helpful, and we should consider providing them to our users soon. cc @aadcock , @vreis

mannatsingh · 2020-04-09T01:19:46Z

@P4ppenheimer , @Kongsea I've just landed a series of commits which makes what you intended to do possible - you can attach heads to any PyTorch module! This probably needs to go in a tutorial, will get to that when I get some time. But in the meantime, the following works -

from torchvision import models

model = models.resnet50()


from classy_vision.models import ClassyModel

classy_model = ClassyModel.from_model(model)


from classy_vision.heads import FullyConnectedHead

head = FullyConnectedHead(unique_id="a", num_classes=100, in_plane=2048)
classy_model.set_heads({"layer4": {"a": head_2}})
input = torch.ones((1, 3,  224, 224))
classy_model(input).shape  # torch.Size([1, 100])

See #465 for a full description.

P4ppenheimer · 2020-04-09T09:40:51Z

Hello @mannatsingh,
thanks for reacting so quickly to our request.
Best,
Simon

CaiyuZhang · 2020-04-25T08:17:16Z

@mannatsingh
why this ClassyModel.from_model(model) can't work. I got an error AttributeError: type object 'ClassyModel' has no attribute 'from_model'

mannatsingh · 2020-04-26T04:52:08Z

@CaiyuZhang my guess would be that you're on old version of the code. You can checkout and install the latest master and the issue should go away. If it doesn't work, can you run the following and let me know what version you see -

import classy_vision
classy_vision.__version__

We're also planning to release our next version next week, so you will be able to get it using pip install classy_vision after that as well.

CaiyuZhang · 2020-04-27T08:58:00Z

@mannatsingh Yeah, I thought about this reason too, but I actually update classy_vision before running my code. But this error still happened, I will check out the version, no worry, I have loaded my trained models through other methods, thanks for ur reply.

Kongsea changed the title ~~Load Imagenet pretrained models by pytorch~~ Load pretrained Imagenet models by pytorch Jan 17, 2020

mannatsingh added the query Query about how to accomplish something label Jan 17, 2020

mannatsingh closed this as completed Jan 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Load pretrained Imagenet models by pytorch #347

Load pretrained Imagenet models by pytorch #347

Kongsea commented Jan 17, 2020 •

edited

mannatsingh commented Jan 17, 2020

Kongsea commented Jan 19, 2020

Kongsea commented Jan 19, 2020

mannatsingh commented Jan 21, 2020

P4ppenheimer commented Mar 23, 2020

mannatsingh commented Mar 24, 2020

mannatsingh commented Apr 9, 2020 •

edited

P4ppenheimer commented Apr 9, 2020

CaiyuZhang commented Apr 25, 2020

mannatsingh commented Apr 26, 2020

CaiyuZhang commented Apr 27, 2020

Load pretrained Imagenet models by pytorch #347

Load pretrained Imagenet models by pytorch #347

Comments

Kongsea commented Jan 17, 2020 • edited

mannatsingh commented Jan 17, 2020

Kongsea commented Jan 19, 2020

Kongsea commented Jan 19, 2020

mannatsingh commented Jan 21, 2020

P4ppenheimer commented Mar 23, 2020

mannatsingh commented Mar 24, 2020

mannatsingh commented Apr 9, 2020 • edited

P4ppenheimer commented Apr 9, 2020

CaiyuZhang commented Apr 25, 2020

mannatsingh commented Apr 26, 2020

CaiyuZhang commented Apr 27, 2020

Kongsea commented Jan 17, 2020 •

edited

mannatsingh commented Apr 9, 2020 •

edited