Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load pretrained Imagenet models by pytorch #347

Closed
Kongsea opened this issue Jan 17, 2020 · 11 comments
Closed

Load pretrained Imagenet models by pytorch #347

Kongsea opened this issue Jan 17, 2020 · 11 comments
Labels
query Query about how to accomplish something

Comments

@Kongsea
Copy link

Kongsea commented Jan 17, 2020

When I load a pretrained imagenet model by pytorch using a finetune task, an AssertionError: Checkpoint does not contain classy_state_dict was raised. So I want to know, how to load a imagenet model to initialize the backbone of a class model?
Thank you.

@Kongsea Kongsea changed the title Load Imagenet pretrained models by pytorch Load pretrained Imagenet models by pytorch Jan 17, 2020
@mannatsingh
Copy link
Contributor

Hi @Kongsea ! If I understand the problem correctly, you are trying to load a pretrained model from torchvision (or elsewhere) and fine tune it with Classy Vision.

The recommended approach for doing something like this would be to do the following -

import torchvision.models as models
my_model = models.resnet18(pretrained=True)

from classy_vision.models import ClassyModelWrapper
my_classy_model = ClassyModelWrapper(my_model)
my_task = ClassificationTask().set_model(my_classy_model)
...

Note that I'm doing two things here -

  • Converting a PyTorch model to a ClassyModel using ClassyModelWrapper
  • Instead of using a FineTuningTask, I am using a regular ClassificationTask

Let me know if that gives you any trouble!

@mannatsingh mannatsingh added the query Query about how to accomplish something label Jan 17, 2020
@Kongsea
Copy link
Author

Kongsea commented Jan 19, 2020

Thank you @mannatsingh for your advice.
However, after wrap the model using ClassyModelWrapper, When I add a custom head to it using model.set_heads({"block3-2": {head.unique_id: head}}), the following error was raised:

Exception has occurred: ValueError
block block3-2 does not exist or can not be attached

So how to change the default head (fullyconnected layer) according to my custom dataset?
Thank you.

@Kongsea
Copy link
Author

Kongsea commented Jan 19, 2020

Finally, I found we can first build the block and then set the model head using:

resnet_model.build_attachable_block("block3-2", resnet_model.model.layer4)
resnet_model.set_heads({"block3-2": {head.unique_id: head}})

However, we met a new question: how to set to freeze some layer of the pretrained resnet model and train only the last layers and the fc head?
Could you give me some advice? @mannatsingh Thank you.

@mannatsingh
Copy link
Contributor

@Kongsea unfortunately, allowing support to attach heads to a model requires a user to implement the model themselves (to propagate the output of the heads). So you will not be able to replace the original "head" with a ClassyHead. If you do want to accomplish that, you should copy the source implementation of your model and make changes to it in accordance with https://github.com/facebookresearch/ClassyVision/blob/master/classy_vision/models/resnext.py. This piece is intentionally not documented yet since it is liable to change in future versions.

With regard to freezing only the last later and the fc head, that isn't supported yet. What is supported is freezing the whole trunk and just training the fc head (https://classyvision.ai/tutorials/fine_tuning). Feel free to create a feature request for this!

@P4ppenheimer
Copy link

Hello all,

I'm basically in the same situation as Kongsea: I would like to have a simple possibility to use a pretrained model ideally configurable via the config file and with the option to freeze the trunk (or any other combination layer combination).

I'm using the standard classy_train.py file and just replace the model, which is loaded via the config, by a pretrained version with

task = build_task(config)

pretrained_resnet = torchvision.models.resnet50(pretrained=True, progress=True)
head = FullyConnectedHead(unique_id="default_head", num_classes=101, in_plane=2048)
pretrained_resnet = ClassyModelWrapper(pretrained_resnet)
    
pretrained_resnet.build_attachable_block("block3-2", pretrained_resnet.model.layer4)
pretrained_resnet.set_heads({"block3-2": {head.unique_id: head}})

task.set_model(pretrained_resnet)

But that doesn't allow me to freeze any layers.

@mannatsingh I have a couple of questions:

  • Is it planned to release a feature that allows easy use of on ImageNet pretrained models via the config and if that's the case what would be the time horizon?
  • Is there something wrong with not being able to replace the original "head" with a ClassyHead? If we attach a new ClassyHead, the original head should be ignored for optimizing, correct?
  • What exactly do you mean by "copy the source implementation of your model and make changes to it in accordance with [..]"? I mean sure, I could reimplement a model, as a ClassyModel, but how would I get the weights into it?
  • What would be the best way to use pretrained models with the option to freeze the trunk and just train the fc layer?
    Thanks in advance

@mannatsingh
Copy link
Contributor

Hi @P4ppenheimer , responding to your messages inline -

I'm using the standard classy_train.py file and just replace the model, which is loaded via the config, by a pretrained version with

task = build_task(config)

pretrained_resnet = torchvision.models.resnet50(pretrained=True, progress=True)
head = FullyConnectedHead(unique_id="default_head", num_classes=101, in_plane=2048)
pretrained_resnet = ClassyModelWrapper(pretrained_resnet)
    
pretrained_resnet.build_attachable_block("block3-2", pretrained_resnet.model.layer4)
pretrained_resnet.set_heads({"block3-2": {head.unique_id: head}})

task.set_model(pretrained_resnet)

But that doesn't allow me to freeze any layers.

This doesn't work because the heads can only be attached to ClassyBlocks within a ClassyModel, and this requires a model to define ClassyBlocks. At the moment, you can only set heads for the ResNe(X)t and DenseNet models within Classy Vision. What you're trying to use is a ResNet from torchvision, which will not work.

@mannatsingh I have a couple of questions:

  • Is it planned to release a feature that allows easy use of on ImageNet pretrained models via the config and if that's the case what would be the time horizon?

This would require us to host our own pre-trained models. We don't have an ETA for this, but I'm getting feedback from multiple people who are interested in this, so we can discuss internally to try get an ETA.

  • Is there something wrong with not being able to replace the original "head" with a ClassyHead? If we attach a new ClassyHead, the original head should be ignored for optimizing, correct?

The concept of a head (ClassyHead) is specific to Classy Vision. There is no way to tell which module is a head in a regular nn.Module, or where in the model a new ClassyHead can be attached. I realize that there are no docs for this - we need to improve our docs, but the exact API is still in flux.

  • What exactly do you mean by "copy the source implementation of your model and make changes to it in accordance with [..]"? I mean sure, I could reimplement a model, as a ClassyModel, but how would I get the weights into it?

You would have to write some code to 1) support attaching heads to that model by editing its implementation 2) modify the state dictionary of the checkpoint and make it compatible with the new model.
I would say that this is only doable by advanced PyTorch users who know what's happening under the hood inside a model's state.

  • What would be the best way to use pretrained models with the option to freeze the trunk and just train the fc layer?

You would need a pre-trained ClassyModel to do this easily. The other way is mentioned in my previous response.

The big thing I'm getting from this is that pre-trained classy models would be really helpful, and we should consider providing them to our users soon. cc @aadcock , @vreis

@mannatsingh
Copy link
Contributor

mannatsingh commented Apr 9, 2020

@P4ppenheimer , @Kongsea I've just landed a series of commits which makes what you intended to do possible - you can attach heads to any PyTorch module! This probably needs to go in a tutorial, will get to that when I get some time. But in the meantime, the following works -

from torchvision import models

model = models.resnet50()


from classy_vision.models import ClassyModel

classy_model = ClassyModel.from_model(model)


from classy_vision.heads import FullyConnectedHead

head = FullyConnectedHead(unique_id="a", num_classes=100, in_plane=2048)
classy_model.set_heads({"layer4": {"a": head_2}})
input = torch.ones((1, 3,  224, 224))
classy_model(input).shape  # torch.Size([1, 100])

See #465 for a full description.

@P4ppenheimer
Copy link

Hello @mannatsingh,
thanks for reacting so quickly to our request.
Best,
Simon

@CaiyuZhang
Copy link

@mannatsingh
why this ClassyModel.from_model(model) can't work. I got an error AttributeError: type object 'ClassyModel' has no attribute 'from_model'
image

@mannatsingh
Copy link
Contributor

@CaiyuZhang my guess would be that you're on old version of the code. You can checkout and install the latest master and the issue should go away. If it doesn't work, can you run the following and let me know what version you see -

import classy_vision
classy_vision.__version__

We're also planning to release our next version next week, so you will be able to get it using pip install classy_vision after that as well.

@CaiyuZhang
Copy link

@mannatsingh Yeah, I thought about this reason too, but I actually update classy_vision before running my code. But this error still happened, I will check out the version, no worry, I have loaded my trained models through other methods, thanks for ur reply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
query Query about how to accomplish something
Projects
None yet
Development

No branches or pull requests

4 participants