Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support more flexible ways to instantiate models in torchvision.models, e.g., remove fc layers, support for pretrained=True and num_classes!=1000 #2200

Closed
lijiaqi opened this issue May 11, 2020 · 5 comments

Comments

@lijiaqi
Copy link

lijiaqi commented May 11, 2020

🚀 Feature

Support more flexible ways to instantiate models in torchvision.models,

e.g., instantiate ResNet-50 only for feature extraction without FC layer like Keras style:

torchvision.models.resnet50(include_top=False)

load pretrained weights when class number doesn't equals to 1000(imagenet):

torchvision.models.resnet50(pretrained=True, num_classes=10)

Motivation

In many situations, we need more flexiable ways to instantiate models in torchvision.models.

For examples, when finetuning a ResNet-50 classification model on a dataset of 10 classes, we need torchvision.models.resnet50(pretrained=True, num_classes=10), but it is not supported now. In current implementation, num_classes should be 1000 when pretrained=True. To implement this case, we should support partial copy from pretrained weights(load weights except for FC layer).

Another example: sometimes we need instantiate the models in torchvision.models as backbone, which means the FC layer is no more needed. Implementation in this case is also needed. We can support this case with an additional argument, like include_top=False in Keras.

Pitch

A possible solution is to modify some codes in models constructions. At least two more features should be realized:

  1. Support for loading partial weights: when num_classes!=1000, weights can still be loaded except for last FC layer.
  2. Support for backbone mode(FC layer moved): when an argument (like include_top) is set to False, the last layer(FC layer) will be moved.

We can apply these modifications to many basic models in torchvision.models.

@fmassa
Copy link
Member

fmassa commented May 11, 2020

Hi,

Thanks for opening the issue to start the discussion!

I have a few worries about the current approach:

  • include_top: what is a top? Is it the last FC layer? should it also include the last layer4 block or not? what about models like VGG, which have 3 MLP for the final classifier, should top be the last MLP or the last 3? There is no single answer for that, and instead of providing a solution that only works in a single case, it's IMO preferable to let the users decide themselves what they want to remove / include
  • loading partial weights: the same question applies, what if users want to change more than the number of classes, but also the internal number of features for the last classifier?

For those reasons, I inclined to say that we shouldn't be providing those high-level model manipulation blocks (which are limited in scope), but instead encourage users to modify the models as their framework / application need.

Thoughts?

@lijiaqi
Copy link
Author

lijiaqi commented May 12, 2020

Yes, I do understand your worries. But in my opinions:

  1. In many CV tasks, we remove all fc layers (perhaps also pooling layer after last conv layer) from these classification models (resnet-50, vgg-16, etc). The remained full convolutional backbones serve as feature exactors. I think we can add two arguments like the following to control that:
  • something like include_fc={True, False} to remove all fc layers, softmax classifiers,...

  • something like 'pooling={None, avg, adaptive}' to add pooling layer to the end or not, and the pooling type

  1. I think, in most cases, loading partial weights for the above situations can satisfies the most cases. At least, support for setting pretrained=True and num_classes!=1000 for the same time is a big puls for flexibility.

There are already some discussions for this:
#184, #190,
discussion about #173

@lijiaqi lijiaqi changed the title Support more flexible ways to instantiate models in torchvision.models, Support more flexible ways to instantiate models in torchvision.models, e.g., remove fc layers, support for pretrained=True and num_classes!=1000 May 12, 2020
@fmassa
Copy link
Member

fmassa commented May 12, 2020

Another issue that is related: #2152

I think that adding those extra arguments like include_fc={True, False} or pooling={None, avg, adaptive} makes it fairly complicated, but not generally more generic.

One could arguably say that they would want to control the number of fc layers (maybe adding more than the initial one?). I think that given the generality of the task (model surgery), I would prefer to keep the implementation simple as it facilitates the users modifying the code themselves.

This is because PyTorch model definition couples the layers (in the __init__) with the execution (in forward), so anything added / removed in the __init__ should also be handled in the forward, increasing complexity.

@lijiaqi
Copy link
Author

lijiaqi commented May 13, 2020

@fmassa OK. I will fork and do modifications in my own repository.

@fmassa
Copy link
Member

fmassa commented May 15, 2020

@ljqcava the way I would recommend doing it is to instead of forking torchvision (and having to install your forked version) to either:

Subclass the model that you want to extend (preferred)

Here is an example:

class MyResNet(torchvision.models.resnet.ResNet):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        del self.fc

    def forward(self, x):
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.relu(x)
        x = self.maxpool(x)

        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        x = self.layer4(x)
        return x

Copy-paste the whole resnet.py file

This is a solution for maximum flexibility. The benefits from copying the file instead of forking torchvision is that you can still use torchvision in your code (and new updates that comes with it), and just import your custom resnet implementation.
For example

import torchvision
import my_resnet

As such, I'm closing the issue but let me know if you have further questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants