Support more flexible ways to instantiate models in `torchvision.models`, e.g., remove fc layers, support for `pretrained=True` and `num_classes！=1000` #2200

lijiaqi · 2020-05-11T12:52:32Z

🚀 Feature

Support more flexible ways to instantiate models in torchvision.models,

e.g., instantiate ResNet-50 only for feature extraction without FC layer like Keras style:

torchvision.models.resnet50(include_top=False)

load pretrained weights when class number doesn't equals to 1000(imagenet):

torchvision.models.resnet50(pretrained=True, num_classes=10)

Motivation

In many situations, we need more flexiable ways to instantiate models in torchvision.models.

For examples, when finetuning a ResNet-50 classification model on a dataset of 10 classes, we need torchvision.models.resnet50(pretrained=True, num_classes=10), but it is not supported now. In current implementation, num_classes should be 1000 when pretrained=True. To implement this case, we should support partial copy from pretrained weights(load weights except for FC layer).

Another example: sometimes we need instantiate the models in torchvision.models as backbone, which means the FC layer is no more needed. Implementation in this case is also needed. We can support this case with an additional argument, like include_top=False in Keras.

Pitch

A possible solution is to modify some codes in models constructions. At least two more features should be realized:

Support for loading partial weights: when num_classes!=1000, weights can still be loaded except for last FC layer.
Support for backbone mode(FC layer moved): when an argument (like include_top) is set to False, the last layer(FC layer) will be moved.

We can apply these modifications to many basic models in torchvision.models.

The text was updated successfully, but these errors were encountered:

fmassa · 2020-05-11T13:03:54Z

Hi,

Thanks for opening the issue to start the discussion!

I have a few worries about the current approach:

include_top: what is a top? Is it the last FC layer? should it also include the last layer4 block or not? what about models like VGG, which have 3 MLP for the final classifier, should top be the last MLP or the last 3? There is no single answer for that, and instead of providing a solution that only works in a single case, it's IMO preferable to let the users decide themselves what they want to remove / include
loading partial weights: the same question applies, what if users want to change more than the number of classes, but also the internal number of features for the last classifier?

For those reasons, I inclined to say that we shouldn't be providing those high-level model manipulation blocks (which are limited in scope), but instead encourage users to modify the models as their framework / application need.

Thoughts?

lijiaqi · 2020-05-12T11:27:49Z

Yes, I do understand your worries. But in my opinions:

In many CV tasks, we remove all fc layers (perhaps also pooling layer after last conv layer) from these classification models (resnet-50, vgg-16, etc). The remained full convolutional backbones serve as feature exactors. I think we can add two arguments like the following to control that:

something like include_fc={True, False} to remove all fc layers, softmax classifiers,...
something like 'pooling={None, avg, adaptive}' to add pooling layer to the end or not, and the pooling type

I think, in most cases, loading partial weights for the above situations can satisfies the most cases. At least, support for setting pretrained=True and num_classes!=1000 for the same time is a big puls for flexibility.

There are already some discussions for this:
#184, #190,
discussion about #173

fmassa · 2020-05-12T13:35:51Z

Another issue that is related: #2152

I think that adding those extra arguments like include_fc={True, False} or pooling={None, avg, adaptive} makes it fairly complicated, but not generally more generic.

One could arguably say that they would want to control the number of fc layers (maybe adding more than the initial one?). I think that given the generality of the task (model surgery), I would prefer to keep the implementation simple as it facilitates the users modifying the code themselves.

This is because PyTorch model definition couples the layers (in the __init__) with the execution (in forward), so anything added / removed in the __init__ should also be handled in the forward, increasing complexity.

lijiaqi · 2020-05-13T09:35:22Z

@fmassa OK. I will fork and do modifications in my own repository.

fmassa · 2020-05-15T13:16:03Z

@ljqcava the way I would recommend doing it is to instead of forking torchvision (and having to install your forked version) to either:

Subclass the model that you want to extend (preferred)

Here is an example:

class MyResNet(torchvision.models.resnet.ResNet):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        del self.fc

    def forward(self, x):
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.relu(x)
        x = self.maxpool(x)

        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        x = self.layer4(x)
        return x

Copy-paste the whole `resnet.py` file

This is a solution for maximum flexibility. The benefits from copying the file instead of forking torchvision is that you can still use torchvision in your code (and new updates that comes with it), and just import your custom resnet implementation.
For example

import torchvision
import my_resnet

As such, I'm closing the issue but let me know if you have further questions.

fmassa added module: models needs discussion labels May 11, 2020

lijiaqi mentioned this issue May 12, 2020

Fully convolutional version of all vgg models via constructor argument #184

Open

lijiaqi changed the title ~~Support more flexible ways to instantiate models in torchvision.models,~~ Support more flexible ways to instantiate models in torchvision.models, e.g., remove fc layers, support for pretrained=True and num_classes！=1000 May 12, 2020

fmassa mentioned this issue May 12, 2020

Make number of input channels in classification models configurable #2152

Closed

fmassa closed this as completed May 15, 2020

fortunto2 mentioned this issue Nov 19, 2020

feature_extractor for Resnet pytorch/serve#790

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support more flexible ways to instantiate models in `torchvision.models`, e.g., remove fc layers, support for `pretrained=True` and `num_classes！=1000` #2200

Support more flexible ways to instantiate models in `torchvision.models`, e.g., remove fc layers, support for `pretrained=True` and `num_classes！=1000` #2200

lijiaqi commented May 11, 2020

fmassa commented May 11, 2020

lijiaqi commented May 12, 2020

fmassa commented May 12, 2020

lijiaqi commented May 13, 2020

fmassa commented May 15, 2020

Support more flexible ways to instantiate models in torchvision.models, e.g., remove fc layers, support for pretrained=True and num_classes！=1000 #2200

Support more flexible ways to instantiate models in torchvision.models, e.g., remove fc layers, support for pretrained=True and num_classes！=1000 #2200

Comments

lijiaqi commented May 11, 2020

🚀 Feature

Motivation

Pitch

fmassa commented May 11, 2020

lijiaqi commented May 12, 2020

fmassa commented May 12, 2020

lijiaqi commented May 13, 2020

fmassa commented May 15, 2020

Subclass the model that you want to extend (preferred)

Copy-paste the whole resnet.py file

Support more flexible ways to instantiate models in `torchvision.models`, e.g., remove fc layers, support for `pretrained=True` and `num_classes！=1000` #2200

Support more flexible ways to instantiate models in `torchvision.models`, e.g., remove fc layers, support for `pretrained=True` and `num_classes！=1000` #2200

Copy-paste the whole `resnet.py` file