Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any Pre-trained models on ImageNet #13

Closed
zhanghang1989 opened this issue Aug 24, 2017 · 6 comments
Closed

Any Pre-trained models on ImageNet #13

zhanghang1989 opened this issue Aug 24, 2017 · 6 comments

Comments

@zhanghang1989
Copy link

Is there pre-trained models ready to use?

@gpleiss
Copy link
Owner

gpleiss commented Aug 24, 2017

We don't have pre-trained models that were trained using this implementation. It should be possible to convert the existing pre-trained imagenet models to work with this implementation.

@ZhengRui
Copy link
Contributor

ZhengRui commented Sep 8, 2017

I converted densenet121 if anyone is interested, you can download from here, but that requires some changes to the DenseNetEfficient class:

class DenseNetEfficient(nn.Module):
    r"""Densenet-BC model class, based on
    `"Densely Connected Convolutional Networks" <https://arxiv.org/pdf/1608.06993.pdf>`

    This model uses shared memory allocations for the outputs of batch norm and
    concat operations, as described in `"Memory-Efficient Implementation of DenseNets"`.

    Args:
        growth_rate (int) - how many filters to add each layer (`k` in paper)
        block_config (list of 4 ints) - how many layers in each pooling block
        num_init_features (int) - the number of filters to learn in the first convolution layer
        bn_size (int) - multiplicative factor for number of bottle neck layers
          (i.e. bn_size * k features in the bottleneck layer)
        drop_rate (float) - dropout rate after each dense layer
        num_classes (int) - number of classification classes
    """
    def __init__(self, growth_rate=12, block_config=(16, 16, 16), compression=0.5,
                 num_init_features=24, bn_size=4, drop_rate=0,
                 num_classes=10, cifar=True):

        super(DenseNetEfficient, self).__init__()
        assert 0 < compression <= 1, 'compression of densenet should be between 0 and 1'
        self.avgpool_size = 8 if cifar else 7

        # First convolution
        if cifar:
            self.features = nn.Sequential(OrderedDict([
                ('conv0', nn.Conv2d(3, num_init_features, kernel_size=3, stride=1, padding=1, bias=False)),
            ]))
        else:
            self.features = nn.Sequential(OrderedDict([
                ('conv0', nn.Conv2d(3, num_init_features, kernel_size=7, stride=2, padding=3, bias=False)),
            ]))
            self.features.add_module('norm0', nn.BatchNorm2d(num_init_features))
            self.features.add_module('relu0', nn.ReLU(inplace=True))
            self.features.add_module('pool0', nn.MaxPool2d(kernel_size=3, stride=2, padding=1,
                                                           ceil_mode=False))


        # Each denseblock
        num_features = num_init_features
        for i, num_layers in enumerate(block_config):
            block = _DenseBlock(num_layers=num_layers,
                                num_input_features=num_features,
                                bn_size=bn_size, growth_rate=growth_rate,
                                drop_rate=drop_rate)
            self.features.add_module('denseblock%d' % (i + 1), block)
            num_features = num_features + num_layers * growth_rate
            if i != len(block_config) - 1:
                trans = _Transition(num_input_features=num_features,
                                    num_output_features=int(num_features
                                                            * compression))
                self.features.add_module('transition%d' % (i + 1), trans)
                num_features = int(num_features * compression)

        # Final batch norm
        self.features.add_module('norm_final', nn.BatchNorm2d(num_features))

        # Linear layer
        self.classifier = nn.Linear(num_features, num_classes)

    def forward(self, x):
        features = self.features(x)
        out = F.relu(features, inplace=True)
        out = F.avg_pool2d(out, kernel_size=self.avgpool_size).view(
            features.size(0), -1)
        out = self.classifier(out)
        return out

then you can instantiate the densenet121 model

densenet = DenseNetEfficient(growth_rate=32, block_config=[6,12,24,16], num_classes=1000, cifar=False, num_init_features=64)
densenet.load_state_dict(torch.load(pretrained_model_path))

@gpleiss
Copy link
Owner

gpleiss commented Sep 11, 2017

Thanks @ZhengRui! Any chance you can make a PR adding those changes?

@ZhengRui
Copy link
Contributor

ZhengRui commented Sep 11, 2017

@gpleiss Pull request sent, you may take a look. I also make multi-gpu and single-gpu having the same module names, so models can be shared between them.

@gpleiss
Copy link
Owner

gpleiss commented Sep 22, 2017

Merged #19 - closes this issue

@mingminzhen
Copy link

I have converted the pretrained models to pytorch efficient models. You can download from here. Or you can convert it yourself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants