Skip to content

Provides VGG networks to do things like this: r11, r31, r51 = net.forward(targets=['relu1_1', 'relu3_1', 'relu5_1']). Nuff said. :)

License

Notifications You must be signed in to change notification settings

black-puppydog/pytorch_vgg_named

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Named VGG Feature Extractors

The networks provided here are the same (as in, the same weights and everything) as in torchvision.models.vgg, but they are built differently, so that you can extract lists of features in a single call like this:

r11, r31, r51 = net.forward(targets=['relu1_1', 'relu3_1', 'relu5_1'])

This is mostly useful for applications in Neural Style Transfer, where we often want to query sets of features from a network. For this purpose, there is also a function vgg19_normalized which loads the weights provided by Leon Gatys in his own implementation on github.

Installation

pip install pytorch-vgg-named

Usage

Use this like the regular torchvision.models.vgg module:

#! /urs/bin/env python3

import vgg_named
net = vgg_named.vgg19(pretrained=True).eval()
print(net)
# SequentialExtractor(
#   (conv1_1): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
#   (relu1_1): ReLU(inplace)
#   (conv1_2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
#   (relu1_2): ReLU(inplace)
#   (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
#   (conv2_1): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
#   (relu2_1): ReLU(inplace)
#   (conv2_2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
#   (relu2_2): ReLU(inplace)
#   (pool2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
#   [...]
#   (conv5_4): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
#   (relu5_4): ReLU(inplace)
#   (pool5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
#   (AdaPool): AdaptiveAvgPool2d(output_size=(7, 7))
#   (flatten): Flatten()
#   (fc6): Linear(in_features=25088, out_features=4096, bias=True)
#   (relu_fc6): ReLU(inplace)
#   (drop_fc6): Dropout(p=0.5)
#   (fc7): Linear(in_features=4096, out_features=4096, bias=True)
#   (relu_fc7): ReLU(inplace)
#   (drop_fc7): Dropout(p=0.5)
#   (fc8): Linear(in_features=4096, out_features=1000, bias=True)
# )

# create a small batch of inputs
import torch
images = torch.randn(4, 3, 224, 224)

with torch.no_grad():
  # call like a regular vgg network
  fc8 = net.forward(images)

  # extract one specific set of features
  c42 = net.forward(images, targets='conv4_2')
  print(f'c42.shape = {c42.shape}')
  # c42.shape = torch.Size([4, 512, 28, 28])

  # extract a list of features. Note that the elements do not have to be in any
  # particular order. Duplicates are allowed too.
  r31, c31, po5, fc6 = net.forward(images, targets=['relu3_1', 'conv3_1', 'pool5', 'fc6'])
  print(f'r31.shape = {r31.shape}\n'
        f'c31.shape = {c31.shape}\n'
        f'po5.shape = {po5.shape}\n'
        f'fc6.shape = {fc6.shape}')
  # r31.shape = torch.Size([4, 256, 56, 56])
  # c31.shape = torch.Size([4, 256, 56, 56])
  # po5.shape = torch.Size([4, 512, 7, 7])
  # fc6.shape = torch.Size([4, 4096])

  # if you only need the first few layers and want to shave some MB of the GPU
  # memory, you can prune the network:
  net.prune('conv2_1')
  print(net)
  # SequentialExtractor(
  #   (conv1_1): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  #   (relu1_1): ReLU(inplace)
  #   (conv1_2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  #   (relu1_2): ReLU(inplace)
  #   (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  #   (conv2_1): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  # )

For the normalized model only, the preprocessing should be done like this:

import torchvision.transforms as tvt
prep = tvt.Compose([tvt.ToTensor(),
                    # turn to BGR
                    tvt.Lambda(lambda x: x[torch.LongTensor([2,1,0])]), 
                    #subtract imagenet mean
                    tvt.Normalize(mean=[0.40760392, 0.45795686, 0.48501961],
                                  std=[1,1,1]),
                    # scale to expected input range
                    tvt.Lambda(lambda x: x.mul_(255))])

And postprocessing accordingly:

postp = tvt.Compose([tvt.Lambda(lambda x: x/255), # don't use in-place
                      # add imagenet mean
                      tvt.Normalize(mean=[-0.40760392, -0.45795686, -0.48501961],
                                    std=[1,1,1]),
                      # turn to RGB
                      tvt.Lambda(lambda x: x[torch.LongTensor([2,1,0])]),
                      tvt.Lambda(lambda x: torch.clamp(x, 0, 1)),
                      tvt.ToPILImage()])

For the standard VGG models, use the canonical pytorch normalization!

License

The files in this project are derived from the pytorch repository and are published under the same BSD-style license.

About

Provides VGG networks to do things like this: r11, r31, r51 = net.forward(targets=['relu1_1', 'relu3_1', 'relu5_1']). Nuff said. :)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages