## Replacing Fully-Connnected by Equivalent Convolutional Layers

In [1]:
import torch

In [2]:
inputs = torch.tensor([[[[1, 2],
                         [3, 4]]]], dtype=torch.float32)

# batch_size, channel, weight, height
inputs.shape

torch.Size([1, 1, 2, 2])

### Fully Connected

A fully connected layer, which maps the 4 input features two 2 outputs, would be computed as follows:

In [3]:
fc = torch.nn.Linear(4, 2)

# fc weight: [output_channel, input_channel]
print(fc.weight.size())
print(fc.bias.size())

weights = torch.tensor([[1.1, 1.2, 1.3, 1.4],
                        [1.5, 1.6, 1.7, 1.8]])
bias = torch.tensor([1.9, 2.0])

fc.weight.data = weights
fc.bias.data = bias

fc

torch.Size([2, 4])
torch.Size([2])


Linear(in_features=4, out_features=2, bias=True)

In [4]:
torch.relu(fc(inputs.view(-1, 4)))

tensor([[14.9000, 19.0000]], grad_fn=<ReluBackward0>)

### Convolution with Kernels equal to the input size

We can obtain the same outputs if we use convolutional layers where the kernel size is the same size as the input feature array:

In [5]:
conv = torch.nn.Conv2d(in_channels=1, out_channels=2,
                       kernel_size=inputs.squeeze(dim=(0)).squeeze(dim=(0)).size())

print(conv.weight.size())
print(conv.bias.size())

torch.Size([2, 1, 2, 2])
torch.Size([2])


In [6]:
conv.weight.data = weights.view(2, 1, 2, 2)
conv.bias.data = bias

In [7]:
torch.relu(conv(inputs))

tensor([[[[14.9000]],

         [[19.0000]]]], grad_fn=<ReluBackward0>)

### Convolution with 1x1 Kernels

Similarly, we can replace the fully connected layer using a convolutional layer when we reshape the input image into a num_inputs x 1 x 1 image:

In [8]:
conv = torch.nn.Conv2d(in_channels=4, out_channels=2, 
                       kernel_size=(1, 1))
conv.weight.data = weights.view(2, 4, 1, 1)
conv.bias.data = bias
torch.relu(conv(inputs.view(1, 4, 1, 1)))

tensor([[[[14.9000]],

         [[19.0000]]]], grad_fn=<ReluBackward0>)