<center><h1>Transposed Convolutions</h1></center>

In feed-forward neural networks and standard convolutional layers, we typically reduce the spatial size of the input — going from a large image or signal to a smaller representation (e.g., class scores).

However, in generative models (e.g., image generation), we often need to go in the opposite direction i.e. from a low-dimensional representation (latent vector) back to a high-dimensional output (full-sized image).

*Some generative models expand input dimensions using gradient-based optimization, relying on the backward pass of convolution layers. But there's a more direct way to do this during the forward pass: by using transposed convolution layers.*

A transposed convolution performs the reverse operation of a standard convolution, effectively expanding the input.

A convolution can be seen as a series of inner products, a transposed convolution can
be seen as a weighted sum of translated kernels.


We can compare on a simple 1d example the results of a standard and a transposed
convolution:
Where $a$ is an input and $w$ is kernel or weight Tensor.

In [13]:
import torch.nn as nn
import torch.nn.functional as F

In [10]:
a = torch.tensor([[[1., 0., 1., 1., 0., 1., 0.]]])
w = torch.tensor([[[1., 2., 3.]]])
print("Shape of a: ",a.shape, "Shape of w: ", w.shape)


Shape of a:  torch.Size([1, 1, 7]) Shape of w:  torch.Size([1, 1, 3])


In [11]:
# Standard 1D Colvolution
Standard_conv_out = F.conv1d(a, w)
print("Output: ", Standard_conv_out, "Shape of standard conv output: ",Standard_conv_out.shape)

Output:  tensor([[[4., 5., 3., 4., 2.]]]) Shape of standard conv output:  torch.Size([1, 1, 5])


The size of the resultant tensor after applying standard convolution is **less** than the input tensor.

In [12]:
# Transposed convolution:
transpose_conv_out = F.conv_transpose1d(a, w)
print("Output: ", transpose_conv_out, "Shape of transpose conv output: ",transpose_conv_out.shape)


Output:  tensor([[[1., 2., 4., 3., 5., 4., 2., 3., 0.]]]) Shape of transpose conv output:  torch.Size([1, 1, 9])


The size of the resultant tensor after applying transpose convolution is **more** than the input tensor.

This is a crucial step in understanding transposed convolution:

- Transposed convolution layers are implemented to mimic the backward pass of a standard convolution but done as a forward pass.

- This is why they're used in generative models to expand a compact representation into a larger output.



The class `nn.ConvTranspose1d` embeds that operation into a nn.Module.

We create A ConvTranspose1d module:

- 1 input channel and 1 output channel

- kernel size = 3

In [15]:
m = nn.ConvTranspose1d(1, 1, kernel_size=3)
a = torch.tensor([[[1., 0., 1., 1., 0., 1., 0.]]])

We then manually set the parameters:
- we makes the operation purely linear, with no bias.

- The kernel is [1,2,1]


In [16]:
with torch.autograd.no_grad():
    m.bias.zero_()
    m.weight.copy_(torch.tensor([[[1., 2., 1.]]]))

In [19]:
y = m(a)
print("Output: ", y, "Shape of transpose conv output: ",y.shape)

Output:  tensor([[[1., 2., 2., 3., 3., 2., 2., 1., 0.]]],
       grad_fn=<ConvolutionBackward0>) Shape of transpose conv output:  torch.Size([1, 1, 9])


Just like regular convolutions, transposed convolutions use:

- Stride: Controls the spacing between output positions.

- Padding: Controls cropping of the output.

- Dilation: Expands the kernel by inserting gaps (makes it sparse).

But There’s a Twist
- In standard convolution, stride and padding are defined with respect to the input.

- In transposed convolution, they are defined with respect to the output.

It affects how large the output will be, and how it is cropped or expanded.

- Stride in transposed convolution: It spreads out the influence of each input value across a wider region in the output.

- Padding in transposed convolution: It is often used to crop the output, not to preserve input size as in standard convolution.

- Dilation: Same behavior as in convolution, stretches the kernel by inserting zeros between its elements.