## Input Shape of Tensor in Neural Network for PyTorch

------

### Understanding `nn.Linear()` Layer

This takes 2 parameters. input features and output features, which are the number of inputs and number of outputs.
This will create a weight matrix and bias vector randomly.

```
fc = nn.Linear(in_features=4, out_features=3, bias=False)
```

Parameters:

 - **in_features** – size of each input sample (i.e. size of x)
 - **out_features** – size of each output sample (i.e. size of y)
 - **bias** – If set to False, the layer will not learn an additive bias. Default: True

------------------

Note that the weights `W` have shape `(out_features, in_features)` and biases `b` have shape `(out_features)`. They are initialized randomly and can be changed later (e.g. during the training of a Neural Network they are updated by some optimization algorithm).


in the above line we have. We have defined a linear layer that accepts 4 input features and converts them into 3 output features, so we convert from 4 dimensional space to 3 dimensional space. We know that a weight matrix is ​​needed to perform this operation, but where is the weight matrix in this example?


The PyTorch LinearLayer class uses the numbers 4 and 3 passed to the constructor to create a 3 x 4 weight matrix. Let's verify this by looking at the PyTorch source code.

https://github.com/pytorch/pytorch/blob/master/torch/nn/modules/linear.py#L79

```py
def __init__(self, in_features: int, out_features: int, bias: bool = True,
                 device=None, dtype=None) -> None:
        factory_kwargs = {'device': device, 'dtype': dtype}
        super(Linear, self).__init__()
        self.in_features = in_features
        self.out_features = out_features
        self.weight = Parameter(torch.empty((out_features, in_features), **factory_kwargs))
        if bias:
            self.bias = Parameter(torch.empty(out_features, **factory_kwargs))
        else:
            self.register_parameter('bias', None)
        self.reset_parameters()
```

## Conv2d() Layer

The syntax must be like this.

#### a = Conv2d(in_channels, out_channels, kernel_size=(n, n), stride, padding, bias)

Parameters
* in_channels (int) - Number of channels in the input image
* out_channels (int) - Number of channels produced by the convolution

In Conv2d, you define input/output channel and kernel size and some arbitrary args like padding, not output size. Output size will be determined using kernel_size, stride, padding, etc.

An image in PyTorch has thre dimensions [channel, height, width]. So for a RGB image, [3, height, width]. If you want to get a 3 channel image as the result, you need to use a convolution that takes images with same channel size of your input which is 3, and 3 channels as the output, nn.Conv2d(3, 3, kernel_size) where kernel_size is the arbitrary size for filters.

Conv2d needs 2D kernels with 1 channel (grayscale mode, 3 in RGB). For having outputs with more than one, you need to run conv2d out_channel times using [1, k, k] size kernels so the result will be like [out_channel, h, w] because all the respones to out_channel different [1, k, k] kernels have been concatenated.

For instance, assume a case that your input image has 10 dimensions [batch_size, 10, h, w] and you want to have 3 as output channel, [batch_size, 3, h, w]. In this case, we need 3 different filters that each has size of [10, k, k]. Each one will create a output with size of [batch_size, 1, h, w] and finally all will be concatenated to have a output [batch, 3, h, w]. So, the kernel size in this case would be [3, 10, k, k].

-----------


### Various Tensor sizes

In [2]:
import torch
from torch import nn
from torch import Tensor
import torch.nn.functional as F

In [3]:

torch.Size([16])
    # 1d Tensor : [batch_size]
    # used for target labels or predictions.
torch.Size([16, 256])
    # 2D- Tensor : [batch_size, num_features (aka: C * H * W)]
    # use for nn.Linear() input.
torch.Size([10, 1, 2048])
    # 3-D Tensor : [batch_size, channels, num_features (aka: H * W)]
    # when used as nn.Conv1d() input.
    # (but [seq_len, batch_size, num_features]
    # if feeding an RNN).
torch.Size([16, 3, 28, 28])
    # 4-D Tensor : [batch_size, channels, height, width]
    # use for nn.Conv2d() input.
torch.Size([32, 1, 5, 15, 15])
    # 5D-Tensor: [batch_size, channels, depth, height, width]
    # use for nn.Conv3d() input.



torch.Size([32, 1, 5, 15, 15])



The key step is between the last convolution and the first `Linear` block.

- `Conv2d` outputs a tensor of shape `[batch_size, n_features_conv, height, width]` whereas

- `Linear` expects `[batch_size, n_features_lin]`.


To make the two align you need to "stack" the 3 dimensions `[n_features_conv, height, width]` into one `[n_features_lin]`. As follows, it must be that `n_features_lin == n_features_conv * height * width`. In the original code this "stacking" is achieved by

    x = x.view(-1, self.num_flat_features(x))


### Most Important Rule while Transitioning from Conv2D Layer to a Linear Layer

## For transitioning from a convolutional layer output to a linear layer input - I must resize Conv Layer output which is a 4d-Tensor to a 2d-Tensor using view.

#### An example, a conv output of [32, 21, 50, 50] should be “flattened” to become a [32, 21 * 50 * 50] tensor. 

#### And the in_features of the linear layer should also be set to [21 * 50 * 50].



---------------------


### Calculate the dimensions for nn.Linear()


The key step is between the last convolution and the first `Linear` block.

- `Conv2d` outputs a tensor of shape `[batch_size, n_features_conv, height, width]` whereas

- `Linear` expects `[batch_size, n_features_lin]`.

There are two, specifically important arguments for all nn.Linear layer networks that you should be aware of no matter how many layers deep your network is. The very first argument, and the very last argument.

----------------

#### If you want to pass in your 28 x 28 image into a linear layer, you have to know two things:

#### Your 28 x 28 pixel image can’t be input as a [28, 28] tensor. This is because nn.Linear will read it as 28 batches of 28-feature-length vectors. Since it expects an input of [batch_size, num_features], you have to transpose it somehow.


Your batch size passes unchanged through all your layers. No matter how your data changes as it passes through a network, your first dimension will end up being your batch_size even if you never see that number explicitly written anywhere in your network module’s definition.
Use view() to change your tensor’s dimensions.

`image = image.view(batch_size, -1)`

You supply your batch_size as the first number, and then “-1” basically tells Pytorch, “you figure out this other number for me… please.” Your tensor will now feed properly into any linear layer. Now we’re talking!

So then, to initialize the very first argument of your linear layer, pass it the number of features of your input data. For 28 x 28, our new view tensor is of size [1, 784] (1 * 28 * 28):
Example 3: Resize with view() to fit into a linear layer

```py
batch_size = 1
# Simulate a 28 x 28 pixel, grayscale "image"
input_image = torch.randn(1, 28, 28)


# Use view() to get [batch_size, num_features].
# -1 calculates the missing value given the other dim.
input_image = input_image.view(batch_size, -1) # torch.Size([1, 784])


# Intialize the linear layer.
fc = torch.nn.Linear(784, 10)
# Pass in the simulated image to the layer.

output = fc(input)
print(output.shape)

# torch.Size([1, 10])
```

------------------------------------

Hence the reason that the linear cannot be placed in the same Sequential function right after the conv layers. The reason is the input to linear needs to be reshaped. So you have to create two sections, one Sequntial for all the conv layers (and their pooling, and activations, …), and then the output of this is reshaped to a flat tensor of shape batch-size x ?, and then this is passed to the linear layers.

-------------------------------

Also note the following 

`Linear (50*50, 64)` is exactly the same as `Linear (2500, 64)`. 

The fact that you write the number 2500 as 50*50 does not somehow tell the layer to accept an input of shape [50, 50].)

-------------------------------


### Example from PyTorch https://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html

#### While Transitioning from a convolutional layer output to a linear layer input

our batch size passes unchanged through all your layers. No matter how your data changes as it passes through a network, your first dimension will end up being your batch_size even if you never see that number explicitly written anywhere in your network module’s definition.
Use view() to change your tensor’s dimensions.

image = image.view(batch_size, -1)

You supply your batch_size as the first number, and then “-1” basically tells Pytorch, “you figure out this other number for me… please.” Your tensor will now feed properly into any linear layer.

In [4]:
class Net(nn.Module):
    
    def __init__(self):
        super(Net, self).__init__()
        # 1 input image channel, 6 output channels, 5x5 square convolution
        # kernel
        self.conv1 = nn.Conv2d(1, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        # an affine operation: y = Wx + b
        self.fc1 = nn.Linear(16 * 5 * 5, 120)  # 5*5 from image dimension. And 16*5*5 gives me 400
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        # Max pooling over a (2, 2) window
        x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
        # If the size is a square, you can specify with a single number
        x = F.max_pool2d(F.relu(self.conv2(x)), 2)
        print('4-D Tensor Shape of x BEFORE Linear layer & before reshaping ', x.shape)
        print('Required Shape Conversion of x to a 2-D Tensor for feeding Linear layer ', (x.view(x.shape[0], -1)).shape)             
        # x = torch.flatten(x, 1) # flatten all dimensions except the batch dimension
        x = x.view(x.shape[0], -1)
        print('Shape of x AFTER re-shaping before feeding to first Linear layer ', x.shape)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x


net = Net()
input = torch.randn(32, 1, 32, 32)
output = net(input)

4-D Tensor Shape of x BEFORE Linear layer & before reshaping  torch.Size([32, 16, 5, 5])
Required Shape Conversion of x to a 2-D Tensor for feeding Linear layer  torch.Size([32, 400])
Shape of x AFTER re-shaping before feeding to first Linear layer  torch.Size([32, 400])
