<a href="https://colab.research.google.com/github/wisdomscode/AI-Lab-Deep-Learning-PyTorch/blob/main/AI_Lab_Project2_1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Common Errors In PyTorch Sequential Models


PyTorch's `nn.Sequential` offers a streamlined method for constructing neural networks. This sequential approach is particularly effective for crafting straightforward architectures where each layer directly feeds into the subsequent one. However, when building these models, it's crucial to exercise caution, as inadvertent layer sizing issues can result in runtime errors or unexpected model behavior.

We'll consider the following common error scenarios when constructing a PyTorch Sequential model:

* Incorrect layer sizing, both in the input layer and after flattening a convolutional layer
* Accidental layer duplication
* Forgetting to flatten between convolutional and linear layers
These scenarios will emphasize the importance of paying careful attention and verification when creating neural network architectures.

###  Error Caused By Incorrect Input Layer Size

Let's start with imports, then define and run a PyTorch model.

In [None]:
import torch
import torch.nn as nn

In [None]:
# Define a model
model = torch.nn.Sequential()
linear1 = nn.Linear(in_features=3200, out_features=128)
model.append(linear1)
model.append(torch.nn.ReLU())
model.append(torch.nn.Dropout())
linear2 = nn.Linear(in_features=128, out_features=64)
model.append(linear2)
model.append(torch.nn.ReLU())
model.append(torch.nn.Dropout())
linear3 = nn.Linear(in_features=64, out_features=10)
model.append(linear3)
model.append(torch.nn.ReLU())
model.append(torch.nn.Dropout())
print(model)

# Create a random input tensor
input_tensor = torch.randn(size=(32, 100))
print("Input shape:", input_tensor.shape)

# Run the model
output = model(input_tensor)
print("Output shape:", output.shape)


# output
Sequential(
  (0): Linear(in_features=3200, out_features=128, bias=True)
  (1): ReLU()
  (2): Dropout(p=0.5, inplace=False)
  (3): Linear(in_features=128, out_features=64, bias=True)
  (4): ReLU()
  (5): Dropout(p=0.5, inplace=False)
  (6): Linear(in_features=64, out_features=10, bias=True)
  (7): ReLU()
  (8): Dropout(p=0.5, inplace=False)
)
Input shape: torch.Size([32, 100])

# Error Message
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[4], line 22
     19 print("Input shape:", input_tensor.shape)
     21 # Run the model
---> 22 output = model(input_tensor)
     23 print("Output shape:", output.shape)

File /usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py:1511, in Module._wrapped_call_impl(self, *args, **kwargs)
   1509     return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
   1510 else:
-> 1511     return self._call_impl(*args, **kwargs)

File /usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py:1520, in Module._call_impl(self, *args, **kwargs)
   1515 # If we don't have any hooks, we want to skip the rest of the logic in
   1516 # this function, and just call forward.
   1517 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1518         or _global_backward_pre_hooks or _global_backward_hooks
   1519         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1520     return forward_call(*args, **kwargs)
   1522 try:
   1523     result = None

File /usr/local/lib/python3.11/site-packages/torch/nn/modules/container.py:217, in Sequential.forward(self, input)
    215 def forward(self, input):
    216     for module in self:
--> 217         input = module(input)
    218     return input

File /usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py:1511, in Module._wrapped_call_impl(self, *args, **kwargs)
   1509     return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
   1510 else:
-> 1511     return self._call_impl(*args, **kwargs)

File /usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py:1520, in Module._call_impl(self, *args, **kwargs)
   1515 # If we don't have any hooks, we want to skip the rest of the logic in
   1516 # this function, and just call forward.
   1517 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1518         or _global_backward_pre_hooks or _global_backward_hooks
   1519         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1520     return forward_call(*args, **kwargs)
   1522 try:
   1523     result = None

File /usr/local/lib/python3.11/site-packages/torch/nn/modules/linear.py:116, in Linear.forward(self, input)
    115 def forward(self, input: Tensor) -> Tensor:
--> 116     return F.linear(input, self.weight, self.bias)

RuntimeError: mat1 and mat2 shapes cannot be multiplied (32x100 and 3200x128)

The code execution fails, resulting in a `RuntimeError`. When interpreting a Python stack trace, always begin at the **bottom** and work your way **up**. PyTorch errors are typically more verbose than standard Python errors. In this instance, we're encountering a `RuntimeError`, signaling an issue during model execution. Scanning **upwards**, we find the line `output = model(input_tensor)` near the top. The lower lines in the stack trace show different function calls from the PyTorch library, but they're not the most important part of the error message. The key is to focus on the code that was written in the notebook.

The line `output = model(input_tensor)` triggers the error as it feeds the input data to the model for the forward pass. During the forward pass, the model sequentially applies layers and operations to compute output predictions. The failure occurs due to misaligned layer dimensions. The error message provides critical information: `mat1 and mat2 shapes cannot be multiplied (32x100 and 3200x128)`.

At the heart of deep learning is matrix algebra. Matrix algebra has very specific rules for matrix multiplication. One crucial rule is that the number of columns in the first matrix (mat1) must equal the number of rows in the second matrix (mat2). This condition is not met in the current input, where the first matrix has 100 columns and the second matrix has 3200 rows.

This error commonly stems from incorrectly specifying layer dimensions, particularly in the input layer. The value passed to the `in_features` argument must exactly match the input data dimensions. Dimensional mismatches often lead to runtime errors or unexpected model behavior.

To resolve this issue, let's carefully review the code to identify the dimensional inconsistency. Printing the shape of the model and data can help debugging. This error can be fixed by adjusting the first layer to `in_features=100`, thus matching the size of the input tensor. Let's define another version of the model with that change.

In [None]:
# Define a revised model
model = torch.nn.Sequential()
linear1 = nn.Linear(
    in_features=100, out_features=128
)  # This line was changed to match input size
model.append(linear1)
model.append(torch.nn.ReLU())
model.append(torch.nn.Dropout())
linear2 = nn.Linear(in_features=128, out_features=64)
model.append(linear2)
model.append(torch.nn.ReLU())
model.append(torch.nn.Dropout())
linear3 = nn.Linear(in_features=64, out_features=10)
model.append(linear3)
model.append(torch.nn.ReLU())
model.append(torch.nn.Dropout())
print(model)

# Create a random input tensor
input_tensor = torch.randn(size=(32, 100))
print("Input shape:", input_tensor.shape)

# Run the model
output = model(input_tensor)
print("Output shape:", output.shape)

# output
Sequential(
  (0): Linear(in_features=100, out_features=128, bias=True)
  (1): ReLU()
  (2): Dropout(p=0.5, inplace=False)
  (3): Linear(in_features=128, out_features=64, bias=True)
  (4): ReLU()
  (5): Dropout(p=0.5, inplace=False)
  (6): Linear(in_features=64, out_features=10, bias=True)
  (7): ReLU()
  (8): Dropout(p=0.5, inplace=False)
)
Input shape: torch.Size([32, 100])
Output shape: torch.Size([32, 10])

### Error Caused By Adding The Same Layer Twice

Let's build another PyTorch model, this time a Convolutional Neural Network (CNN).

In [None]:
# Define a model
model = torch.nn.Sequential()
conv1 = nn.Conv2d(in_channels=3, out_channels=16, kernel_size=3, padding=1)
model.append(conv1)
model.append(torch.nn.ReLU())
max_pool1 = nn.MaxPool2d(2, 2)
model.append(max_pool1)
model.append(conv1)  # Add the same layer again
model.append(torch.nn.ReLU())
model.append(max_pool1)
print(model)

# Create a random input tensor
input_tensor = torch.randn(size=(1, 3, 224, 224))
print("Input shape:", input_tensor.shape)

# Run the model
output = model(input_tensor)
print("Output shape:", output.shape)

# output
Sequential(
  (0): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (1): ReLU()
  (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (3): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (4): ReLU()
  (5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
Input shape: torch.Size([1, 3, 224, 224])

# Error Message
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[9], line 18
     15 print("Input shape:", input_tensor.shape)
     17 # Run the model
---> 18 output = model(input_tensor)
     19 print("Output shape:", output.shape)

File /usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py:1511, in Module._wrapped_call_impl(self, *args, **kwargs)
   1509     return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
   1510 else:
-> 1511     return self._call_impl(*args, **kwargs)

File /usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py:1520, in Module._call_impl(self, *args, **kwargs)
   1515 # If we don't have any hooks, we want to skip the rest of the logic in
   1516 # this function, and just call forward.
   1517 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1518         or _global_backward_pre_hooks or _global_backward_hooks
   1519         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1520     return forward_call(*args, **kwargs)
   1522 try:
   1523     result = None

File /usr/local/lib/python3.11/site-packages/torch/nn/modules/container.py:217, in Sequential.forward(self, input)
    215 def forward(self, input):
    216     for module in self:
--> 217         input = module(input)
    218     return input

File /usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py:1511, in Module._wrapped_call_impl(self, *args, **kwargs)
   1509     return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
   1510 else:
-> 1511     return self._call_impl(*args, **kwargs)

File /usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py:1520, in Module._call_impl(self, *args, **kwargs)
   1515 # If we don't have any hooks, we want to skip the rest of the logic in
   1516 # this function, and just call forward.
   1517 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1518         or _global_backward_pre_hooks or _global_backward_hooks
   1519         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1520     return forward_call(*args, **kwargs)
   1522 try:
   1523     result = None

File /usr/local/lib/python3.11/site-packages/torch/nn/modules/conv.py:460, in Conv2d.forward(self, input)
    459 def forward(self, input: Tensor) -> Tensor:
--> 460     return self._conv_forward(input, self.weight, self.bias)

File /usr/local/lib/python3.11/site-packages/torch/nn/modules/conv.py:456, in Conv2d._conv_forward(self, input, weight, bias)
    452 if self.padding_mode != 'zeros':
    453     return F.conv2d(F.pad(input, self._reversed_padding_repeated_twice, mode=self.padding_mode),
    454                     weight, bias, self.stride,
    455                     _pair(0), self.dilation, self.groups)
--> 456 return F.conv2d(input, weight, bias, self.stride,
    457                 self.padding, self.dilation, self.groups)

RuntimeError: Given groups=1, weight of size [16, 3, 3, 3], expected input[1, 16, 112, 112] to have 3 channels, but got 16 channels instead

When the cell is executed, it also generates a `RuntimeError`. The message "Given groups=1, weight of size [16, 3, 3, 3], expected input[1, 16, 224, 224] to have 3 channels, but got 16 channels instead" is a clue that the dimensions do not line up correctly. That is caused because the same layer was accidentally added twice to a model. This is typically caused by a copy-and-paste mistake.

To resolve this issue, you'll need to carefully review your code and identify where this dimensional inconsistency occurs. Pay particular attention to the layer where you might have accidentally duplicated a component, leading to unexpected channel dimensions.

The code can be fixed by adding a different layer that has the appropriate dimensions.

In [None]:
# Define a revised model
model = torch.nn.Sequential()
conv1 = nn.Conv2d(in_channels=3, out_channels=16, kernel_size=3, padding=1)
model.append(conv1)
model.append(torch.nn.ReLU())
max_pool1 = nn.MaxPool2d(2, 2)
model.append(max_pool1)
# Define and add a new layer instead of the same previous one
conv2 = nn.Conv2d(in_channels=16, out_channels=32, kernel_size=3, padding=1)
model.append(conv2)
model.append(torch.nn.ReLU())
max_pool2 = nn.MaxPool2d(2, 2)
model.append(max_pool2)
print(model)

# Create a random input tensor
input_tensor = torch.randn(size=(1, 3, 224, 224))
print("Input shape:", input_tensor.shape)

# Run the model
output = model(input_tensor)
print("Output shape:", output.shape)

# output
Sequential(
  (0): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (1): ReLU()
  (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (3): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (4): ReLU()
  (5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
Input shape: torch.Size([1, 3, 224, 224])
Output shape: torch.Size([1, 32, 56, 56])

**Task 2.1.2:** Fix a `RuntimeError` by not adding the same layer twice.

In [None]:
# Define a model
model = torch.nn.Sequential()
conv1 = nn.Conv2d(in_channels=1, out_channels=8, kernel_size=2, padding=1)
model.append(conv1)
model.append(torch.nn.ReLU())
max_pool1 = nn.MaxPool2d(2, 2)
model.append(max_pool1)
conv2 = nn.Conv2d(in_channels=8, out_channels=16, kernel_size=3, padding=1)
model.append(conv2)
model.append(torch.nn.ReLU())
max_pool2 = nn.MaxPool2d(2, 2)
model.append(max_pool2)
conv3 = nn.Conv2d(in_channels=16, out_channels=8, kernel_size=3, padding=1)
model.append(conv3)
model.append(torch.nn.ReLU())
max_pool3 = nn.MaxPool2d(2, 2)
model.append(max_pool3)
conv4 = nn.Conv2d(in_channels=8, out_channels=1, kernel_size=2, padding=1)

model.append(conv4)  # fixed the error here by putting conv4 instead of conv3
model.append(torch.nn.ReLU())
max_pool4 = nn.MaxPool2d(2, 2)
model.append(max_pool4)
print(model)

# Create a random input tensor
input_tensor = torch.randn(size=(1, 1, 224, 224))
print("Input shape:", input_tensor.shape)

# Run the model
output = model(input_tensor)
print("Output shape:", output.shape)


### Error Caused By Forgetting to Flatten

Let's construct another model with multiple convolutional layers (Conv2d) followed by several fully connected layers (Linear).

In [None]:
# Define a model
model = torch.nn.Sequential()
model.append(nn.Conv2d(in_channels=3, out_channels=16, kernel_size=3, padding=1))
model.append(torch.nn.ReLU())
model.append(nn.MaxPool2d(2, 2))
model.append(nn.Conv2d(in_channels=16, out_channels=32, kernel_size=3, padding=1))
model.append(torch.nn.ReLU())
model.append(nn.MaxPool2d(2, 2))
model.append(nn.Linear(in_features=32 * 56 * 56, out_features=128))
model.append(nn.ReLU())
model.append(nn.Linear(in_features=128, out_features=10))
print(model)

# Create a random input tensor
input_tensor = torch.randn(size=(1, 3, 224, 224))
print("Input shape:", input_tensor.shape)

# Run the model
output = model(input_tensor)
print("Output shape:", output.shape)

# output
Sequential(
  (0): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (1): ReLU()
  (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (3): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (4): ReLU()
  (5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (6): Linear(in_features=100352, out_features=128, bias=True)
  (7): ReLU()
  (8): Linear(in_features=128, out_features=10, bias=True)
)
Input shape: torch.Size([1, 3, 224, 224])

# Error
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[14], line 19
     16 print("Input shape:", input_tensor.shape)
     18 # Run the model
---> 19 output = model(input_tensor)
     20 print("Output shape:", output.shape)

File /usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py:1511, in Module._wrapped_call_impl(self, *args, **kwargs)
   1509     return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
   1510 else:
-> 1511     return self._call_impl(*args, **kwargs)

File /usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py:1520, in Module._call_impl(self, *args, **kwargs)
   1515 # If we don't have any hooks, we want to skip the rest of the logic in
   1516 # this function, and just call forward.
   1517 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1518         or _global_backward_pre_hooks or _global_backward_hooks
   1519         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1520     return forward_call(*args, **kwargs)
   1522 try:
   1523     result = None

File /usr/local/lib/python3.11/site-packages/torch/nn/modules/container.py:217, in Sequential.forward(self, input)
    215 def forward(self, input):
    216     for module in self:
--> 217         input = module(input)
    218     return input

File /usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py:1511, in Module._wrapped_call_impl(self, *args, **kwargs)
   1509     return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
   1510 else:
-> 1511     return self._call_impl(*args, **kwargs)

File /usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py:1520, in Module._call_impl(self, *args, **kwargs)
   1515 # If we don't have any hooks, we want to skip the rest of the logic in
   1516 # this function, and just call forward.
   1517 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1518         or _global_backward_pre_hooks or _global_backward_hooks
   1519         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1520     return forward_call(*args, **kwargs)
   1522 try:
   1523     result = None

File /usr/local/lib/python3.11/site-packages/torch/nn/modules/linear.py:116, in Linear.forward(self, input)
    115 def forward(self, input: Tensor) -> Tensor:
--> 116     return F.linear(input, self.weight, self.bias)

RuntimeError: mat1 and mat2 shapes cannot be multiplied (1792x56 and 100352x128)

Upon execution, this model generates a `RuntimeError`. Let's analyze this error message as we did in the previous example. Concentrate on the matrix dimension mismatch: "the mat1 and mat2 shapes cannot be multiplied (7168x224 and 1605632x128)".

The error in the code stems from attempting to feed the output of the max pooling layer (`MaxPool2d`) directly into fully connected layers (`Linear`). This fails because `Conv2d` layers produce a 4D tensor (`batch_size`, `channels`, `height`, `width`), while `Linear` layers expect a 2D tensor (`batch_size`, `features`).

To resolve this issue, the tensor needs to be flattened before it enters the fully connected layers. This flattening step transforms the 4D output from the convolutional layers into a 2D tensor that the Linear layers can process. Without this crucial step, the dimensions of the data flowing through your model become incompatible, leading to the observed error.

Here is the corrected code that flattens the tensor before passing it to the fully connected layers.

In [None]:
# Define a model
model = torch.nn.Sequential()
model.append(nn.Conv2d(in_channels=3, out_channels=16, kernel_size=3, padding=1))
model.append(torch.nn.ReLU())
model.append(nn.MaxPool2d(2, 2))
model.append(nn.Conv2d(in_channels=16, out_channels=32, kernel_size=3, padding=1))
model.append(torch.nn.ReLU())
model.append(nn.MaxPool2d(2, 2))
model.append(torch.nn.Flatten())  # Flatten the tensor before passing to Linear layers
model.append(nn.Linear(in_features=32 * 56 * 56, out_features=128))
model.append(nn.ReLU())
model.append(nn.Linear(in_features=128, out_features=10))
print(model)

# Create a random input tensor
input_tensor = torch.randn(size=(1, 3, 224, 224))
print("Input shape:", input_tensor.shape)

# Run the model
output = model(input_tensor)
print("Output shape:", output.shape)

**Task 2.1.3:** Fix a `RuntimeError` caused by forgetting to flatten.

In [None]:
# Define a model
model = torch.nn.Sequential()
model.append(nn.Conv2d(in_channels=3, out_channels=16, kernel_size=3, padding=1))
model.append(nn.ReLU())
model.append(nn.MaxPool2d(kernel_size=2, stride=2))
model.append(nn.Conv2d(in_channels=16, out_channels=32, kernel_size=3, padding=1))
model.append(nn.ReLU())
model.append(nn.MaxPool2d(kernel_size=2, stride=2))
model.append(nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, padding=1))
model.append(nn.ReLU())
model.append(nn.MaxPool2d(kernel_size=2, stride=2))

model.append(torch.nn.Flatten())  # Add this Flatten the tensor before passing to Linear layers
model.append(nn.Linear(in_features=64 * 28 * 28, out_features=256)) # can't add this without Flatten first

model.append(nn.ReLU())
model.append(nn.Linear(in_features=256, out_features=64))
model.append(nn.ReLU())
model.append(nn.Linear(in_features=64, out_features=10))
print(model)

# Create a random input tensor
input_tensor = torch.randn(size=(1, 3, 224, 224))
print("Input shape:", input_tensor.shape)

# Run the model
output = model(input_tensor)
print("Output shape:", output.shape)

### Error Caused By Incorrect Layer Size After Flattening

Let's look at another example of convolutional layer followed by a linear layer.

In [None]:
# Define a model
model = torch.nn.Sequential()
model.append(nn.Conv2d(in_channels=3, out_channels=16, kernel_size=3, padding=1))
model.append(torch.nn.ReLU())
model.append(nn.MaxPool2d(2, 2))
model.append(torch.nn.Flatten())
model.append(nn.Linear(in_features=16 * 224 * 224, out_features=32))
model.append(nn.ReLU())
model.append(nn.Linear(in_features=32, out_features=10))
print(model)

# Create a random input tensor
input_tensor = torch.randn(size=(1, 3, 224, 224))
print("Input shape:", input_tensor.shape)

# Run the model
output = model(input_tensor)
print("Output shape:", output.shape)

# output
Sequential(
  (0): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (1): ReLU()
  (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (3): Flatten(start_dim=1, end_dim=-1)
  (4): Linear(in_features=802816, out_features=32, bias=True)
  (5): ReLU()
  (6): Linear(in_features=32, out_features=10, bias=True)
)
Input shape: torch.Size([1, 3, 224, 224])

# Error output
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[19], line 17
     14 print("Input shape:", input_tensor.shape)
     16 # Run the model
---> 17 output = model(input_tensor)
     18 print("Output shape:", output.shape)

File /usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py:1511, in Module._wrapped_call_impl(self, *args, **kwargs)
   1509     return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
   1510 else:
-> 1511     return self._call_impl(*args, **kwargs)

File /usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py:1520, in Module._call_impl(self, *args, **kwargs)
   1515 # If we don't have any hooks, we want to skip the rest of the logic in
   1516 # this function, and just call forward.
   1517 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1518         or _global_backward_pre_hooks or _global_backward_hooks
   1519         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1520     return forward_call(*args, **kwargs)
   1522 try:
   1523     result = None

File /usr/local/lib/python3.11/site-packages/torch/nn/modules/container.py:217, in Sequential.forward(self, input)
    215 def forward(self, input):
    216     for module in self:
--> 217         input = module(input)
    218     return input

File /usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py:1511, in Module._wrapped_call_impl(self, *args, **kwargs)
   1509     return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
   1510 else:
-> 1511     return self._call_impl(*args, **kwargs)

File /usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py:1520, in Module._call_impl(self, *args, **kwargs)
   1515 # If we don't have any hooks, we want to skip the rest of the logic in
   1516 # this function, and just call forward.
   1517 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1518         or _global_backward_pre_hooks or _global_backward_hooks
   1519         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1520     return forward_call(*args, **kwargs)
   1522 try:
   1523     result = None

File /usr/local/lib/python3.11/site-packages/torch/nn/modules/linear.py:116, in Linear.forward(self, input)
    115 def forward(self, input: Tensor) -> Tensor:
--> 116     return F.linear(input, self.weight, self.bias)

RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x200704 and 802816x32)

This model generates a `RuntimeError` when attempting to execute the cell due to a matrix dimension mismatch: "mat1 and mat2 shapes cannot be multiplied (1x200704 and 802816x32)."

The issue arises from incorrectly specifying the size of the linear layer after flattening from a convolutional layer. The code specifies`model.append(nn.Linear(in_features=16 * 224 * 224, out_features=32))`, but the `in_features` value is incorrect. While 16 is the correct number based on the output channels from the `Conv2d` layer, the spatial dimensions should be 112x112 (half of 224x224 due to the `MaxPool2d(2, 2)` layer). The correct value for `in_features` should be 16 * 112 * 112.

Let's update the model with this change and try running it again.

In [None]:
# Define a model
model = torch.nn.Sequential()
model.append(nn.Conv2d(in_channels=3, out_channels=16, kernel_size=3, padding=1))
model.append(torch.nn.ReLU())
model.append(nn.MaxPool2d(2, 2))
model.append(torch.nn.Flatten())
model.append(
    nn.Linear(in_features=16 * 112 * 112, out_features=32)
)  # in_features modified to match the expected number of dimensions
model.append(nn.ReLU())
model.append(nn.Linear(in_features=32, out_features=10))
print(model)

# Create a random input tensor
input_tensor = torch.randn(size=(1, 3, 224, 224))
print("Input shape:", input_tensor.shape)

# Run the model
output = model(input_tensor)
print("Output shape:", output.shape)

# output
Sequential(
  (0): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (1): ReLU()
  (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (3): Flatten(start_dim=1, end_dim=-1)
  (4): Linear(in_features=200704, out_features=32, bias=True)
  (5): ReLU()
  (6): Linear(in_features=32, out_features=10, bias=True)
)
Input shape: torch.Size([1, 3, 224, 224])
Output shape: torch.Size([1, 10])

**Task 2.1.4:** Fix a `RuntimeError` caused by incorrect dimensions after flattening.

In [None]:
# Define a model
model = torch.nn.Sequential()
model.append(
    nn.Conv2d(in_channels=3, out_channels=64, kernel_size=7, stride=2, padding=3)
)
model.append(nn.ReLU())
model.append(nn.MaxPool2d(kernel_size=3, stride=2, padding=1))
model.append(nn.Conv2d(in_channels=64, out_channels=128, kernel_size=3, padding=1))
model.append(nn.ReLU())
model.append(nn.MaxPool2d(kernel_size=2, stride=2))
model.append(nn.Conv2d(in_channels=128, out_channels=256, kernel_size=3, padding=1))
model.append(nn.ReLU())
model.append(nn.MaxPool2d(kernel_size=2, stride=2))
model.append(nn.Flatten())

# model.append(nn.Linear(in_features=128 * 7 * 7, out_features=1000))  # This is wrong
model.append(nn.Linear(in_features=256 * 14 * 14, out_features=1000)) # Add this


model.append(nn.ReLU())
model.append(nn.Linear(in_features=1000, out_features=10))
print(model)

# Create a random input tensor
input_tensor = torch.randn(size=(1, 3, 224, 224))
print("Input shape:", input_tensor.shape)

# Run the model
output = model(input_tensor)
print("Output shape:", output.shape)