<p>
  <b>AI Lab: Deep Learning for Computer Vision</b><br>
  <b><a href="https://www.wqu.edu/">WorldQuant University</a></b>
</p>

<div class="alert alert-success" role="alert">
  <p>
    <center><b>Usage Guidelines</b></center>
  </p>
  <p>
    This file is licensed under <a href="https://creativecommons.org/licenses/by-nc-nd/4.0/">Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International</a>.
  </p>
  <p>
    You <b>can</b>:
    <ul>
      <li><span style="color: green">✓</span> Download this file</li>
      <li><span style="color: green">✓</span> Post this file in public repositories</li>
    </ul>
    You <b>must always</b>:
    <ul>
      <li><span style="color: green">✓</span> Give credit to <a href="https://www.wqu.edu/">WorldQuant University</a> for the creation of this file</li>
      <li><span style="color: green">✓</span> Provide a <a href="https://creativecommons.org/licenses/by-nc-nd/4.0/">link to the license</a></li>
    </ul>
    You <b>cannot</b>:
    <ul>
      <li><span style="color: red">✗</span> Create derivatives or adaptations of this file</li>
      <li><span style="color: red">✗</span> Use this file for commercial purposes</li>
    </ul>
  </p>
  <p>
    Failure to follow these guidelines is a violation of your terms of service and could lead to your expulsion from WorldQuant University and the revocation your certificate.
  </p>
</div>

###  Error Caused By Incorrect Input Layer Size

Let's start with imports, then define and run a PyTorch model.

In [1]:
import torch
import torch.nn as nn

In [2]:
# Define a model
model = torch.nn.Sequential()
linear1 = nn.Linear(in_features=3200, out_features=128)
model.append(linear1)
model.append(torch.nn.ReLU())
model.append(torch.nn.Dropout())
linear2 = nn.Linear(in_features=128, out_features=64)
model.append(linear2)
model.append(torch.nn.ReLU())
model.append(torch.nn.Dropout())
linear3 = nn.Linear(in_features=64, out_features=10)
model.append(linear3)
model.append(torch.nn.ReLU())
model.append(torch.nn.Dropout())
print(model)

# Create a random input tensor
input_tensor = torch.randn(size=(32, 100))
print("Input shape:", input_tensor.shape)

# Run the model
output = model(input_tensor)
print("Output shape:", output.shape)

Sequential(
  (0): Linear(in_features=3200, out_features=128, bias=True)
  (1): ReLU()
  (2): Dropout(p=0.5, inplace=False)
  (3): Linear(in_features=128, out_features=64, bias=True)
  (4): ReLU()
  (5): Dropout(p=0.5, inplace=False)
  (6): Linear(in_features=64, out_features=10, bias=True)
  (7): ReLU()
  (8): Dropout(p=0.5, inplace=False)
)
Input shape: torch.Size([32, 100])


RuntimeError: mat1 and mat2 shapes cannot be multiplied (32x100 and 3200x128)

The code execution fails, resulting in a `RuntimeError`. When interpreting a Python stack trace, always begin at the **bottom** and work your way **up**. PyTorch errors are typically more verbose than standard Python errors. In this instance, we're encountering a `RuntimeError`, signaling an issue during model execution. Scanning **upwards**, we find the line `output = model(input_tensor)` near the top. The lower lines in the stack trace show different function calls from the PyTorch library, but they're not the most important part of the error message. The key is to focus on the code that was written in the notebook.

The line `output = model(input_tensor)` triggers the error as it feeds the input data to the model for the forward pass. During the forward pass, the model sequentially applies layers and operations to compute output predictions. The failure occurs due to misaligned layer dimensions. The error message provides critical information: `mat1 and mat2 shapes cannot be multiplied (32x100 and 3200x128)`.

At the heart of deep learning is matrix algebra. Matrix algebra has very specific rules for matrix multiplication. One crucial rule is that the number of columns in the first matrix (mat1) must equal the number of rows in the second matrix (mat2). This condition is not met in the current input, where the first matrix has 100 columns and the second matrix has 3200 rows.

This error commonly stems from incorrectly specifying layer dimensions, particularly in the input layer. The value passed to the `in_features` argument must exactly match the input data dimensions. Dimensional mismatches often lead to runtime errors or unexpected model behavior.

To resolve this issue, let's carefully review the code to identify the dimensional inconsistency. Printing the shape of the model and data can help debugging. This error can be fixed by adjusting the first layer to `in_features=100`, thus matching the size of the input tensor. Let's define another version of the model with that change.

In [3]:
# Define a revised model
model = torch.nn.Sequential()
linear1 = nn.Linear(
    in_features=100, out_features=128
)  # This line was changed to match input size
model.append(linear1)
model.append(torch.nn.ReLU())
model.append(torch.nn.Dropout())
linear2 = nn.Linear(in_features=128, out_features=64)
model.append(linear2)
model.append(torch.nn.ReLU())
model.append(torch.nn.Dropout())
linear3 = nn.Linear(in_features=64, out_features=10)
model.append(linear3)
model.append(torch.nn.ReLU())
model.append(torch.nn.Dropout())
print(model)

# Create a random input tensor
input_tensor = torch.randn(size=(32, 100))
print("Input shape:", input_tensor.shape)

# Run the model
output = model(input_tensor)
print("Output shape:", output.shape)

Sequential(
  (0): Linear(in_features=100, out_features=128, bias=True)
  (1): ReLU()
  (2): Dropout(p=0.5, inplace=False)
  (3): Linear(in_features=128, out_features=64, bias=True)
  (4): ReLU()
  (5): Dropout(p=0.5, inplace=False)
  (6): Linear(in_features=64, out_features=10, bias=True)
  (7): ReLU()
  (8): Dropout(p=0.5, inplace=False)
)
Input shape: torch.Size([32, 100])
Output shape: torch.Size([32, 10])


The code now runs because the model can make a successful forward pass.

Now is your chance to try to solve a similar error.

**Task 2.1.1:** Modify the model to match the input tensor size.

In [18]:
# Define a revised model
model = torch.nn.Sequential()
linear1 = nn.Linear(in_features=10, out_features=64)
model.append(linear1)
model.append(torch.nn.ReLU())
model.append(torch.nn.Dropout())
linear2 = nn.Linear(in_features=64, out_features=32)
model.append(linear2)
model.append(torch.nn.ReLU())
model.append(torch.nn.Dropout())
linear3 = nn.Linear(in_features=32, out_features=1)
model.append(linear3)
model.append(torch.nn.ReLU())
model.append(torch.nn.Dropout())
print(model)

# Create a random input tensor
input_tensor = torch.randn(size=(32, 10))
print("Input shape:", input_tensor.shape)

# Run the model
output = model(input_tensor)
print("Output shape:", output.shape)

Sequential(
  (0): Linear(in_features=10, out_features=64, bias=True)
  (1): ReLU()
  (2): Dropout(p=0.5, inplace=False)
  (3): Linear(in_features=64, out_features=32, bias=True)
  (4): ReLU()
  (5): Dropout(p=0.5, inplace=False)
  (6): Linear(in_features=32, out_features=1, bias=True)
  (7): ReLU()
  (8): Dropout(p=0.5, inplace=False)
)
Input shape: torch.Size([32, 10])
Output shape: torch.Size([32, 1])


### Error Caused By Adding The Same Layer Twice

Let's build another PyTorch model, this time a Convolutional Neural Network (CNN).

In [19]:
# Define a model
model = torch.nn.Sequential()
conv1 = nn.Conv2d(in_channels=3, out_channels=16, kernel_size=3, padding=1)
model.append(conv1)
model.append(torch.nn.ReLU())
max_pool1 = nn.MaxPool2d(2, 2)
model.append(max_pool1)
model.append(conv1)  # Add the same layer again
model.append(torch.nn.ReLU())
model.append(max_pool1)
print(model)

# Create a random input tensor
input_tensor = torch.randn(size=(1, 3, 224, 224))
print("Input shape:", input_tensor.shape)

# Run the model
output = model(input_tensor)
print("Output shape:", output.shape)

Sequential(
  (0): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (1): ReLU()
  (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (3): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (4): ReLU()
  (5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
Input shape: torch.Size([1, 3, 224, 224])


RuntimeError: Given groups=1, weight of size [16, 3, 3, 3], expected input[1, 16, 112, 112] to have 3 channels, but got 16 channels instead

When the cell is executed, it also generates a `RuntimeError`. The message "Given groups=1, weight of size [16, 3, 3, 3], expected input[1, 16, 224, 224] to have 3 channels, but got 16 channels instead" is a clue that the dimensions do not line up correctly. That is caused because the same layer was accidentally added twice to a model. This is typically caused by a copy-and-paste mistake.

To resolve this issue, you'll need to carefully review your code and identify where this dimensional inconsistency occurs. Pay particular attention to the layer where you might have accidentally duplicated a component, leading to unexpected channel dimensions.

The code can be fixed by adding a different layer that has the appropriate dimensions.

In [9]:
# Define a revised model
model = torch.nn.Sequential()
conv1 = nn.Conv2d(in_channels=3, out_channels=16, kernel_size=3, padding=1)
model.append(conv1)
model.append(torch.nn.ReLU())
max_pool1 = nn.MaxPool2d(2, 2)
model.append(max_pool1)
# Define and add a new layer instead of the same previous one
conv2 = nn.Conv2d(in_channels=16, out_channels=32, kernel_size=3, padding=1)
model.append(conv2)
model.append(torch.nn.ReLU())
max_pool2 = nn.MaxPool2d(2, 2)
model.append(max_pool2)
print(model)

# Create a random input tensor
input_tensor = torch.randn(size=(1, 3, 224, 224))
print("Input shape:", input_tensor.shape)

# Run the model
output = model(input_tensor)
print("Output shape:", output.shape)

Sequential(
  (0): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (1): ReLU()
  (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (3): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (4): ReLU()
  (5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
Input shape: torch.Size([1, 3, 224, 224])
Output shape: torch.Size([1, 32, 56, 56])


Now you should try to fix a similar mistake, by using the traceback information to find the mismatched dimensions.

**Task 2.1.2:** Fix a `RuntimeError` by not adding the same layer twice.

In [20]:
# Define a model
model = torch.nn.Sequential()
conv1 = nn.Conv2d(in_channels=1, out_channels=8, kernel_size=2, padding=1)
model.append(conv1)
model.append(torch.nn.ReLU())
max_pool1 = nn.MaxPool2d(2, 2)
model.append(max_pool1)
conv2 = nn.Conv2d(in_channels=8, out_channels=16, kernel_size=3, padding=1)
model.append(conv2)
model.append(torch.nn.ReLU())
max_pool2 = nn.MaxPool2d(2, 2)
model.append(max_pool2)
conv3 = nn.Conv2d(in_channels=16, out_channels=8, kernel_size=3, padding=1)
model.append(conv3)
model.append(torch.nn.ReLU())
max_pool3 = nn.MaxPool2d(2, 2)
model.append(max_pool3)
conv4 = nn.Conv2d(in_channels=8, out_channels=1, kernel_size=2, padding=1)

model.append(conv4)
model.append(torch.nn.ReLU())
max_pool4 = nn.MaxPool2d(2, 2)
model.append(max_pool4)
print(model)

# Create a random input tensor
input_tensor = torch.randn(size=(1, 1, 224, 224))
print("Input shape:", input_tensor.shape)

# Run the model
output = model(input_tensor)
print("Output shape:", output.shape)

Sequential(
  (0): Conv2d(1, 8, kernel_size=(2, 2), stride=(1, 1), padding=(1, 1))
  (1): ReLU()
  (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (3): Conv2d(8, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (4): ReLU()
  (5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (6): Conv2d(16, 8, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (7): ReLU()
  (8): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (9): Conv2d(8, 1, kernel_size=(2, 2), stride=(1, 1), padding=(1, 1))
  (10): ReLU()
  (11): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
Input shape: torch.Size([1, 1, 224, 224])
Output shape: torch.Size([1, 1, 14, 14])


### Error Caused By Forgetting to Flatten 

Let's construct another model with multiple convolutional layers (Conv2d) followed by several fully connected layers (Linear). 

In [22]:
# Define a model
model = torch.nn.Sequential()
model.append(nn.Conv2d(in_channels=3, out_channels=16, kernel_size=3, padding=1))
model.append(torch.nn.ReLU())
model.append(nn.MaxPool2d(2, 2))
model.append(nn.Conv2d(in_channels=16, out_channels=32, kernel_size=3, padding=1))
model.append(torch.nn.ReLU())
model.append(nn.MaxPool2d(2, 2))
model.append(nn.Linear(in_features=32 * 56 * 56, out_features=128))
model.append(nn.ReLU())
model.append(nn.Linear(in_features=128, out_features=10))
print(model)

# Create a random input tensor
input_tensor = torch.randn(size=(1, 3, 224, 224))
print("Input shape:", input_tensor.shape)

# Run the model
output = model(input_tensor)
print("Output shape:", output.shape)

Sequential(
  (0): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (1): ReLU()
  (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (3): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (4): ReLU()
  (5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (6): Linear(in_features=100352, out_features=128, bias=True)
  (7): ReLU()
  (8): Linear(in_features=128, out_features=10, bias=True)
)
Input shape: torch.Size([1, 3, 224, 224])


RuntimeError: mat1 and mat2 shapes cannot be multiplied (1792x56 and 100352x128)

Upon execution, this model generates a `RuntimeError`. Let's analyze this error message as we did in the previous example. Concentrate on the matrix dimension mismatch: "the mat1 and mat2 shapes cannot be multiplied (7168x224 and 1605632x128)".

The error in the code stems from attempting to feed the output of the max pooling layer (`MaxPool2d`) directly into fully connected layers (`Linear`). This fails because `Conv2d` layers produce a 4D tensor (`batch_size`, `channels`, `height`, `width`), while `Linear` layers expect a 2D tensor (`batch_size`, `features`). 

To resolve this issue, the tensor needs to be flattened before it enters the fully connected layers. This flattening step transforms the 4D output from the convolutional layers into a 2D tensor that the Linear layers can process. Without this crucial step, the dimensions of the data flowing through your model become incompatible, leading to the observed error.

Here is the corrected code that flattens the tensor before passing it to the fully connected layers.

In [23]:
# Define a model
model = torch.nn.Sequential()
model.append(nn.Conv2d(in_channels=3, out_channels=16, kernel_size=3, padding=1))
model.append(torch.nn.ReLU())
model.append(nn.MaxPool2d(2, 2))
model.append(nn.Conv2d(in_channels=16, out_channels=32, kernel_size=3, padding=1))
model.append(torch.nn.ReLU())
model.append(nn.MaxPool2d(2, 2))
model.append(torch.nn.Flatten())  # Flatten the tensor before passing to Linear layers
model.append(nn.Linear(in_features=32 * 56 * 56, out_features=128))
model.append(nn.ReLU())
model.append(nn.Linear(in_features=128, out_features=10))
print(model)

# Create a random input tensor
input_tensor = torch.randn(size=(1, 3, 224, 224))
print("Input shape:", input_tensor.shape)

# Run the model
output = model(input_tensor)
print("Output shape:", output.shape)

Sequential(
  (0): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (1): ReLU()
  (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (3): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (4): ReLU()
  (5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (6): Flatten(start_dim=1, end_dim=-1)
  (7): Linear(in_features=100352, out_features=128, bias=True)
  (8): ReLU()
  (9): Linear(in_features=128, out_features=10, bias=True)
)
Input shape: torch.Size([1, 3, 224, 224])
Output shape: torch.Size([1, 10])


Have a try yourself.

**Task 2.1.3:** Fix a `RuntimeError` caused by forgetting to flatten.

In [24]:
# Define a model
model = torch.nn.Sequential()
model.append(nn.Conv2d(in_channels=3, out_channels=16, kernel_size=3, padding=1))
model.append(nn.ReLU())
model.append(nn.MaxPool2d(kernel_size=2, stride=2))
model.append(nn.Conv2d(in_channels=16, out_channels=32, kernel_size=3, padding=1))
model.append(nn.ReLU())
model.append(nn.MaxPool2d(kernel_size=2, stride=2))
model.append(nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, padding=1))
model.append(nn.ReLU())
model.append(nn.MaxPool2d(kernel_size=2, stride=2))
model.append(nn.Flatten())  # ✅ FIX: Flatten before Linear
model.append(nn.Linear(in_features=64 * 28 * 28, out_features=256))
model.append(nn.ReLU())
model.append(nn.Linear(in_features=256, out_features=64))
model.append(nn.ReLU())
model.append(nn.Linear(in_features=64, out_features=10))
print(model)

# Create a random input tensor
input_tensor = torch.randn(size=(1, 3, 224, 224))
print("Input shape:", input_tensor.shape)

# Run the model
output = model(input_tensor)
print("Output shape:", output.shape)

Sequential(
  (0): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (1): ReLU()
  (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (3): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (4): ReLU()
  (5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (6): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (7): ReLU()
  (8): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (9): Flatten(start_dim=1, end_dim=-1)
  (10): Linear(in_features=50176, out_features=256, bias=True)
  (11): ReLU()
  (12): Linear(in_features=256, out_features=64, bias=True)
  (13): ReLU()
  (14): Linear(in_features=64, out_features=10, bias=True)
)
Input shape: torch.Size([1, 3, 224, 224])
Output shape: torch.Size([1, 10])


### Error Caused By Incorrect Layer Size After Flattening

Let's look at another example of convolutional layer followed by a linear layer.

In [26]:
# Define a model
model = torch.nn.Sequential()
model.append(nn.Conv2d(in_channels=3, out_channels=16, kernel_size=3, padding=1))
model.append(torch.nn.ReLU())
model.append(nn.MaxPool2d(2, 2))
model.append(torch.nn.Flatten())
model.append(nn.Linear(in_features=16 * 224 * 224, out_features=32))
model.append(nn.ReLU())
model.append(nn.Linear(in_features=32, out_features=10))
print(model)

# Create a random input tensor
input_tensor = torch.randn(size=(1, 3, 224, 224))
print("Input shape:", input_tensor.shape)

# Run the model
output = model(input_tensor)
print("Output shape:", output.shape)

Sequential(
  (0): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (1): ReLU()
  (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (3): Flatten(start_dim=1, end_dim=-1)
  (4): Linear(in_features=802816, out_features=32, bias=True)
  (5): ReLU()
  (6): Linear(in_features=32, out_features=10, bias=True)
)
Input shape: torch.Size([1, 3, 224, 224])


RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x200704 and 802816x32)

This model generates a `RuntimeError` when attempting to execute the cell due to a matrix dimension mismatch: "mat1 and mat2 shapes cannot be multiplied (1x200704 and 802816x32)." 

The issue arises from incorrectly specifying the size of the linear layer after flattening from a convolutional layer. The code specifies`model.append(nn.Linear(in_features=16 * 224 * 224, out_features=32))`, but the `in_features` value is incorrect. While 16 is the correct number based on the output channels from the `Conv2d` layer, the spatial dimensions should be 112x112 (half of 224x224 due to the `MaxPool2d(2, 2)` layer). The correct value for `in_features` should be 16 * 112 * 112.

Let's update the model with this change and try running it again.

In [27]:
# Define a model
model = torch.nn.Sequential()
model.append(nn.Conv2d(in_channels=3, out_channels=16, kernel_size=3, padding=1))
model.append(torch.nn.ReLU())
model.append(nn.MaxPool2d(2, 2))
model.append(torch.nn.Flatten())
model.append(
    nn.Linear(in_features=16 * 112 * 112, out_features=32)
)  # in_features modified to match the expected number of dimensions
model.append(nn.ReLU())
model.append(nn.Linear(in_features=32, out_features=10))
print(model)

# Create a random input tensor
input_tensor = torch.randn(size=(1, 3, 224, 224))
print("Input shape:", input_tensor.shape)

# Run the model
output = model(input_tensor)
print("Output shape:", output.shape)

Sequential(
  (0): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (1): ReLU()
  (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (3): Flatten(start_dim=1, end_dim=-1)
  (4): Linear(in_features=200704, out_features=32, bias=True)
  (5): ReLU()
  (6): Linear(in_features=32, out_features=10, bias=True)
)
Input shape: torch.Size([1, 3, 224, 224])
Output shape: torch.Size([1, 10])


After making the change to the input dimensions of the linear layer, the model can be successfully executed.

Your turn to debug a similar issue.

**Task 2.1.4:** Fix a `RuntimeError` caused by incorrect dimensions after flattening.

In [28]:
# Define a model
model = torch.nn.Sequential()
model.append(
    nn.Conv2d(in_channels=3, out_channels=64, kernel_size=7, stride=2, padding=3)
)
model.append(nn.ReLU())
model.append(nn.MaxPool2d(kernel_size=3, stride=2, padding=1))
model.append(nn.Conv2d(in_channels=64, out_channels=128, kernel_size=3, padding=1))
model.append(nn.ReLU())
model.append(nn.MaxPool2d(kernel_size=2, stride=2))
model.append(nn.Conv2d(in_channels=128, out_channels=256, kernel_size=3, padding=1))
model.append(nn.ReLU())
model.append(nn.MaxPool2d(kernel_size=2, stride=2))
model.append(nn.Flatten())

model.append(nn.Linear(in_features=256 * 14 * 14, out_features=1000))

model.append(nn.ReLU())
model.append(nn.Linear(in_features=1000, out_features=10))
print(model)

# Create a random input tensor
input_tensor = torch.randn(size=(1, 3, 224, 224))
print("Input shape:", input_tensor.shape)

# Run the model
output = model(input_tensor)
print("Output shape:", output.shape)

Sequential(
  (0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3))
  (1): ReLU()
  (2): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (3): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (4): ReLU()
  (5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (6): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (7): ReLU()
  (8): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (9): Flatten(start_dim=1, end_dim=-1)
  (10): Linear(in_features=50176, out_features=1000, bias=True)
  (11): ReLU()
  (12): Linear(in_features=1000, out_features=10, bias=True)
)
Input shape: torch.Size([1, 3, 224, 224])
Output shape: torch.Size([1, 10])


In summary, PyTorch error messages are helpful, though can often be lengthy. Read them thoroughly, focusing on the parts directly related to your code. These messages typically provide helpful hints for resolving the issues. The issues sometimes stem from not defining the correct size of layers. Printing out the size of the data and the model might help find where in the model the incorrect layer sizes are defined.

---
This file &#169; 2024 by [WorldQuant University](https://www.wqu.edu/) is licensed under [CC BY-NC-ND 4.0](https://creativecommons.org/licenses/by-nc-nd/4.0/).