# Skip Connection Challenge

Skip connections enable very deep neural network through uninterrupted gradient flows. The objective of this question is to study the issues in including the skip connection in a CNN network.

In [1]:
# Import necessary libraries
import torch
import torch.nn as nn


# For reproducibility
torch.manual_seed(42)


<torch._C.Generator at 0x1212096f0>

# Q3a: Implementation of Skip Connection



Consider this basic CNN structure shown in the code block below. The objective is to implement the skip connection from the output of Layer 1 to the output of Layer 3. Layer 1's output (a1) should be added to Layer 3's output (a3). The issue with this connection is the shape mismatch between a1 and a3.

First answer the following questions. You can assume that the batch size is B and the CIFAR-10 is an RGB dataset with a height and width of 32 pixels each.

- **Shape of tensor a1_** : 32x32x16


- **Shape of tensor a1**:  16x16x16



- **Shape of tensor a2_**: 16x16x32



- **Shape of tensor a2**: 8x8x32


- **Shape of tensor a3_**: 8x8x64



- **Shape of tensor a3**: 4x4x64




In [2]:
class SkipConnectionCNN(nn.Module):
    def __init__(self):
        super(SkipConnectionCNN, self).__init__()
        
        # Layer 1: Conv2d(in_channels = 3, out_channels = 16, kernel_size=3, stride=1, padding=1)
        self.conv1 = nn.Conv2d(in_channels = 3, out_channels = 16, kernel_size=3, stride=1, padding=1)
        # MaxPool2d(2, 2)
        self.pool1 = nn.MaxPool2d(2, 2)
        
        # Layer 2: Conv2d(16, 32, kernel_size=3, stride=1, padding=1)
        self.conv2 = nn.Conv2d(in_channels = 16, out_channels = 32, kernel_size = 3, stride=1, padding=1)

        # MaxPool2d(2, 2)
        self.pool2 = nn.MaxPool2d(2, 2)

        # Layer 3: Conv2d(32, 64, kernel_size=3, stride=1, padding=1)
        self.conv3 = nn.Conv2d(in_channels = 32, out_channels = 64, kernel_size = 3, stride=1, padding=1)
                
        # Activation function
        self.relu = nn.ReLU()

        # Define skip connection from the output of Layer 1 (a1) to output of Layer 3 (a3)
        # Hint: a1 and a3 needs to be added and the shapes of a1 and a3 must be compatible for addition
        self.SkipConv = nn.Conv2d(in_channels = 16, out_channels = 64, kernel_size = 1)  # 1x1 conv to match channel dimensions
        self.SkipPool = nn.MaxPool2d(4,4)


    def forward(self, x):
        
        # Main path
        #Input to Layer 1
        a1_ = self.relu(self.conv1(x))     
        a1 = self.pool1(a1_)               
        #Layer 1 to Layer 2
        a2_ = self.relu(self.conv2(a1))    
        a2 = self.pool2(a2_)               
        #Layer 2 to Layer 3
        a3_ = self.relu(self.conv3(a2))    
        a3 = self.pool2(a3_)               
       
        # Implement skip connection from Layer 1 to Layer 3  
        # Combine a3 with a1
        # a3skip = a3 + a1
        # Hint: Check the shapes a1 and a3; they must be compatible for addition
        #Matching spatial dimensions (Height and Width)
        a1_skip_ = self.SkipConv(a1)
        a1_skip = self.SkipPool(a1_skip_)
        
        a3skip = a3 + a1_skip

        return a3skip

# Q3b: Test the model

In [3]:
# Initialize model
skip_model = SkipConnectionCNN()

# Test the implementation
def test_network(model):
    # Create random input tensor with CIFAR-10 dimensions
    x = torch.randn(1, 3, 32, 32)
    

    
    # Forward pass
    output = model(x)
    
    # Print shapes
    print(f"Input shape: {x.shape}")
    print(f"Output shape: {output.shape}")

# Run test
test_network(skip_model)

Input shape: torch.Size([1, 3, 32, 32])
Output shape: torch.Size([1, 64, 4, 4])
