# Teacher's Assignment No. 14 - Q1

***Author:*** *Ofir Paz* $\qquad$ ***Version:*** *12.05.2024* $\qquad$ ***Course:*** *22961 - Deep Learning*

Welcome to question 1 of the fourth assignment of the course *Deep Learning*. \
In this question, we will implement the *SplitLinear* network layer, and make various gradient calculations related to it.

## Imports

First, we will import the required packages for this assignment.
- [pytorch](https://pytorch.org/) - One of the most fundemental and famous tensor handling library.
- [numpy](https://numpy.org) - The fundamental package for scientific computing with Python.
- [pandas](https://pandas.pydata.org) - Library to handle data in Python.
- [matplotlib](https://matplotlib.org) - Library to plot graphs in Python.

In [1]:
import torch  # pytorch.
import torch.nn as nn  # neural network module.
import torch.nn.functional as F  # functional module.
import numpy as np  # numpy.
import matplotlib.pyplot as plt  # plotting.
from typing import Literal  # type hinting.

## SplitLinear Implementation

We will start with the implementation of the *SplitLinear* layer, using pytorch.

In [2]:
class SplitLinear(nn.Module):
    '''SplitLinear layer.
    
    The SplitLinear layer is a linear layer that splits the input tensor in half, 
    applies a linear transformation to each half, and concatenates the results.
    '''
    def __init__(self, input_size: int, output_size: int) -> None:
        '''
        Constructor for the SplitLinear layer.

        Args:
            input_size (int) - Size of the input tensor (Assumes even).
            output_size (int) - Size of the output tensor.
        '''
        super(SplitLinear, self).__init__()
        self.linear = nn.Linear(input_size // 2, output_size)

        # Use Xavier initialization for the weights.
        # Reasoning for use in the video.
        nn.init.xavier_uniform_(self.linear.weight)

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        '''
        Forward pass of the layer.

        Args:
            x (torch.Tensor) - Input tensor.
                Assumes shape (batch_size, #features), where #features is even.

        Returns:
            torch.Tensor - Output tensor.
        '''

        # Split the input tensor in half.
        x1, x2 = torch.chunk(x, 2, dim=1)

        # Apply linear transformation to each half.
        x1, x2 = self.fc(x1), self.fc(x2)

        # Concatenate the results and apply ReLU.
        x = F.relu(torch.cat([x1, x2], dim=1))

        return x

In [3]:
# Example if Single pass through the `SplitLinear` layer.
split_linear = SplitLinear(6, 5)

# Random input tensor.
x = torch.randn(2, 6)

# Forward pass (not using `.forward` for printing each stage).
print(f"Input:\n{x = }")
x1, x2 = torch.chunk(x, 2, dim=1)
print(f"Split:\n{x1 = }\n{x2 = }")
x1, x2 = split_linear.linear(x1), split_linear.linear(x2)
print(f"Linear:\n{x1 = }\n{x2 = }")
x = F.relu(torch.cat([x1, x2], dim=1))
print(f"Output:\n{x = }")

Input:
x = tensor([[-0.3024, -0.3163, -1.8870,  1.1589,  1.2414,  0.0218],
        [ 0.4037,  0.8021, -1.2792, -0.1575, -0.2317,  1.0961]])
Split:
x1 = tensor([[-0.3024, -0.3163, -1.8870],
        [ 0.4037,  0.8021, -1.2792]])
x2 = tensor([[ 1.1589,  1.2414,  0.0218],
        [-0.1575, -0.2317,  1.0961]])
Linear:
x1 = tensor([[-0.1756,  0.6182, -1.6125,  0.4275, -0.1730],
        [ 0.1825,  0.0253, -1.1398,  0.8968,  0.0883]],
       grad_fn=<AddmmBackward0>)
x2 = tensor([[ 0.5599, -0.5488, -0.1931,  0.8926, -0.0974],
        [ 0.0303, -0.8973,  0.9043, -0.7078, -0.1352]],
       grad_fn=<AddmmBackward0>)
Output:
x = tensor([[0.0000, 0.6182, 0.0000, 0.4275, 0.0000, 0.5599, 0.0000, 0.0000, 0.8926,
         0.0000],
        [0.1825, 0.0253, 0.0000, 0.8968, 0.0883, 0.0303, 0.0000, 0.9043, 0.0000,
         0.0000]], grad_fn=<ReluBackward0>)
