## CS5242 Neural Networks and Deep Learning

## Coding sample - Solution

*Instructions* <br>
Name: Please, add your name here : e.g. JOHN SMITH<br>
Answers: Please write your answers directly in this notebook by completing the code sections marked with  
`# YOUR CODE STARTS HERE`  
`# YOUR CODE` (it can span one or multiple lines)  
`# YOUR CODE ENDS HERE`. <br>
Remark: If certain conditions of the questions (for eg. hyperparameter values) are not stated, you are free to choose anything you want.  


## Exercise : Implement MLP with sigmoid activation.

Implement a simple two-layer multi-layer perceptron (MLP) in PyTorch without using the pre-built modules `torch.nn.Linear()` or `torch.nn.Sigmoid()`.

For the first hidden linear layer, use a weight tensor W1 with shape (in_features, hidden_size) and a bias tensor b1 with shape (hidden_size,).

For the output linear layer, employ a weight tensor W2 with shape (hidden_size, out_features) and a bias tensor b2 with shape (out_features,).

The activation after the hidden linear layer is the sigmoid function defined as: $\textrm{sigmoid}(x) = 1 / (1 + \exp(-x))$.

**Hints**:

`torch.exp(x):` Returns a new tensor with the exponential of the elements of the input tensor (shape unchanged).



In [1]:
%reset -f
import datetime
import torch

class MyMLP:
    """
    A simple MLP with one hidden layer using manual matrix multiplication
    and manual sigmoid, without torch.nn.Linear or torch.sigmoid.
    """
    def __init__(self, input_dim, hidden_dim, output_dim):
        # Weights and biases are randomly initialized with :
        #  1) Hidden layer: (input_dim  -> hidden_dim)
        #  2) Output layer: (hidden_dim -> output_dim)
        self.W1 = torch.randn(input_dim, hidden_dim, requires_grad=False)
        self.b1 = torch.randn(hidden_dim, requires_grad=False)
        self.W2 = torch.randn(hidden_dim, output_dim, requires_grad=False)
        self.b2 = torch.randn(output_dim, requires_grad=False)

    def forward(self, x):
        """
        Forward pass through:
            hidden = sigmoid(W1 * x + b1)
            output = W2 * hidden + b2
        """
        ##########################
        # YOUR CODE STARTS HERE
        hidden = torch.matmul(x, self.W1) + self.b1
        hidden = 1.0 / (1.0 + torch.exp(-hidden))  # Implemented sigmoid
        out = torch.matmul(hidden, self.W2) + self.b2
        # YOUR CODE ENDS HERE
        ##########################
        return out

print('Timestamp:',datetime.datetime.now().strftime("%y-%m-%d--%H-%M-%S"))


Timestamp: 26-02-09--14-43-40


In [2]:
class ReferenceMLP(torch.nn.Module):
    """
    Reference MLP using official PyTorch modules for comparison.
    """
    def __init__(self, input_dim, hidden_dim, output_dim):
        super().__init__()
        self.fc1 = torch.nn.Linear(input_dim, hidden_dim, bias=True)
        self.sigmoid = torch.nn.Sigmoid()
        self.fc2 = torch.nn.Linear(hidden_dim, output_dim, bias=True)

    def forward(self, x):
        hidden = self.sigmoid(self.fc1(x))
        return self.fc2(hidden)

# We set the same dimensions
input_dim, hidden_dim, output_dim = 5, 10, 2

# Create random input data
torch.manual_seed(0)
x = torch.randn(4, input_dim)

# Create our MLP
my_mlp = MyMLP(input_dim, hidden_dim, output_dim)

# Create the reference MLP
ref_mlp = ReferenceMLP(input_dim, hidden_dim, output_dim)

# Copy parameters from the reference MLP to our MLP for a fair comparison
with torch.no_grad():
    my_mlp.W1[:] = ref_mlp.fc1.weight.data.T
    my_mlp.b1[:] = ref_mlp.fc1.bias.data
    my_mlp.W2[:] = ref_mlp.fc2.weight.data.T
    my_mlp.b2[:] = ref_mlp.fc2.bias.data

# Get outputs
my_output = my_mlp.forward(x)
ref_output = ref_mlp(x)

# Compare
diff = (my_output - ref_output).abs().max().item()
# print("Difference:", diff)

# Check if the difference is very small
if diff < 1e-7:
    print("Well Done!")
else:
    print("Try again.")


Well Done!
