
# ‚û°Ô∏è **LAB01:** develop multi-layered perceptrons, activation functions, and loss functions from scratch

In this computer assignment, you will:
- Implement common **activation functions** from scratch.
- **Visualize** their behavior.
- Develop a simple **Multi-Layer Perceptron (MLP)**.
- Choose and implement the appropriate loss function.
- Train the MLP to solve the **XOR problem**.

In [None]:
import torch
import torch.nn as nn
import matplotlib.pyplot as plt

from utils.lab01 import *
%matplotlib inline

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")



## ‚úçÔ∏è Task 1: Implement activation functions

Complete the formulas for **Sigmoid**, **Tanh**, **ReLU**, and **Leaky ReLU** below.


In [None]:
def sigmoid(x):
    # TODO: Implement the Sigmoid function
    return "FILL_IN"

def tanh(x):
    # TODO: Implement the Tanh function
    return "FILL_IN"

def relu(x):
    # TODO: Implement the ReLU function
    return "FILL_IN"

def leaky_relu(x, alpha=0.01):
    # TODO: Implement the Leaky ReLU function
    return "FILL_IN"

def elu(x, alpha=0.01):
    # TODO: Implement the ELU function
    return "FILL_IN"

In [None]:
activations = {
        "Sigmoid": sigmoid,
        "Tanh": tanh,
        "ReLU": relu,
        "Leaky ReLU": lambda x: leaky_relu(x, alpha=0.1),
        "ELU": lambda x: elu(x, alpha=1.0),
    }


## üìä Task 2: Visualize activation functions

Use the helper function below to visualize the implemented activation functions.


In [None]:
def vis_act_fn():
    x = torch.linspace(-5, 5, 200)
    
    for name, act in activations.items():
        plt.figure(figsize=(8, 6))
        try:
            y = act(x)
            y_grads = get_grads(act, x)
            plt.plot(x, y, linewidth=3, label=name)
            plt.plot(x, y_grads, linewidth=3, label=f"Gradient of {name}")
        except:
            plt.plot([], [], label=f"{name} (not implemented)")
        plt.title(name)
        plt.xlabel("x")
        plt.ylabel("f(x)")
        plt.legend()
        plt.grid(True)
        plt.show()

# Visualize the activation functions
vis_act_fn()



## üß© Task 3: Create the neural network

**Create a multi-layer Perceptron**: In the `NeuralNetwork` class below, the code is missing for the affine linear transformation of the input layer and the output layer . At the line `self.linear1` and `self.linear2`, replace `FILL_IN` by a linear transformation with the correct parameters. Please refer to the [PyTorch nn documentation](https://pytorch.org/docs/stable/nn.html).


In [None]:
class NeuralNetwork(nn.Module):
    def __init__(self, num_inputs, num_hidden, num_outputs, act_fn):
        super().__init__()

        # TODO: Implement an affine linear transformation with the correct parameters
        self.linear1 = "FILL_IN"

        self.act_fn = act_fn

        # TODO: Implement an affine linear transformation with the correct parameters
        self.linear2 = "FILL_IN"

    def forward(self, x):
        x = self.linear1(x)
        x = self.act_fn(x)
        x = self.linear2(x)
        return x


## üßÆ Task 4: Choose and implement the appropriate loss function

Decide which loss function to use for this application. Implement the loss function from scratch in the code below.

In [None]:
def loss_function(preds, targets):
    # TODO: Implement the loss function
    loss = "FILL_IN"
    return loss.mean()


## üß† Task 5: Train and evaluate the neural network on XOR dataset

Try out different activation functions and observe how they influence the performance and decision boundaries.


In [None]:
for name, act in activations.items():
    print(f"\n--- Training with {name} ---")
    model = NeuralNetwork(2, 4, 1, act_fn=act).to(device)
    loss_module =  loss_wrapper(loss_function)
    optimizer = torch.optim.SGD(model.parameters(), lr=0.1)

    # Training dataset
    train_dataset = XORDataset(size=2500)
    train_loader = data.DataLoader(train_dataset, batch_size=128, shuffle=True)
    train_model(model, optimizer, train_loader, loss_module, device)

    # Test dataset
    test_dataset = XORDataset(size=500)
    test_loader = data.DataLoader(test_dataset, batch_size=128, shuffle=False)
    predictions, accuracy = eval_model(model, test_loader, device)
    print(f"Test Accuracy: {accuracy * 100:.2f}%")

    _ = visualize_boundary(model, test_dataset.data, predictions, device, f"{name} Decision Boundary")
    plt.show()



## üí≠ Summary & Reflection

After completing this exercise, you should be able to:

- Understand why **non-linear activation functions** are essential in neural networks.
- See how **different activation functions** affect model learning and decision boundaries.
- Implement and visualize activation functions from their mathematical formulas.

**Discussion:**  
Which activation function worked best for the XOR problem? Why do you think that is?
