# Assignment 2 - Perceptron

## Exercise

Given an input tensor X of shape (m, 784), where m is the number of examples
and 784 is the number of features (input neurons), a weight tensor W of shape
(784, 10), and a bias tensor b of shape (10,), compute the output of the network
for each example in the batch, calculate the error, and update the weights and
biases accordingly.

In [65]:
import torch

Activation function = sigmoid function

In [66]:
def sigmoid(x):
    return 1 / (1 + torch.exp(-x))

The derivative of the sigmoid function


In [67]:
def sigmoid_derivative(x):
    return x * (1 - x)

Write a Python function named train perceptron that takes X, W, b, y true
(the true labels), and mu (the learning rate) as inputs and returns the updated
weights and biases after applying both forward and backward propagation steps.

In [68]:

def train_perceptron(X, W, b, y_true, mu):

    # Forward propagation
    Z = torch.matmul(X, W) + b
    y_pred = sigmoid(Z) # Y_prediction

    # Calculate error
    error = y_true - y_pred

    # Backward propagation
    dy_pred = error * sigmoid_derivative(y_pred)
    dW = torch.matmul(X.T, dy_pred)   
    db = torch.sum(dy_pred, dim=0)

    #Update weights and biases
    W = W + mu * dW
    b = b + mu * db

    return W,b


- X: A 2D PyTorch tensor of shape (m, 784) containing the input features
- W: A 2D PyTorch tensor of shape (784, 10) containing the initial weights for the 10 perceptrons.
- b: A 1D PyTorch tensor of shape (10,) containing the initial biases for the 10 perceptrons.
- y true: A 2D PyTorch tensor of shape (m, 10) containing the true labels for each of the m examples.
- mu: A float representing the learning rate.

In [69]:
# Example Usage
m = 1  # Number of examples
n_inputs = 784
n_outputs = 10

In [70]:
# Initialize random input, weights, biases, and true labels (0 or 1 for simplicity)
X = torch.rand((m, n_inputs)) 
W = torch.rand((n_inputs, n_outputs))
b = torch.rand((n_outputs,))
y_true = torch.randint(2, (m, n_outputs)).float()
mu = 0.01

In [71]:
# Train the perceptron
W, b = train_perceptron(X, W, b, y_true, mu)

In [72]:
# Perform multiple epochs of training
num_epochs = 1000
for epoch in range(num_epochs):
    W, b = train_perceptron(X, W, b, y_true, mu)

In [73]:
print(W)
print(b)

tensor([[0.7843, 0.1468, 0.8887,  ..., 0.8562, 0.8415, 0.7061],
        [0.1024, 0.9148, 0.8114,  ..., 0.0350, 0.1300, 0.8546],
        [0.4367, 0.8223, 0.5734,  ..., 0.9211, 0.7102, 0.4785],
        ...,
        [0.0453, 0.4569, 0.4895,  ..., 0.8321, 0.4848, 0.0270],
        [0.4836, 0.2888, 0.7145,  ..., 0.3558, 0.3413, 0.2327],
        [0.7077, 0.1726, 0.9730,  ..., 0.4158, 0.2398, 0.0315]])
tensor([0.4814, 0.9189, 0.0692, 0.1991, 0.4397, 0.5098, 0.3682, 0.4321, 0.9452,
        0.6864])
