# Linear Neural Networks

In this notebook, we will explore the mathematical foundations and implementation of Linear Neural Networks. Linear models are the simplest form of neural networks and are primarily used for linear regression tasks.


## Theoretical Background

### Overview
Linear Neural Networks are composed of layers where each neuron performs a linear transformation of the input.

**Type of Function**: Linear

**Nature**: Continuous

**Behavior**: Linear neural networks are the simplest form of neural networks where the output is a linear combination of the input features. 

### Mathematical Formulation


The output \(y\) is given by:
\[ y = XW + b1\]
where \( X \) is the input, \( W \) is the weight matrix, and \( b \) is the bias term.


# Implementation in PyTorch

In [1]:
import torch 
from torch import nn
from torch.optim import SGD

### Mathematical function

In [2]:
#quadratic polynomial function
def f(x):
    return x**2 + 1

x = torch.tensor(4.0, requires_grad=True)
y = f(x)
print("f(x) = ", y.item())

#gradient
y.backward()
print("df/dx = ", x.grad.item())

f(x) =  17.0
df/dx =  8.0


In [3]:
#linear polynomial function
def lin_F(x):
    W = torch.tensor([1.0], requires_grad=True)
    b = torch.tensor([1.0], requires_grad=True)
    
    assert x.shape[-1] == W.shape[0], """
    Invalid shape. (mxn)(nxp) = (mxp). Check shape. W.shape == 1
    """
    return x@W + b, W, b

## Understanding BackProg

In [4]:
# Using PyTorch's .grad and .backward functions
x = torch.tensor([2.0], requires_grad=True)
y_true = torch.tensor([10.0])

#forward pass
y_pred, W, b = lin_F(x)
print("y =", y_pred.item())

#calculate the loss : mean squared error
Loss = ((y_true - y_pred)**2).mean()

#gradient via PyTorch
Loss.backward()
print("Loss = ", Loss.item())
print("dL/y_pred (pT)= ", x.grad.item())

# Gradient of loss w.r.t W and b
print("dL/dW (pT)= ", W.grad.item())
print("dL/db (pT)= ", b.grad.item())

y = 3.0
Loss =  49.0
dL/y_pred (pT)=  -14.0
dL/dW (pT)=  -28.0
dL/db (pT)=  -14.0


In [5]:
# Calculating gradients manually

y_pred, W, b = lin_F(x)
man_loss = ((y_true - y_pred)**2).mean()
print("y =", y_pred.item())

Loss = ((y_true - y_pred)**2).mean()

#differentiating Loss w.r.t y_pred (chain-rule)
u = y_true - y_pred
v = u**2

du = -1 
dv = 2*u
dv_dypred = du*dv

dL = dv_dypred.mean()
print("dL/dy_pred = ", dL.item())

#differentiating Loss w.r.t W (chain-rule)
dW = dL * x
dB = dL*1

print("dL/dW = ", dW.item())
print("dL/db = ", dB.item())

y = 3.0
dL/dy_pred =  -14.0
dL/dW =  -28.0
dL/db =  -14.0


## Creating a simple two layers neural network with activation function 

In [6]:
X = torch.tensor([1., 1., 1., 1.], requires_grad = True)
y = torch.tensor([12])

In [25]:
#defining a neural network


W1 = torch.tensor([[1.], [1.], [1.], [1.]], requires_grad=True)
b1 = torch.tensor([1.], requires_grad=True)
W2 = torch.tensor([[1., 1., 1., 1.]], requires_grad=True)
b2 = torch.tensor([1.0], requires_grad=True)

hidden = torch.nn.functional.linear(X.view(-1,1), W1, b1)
y_pred = torch.nn.functional.linear(hidden, W2, b2)
y_pred.retain_grad()

Loss = ((y - y_pred)**2).mean()
print("Loss = ", Loss.item()) 

Loss.backward()

print("dL/y_pred (pT)= ", y_pred.grad)

# Gradient of loss w.r.t W and b
print("dL/dW1 (pT)= ", W1.grad)
print("dL/db1 (pT)= ", b1.grad)

print("dL/dW2 (pT)= ", W2.grad)
print("dL/db2 (pT)= ", b2.grad)

hidden

Loss =  9.0
dL/y_pred (pT)=  tensor([[-1.5000],
        [-1.5000],
        [-1.5000],
        [-1.5000]])
dL/dW1 (pT)=  tensor([[-6.],
        [-6.],
        [-6.],
        [-6.]])
dL/db1 (pT)=  tensor([-24.])
dL/dW2 (pT)=  tensor([[-12., -12., -12., -12.]])
dL/db2 (pT)=  tensor([-6.])


tensor([[2., 2., 2., 2.],
        [2., 2., 2., 2.],
        [2., 2., 2., 2.],
        [2., 2., 2., 2.]], grad_fn=<AddmmBackward0>)

In [19]:
print("Loss = ", Loss.item()) 

u = y - y_pred
v = u**2

du = -1 
dv = 2*u
dv_dypred = du*dv

dL = dv_dypred.mean()
print("dL/dy_pred = ", dL.item())

# #differentiating Loss w.r.t W (chain-rule)
# dW = dL * x
# dB = dL*1

# print("dL/dW = ", dW.item())
# print("dL/db = ", dB.item())

Loss =  9.0
dL/dy_pred =  -6.0


## Example Use Case

### Dataset
We use a synthetic dataset for demonstration purposes.

### Preprocessing
No preprocessing is required for this simple dataset.

### Training the Model
Training the Linear Neural Network using the synthetic dataset.


## Visualization

## Conclusion and Insights

In this notebook, we have explored the fundamentals of Linear Neural Networks and implemented a simple model in PyTorch. Linear models are useful for understanding the basic principles of neural networks and serve as a foundation for more complex architectures.


In [22]:
import torch
import torch.nn.functional as F

# Input data
X = torch.tensor([1., 1., 1., 1.], requires_grad=True)
# Weights and bias
W = torch.tensor([[1.], [1.], [1.], [1.]], requires_grad=True)
b = torch.tensor([1.], requires_grad=True)
# True output for MSE calculation
y_true = torch.tensor([5.], requires_grad=False)  # no need for gradients

# Perform the linear transformation using functional API
y_pred = F.linear(X.view(-1, 1), W, b)  # Reshape X to match expected dimensions

# Define the loss function (MSE)
loss = (y_true - y_pred).pow(2).mean()

# Compute the gradients
loss.backward()

# Output the gradients
print("Gradient with respect to X:", X.grad)
print("Gradient with respect to W:", W.grad)
print("Gradient with respect to b:", b.grad)
y_pred

Gradient with respect to X: tensor([-1.5000, -1.5000, -1.5000, -1.5000])
Gradient with respect to W: tensor([[-1.5000],
        [-1.5000],
        [-1.5000],
        [-1.5000]])
Gradient with respect to b: tensor([-6.])


tensor([[2., 2., 2., 2.],
        [2., 2., 2., 2.],
        [2., 2., 2., 2.],
        [2., 2., 2., 2.]], grad_fn=<AddmmBackward0>)