# Assignment 1



### NAME   
#### Thallapally Nimisha

### RollNo  
#### CS22B1082

### Date  
#### 20/01/2025


### Question 2

**Implement the feedforward and backpropagation learning algorithm for multi layer perceptrons in Python for the question provided in the attached image.**<br>

Use  the weights and biases as given.

- Implement the forward pass.
- Compute the loss between the predicted output and the actual output using an appropriate loss function (MSE).
- Compute the gradients of the loss function with respect to the weights and biases using the chain rule.
- Update the weights and biases.
- Iterate over multiple times (epochs), performing forward propagation, loss calculation, backpropagation, and parameter updates in each iteration till convergence (the actual output is the same as the target output).


In [71]:
import numpy as np

### Define Sigmoid Function 

In [72]:
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

### Initialise Weight Vector 

In [73]:
# Initialize weights 

w1 = np.array([[0.5, 1.5, 0.8], [0.8, 0.2, -1.6]])
w2 = np.array([[0.9, -1.7, 1.6], [1.2, 2.1, -0.2]])  


### Define input Feature Vector and target output

In [74]:
X1 = np.array([1, 0.7, 1.2])  # Input (1x3, including bias x0=1)
T = np.array([1.0, 0.0])         # Target output (1x2)

### Forward Pass

In [75]:
# Forward pass
def forward_pass(X , w):
    return sigmoid(np.dot(w,X))

### Calculating Error(MSE)

In [76]:
# Compute loss (Mean Squared Error)
def compute_loss(y_true, y_pred):
    return np.mean((y_true - y_pred) ** 2)

### Back Propagation

In [77]:
def back_propagation(w,X,del_vector,learning_rate):
    n,m = w.shape
    for j in range(n):
        for i in range(m):
            w[j,i]-=learning_rate*del_vector[j]*X[i]
    return w

### calculating Error vector for hidden layers

  - For each neuron \(j\) in the current layer:
     - Add the weighted sum of the error signals from the next layer .
     - Use \(W[i, j+1]\) because \(W[i, 0]\) typically corresponds to the bias term (not used here for backpropagation).

  - **Formula** : $$
\delta[j] = \sum_{i=0}^{m-1} \delta_{k+1}[i] \cdot W[i, j+1]
$$
  - **Final Formula** : $$
\delta[j] = \left( \sum_{i=0}^{m-1} \delta_{k+1}[i] \cdot W[i, j+1] \right) \cdot \text{output}[j] \cdot (1 - \text{output}[j])
$$

####  Explanation:
- **\(i\)**: This represents each neuron in the **next layer** (layer \(k+1\)).
- **\(j\)**: This refers to a neuron in the **current layer** (layer \(k\)) that you're calculating the delta for.

### Formula Breakdown:
- For each neuron \( j \) in the current layer:

- You compute the delta[j]  by summing over all neurons i in the next layer k+1. Each term in the sum consists of the error signal from the next layer multiplied by the weight connecting the ith neuron in the next layer to the jth neuron in the current layer \( W[i, j+1] \).
- This summation gives you the weighted sum of error signals for each neuron in the current layer.


- **Apply the derivative of the activation function:**
   ```python
   del_vector[j] *= output[j] * (1 - output[j])
   ```
   - Multiply the accumulated value by the derivative of the sigmoid function

In [78]:
def calculate_del_vector(W, del_vector_k1, output):
    n, m = W.shape
    del_vector = np.zeros(n)
    for j in range(n):
        for i in range(len(del_vector_k1)):
            del_vector[j] += del_vector_k1[i] * W[i, j+1]
        del_vector[j] *= output[j] * (1 - output[j])
    return del_vector

### Feed Forward Step

The code first performs a forward pass through the network layers. 
- For layer 1, the input X is passed through the weights W1 to get the output (output1). 
- For layer 2, the output from layer 1 is used as the input for layer 2, and the weights (W2) are applied to compute the final output (output2).

The formula for the feedforward process is:

$$
\text{output1} = \sigma(W_1 \cdot X)
$$

$$
\text{output2} = \sigma(W_2 \cdot \text{output1})
$$

### Backpropagation Step

Backpropagation computes the gradient of the loss with respect to the weights by applying the chain rule. The delta delta2 for the second layer is computed by taking the derivative of the loss function with respect to the output of the second layer. This is done by multiplying the error (output - target) with the derivative of the activation function (sigmoid):

$$
\delta_2 = ( \text{output2} - t ) \cdot \text{output2} \cdot (1 - \text{output2})
$$

Then, the weights \( W2 \) are updated using gradient descent:

$$
W_2 = W_2 - \eta \cdot X_2 \cdot \delta_2^T
$$

### Convergence Check

The convergence condition checks if the difference between the target output \( t \) and the actual output \( \text{output2} \) is less than a predefined threshold for all elements:

$$
\| \text{output2} - t \| < 1 \times 10^{-2}
$$


In [79]:
def multilayer_perceptron(X, W1, W2, t, learning_rate):
    epoch = 0
    while True:
        epoch += 1

        # Feed forward
        output1 = forward_pass(X, W1)
        X2 = np.array([1.0, *output1])  # Input for layer 2
        output2 = forward_pass(X2, W2)

        # Check for convergence
        if np.all(np.abs(output2 - t) < 1e-2):
            print(f"Converged after {epoch} epochs")
            break

        # Backpropagation
        delta2 = (output2 - t) * output2 * (1 - output2)
        W2 = back_propagation(W2, X2, delta2, learning_rate)

        delta1 = calculate_del_vector(W2, delta2, output1)
        W1 = back_propagation(W1, X, delta1, learning_rate)

        # Print progress
        print(f"Epoch {epoch}: Output = {output2}")
        print(f"W1: {W1}")
        print(f"W2: {W2}")
    
    return W1, W2,output2


In [80]:
W1, W2,output = multilayer_perceptron(X1, w1, w2, T, 0.5)

Epoch 1: Output = [0.44137071 0.95637774]
W1: [[ 0.48928025  1.49249617  0.7871363 ]
 [ 0.8229341   0.21605387 -1.57247908]]
W2: [[ 0.96886855 -1.63630762  1.61879366]
 [ 1.18005027  2.08154969 -0.20544412]]
Epoch 2: Output = [0.48071757 0.95442284]
W1: [[ 0.47892982  1.48525087  0.77471578]
 [ 0.84552115  0.2318648  -1.54537462]]
W2: [[ 1.03368231 -1.57650867  1.63735805]
 [ 1.15929163  2.0623972  -0.21138994]]
Epoch 3: Output = [0.51850378 0.95230896]
W1: [[ 0.46899918  1.47829943  0.76279902]
 [ 0.86741089  0.24718763 -1.51910693]]
W2: [[ 1.09378691 -1.52118633  1.65539794]
 [ 1.13766631  2.04249252 -0.2178806 ]]
Epoch 4: Output = [0.55387849 0.95002063]
W1: [[ 0.45949453  1.47164617  0.75139344]
 [ 0.88835515  0.2618486  -1.49397382]]
W2: [[ 1.14890458 -1.47057303  1.672693  ]
 [ 1.11511214  2.02178153 -0.22495775]]
Epoch 5: Output = [0.58633523 0.94753906]
W1: [[ 0.45038709  1.46527096  0.7404645 ]
 [ 0.90821357  0.2757495  -1.47014372]]
W2: [[ 1.19907099 -1.42461242  1.68910477]


Epoch 799: Output = [0.97858376 0.02336501]
W1: [[ 0.09040363  1.21328254  0.30848436]
 [ 1.81869831  0.91308882 -0.37756203]]
W2: [[ 2.2181754  -0.56313768  2.32369205]
 [-1.70185656 -0.34762722 -1.99474709]]
Epoch 800: Output = [0.97859671 0.02334892]
W1: [[ 0.09039801  1.21327861  0.30847762]
 [ 1.81880836  0.91316585 -0.37742996]]
W2: [[ 2.21839955 -0.56296116  2.3238896 ]
 [-1.70212278 -0.34783687 -1.99498171]]
Epoch 801: Output = [0.97860963 0.02333286]
W1: [[ 0.09039241  1.21327469  0.3084709 ]
 [ 1.81891826  0.91324278 -0.37729809]]
W2: [[ 2.21862343 -0.56278486  2.32408691]
 [-1.70238864 -0.34804623 -1.99521602]]
Epoch 802: Output = [0.97862253 0.02331684]
W1: [[ 0.09038683  1.21327078  0.3084642 ]
 [ 1.819028    0.9133196  -0.3771664 ]]
W2: [[ 2.21884704 -0.56260877  2.32428399]
 [-1.70265414 -0.34825531 -1.99545003]]
Epoch 803: Output = [0.97863541 0.02330084]
W1: [[ 0.09038127  1.21326689  0.30845753]
 [ 1.81913758  0.91339631 -0.3770349 ]]
W2: [[ 2.21907039 -0.56243288  2.

Epoch 1697: Output = [0.98516481 0.01557101]
W1: [[ 0.08941774  1.21259242  0.30730129]
 [ 1.8817982   0.95725874 -0.30184216]]
W2: [[ 2.35508512 -0.4553825   2.44563739]
 [-1.85787331 -0.47042114 -2.13369328]]
Epoch 1698: Output = [0.9851691  0.01556615]
W1: [[ 0.08941888  1.21259321  0.30730265]
 [ 1.88184523  0.95729166 -0.30178572]]
W2: [[ 2.35519346 -0.45529723  2.44573483]
 [-1.85799257 -0.470515   -2.13380054]]
Epoch 1699: Output = [0.98517338 0.0155613 ]
W1: [[ 0.08942002  1.21259401  0.30730402]
 [ 1.88189224  0.95732457 -0.30172932]]
W2: [[ 2.35530175 -0.45521201  2.44583221]
 [-1.85811177 -0.47060881 -2.13390773]]
Epoch 1700: Output = [0.98517766 0.01555646]
W1: [[ 0.08942116  1.21259481  0.30730539]
 [ 1.88193921  0.95735745 -0.30167295]]
W2: [[ 2.35540997 -0.45512684  2.44592954]
 [-1.85823089 -0.47070256 -2.13401486]]
Epoch 1701: Output = [0.98518194 0.01555162]
W1: [[ 0.0894223   1.21259561  0.30730676]
 [ 1.88198615  0.95739031 -0.30161661]]
W2: [[ 2.35551813 -0.4550417

Epoch 2100: Output = [0.98664031 0.01391648]
W1: [[ 0.0900267   1.21301869  0.30803204]
 [ 1.89859475  0.96901632 -0.2816863 ]]
W2: [[ 2.39437631 -0.42445473  2.48105929]
 [-1.90079513 -0.50420672 -2.17238804]]
Epoch 2101: Output = [0.98664344 0.013913  ]
W1: [[ 0.0900285   1.21301995  0.3080342 ]
 [ 1.8986318   0.96904226 -0.28164184]]
W2: [[ 2.39446431 -0.42438544  2.48113882]
 [-1.90089057 -0.50428186 -2.17247429]]
Epoch 2102: Output = [0.98664657 0.01390952]
W1: [[ 0.09003031  1.21302122  0.30803637]
 [ 1.89866883  0.96906818 -0.2815974 ]]
W2: [[ 2.39455228 -0.42431618  2.48121831]
 [-1.90098596 -0.50435696 -2.1725605 ]]
Epoch 2103: Output = [0.9866497  0.01390605]
W1: [[ 0.09003212  1.21302248  0.30803854]
 [ 1.89870584  0.96909409 -0.28155299]]
W2: [[ 2.39464021 -0.42424696  2.48129777]
 [-1.9010813  -0.50443203 -2.17264666]]
Epoch 2104: Output = [0.98665282 0.01390258]
W1: [[ 0.09003392  1.21302375  0.30804071]
 [ 1.89874283  0.96911998 -0.2815086 ]]
W2: [[ 2.39472809 -0.4241777

Epoch 2522: Output = [0.98779363 0.01264337]
W1: [[ 0.09086761  1.21360733  0.30904114]
 [ 1.91269356  0.97888549 -0.26476772]]
W2: [[ 2.42829186 -0.39774615  2.51176875]
 [-1.93736581 -0.53300621 -2.20550151]]
Epoch 2523: Output = [0.98779602 0.01264075]
W1: [[ 0.09086975  1.21360883  0.3090437 ]
 [ 1.91272376  0.97890663 -0.26473149]]
W2: [[ 2.42836542 -0.3976882   2.51183548]
 [-1.93744469 -0.53306834 -2.20557308]]
Epoch 2524: Output = [0.9877984  0.01263813]
W1: [[ 0.09087189  1.21361033  0.30904627]
 [ 1.91275394  0.97892776 -0.26469527]]
W2: [[ 2.42843895 -0.39763028  2.51190219]
 [-1.93752354 -0.53313046 -2.20564462]]
Epoch 2525: Output = [0.98780079 0.01263552]
W1: [[ 0.09087404  1.21361182  0.30904884]
 [ 1.91278411  0.97894888 -0.26465907]]
W2: [[ 2.42851245 -0.39757238  2.51196888]
 [-1.93760236 -0.53319255 -2.20571613]]
Epoch 2526: Output = [0.98780317 0.0126329 ]
W1: [[ 0.09087618  1.21361332  0.30905141]
 [ 1.91281427  0.97896999 -0.26462288]]
W2: [[ 2.42858593 -0.3975145

Epoch 2870: Output = [0.98854875 0.01181881]
W1: [[ 0.09163853  1.21414697  0.30996623]
 [ 1.9224661   0.98572627 -0.25304068]]
W2: [[ 2.45230497 -0.37882613  2.53358272]
 [-1.96302494 -0.55322312 -2.22881072]]
Epoch 2871: Output = [0.98855072 0.01181667]
W1: [[ 0.0916408   1.21414856  0.30996896]
 [ 1.92249224  0.98574457 -0.25300931]]
W2: [[ 2.45236977 -0.37877507  2.53364165]
 [-1.96309393 -0.55327749 -2.22887348]]
Epoch 2872: Output = [0.98855269 0.01181452]
W1: [[ 0.09164308  1.21415016  0.3099717 ]
 [ 1.92251838  0.98576286 -0.25297795]]
W2: [[ 2.45243454 -0.37872402  2.53370057]
 [-1.9631629  -0.55333184 -2.22893621]]
Epoch 2873: Output = [0.98855466 0.01181238]
W1: [[ 0.09164536  1.21415175  0.30997443]
 [ 1.9225445   0.98578115 -0.2529466 ]]
W2: [[ 2.45249929 -0.378673    2.53375946]
 [-1.96323184 -0.55338617 -2.22899892]]
Epoch 2874: Output = [0.98855663 0.01181024]
W1: [[ 0.09164763  1.21415334  0.30997716]
 [ 1.92257061  0.98579943 -0.25291527]]
W2: [[ 2.45256401 -0.3786219

Epoch 3672: Output = [0.98986402 0.01039856]
W1: [[ 0.09351508  1.21546055  0.31221809]
 [ 1.94070064  0.99849044 -0.23115924]]
W2: [[ 2.49822644 -0.34261545  2.57545293]
 [-2.01162455 -0.59154557 -2.27312261]]
Epoch 3673: Output = [0.98986539 0.0103971 ]
W1: [[ 0.09351745  1.21546221  0.31222094]
 [ 1.9407205   0.99850435 -0.2311354 ]]
W2: [[ 2.49827727 -0.34257534  2.57549939]
 [-2.01167804 -0.59158777 -2.27317149]]
Epoch 3674: Output = [0.98986676 0.01039563]
W1: [[ 0.09351982  1.21546387  0.31222378]
 [ 1.94074037  0.99851826 -0.23111156]]
W2: [[ 2.49832809 -0.34253524  2.57554583]
 [-2.01173151 -0.59162996 -2.27322036]]
Epoch 3675: Output = [0.98986812 0.01039416]
W1: [[ 0.09352218  1.21546553  0.31222662]
 [ 1.94076022  0.99853216 -0.23108773]]
W2: [[ 2.4983789  -0.34249516  2.57559227]
 [-2.01178497 -0.59167214 -2.27326921]]
Epoch 3676: Output = [0.98986949 0.0103927 ]
W1: [[ 0.09352455  1.21546719  0.31222946]
 [ 1.94078007  0.99854605 -0.23106391]]
W2: [[ 2.4984297  -0.3424550

In [81]:
print("output before mapping")
print(output)
output = np.round(output).astype(int)
print("output after mapping")
print(output)

output before mapping
[0.99023658 0.00999979]
output after mapping
[1 0]


In [82]:
print("The final Weight vectors are : \n")
print(W1)
print(W2)

The final Weight vectors are : 

[[ 0.09419442  1.2159361   0.31303331]
 [ 1.94616879  1.00231815 -0.22459746]]
[[ 2.51228286 -0.33152243  2.58830795]
 [-2.02639032 -0.60319838 -2.28662633]]
