In [1]:
import pandas as pd
import numpy as np

In [2]:
df = pd.DataFrame([[8,8,4],[7,9,5],[6,10,6],[5,12,7]], columns=['cgpa', 'profile_score', 'lpa'])

In [3]:
df

Unnamed: 0,cgpa,profile_score,lpa
0,8,8,4
1,7,9,5
2,6,10,6
3,5,12,7


In [4]:
def initialize_parameters(layer_dims):
  
  np.random.seed(3)
  parameters = {}
  L = len(layer_dims)         

  for l in range(1, L):

    parameters['W' + str(l)] = np.ones((layer_dims[l-1], layer_dims[l]))*0.1
    parameters['b' + str(l)] = np.zeros((layer_dims[l], 1))
      

  return parameters

### Function: `initialize_parameters`

This function initializes the **weights and biases** for a fully connected neural network.

#### Example call
```python
initialize_parameters([2, 2, 1])
Meaning of [2, 2, 1]
Input layer: 2 neurons

Hidden layer: 2 neurons

Output layer: 1 neuron

How the function works
layer_dims defines the network structure

L = len(layer_dims) gives the number of layers

The loop runs L - 1 times (input layer has no weights)

For [2, 2, 1], the loop runs 2 times:

Creates W1 (2√ó2) and b1 (2√ó1)

Creates W2 (2√ó1) and b2 (1√ó1)

Initialization values
All weights are initialized to 0.1

All biases are initialized to 0

One-line summary
This function creates all the weights and biases with correct shapes so that a neural network can start training.

In [5]:
initialize_parameters([2,2,1])

{'W1': array([[0.1, 0.1],
        [0.1, 0.1]]),
 'b1': array([[0.],
        [0.]]),
 'W2': array([[0.1],
        [0.1]]),
 'b2': array([[0.]])}

In [10]:
def linear_forward(A_prev, W, b):
  
  Z = np.dot(W.T, A_prev) + b
  
  return Z



This function computes the **linear part of forward propagation** for one layer in a neural network.

---


### Purpose

It calculates the weighted sum of inputs plus bias **before applying the activation function**.

---

### Mathematical representation

[
Z = W^T A_{prev} + b
]

---

### Parameters

* **A_prev**: Activations (or inputs) from the previous layer
* **W**: Weight matrix of the current layer
* **b**: Bias vector of the current layer

---

### Shape intuition

If:

* `A_prev` has shape `(n_prev, m)`
* `W` has shape `(n_prev, n_current)`
* `b` has shape `(n_current, 1)`

Then:

* `Z` has shape `(n_current, m)`

---

### Example

For 2 inputs and 1 neuron:

```python
A_prev = [[x1],
          [x2]]

W = [[w1],
     [w2]]

b = [[b]]
```

[
Z = w_1x_1 + w_2x_2 + b
]

---

### One-line summary

This function computes the linear transformation ( Z ) used in forward propagation of a neural network.

```
```


In [11]:
# Forward Prop
def L_layer_forward(X, parameters):

  A = X
  L = len(parameters) // 2                  # number of layers in the neural network
  
  for l in range(1, L+1):
    A_prev = A 
    Wl = parameters['W' + str(l)]
    bl = parameters['b' + str(l)]
    #print("A"+str(l-1)+": ", A_prev)
    #print("W"+str(l)+": ", Wl)
    #print("b"+str(l)+": ", bl)
    #print("--"*20)

    A = linear_forward(A_prev, Wl, bl)
    #print("A"+str(l)+": ", A)
    #print("**"*20)
          
  return A,A_prev

````markdown
## Function: `L_layer_forward`

This function performs **forward propagation through all layers** of a neural network.

---

### Code
```python
# Forward Prop
def L_layer_forward(X, parameters):

    A = X
    L = len(parameters) // 2   # number of layers in the neural network
  
    for l in range(1, L+1):
        A_prev = A 
        Wl = parameters['W' + str(l)]
        bl = parameters['b' + str(l)]

        A = linear_forward(A_prev, Wl, bl)
          
    return A, A_prev
````

---

### Purpose

It passes the input data **layer by layer** through the network and computes the final output.

---

### Step-by-step explanation

#### 1. Input initialization

```python
A = X
```

* `X` is the input data
* `A` stores the activations (initially, input itself)

---

#### 2. Number of layers

```python
L = len(parameters) // 2
```

* Each layer has `W` and `b`
* Total layers = number of parameter pairs

Example:

```python
parameters = {W1, b1, W2, b2}
L = 2
```

---

#### 3. Loop through layers

```python
for l in range(1, L+1):
```

* Runs once per layer
* `l = 1` ‚Üí first hidden layer
* `l = L` ‚Üí output layer

---

#### 4. Store previous activation

```python
A_prev = A
```

* Saves activation from the previous layer

---

#### 5. Extract parameters

```python
Wl = parameters['W' + str(l)]
bl = parameters['b' + str(l)]
```

* Gets weights and biases for layer `l`

---

#### 6. Linear forward step

```python
A = linear_forward(A_prev, Wl, bl)
```

* Computes:
  [
  A = W_l^T A_{prev} + b_l
  ]
* Updates activation for the current layer

---

#### 7. Return values

```python
return A, A_prev
```

* `A` ‚Üí final output of the network
* `A_prev` ‚Üí activation of the second-last layer

---

### One-line summary

This function performs forward propagation across all layers of the neural network and returns the final output.

```
```


In [12]:
X = df[['cgpa', 'profile_score']].values[0].reshape(2,1) # Shape(no of features, no. of training example)
y = df[['lpa']].values[0][0]

# Parameter initialization
parameters = initialize_parameters([2,2,1])

y_hat,A1 = L_layer_forward(X, parameters)


## Example: Forward Propagation with Real Data

This code shows how **input data is passed through a neural network** using the functions you defined.

---

### Step 1: Input data `X`

* You select **2 features**:

  * `cgpa`
  * `profile_score`
* `.values[0]` ‚Üí takes the **first training example**
* `.reshape(2,1)` ‚Üí converts it to a column vector

üìê Shape of `X`:

```
(2, 1)
```

Meaning:

* 2 features
* 1 training example

---

### Step 2: True output `y`

```python
y = df[['lpa']].values[0][0]
```

* Extracts the **actual label (target value)** for the same data point
* This is the **ground truth** used later to compute loss

---

### Step 3: Initialize parameters

```python
parameters = initialize_parameters([2,2,1])
```

Creates weights and biases for a network with:

```
2 inputs ‚Üí 2 hidden neurons ‚Üí 1 output neuron
```

Parameters created:

* `W1 (2√ó2)`, `b1 (2√ó1)`
* `W2 (2√ó1)`, `b2 (1√ó1)`

All weights = `0.1`, all biases = `0`

---

### Step 4: Forward propagation

```python
y_hat, A1 = L_layer_forward(X, parameters)
```

What happens internally:

1. **Layer 1 (hidden layer)**
   [
   A_1 = W_1^T X + b_1
   ]

2. **Layer 2 (output layer)**
   [
   \hat{y} = W_2^T A_1 + b_2
   ]

---

### Returned values

* `y_hat` ‚Üí **predicted output** (model prediction)
* `A1` ‚Üí **activation from hidden layer**

---

### Shape intuition

| Variable | Shape |
| -------- | ----- |
| `X`      | (2,1) |
| `A1`     | (2,1) |
| `y_hat`  | (1,1) |

---

### One-line summary

This code takes one data point, initializes a 2‚Äì2‚Äì1 neural network, and performs forward propagation to compute the predicted output.

```
```


In [13]:
y_hat = y_hat[0][0]


### Line: `y_hat = y_hat[0][0]`

This line converts the model output from a **2D NumPy array** into a **single scalar value**.

---

### Before this line

After forward propagation:

```python
y_hat, A1 = L_layer_forward(X, parameters)
````

`y_hat` looks like this:

```text
[[value]]
```

üìê Shape:

```
(1, 1)
```

Example:

```python
y_hat = [[0.85]]
```

---

### Why this happens

Neural networks work with **matrices**, even when:

* There is only **one output neuron**
* There is only **one training example**

So the output is kept as a `(1,1)` array.

---

### What `[0][0]` does

```python
y_hat = y_hat[0][0]
```

* First `[0]` ‚Üí selects the first row
* Second `[0]` ‚Üí selects the first column

Result:

```python
y_hat = 0.85
```

Now `y_hat` is a **scalar**, not an array.

---

### Why convert to scalar?

* Easier to:

  * Print values
  * Compare with `y`
  * Compute loss manually
* Useful for **single-example predictions**

---

### One-line summary

This line extracts the scalar prediction value from a `(1,1)` NumPy array returned by the neural network.

```
```


In [14]:
A1


array([[1.6],
       [1.6]])


## What is `A1`?

`A1` is the **activation (output) of the hidden layer** in your neural network.

---

### Where does `A1` come from?

From this line:

```python
y_hat, A1 = L_layer_forward(X, parameters)
````

Inside `L_layer_forward`:

* `A_prev` is updated at each layer
* At the **last iteration**, before computing the output layer,
  `A_prev` holds the activation of the **hidden layer**
* That value is returned as `A1`

---

### In your network structure

```python
initialize_parameters([2, 2, 1])
```

Architecture:

```
Input (2) ‚Üí Hidden (2) ‚Üí Output (1)
```

So:

* `A1` = output of the **hidden layer**
* `y_hat` = output of the **final layer**

---

### Mathematical meaning

Hidden layer computation:

[
A_1 = W_1^T X + b_1
]

(no activation function applied yet in your code)

---

### Shape of `A1`

```text
A1 shape = (2, 1)
```

Why?

* 2 hidden neurons
* 1 training example

Example:

```python
A1 =
[[a1],
 [a2]]
```

Each value corresponds to **one hidden neuron‚Äôs output**.

---

### Why `A1` is useful

* Needed for **backpropagation**
* Used to compute gradients for `W2`
* Helps understand what the hidden layer learned

---

### One-line summary

`A1` is the hidden layer‚Äôs output (activation) produced during forward propagation, and it is later used for backpropagation.

```
```


In [16]:
def update_parameters(parameters, y, y_hat, A1, X):

    lr = 0.001
    error_grad = 2 * (y - y_hat)

    # ----- Update W2 -----
    parameters['W2'][0][0] += lr * error_grad * A1[0][0]
    parameters['W2'][1][0] += lr * error_grad * A1[1][0]

    # ----- Update b2 -----
    parameters['b2'][0][0] += lr * error_grad


    # ----- Update W1 (hidden neuron 1) -----
    parameters['W1'][0][0] += lr * error_grad * parameters['W2'][0][0] * X[0][0]
    parameters['W1'][0][1] += lr * error_grad * parameters['W2'][0][0] * X[1][0]
    parameters['b1'][0][0] += lr * error_grad * parameters['W2'][0][0]


    # ----- Update W1 (hidden neuron 2) -----
    parameters['W1'][1][0] += lr * error_grad * parameters['W2'][1][0] * X[0][0]
    parameters['W1'][1][1] += lr * error_grad * parameters['W2'][1][0] * X[1][0]
    parameters['b1'][1][0] += lr * error_grad * parameters['W2'][1][0]

    return parameters



## Explanation of `update_parameters` (Corrected Version)

This function performs **manual backpropagation** and updates the **weights and biases** of a  
**2 ‚Üí 2 ‚Üí 1 neural network** using **one training example**.

---

## Network Setup
- Architecture: `2 ‚Üí 2 ‚Üí 1`
- Loss function: Mean Squared Error (MSE)
\[
L = (y - \hat{y})^2
\]
- Learning rate: `0.001`
- No activation functions (pure linear model)

---

## Step 0: Error term

```python
lr = 0.001
error_grad = 2 * (y - y_hat)
````

From MSE:
[
\frac{\partial L}{\partial \hat{y}} = 2 (y - \hat{y})
]

This value tells **how wrong the prediction is** and in which direction to update.

---

## Step 1: Update Output Layer Parameters (`W2`, `b2`)

### Update `W2`

```python
parameters['W2'][0][0] += lr * error_grad * A1[0][0]
parameters['W2'][1][0] += lr * error_grad * A1[1][0]
```

Explanation:
[
\frac{\partial L}{\partial W_2} = 2 (y - \hat{y}) \cdot A_1
]

Each output weight is updated using:

* Prediction error
* Corresponding hidden neuron output

---

### Update `b2`

```python
parameters['b2'][0][0] += lr * error_grad
```

Explanation:
[
\frac{\partial L}{\partial b_2} = 2 (y - \hat{y})
]

Bias depends **only on the error**, not on inputs or weights.

---

## Step 2: Update Hidden Layer Parameters (`W1`, `b1`)

Here we apply the **chain rule**.

[
\frac{\partial L}{\partial W_1}
===============================

\frac{\partial L}{\partial \hat{y}}
\cdot
\frac{\partial \hat{y}}{\partial A_1}
\cdot
\frac{\partial A_1}{\partial W_1}
]

---

### Update `W1` (Hidden Neuron 1)

```python
parameters['W1'][0][0] += lr * error_grad * parameters['W2'][0][0] * X[0][0]
parameters['W1'][0][1] += lr * error_grad * parameters['W2'][0][0] * X[1][0]
```

Explanation:

* Error flows back through `W2`
* Then through the input `X`

---

### Update `b1` (Hidden Neuron 1)

```python
parameters['b1'][0][0] += lr * error_grad * parameters['W2'][0][0]
```

---

### Update `W1` (Hidden Neuron 2)

```python
parameters['W1'][1][0] += lr * error_grad * parameters['W2'][1][0] * X[0][0]
parameters['W1'][1][1] += lr * error_grad * parameters['W2'][1][0] * X[1][0]
```

---

### Update `b1` (Hidden Neuron 2)

```python
parameters['b1'][1][0] += lr * error_grad * parameters['W2'][1][0]
```

---

## Key Assumptions

* Single training example
* Linear layers only
* Manual gradient computation
* Hardcoded for `2 ‚Üí 2 ‚Üí 1`

---

## One-line Summary

This function manually applies **backpropagation using the chain rule** and updates all weights and biases of a **2‚Äì2‚Äì1 neural network** to reduce prediction error.

```
```


In [17]:
update_parameters(parameters,y,y_hat,A1,X)

{'W1': array([[0.10658137, 0.10658137],
        [0.10658137, 0.10658137]]),
 'b1': array([[0.00082267],
        [0.00082267]]),
 'W2': array([[0.111776],
        [0.111776]]),
 'b2': array([[0.00736]])}

In [19]:
X = df[['cgpa', 'profile_score']].values[0].reshape(2,1) # Shape(no of features, no. of training example)
y = df[['lpa']].values[0][0]

# Parameter initialization
parameters = initialize_parameters([2,2,1])

y_hat,A1 = L_layer_forward(X,parameters)
y_hat = y_hat[0][0]

update_parameters(parameters,y,y_hat,A1,X)

parameters

{'W1': array([[0.10658137, 0.10658137],
        [0.10658137, 0.10658137]]),
 'b1': array([[0.00082267],
        [0.00082267]]),
 'W2': array([[0.111776],
        [0.111776]]),
 'b2': array([[0.00736]])}

In [20]:
X = df[['cgpa', 'profile_score']].values[1].reshape(2,1) # Shape(no of features, no. of training exaplme)
y = df[['lpa']].values[1][0]

y_hat,A1 = L_layer_forward(X,parameters)
y_hat = y_hat[0][0]

update_parameters(parameters,y,y_hat,A1,X)

parameters

{'W1': array([[0.11481311, 0.11716504],
        [0.11481311, 0.11716504]]),
 'b1': array([[0.00199863],
        [0.00199863]]),
 'W2': array([[0.12751067],
        [0.12751067]]),
 'b2': array([[0.01658246]])}

In [21]:
X = df[['cgpa', 'profile_score']].values[2].reshape(2,1) # Shape(no of features, no. of training exaplme)
y = df[['lpa']].values[2][0]

y_hat,A1 = L_layer_forward(X,parameters)
y_hat = y_hat[0][0]

update_parameters(parameters,y,y_hat,A1,X)

parameters

{'W1': array([[0.12458335, 0.13344878],
        [0.12461077, 0.13349447]]),
 'b1': array([[0.00362701],
        [0.00363158]]),
 'W2': array([[0.1477752 ],
        [0.14818986]]),
 'b2': array([[0.02760173]])}

In [22]:
X = df[['cgpa', 'profile_score']].values[3].reshape(2,1) # Shape(no of features, no. of training exaplme)
y = df[['lpa']].values[3][0]

y_hat,A1 = L_layer_forward(X,parameters)
y_hat = y_hat[0][0]

update_parameters(parameters,y,y_hat,A1,X)

parameters

{'W1': array([[0.13562189, 0.15994127],
        [0.13579618, 0.16033944]]),
 'b1': array([[0.00583472],
        [0.00586866]]),
 'W2': array([[0.17460429],
        [0.1769274 ]]),
 'b2': array([[0.04024579]])}

In [23]:
# epochs implementation

parameters = initialize_parameters([2,2,1])
epochs = 5

for i in range(epochs):

  Loss = []

  for j in range(df.shape[0]):

    X = df[['cgpa', 'profile_score']].values[j].reshape(2,1) # Shape(no of features, no. of training example)
    y = df[['lpa']].values[j][0]

    # Parameter initialization


    y_hat,A1 = L_layer_forward(X,parameters)
    y_hat = y_hat[0][0]

    update_parameters(parameters,y,y_hat,A1,X)

    Loss.append((y-y_hat)**2)

  print('Epoch - ',i+1,'Loss - ',np.array(Loss).mean())

parameters

Epoch -  1 Loss -  26.28249792398698
Epoch -  2 Loss -  19.438253848220803
Epoch -  3 Loss -  10.139874435827522
Epoch -  4 Loss -  3.385561305106485
Epoch -  5 Loss -  1.3198454128484565


{'W1': array([[0.273603  , 0.3993222 ],
        [0.28787155, 0.42586102]]),
 'b1': array([[0.02885522],
        [0.03133223]]),
 'W2': array([[0.42574893],
        [0.50219328]]),
 'b2': array([[0.11841278]])}