> <div>
> <h1>Deep Learning Backpropagation Algorithm for Classification with Sigmoid Activation</h1>

> <h2>1. Forward Pass</h2>
> <ul>
    <li><strong>Input Layer:</strong> Pass input data through the network.</li>
    <li><strong>Hidden Layers:</strong> For each neuron, compute the weighted sum of inputs and apply the sigmoid activation function.
        <ul>
            <li><strong>Sigmoid Function:</strong> <code>&sigma;(z) = 1 / (1 + e<sup>-z</sup>)</code></li>
        </ul>
    </li>
    <li><strong>Output Layer:</strong> Compute the final output using the same process.</li>
</ul>

> <h2>2. Calculate Loss</h2>
> <ul>
    <li>Compare the network’s output to the true label using a loss function.
        <ul>
            <li><strong>Cross-Entropy Loss:</strong>
                <p>
                    <code>L = - &sum; (y log(&#x3B7;) + (1 - y) log(1 - &#x3B7;))</code>
                </p>
                where <code>&#x3B7;</code> is the predicted output and <code>y</code> is the true label.
            </li>
        </ul>
    </li>
</ul>

> <h2>3. Backward Pass (Backpropagation)</h2>
> <ul>
    <li><strong>Output Layer:</strong>
        <ul>
            <li>Compute the gradient of the loss with respect to the output (error term).</li>
            <li>For sigmoid activation, adjust the error term using the derivative of the sigmoid function.
                <ul>
                    <li><strong>Derivative of Sigmoid:</strong> <code>&sigma;'<sub>sigmoid</sub>(z) = &sigma;(z) &times; (1 - &sigma;(z))</code></li>
                </ul>
            </li>
        </ul>
    </li>
    <li><strong>Hidden Layers:</strong>
        <ul>
            <li>Propagate the error backward through the network.</li>
            <li>Compute gradients of the loss with respect to each weight using the error term and the derivative of the sigmoid function.</li>
            <li>Update weights using the computed gradients.</li>
        </ul>
    </li>
</ul>

> <h2>4. Update Weights</h2>
> <ul>
    <li>Adjust the weights of the neurons using the gradients calculated during backpropagation.
        <ul>
            <li><strong>Gradient Descent Update Rule:</strong>
                <p>
                    <code>w = w - &eta; &times; (&#x2202;L / &#x2202;w)</code>
                </p>
                where <code>&eta;</code> is the learning rate.
            </li>
        </ul>
    </li>
</ul>

> <h2>5. Repeat</h2>
> <ul>
    <li>Iterate through the forward pass, loss calculation, backpropagation, and weight update steps for multiple epochs until the network performs satisfactorily.</li>
</ul>
</div>

In [1]:
import numpy as np
import pandas as pd

In [2]:
# DataFrame
df = pd.DataFrame([[8,8,1],[7,9,1],[6,10,0],[5,5,0]], columns=['cgpa', 'profile_score', 'placed'])
df.head()

Unnamed: 0,cgpa,profile_score,placed
0,8,8,1
1,7,9,1
2,6,10,0
3,5,5,0


> <div class = "markdown-google-sans"><h1>Function 1 - Parameter Initialization</h1>

In [3]:
def initialize_parameters(ann_dim):
    """
    Initialize parameters for the neural network.

    Parameters:
    ann_dim (list of int): Dimensions of the layers in the neural network, where each element represents the number of neurons in that layer.

    Returns:
    parameters (dict): A dictionary containing weights ('w') and biases ('b') for each layer.
    """
    parameters = {}
    # Iterate through each layer except the input layer
    for i in range(1, len(ann_dim)):
        # Initialize weights for layer i
        # Weights are initialized with small values, here 0.1
        parameters['w' + str(i)] = np.ones((ann_dim[i-1], ann_dim[i])) * 0.1
        # Initialize biases for layer i
        # Biases are initialized to zero
        parameters['b' + str(i)] = np.zeros((1, ann_dim[i]))

    return parameters


In [4]:
parameters = initialize_parameters([2,2,1])
parameters

{'w1': array([[0.1, 0.1],
        [0.1, 0.1]]),
 'b1': array([[0., 0.]]),
 'w2': array([[0.1],
        [0.1]]),
 'b2': array([[0.]])}

> <div class = "markdown-google-sans"><h1>Function 2 - Forward Propagation</h1>

In [5]:
def forward_propagation(X, parameters):
    """
    Perform forward propagation through the neural network.

    Parameters:
    X (numpy array): Input data of shape (m, n_x), where m is the number of examples and n_x is the number of features.
    parameters (dict): Dictionary containing weights ('w') and biases ('b') for each layer.

    Returns:
    A (numpy array): Output of the neural network after forward propagation.
    A_prev (numpy array): Output of the previous layer (before the last layer) to be used in backpropagation.
    """
    # Set initial activation to the input data
    A = X
    # Calculate the number of layers in the network
    n = len(parameters) // 2

    # Iterate through each layer of the network
    for i in range(1, n + 1):
        # Save the activation from the previous layer
        A_prev = A
        # Compute the linear part of the activation for layer i
        z = np.dot(A_prev, parameters['w'+str(i)]) + parameters['b'+str(i)]
        # Apply sigmoid activation function
        y_hat = 1 / (1 + np.exp(-z))
        # Set the current layer's activation to be used in the next iteration
        A = y_hat

    # Return the final output and the activation from the previous layer
    return A, A_prev

In [7]:
A, A_prev = forward_propagation(df.iloc[:1, :-1], parameters)
print(A)
print(A_prev)

[[0.54150519]]
[[0.83201839 0.83201839]]


> <div class = "markdown-google-sans"><h1>Function 3 - Updating Weights & Biases</h1>

In [57]:
def update_parameters(parameters,y,y_hat,A1,X):
    # Layer 2 - Parameters
    parameters['w2'][0][0] += (0.01 * (y - y_hat)*A1[0][0])
    parameters['w2'][1][0] += (0.01 * (y - y_hat)*A1[0][1])
    parameters['b2'][0][0] += (0.01 * (y - y_hat))

    # Layer 1 - Parameters
    parameters['w1'][0][0] += (0.01 * (y - y_hat)*parameters['w2'][0][0]*A1[0][0]*(1-A1[0][0])*X[0][0])
    parameters['w1'][0][1] += (0.01 * (y - y_hat)*parameters['w2'][0][0]*A1[0][0]*(1-A1[0][0])*X[0][1])
    parameters['b1'][0][0] += (0.01 * (y - y_hat)*parameters['w2'][0][0]*A1[0][0]*(1-A1[0][0]))

    parameters['w1'][1][0] += (0.01 * (y - y_hat)*parameters['w2'][1][0]*A1[0][1]*(1-A1[0][1])*X[0][0])
    parameters['w1'][1][1] += (0.01 * (y - y_hat)*parameters['w2'][1][0]*A1[0][1]*(1-A1[0][1])*X[0][1])
    parameters['b1'][0][1] += (0.01 * (y - y_hat)*parameters['w2'][1][0]*A1[0][1]*(1-A1[0][1]))

In [9]:
update_parameters(parameters, df['placed'][0], A[0][0], A_prev, np.array(df.iloc[:1, :-1]))

In [10]:
parameters

{'w1': array([[0.1005322, 0.1005322],
        [0.1005322, 0.1005322]]),
 'b1': array([[6.65255093e-05, 6.65255093e-05]]),
 'w2': array([[0.10381476],
        [0.10381476]]),
 'b2': array([[0.00458495]])}

In [58]:
# Initialize parameters with the specified dimensions [2,2,1]
parameters = initialize_parameters([2,2,1])

# Iterate for a specified number of epochs
for i in range(10):
    # Initialize a list to keep track of the loss for each iteration
    loss = []

    # Iterate through each sample in the dataset
    for j in range(df.shape[0]):
        # Forward Propagation: Compute the output 'A' and the previous layer's output 'A_prev'
        A, A_prev = forward_propagation(df.iloc[:j+1, :-1], parameters)

        # Compute the loss for the current sample
        # Loss is the squared difference between the actual value and the predicted value
        l = - (df['placed'][j]*np.log(A[0][0])) - ((1-df['placed'][j])*np.log(1-A[0][0]))
        # Append the loss value to the loss list
        loss.append(l)

        # Update the parameters based on the current sample
        # Uses the actual value, the predicted value, the previous layer's output, and the input features
        update_parameters(parameters, df['placed'][j], A[0][0], A_prev, np.array(df.iloc[:j+1, :-1]))

    # Print the mean loss for the current epoch
    print(np.array(loss).mean())

0.6995793814769459
0.699403737188172
0.6992368341080972
0.6990782345893999
0.6989275235481633
0.6987843072461035
0.6986482121420933
0.6985188838088761
0.6983959859111069
0.6982791992410817


<hr>

> <div class = "markdown-google-sans"><h1>Backpropogation using Tensorflow - Keras</h1>

In [13]:
import tensorflow as tf
from tensorflow import keras
from keras import Sequential
from keras.layers import InputLayer, Dense

In [47]:
model = Sequential()

# model.add(InputLayer(shape = (1, 2)))
model.add(Dense(2, activation = 'sigmoid', input_dim = 2))
model.add(Dense(1, activation = 'sigmoid'))

model.summary()

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [48]:
model.get_weights()

[array([[ 0.12149262,  1.0377032 ],
        [-0.3761341 ,  0.33478928]], dtype=float32),
 array([0., 0.], dtype=float32),
 array([[0.70839894],
        [0.6414827 ]], dtype=float32),
 array([0.], dtype=float32)]

In [49]:
# Define weights and biases manually
weights_for_layer1 = np.array([[0.1, 0.1], [0.1, 0.1]])
biases_for_layer1 = np.array([0.1, 0.1])
weights_for_layer2 = np.array([[0.1], [0.1]])
biases_for_layer2 = np.array([0.1])

# Combine them into the correct format
parameters_format = [weights_for_layer1, biases_for_layer1, weights_for_layer2, biases_for_layer2]

# Set the weights
model.set_weights(parameters_format)

In [50]:
model.get_weights()

[array([[0.1, 0.1],
        [0.1, 0.1]], dtype=float32),
 array([0.1, 0.1], dtype=float32),
 array([[0.1],
        [0.1]], dtype=float32),
 array([0.1], dtype=float32)]

In [51]:
model.compile(optimizer=keras.optimizers.Adam(learning_rate=0.01), loss='binary_crossentropy', metrics=['accuracy'])

In [52]:
model.fit(df.iloc[:, :-1].values, df['placed'].values, epochs = 10)

Epoch 1/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 1s/step - accuracy: 0.5000 - loss: 0.6995
Epoch 2/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 39ms/step - accuracy: 0.5000 - loss: 0.6979
Epoch 3/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 56ms/step - accuracy: 0.5000 - loss: 0.6966
Epoch 4/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 56ms/step - accuracy: 0.5000 - loss: 0.6955
Epoch 5/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 33ms/step - accuracy: 0.5000 - loss: 0.6947
Epoch 6/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 34ms/step - accuracy: 0.5000 - loss: 0.6941
Epoch 7/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 57ms/step - accuracy: 0.5000 - loss: 0.6936
Epoch 8/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 43ms/step - accuracy: 0.5000 - loss: 0.6933
Epoch 9/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m

<keras.src.callbacks.history.History at 0x7caad97d1030>

> <div class = "markdown-google-sans">Since the loss values of our custom backpropagation model and the Keras backpropagation model are identical, this verifies that our model implementation is correct."</div>