The data set represents a classification problem where the task is to predict whether a student will get placed or not based on the CGPA and ResumeScore. Implement the backpropagation algorithm on the above task using only Python constructs, numpy library and pandas library.

Make the following assumptions:

*   There is only one hidden layer with two nodes
*   The activation function used in the hidden layer is the logistic function
*   Initialize the parameters using small random values in the range 0 to 1
*   The loss function is **binary cross entropy**
*   Assume the number of epochs to be 50
*   Train on the entire dataset
*   Use the same initial parameter values and the same configuration shown above to implement the MLP using keras. Compare the results
*   Consider the students_placement.csv dataset and apply your BP algorithm to predict the placement of a student. Here, you would be required to split the dataset into train and test sets










In [1]:
import numpy as np
import pandas as pd

In [2]:
df = pd.DataFrame([[8,8,1],[7,9,1],[6,10,0],[6,5,0]], columns=['cgpa', 'resumescore', 'placed'])#dataframe

In [3]:
df

Unnamed: 0,cgpa,resumescore,placed
0,8,8,1
1,7,9,1
2,6,10,0
3,6,5,0


Defining the sigmoid function

In [37]:
def sigmoid(x):#sigmoid activation fn
    return 1 / (1 + np.exp(-x))

def sig_der(x):#derivative
    return x * (1 - x)

Initializing the parameters using small random values in the range 0 to 1.

In [38]:
def init_param(input_size, hidden_size, output_size):
    np.random.seed(0)
    #weight of hidden layer
    weight1 = np.random.rand(hidden_size, input_size)

    #bias of hidden layer
    bias1 = np.random.rand(hidden_size, 1)

    #weight of output layer
    weight2 = np.random.rand(output_size, hidden_size)

    #bias of output layer
    bias2 = np.random.rand(output_size, 1)

    return weight1, bias1, weight2, bias2

**Forward propagation :** Forward propagation in an MLP is the process of transmitting input data through the neural network. The input values are multiplied by corresponding weights, and the resulting sums, along with biases, are passed through an activation function in each neuron. This sequential flow continues through hidden layers until the output layer is reached, providing the final prediction.

Activation functions introduce non-linearity, enabling the network to learn complex patterns. Here the activation function used is **logistic function (0 or 1)**. The weighted connections and activation functions collectively enable the neural network to transform input information into meaningful predictions during the forward propagation process.

In [39]:
def fwd(X, weight1, bias1, weight2, bias2):
    #1st dot product
    dot1 = np.dot(weight1, X) + bias1

    #output of 1st hidden layer
    out1 = sigmoid(dot1)

    #2nd dot product
    dot2 = np.dot(weight2, out1) + bias2

    #output of output layer
    out2 = sigmoid(dot2)

    return dot1, out1, dot2, out2

**Backward propagation :** Backward propagation, or backpropagation, is the process in a MLP where the neural network learns from its mistakes during training. It involves comparing the predicted output generated during forward propagation with the actual target values, computing the error. The error is then propagated backward through the network, layer by layer, while adjusting the weights and biases using a technique called gradient descent. The objective is to minimize the error by updating the parameters in the opposite direction of the gradient, gradually improving the model's ability to make accurate predictions. Backward propagation essentially tunes the network's parameters to enhance its performance by iteratively adjusting the connections between neurons based on the discrepancies between predicted and actual outcomes.

In [40]:
def backwd(X, Y, Z1, A1, Z2, A2, W2):
    m = X.shape[1]
    dA2 = - (Y / A2 - (1 - Y) / (1 - A2))
    dZ2 = dA2 * sig_der(A2)
    dW2 = (1 / m) * np.dot(dZ2, A1.T)
    db2 = (1 / m) * np.sum(dZ2, axis=1, keepdims=True)
    dZ1 = np.dot(W2.T, dZ2) * sig_der(A1)
    dW1 = (1 / m) * np.dot(dZ1, X.T)
    db1 = (1 / m) * np.sum(dZ1, axis=1, keepdims=True)

    return dW1, db1, dW2, db2

**updating parameters :** To adjust the weights and biases between neurons during the training process. After forward propagation and calculating the error, the backpropagation step computes how much each parameter contributed to the error. The network then updates these parameters, like weights and biases, by nudging them in the opposite direction of the error gradient. This update is usually performed using an optimization algorithm like gradient descent.

In [41]:
def update_param(weight1, bias1, weight2, bias2, dW1, db1, dW2, db2, learning_rate):
    #updation of w1
    weight1 = weight1 - learning_rate * dW1

    #updation of bias 1
    bias1 = bias1 - learning_rate * db1

    #updation of w2
    weight2 = weight2 - learning_rate * dW2

    #updation of bias 2
    bias2 = bias2 - learning_rate * db2

    return weight1, bias1, weight2, bias2

In [42]:
x=df[["cgpa","resumescore"]].values.T
y=df[["placed"]].values.T

Training the model using the above defined functions

In [43]:
def train(X, Y, hidden_size, num_iterations, learning_rate):
    input_size = X.shape[0]
    output_size = Y.shape[0]
    w1_init, b1_init, w2_init, b2_init = init_param(input_size, hidden_size, output_size)
    W1, b1, W2, b2 = w1_init, b1_init, w2_init, b2_init
    for i in range(num_iterations):
        Z1, A1, Z2, A2 = fwd(X, W1, b1, W2, b2)
        dW1, db1, dW2, db2 = backwd(X, Y, Z1, A1, Z2, A2, W2)
        W1, b1, W2, b2 = update_param(W1, b1, W2, b2, dW1, db1, dW2, db2, learning_rate)
        if i % 2 == 0:
            loss = -np.mean(Y * np.log(A2) + (1 - Y) * np.log(1 - A2))
            print(f"Cost after iteration {i}: {loss}")
    return W1, b1, W2, b2, w1_init, w2_init, b1_init, b2_init


In [44]:
hidden_size = 2 #no of nodes in the hidden layer
num_iterations = 50
learning_rate = 0.01
W1, b1, W2, b2 ,w1_init,w2_init,b1_init,b2_init= train(x, y, hidden_size, num_iterations, learning_rate)

Cost after iteration 0: 1.2424288489791138
Cost after iteration 2: 1.2324715173009992
Cost after iteration 4: 1.2226150740563688
Cost after iteration 6: 1.2128604801820648
Cost after iteration 8: 1.2032086638840598
Cost after iteration 10: 1.1936605191912535
Cost after iteration 12: 1.1842169045426476
Cost after iteration 14: 1.174878641412396
Cost after iteration 16: 1.1656465129771567
Cost after iteration 18: 1.1565212628301467
Cost after iteration 20: 1.1475035937462075
Cost after iteration 22: 1.1385941665020827
Cost after iteration 24: 1.129793598755998
Cost after iteration 26: 1.121102463990462
Cost after iteration 28: 1.1125212905220576
Cost after iteration 30: 1.1040505605817774
Cost after iteration 32: 1.0956907094692498
Cost after iteration 34: 1.0874421247839656
Cost after iteration 36: 1.0793051457363545
Cost after iteration 38: 1.0712800625412875
Cost after iteration 40: 1.0633671158962914
Cost after iteration 42: 1.055566496546455
Cost after iteration 44: 1.04787834493768

In [26]:
W1

array([[0.54865698, 0.71505586],
       [0.60225455, 0.54443548]])

In [27]:
W2

array([[0.24750123, 0.70171762]])

Implementing the model using Keras

In [28]:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
import numpy as np

x=df[["cgpa","resumescore"]]
y=df[["placed"]]
final_W1 = np.array(w1_init)
final_b1 = np.array(b1_init)
final_W2 = np.array(w2_init)
final_b2 = np.array(b2_init)

# Defining the artifical neural network
model = Sequential()
model.add(Dense(2, activation='sigmoid', input_shape=(2,)))
model.add(Dense(1, activation='sigmoid'))

final_W1 = final_W1.T
final_b1 = final_b1.reshape((2,))
final_W2 = final_W2.T
final_b2 = final_b2.reshape((1,))
model.layers[0].set_weights([final_W1, final_b1])
model.layers[1].set_weights([final_W2, final_b2])
model.compile(loss='binary_crossentropy', optimizer='SGD', metrics=['accuracy'])
model.fit(x, y, validation_split=0.2, epochs=50)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<keras.src.callbacks.History at 0x7fbde8d75e70>

In [None]:
model.layers[0].get_weights()

[array([[0.49865586, 0.5489598 ],
        [0.6632965 , 0.49076805]], dtype=float32),
 array([0.38360593, 0.59412664], dtype=float32)]

In [None]:
model.layers[1].get_weights()

[array([[0.38792065],
        [0.84210676]], dtype=float32),
 array([0.91399604], dtype=float32)]

In [30]:
W1

array([[0.54865698, 0.71505586],
       [0.60225455, 0.54443548]])

In [31]:
w1_init

array([[0.5488135 , 0.71518937],
       [0.60276338, 0.54488318]])

In [48]:
W2

array([[0.24750123, 0.70171762]])

In [33]:
w2_init

array([[0.43758721, 0.891773  ]])

In [17]:
b1

array([[0.42362865],
       [0.64580898]])

In [45]:
b1_init

array([[0.4236548 ],
       [0.64589411]])

In [18]:
b2

array([[0.77350051]])

In [46]:
b2_init

array([[0.96366276]])