<h2 style="color:green" align="center">Implementing a Perceptron with Gradient Descent Using Python</h2>

NN contains two neurons (two features) in the input layer, one neuron in the hidden layer, one neuron in the output layer. It is typically a multiple logistic regression. Since output neuron contains only two possible outcomes, it is also called binary logistic regression.

This setup is also called a perceptron.

In [36]:
import numpy as np
import tensorflow as tf
from tensorflow import keras
import pandas as pd
from matplotlib import pyplot as plt
%matplotlib inline

In [37]:
df = pd.read_csv("D:\\9.BOOKS MATERIALS\\4.PROGRAMS\\Python_Programs\\Datasets\\insurance_data1.csv")
df.head()

Unnamed: 0,age,affordibility,bought_insurance
0,22,1,0
1,25,0,0
2,47,1,1
3,52,0,0
4,46,1,1


**Split train and test set**

In [38]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(df[['age','affordibility']],df.bought_insurance,test_size=0.2, random_state=25)

**Preprocessing: Scale the data so that both age and affordibility are in same scaling range**

In [39]:
X_train_scaled = X_train.copy()
X_train_scaled['age'] = X_train_scaled['age'] / 100

X_test_scaled = X_test.copy()
X_test_scaled['age'] = X_test_scaled['age'] / 100

**Parameter initialization**

In [40]:
# np.random.seed(13)
# number of iterations (epochs)
epoch = 1000
# learning rate
learn_rate = 0.5

**Objective, Derivative, and Cost function**

In [41]:
# Prediction function  
def predict(w1, w2, bias, age, affordability):
    weighted_sum = w1 * age + w2 * affordability + bias
    return weighted_sum 

In [42]:
# Activation function
def sigmoid_AF(X):
   return 1/(1+np.exp(-X))
sigmoid_AF(np.array([12,0,1]))

array([0.99999386, 0.5       , 0.73105858])

In [43]:
# Loss function
#np.log([0])
#np.log([0.0000000000001])
#np.log([1])
#np.log([1-0.0000000000001])

def log_loss(y_true, y_predicted):
    # assigning extreme small value close to 0 instead of 0, as log(0) results to infinity
    epsilon = 1e-15  
    y_predicted_new = [max(i,epsilon) for i in y_predicted]
    # assigning extreme large value close to 1 instead of 1, as log(1) results to 0
    y_predicted_new = [min(i,1-epsilon) for i in y_predicted_new]
    y_predicted_new = np.array(y_predicted_new)
    overall_loss = -np.mean(y_true*np.log(y_predicted_new)+(1-y_true)*np.log(1-y_predicted_new))
    return overall_loss

In [44]:
# Partial derivatives 
def gradient(age, affordability, y_predicted, y_true):
    # To perform array multiplication we do transpose
    n = len(age)
    w1d = (1/n)*np.dot(np.transpose(age),(y_predicted-y_true)) 
    w2d = (1/n)*np.dot(np.transpose(affordability),(y_predicted-y_true)) 
    bias_d = np.mean(y_predicted-y_true)
    return [w1d, w2d, bias_d]

<img src="parameters update.jpg" height=600 width=600/>

**Gradient descent algorithm**

In [45]:
def gradient_descent(age, affordability, y_true, epochs, loss_thresold):
    w1 = w2 = 1
    bias = 0 
    
    for i in range(epochs):
        
        # Calculation in neuron
        weighted_sum = predict(w1, w2, bias, age, affordability)
        y_predicted = sigmoid_AF(weighted_sum)
        
        # Loss calculation
        loss = log_loss(y_true, y_predicted)

        # Gradient calculation
        w1d, w2d, bias_d = gradient(age, affordability, y_predicted, y_true)

        # Calculating step size and new value for the parameters
        w1 = w1 - learn_rate * w1d
        w2 = w2 - learn_rate * w2d
        bias = bias - learn_rate * bias_d

        if i % 50 == 0:
            print (f'Epoch:{i}, w1:{w1}, w2:{w2}, bias:{bias}, loss:{loss}')

        if loss<=loss_thresold:
            print (f'Epoch:{i}, w1:{w1}, w2:{w2}, bias:{bias}, loss:{loss}')
            break

    return w1, w2, bias

**Build NN**

In [46]:
class myNN:
    def __init__(self):
        self.w1 = 1 
        self.w2 = 1
        self.bias = 0
        
    def fit(self, X, y, epoch, loss_thresold):
        self.w1, self.w2, self.bias = gradient_descent(X['age'], X['affordibility'], y, epoch, loss_thresold)
        print(f"Final weights and bias: w1: {self.w1}, w2: {self.w2}, bias: {self.bias}")
        return [[self.w1, self.w2], self.bias]
        
    def predict(self, X_test):
        weighted_sum = self.w1 * X_test['age'] + self.w2 * X_test['affordibility'] + self.bias
        return sigmoid_AF(weighted_sum)

In [47]:
customModel = myNN()
coef, intercept = customModel.fit(X_train_scaled, y_train, epoch, loss_thresold=0.4631)

Epoch:0, w1:0.974907633470177, w2:0.948348125394529, bias:-0.11341867736368583, loss:0.7113403233723417
Epoch:50, w1:1.503319554173139, w2:1.108384790367645, bias:-1.2319047301235464, loss:0.5675865113475955
Epoch:100, w1:2.2007131317600317, w2:1.2941584023238906, bias:-1.6607009122062801, loss:0.5390680417774752
Epoch:150, w1:2.8495727769689077, w2:1.369689549157275, bias:-1.986105845859897, loss:0.5176462164249293
Epoch:200, w1:3.443016970881803, w2:1.4042218624465033, bias:-2.2571369883752723, loss:0.5005011269691375
Epoch:250, w1:3.982450494649576, w2:1.4239127329321233, bias:-2.494377365971801, loss:0.48654089537617085
Epoch:300, w1:4.472179522095915, w2:1.438787986553552, bias:-2.707387811922373, loss:0.4750814640632793
Epoch:350, w1:4.917245868007634, w2:1.4525660781176122, bias:-2.901176333556766, loss:0.46561475306999006
Epoch:366, w1:5.051047623653049, w2:1.4569794548473887, bias:-2.9596534546250037, loss:0.46293944095888917
Final weights and bias: w1: 5.051047623653049, w2: 

In [48]:
coef, intercept

([5.051047623653049, 1.4569794548473887], -2.9596534546250037)

In [49]:
X_test_scaled

Unnamed: 0,age,affordibility
2,0.47,1
10,0.18,1
21,0.26,0
11,0.28,1
14,0.49,1
9,0.61,1


**Predict using custom model**

In [50]:
customModel.predict(X_test_scaled)

2     0.705020
10    0.355836
21    0.161599
11    0.477919
14    0.725586
9     0.828987
dtype: float64