#### Exercise 1 - Using OO design, Implement a classification using Perceptron
##### Task-1: Load and Explore Churn Modeling Dataset
##### Task-2: Prepare Feature Matrix and Target Variable for Churn Modeling Prediction
##### Task-3: Standardize Input Features Using Standard Scaler
##### Task-4: Split the Dataset into Training and Testing Sets
##### Task-5: Initialize Model Parameters for Classification
##### Task-6: Implement Activation Function (step function) for Perceptron Model
##### Task-7: Implement Forward Propagation with Activation Function for Perceptron

In [1]:
#Import all the necessary libraries
import numpy as np
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split

## Task-1: Load and Explore Churn Modeling Dataset

Load the dataset from the CSV file located at '/shareddata/datasets/ngitcourse4/Churn_Modelling_Data.csv' into a pandas DataFrame called 'churn_df'.

In [2]:
churn_df = pd.read_csv('Churn_Modelling_Data.csv')

## Task-2: Prepare Feature Matrix and Target Variable for Churn Modeling Prediction

Steps:

1) Drop the Exited column from the DataFrame churn_df to isolate the input features.
2) Extract the Exited column from churn_df as the target variable. 

In [3]:
X = churn_df.drop(['Exited'], axis=1).values
y = churn_df['Exited'].values.reshape(-1, 1)

## Task-3: Standardize Input Features Using Standard Scaler

Standardize the input features (X) to have a mean of 0 and a standard deviation of 1, 
using the StandardScaler from sklearn.preprocessing.

Steps:
    
1) Initialize the StandardScaler from the sklearn.preprocessing module
2) Apply the scaler to the input feature matrix X by fitting the scaler to the data and then transforming it to standardized values:

In [4]:
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

## Task-4: Split the Dataset into Training and Testing Sets

Split the standardized input features (X_scaled) and target variable (y) into training and testing sets 
using the train_test_split function from sklearn.model_selection. 

Steps:

1) Use the train_test_split function to divide the standardized input data (X_scaled) and 
the target variable (y) into training and testing sets.
2) Specify test_size=0.2 to allocate 20% of the data for testing and 80% for training.
3) Set random_state=42 to ensure that the split is reproducible, so that running the code multiple times yields the same split.

In [5]:
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=42)

In [6]:
X_train.shape

(120, 10)

In [7]:
y_train.shape

(120, 1)

In [8]:
class Perceptron:
    
    def __init__(self):
        self.weights = None
        self.bias = None
    
    def initialize_parameters(self, X):
        pass
    
    def activation(self, z):
        pass
    
    def activation_derivative(self, z):
        pass 
    
    def forward_propagation(self, X):
        pass
    
    def compute_bce(self,y, y_pred):
        pass
    
    def bce_derivative(self,y, y_pred):
        pass
    
    def compute_gradients(self, X, y_true, y_pred):
        pass
    
    def train(self, X, num_epochs):
        pass

## Task-5: Initialize Model Parameters for Classification

Implement a function to initialize the weights and bias parameters for a classification model. 
The weights are initialized randomly with small values, and the bias is set to zero.

Steps:

1) Define a function initialize_parameters that takes the input data X as an argument.
Inside the function:
2) Calculate the number of features in X 
3) Initialize the weights as a NumPy array of shape using np.random.randn
4) Set the bias to zero

In [9]:
def initialize_parameters(self, X):
    n_features = X.shape[1]

    # Initializing weights and bias
    self.weights = np.random.randn(n_features, 1) * 0.01
    self.bias = 0
    
Perceptron.initialize_parameters=initialize_parameters

## Task-6: Implement Activation Function (step function) for Perceptron Model

Objective:

Define the step function as the activation function for the Perceptron model, which will be used to determine the binary output (0 or 1) based on the linear combination of inputs and weights.

Steps:

1) Define an activation function that takes self and z as arguments:
z is the input to the activation function, which is the result of the dot product between the input features and the weights plus the bias.
2) Inside the function, apply the step function outputs 1 if z >= 0 and 0 if z < 0. 
This function converts the continuous output of the linear model into a binary class label (0 or 1).

In [10]:
def activation(self, z):
    """Sigmoid activation function."""
    
    return 1 / (1 + np.exp(-z))

Perceptron.activation=activation

In [11]:
def activation_derivative(self, z):
    """Derivative of the sigmoid function."""
    a = self.activation(z)
    
    return a * (1 - a)

Perceptron.activation_derivative=activation_derivative

## Task-7: Implement Forward Propagation with Activation Function for Perceptron

Implement forward propagation for the Perceptron model, where the input features are linearly combined 
with the weights and bias, and then passed through the activation function to generate binary predictions.

Steps:

1) Define the forward_propagation method, which takes X as an argument.
Inside the method:
2) Compute the linear combination of the input feature matrix X with the weights (self.weights) and add the bias (self.bias):
z = np.dot(X, self.weights) + self.bias
This step produces the raw output (z) of the model, representing the input to the activation function.
3) Pass the computed z through the activation function (self.activation(z)) to get the predicted labels (y_pred):
The activation function (such as the step function) converts the continuous output into binary values (0 or 1) for classification.
Return the predicted values y_pred.

In [12]:
def forward_propagation(self, X):
   
    # compute the sum of the product of the input and the weights
    z = np.dot(X, self.weights) + self.bias # Finding the dot product and adding the bias
    y_pred = self.activation(z) # Passing through an activation function

    return y_pred

Perceptron.forward_propagation=forward_propagation

In [13]:
def compute_bce(self, y, y_pred):
    m = y.shape[0]
    bce = (1/m) * np.sum(-(y * np.log(y_pred) + (1 - y) * np.log(1 - y_pred)))
    
    return bce

Perceptron.compute_bce=compute_bce

In [14]:
def bce_derivative(self, y, y_pred):
     m = y.shape[0]
   
     return (1/m) * (y_pred - y) / (y_pred * (1 - y_pred))

Perceptron.bce_derivative=bce_derivative

In [15]:
# Function to compute the gradient of BCE with respect to the parameters (w, b)
def compute_gradients(self, X, y_true, y_pred):
 
    d_pred_loss = self.bce_derivative(y_true, y_pred)             # Derivative of loss w.r.t y_pred
    d_activation_loss = self.activation_derivative(y_pred)           # Derivative of sigmoid activation
    dL_dz = d_pred_loss * d_activation_loss                  # Chain rule: derivative w.r.t z
    dL_dw = np.dot(X.T, dL_dz) / X.shape[0]                  # Gradient w.r.t weights
    dL_db = np.mean(dL_dz, axis=0)                           # Gradient w.r.t bias
    
    return dL_dw, dL_db

Perceptron.compute_gradients=compute_gradients

In [16]:
def train(self, X, num_epochs):
    
    learning_rate=0.01 # 0.01, 0.1, 0.5
    loss = 0
    
    for epoch in range(0, num_epochs):
        
        y_pred = self.forward_propagation(X).reshape(-1,1)
        loss = self.compute_bce(y_train, y_pred)
       
        # Compute the gradients
        dw, db = self.compute_gradients(X, y_train, y_pred)

        # update parameters W (weights) and b (bias)

        self.weights = self.weights - dw * learning_rate
        self.bias = self.bias - db * learning_rate

        print(f"Epoch {epoch+1}    loss = {loss}")
    return loss   

Perceptron.train=train

In [17]:
perceptron = Perceptron()
perceptron.initialize_parameters(X_train)
print(perceptron.weights)

[[ 0.00426966]
 [-0.000111  ]
 [-0.00115684]
 [ 0.00741583]
 [-0.01607478]
 [-0.00317835]
 [-0.0156156 ]
 [-0.0201802 ]
 [-0.00717695]
 [-0.02023813]]


In [18]:
y_pred = perceptron.forward_propagation(X_train).reshape(-1,1)
perceptron.compute_bce(y_train, y_pred)

0.6930996918819761

In [19]:
def random_weights_bias(steps):
    for i in range(1, steps):
        perceptron.initialize_parameters(X_train)
        y_pred = perceptron.forward_propagation(X_train).reshape(-1,1)
        bce = perceptron.compute_bce(y_train, y_pred)
        print(f"bce loss = {bce}")

In [20]:
random_weights_bias(10)

bce loss = 0.6922137060594854
bce loss = 0.6960018224322305
bce loss = 0.6941105934058425
bce loss = 0.6917042192788531
bce loss = 0.692950631544614
bce loss = 0.6963590418316041
bce loss = 0.6950704942348961
bce loss = 0.6947810679617461
bce loss = 0.6918802915155332


In [21]:
perceptron.train(X_train, 10)

Epoch 1    loss = 0.6918802915155332
Epoch 2    loss = 0.6918717308381204
Epoch 3    loss = 0.6918631705027233
Epoch 4    loss = 0.6918546105093237
Epoch 5    loss = 0.6918460508579035
Epoch 6    loss = 0.6918374915484446
Epoch 7    loss = 0.6918289325809286
Epoch 8    loss = 0.6918203739553375
Epoch 9    loss = 0.6918118156716528
Epoch 10    loss = 0.6918032577298568


0.6918032577298568