# Logistic Regression

In this notebook, we will learn how to apply Logistic regression for predicting the cooling load requirements (Y2) of buildings as a function of building parameters (Xs).

The attached dataset is taken from the [UC Irvine Machine Learning Repository](https://archive.ics.uci.edu/ml/datasets/Energy+efficiency).

To run this code, you will need the following python packages:
* numpy
* pandas
* openpyxl
* scikit-learn

In [308]:
import numpy as np
import pandas as pd

In [309]:
!pip install openpyxl




[notice] A new release of pip is available: 23.3.1 -> 24.0
[notice] To update, run: C:\Users\LAPTOP\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip


In [310]:
# First, we load the dataset using pandas
df = pd.read_excel("Energy_Efficiency.xlsx", engine = 'openpyxl')
# Remove any unnamed columns (might occur due to difference in pandas readers)
df = df.loc[:, ~df.columns.str.contains('^Unnamed')]
# Remove any row with NaNs
df = df.dropna(how='all')
# Drop Y1 (as we only consider Y2 for classification)
df = df.drop('Y1', axis=1)

In [311]:
# next, we will split the dataframe into a training and testing splits with a 70% / 30% ratio
from sklearn.model_selection import train_test_split

df_train, df_test = train_test_split(df, test_size=0.3, random_state=42) # Random is fixed for reproducability

In [312]:
df_train

Unnamed: 0,X1,X2,X3,X4,X5,X6,X7,X8,Y2
334,0.62,808.5,367.5,220.50,3.5,4,0.25,1,15.77
139,0.64,784.0,343.0,220.50,3.5,5,0.10,2,19.30
485,0.90,563.5,318.5,122.50,7.0,3,0.25,5,32.00
547,0.79,637.0,343.0,147.00,7.0,5,0.40,1,46.94
18,0.79,637.0,343.0,147.00,7.0,4,0.00,0,30.93
...,...,...,...,...,...,...,...,...,...
71,0.76,661.5,416.5,122.50,7.0,5,0.10,1,33.67
106,0.86,588.0,294.0,147.00,7.0,4,0.10,2,27.36
270,0.71,710.5,269.5,220.50,3.5,4,0.10,5,14.26
435,0.98,514.5,294.0,110.25,7.0,5,0.25,4,30.12


In [313]:
df_train.describe()

Unnamed: 0,X1,X2,X3,X4,X5,X6,X7,X8,Y2
count,537.0,537.0,537.0,537.0,537.0,537.0,537.0,537.0,537.0
mean,0.760354,674.867784,318.636872,178.115456,5.201117,3.500931,0.23594,2.854749,24.287505
std,0.10479,87.758133,43.619254,44.839207,1.750948,1.106502,0.134118,1.544532,9.505775
min,0.62,514.5,245.0,110.25,3.5,2.0,0.0,0.0,10.94
25%,0.66,612.5,294.0,147.0,3.5,3.0,0.1,2.0,15.5
50%,0.74,686.0,318.5,220.5,3.5,3.0,0.25,3.0,21.16
75%,0.82,759.5,343.0,220.5,7.0,4.0,0.4,4.0,32.92
max,0.98,808.5,416.5,220.5,7.0,5.0,0.4,5.0,48.03


In [314]:
# Now we will extract the models input and targets from both the training and testing dataframes
def extract_Xy(df):
    df_numpy = df.to_numpy()
    return df_numpy[:, :-1], df_numpy[:, -1]

X_train, y_train = extract_Xy(df_train)
X_test, y_test = extract_Xy(df_test)

y_median = np.median(y_train)
print("Median value of the target:", y_median)

# Since we will treat this as a classification task, we will assume that
# the load is "high" (y = True) if its compressive ratio is higher than the median
# otherwise, it is assumed to be "low" (y = False)
y_train = y_train > y_median
y_test = y_test > y_median

# Now ~50% of the samples should be considered "high" and the rest are considered "low"
print(f"Percentage of 'high load' samples: {y_train.mean() * 100} %")

# Also, lets standardize the data since it improves the training process
X_mean = X_train.mean(axis=0)
X_std = X_train.std(axis=0)
X_train = (X_train - X_mean)/(1e-8 + X_std)
X_test = (X_test - X_mean)/(1e-8 + X_std)

Median value of the target: 21.16
Percentage of 'high load' samples: 49.906890130353815 %


## Logistic Regression via Scikit-Learn

In [315]:
from sklearn.linear_model import LogisticRegression

In [316]:
%%time
# We use time to compute the training time of our model
model = LogisticRegression(random_state=0, penalty="none").fit(X_train, y_train)

CPU times: total: 15.6 ms
Wall time: 8.97 ms




In [317]:
from sklearn.metrics import accuracy_score

y_train_predict = model.predict(X_train)
print(f"Training Accurracy: {accuracy_score(y_train, y_train_predict) * 100}%")
y_test_predict = model.predict(X_test)
print(f"Testing Accurracy: {accuracy_score(y_test, y_test_predict) * 100}%")

Training Accurracy: 98.32402234636871%
Testing Accurracy: 96.53679653679653%


## Logistic Regression from Scratch

In [318]:
def sigmoid(x):
    #TODO: Implement sigmoid (hint: use np.exp)
    return  1 / (1 + np.exp(-x))

In [319]:
# Sanity checks
print(f"{sigmoid(-1e2) = }") # This should be almost equal 0
print(f"{sigmoid(   0) = }") # This should be exactly 0.5
print(f"{sigmoid(+1e2) = }") # This should be almost equal 1

sigmoid(-1e2) = 3.7200759760208356e-44
sigmoid(   0) = 0.5
sigmoid(+1e2) = 1.0


In [320]:
def our_accuracy_score(true, predicted):
    #TODO: Implement an accuracy metric so that is can be used instead of Sklearn's accuracy score
    #Note: both true and predicted will be boolean numpy array
    acc_score = np.sum (true == predicted) / len(true)
    return acc_score

In [321]:
# Sanity checks
print(f"{our_accuracy_score( np.array([True,  True]), np.array([True,  True]) ) = }") # Should be 1
print(f"{our_accuracy_score( np.array([True, False]), np.array([True,  True]) ) = }") # Should be 0.5
print(f"{our_accuracy_score( np.array([True, False]), np.array([True, False]) ) = }") # Should be 1
print(f"{our_accuracy_score( np.array([False, True]), np.array([True, False]) ) = }") # Should be 0

our_accuracy_score( np.array([True,  True]), np.array([True,  True]) ) = 1.0
our_accuracy_score( np.array([True, False]), np.array([True,  True]) ) = 0.5
our_accuracy_score( np.array([True, False]), np.array([True, False]) ) = 1.0
our_accuracy_score( np.array([False, True]), np.array([True, False]) ) = 0.0


In [322]:
#IMPORTANT: You can only use numpy here. Do not use any premade algorithms (e.g. Scikit-Learn's Logistic Regression)
class OurLogisticRegression:
    def __init__(self, lr: int, epochs: int, probability_threshold: float = 0.5, random_state = None):
        self.lr = lr # The learning rate
        self.epochs = epochs # The number of training epochs
        self.probability_threshold = probability_threshold # If the output of the sigmoid function is > probability_threshold, the prediction is considered to be positive (True)
                                                            # otherwise, the prediction is considered to be negative (False)
        self.random_state = random_state # The random state will be used set the random seed for the sake of reproducability
    
    def _prepare_input(self, X):
        # Here, we add a new input with value 1 to each example. It will be multipled by the bias
        ones = np.ones((X.shape[0], 1), dtype=X.dtype)
        return np.concatenate((ones, X), axis=1)
    
    def _prepare_target(self, y):
        # Here, we convert True to +1 and False to -1
        #TODO (Optional): You can modify your function if you wish to used other values for the positive and negative classes
        return np.where(y, 1, -1)

    def _initialize(self, num_weights: int, stdev: float = 0.01):
        # Here, we initialize the weights using a normally distributed random variable with a small standard deviation
        self.w = np.random.randn(num_weights) * stdev

    def _gradient(self, X, y):
        #TODO: Compute and return the gradient of the weights (self.w) wrt to the loss given the X and y arrays
        
        pred = sigmoid(X @ self.w)
        error = y - pred
        gradient = -X.T @ error
        return gradient

    def _update(self, X, y):
        #TODO: Implement this function to apply a single iteration on the weights "self.w"
        #Hint: use self._gradient

        # w(t+1) = w(t) - learning rate - gradient
        self.w = self.w - self.lr * self._gradient(X,y)

    def fit(self, X, y):
        np.random.seed(self.random_state) # First, we set the seed
        X = self._prepare_input(X) # Then we prepare the inputs
        y = self._prepare_target(y) # and prepare the targets too
        self._initialize(X.shape[1]) # and initialize the weights randomly
        for _ in range(self.epochs): # Then we update the weights for a certain number of epochs
            self._update(X, y)
        return self # Return self to match the behavior of Scikit-Learn's LinearRegression fit()
    
    def predict(self, X):
        X = self._prepare_input(X)
        #TODO: Implement the rest of this function (Note: It should return a boolean array)
        pred = sigmoid(X @ self.w)
        return pred > self.probability_threshold

In [323]:
# We will use this function to tune the hyper parameters
def validate(lr, epochs):
    validation_size = 0.3 #TODO: Choose a size for the validation set as a ratio from the training data
    X_tr, X_val, y_tr, y_val = train_test_split(X_train, y_train, test_size=validation_size, random_state=42)
    # We will fit the model to only a subset of the training data and we will use the rest to evaluate the performance
    our_model = OurLogisticRegression(lr=lr, epochs=epochs, random_state=0).fit(X_tr, y_tr)
    # Then, we evaluate the peformance using the validation set
    return our_accuracy_score(y_val, our_model.predict(X_val)) 

In [324]:
lr = 0.01 #TODO: Choose a learning rate to use while testing different values for the number of epochs
epochs_values = [1,10,100,1000] #TODO: Choose a list of values for the number of epochs to test
for epochs in epochs_values:
    accuracy = validate(lr, epochs)
    print(f"In {epochs} epochs, the accuracy reaches {accuracy * 100}% using lr={lr}")

In 1 epochs, the accuracy reaches 98.76543209876543% using lr=0.01
In 10 epochs, the accuracy reaches 98.76543209876543% using lr=0.01
In 100 epochs, the accuracy reaches 98.76543209876543% using lr=0.01


  return  1 / (1 + np.exp(-x))


In 1000 epochs, the accuracy reaches 98.76543209876543% using lr=0.01


In [325]:
epochs = 100 #TODO: Choose the number of epochs to use while testing different values for the learning rate
lr_values = [ 0.5, 0.1, 0.05, 0.01] #TODO: Choose a list of values for the learning rate to test
for lr in lr_values:
    accuracy = validate(lr, epochs)
    print(f"Using lr={lr}, the accuracy reaches {accuracy * 100}% in {epochs} epochs")

Using lr=0.5, the accuracy reaches 98.76543209876543% in 100 epochs
Using lr=0.1, the accuracy reaches 98.76543209876543% in 100 epochs
Using lr=0.05, the accuracy reaches 98.76543209876543% in 100 epochs
Using lr=0.01, the accuracy reaches 98.76543209876543% in 100 epochs


  return  1 / (1 + np.exp(-x))


In [326]:
%%time
# We use time to compute the training time of our model
#TODO: Select an appropriate learning rate and number of epochs
lr = 0.01
epochs = 100
our_model = OurLogisticRegression(lr=lr, epochs=epochs, random_state=0).fit(X_train, y_train)

CPU times: total: 0 ns
Wall time: 2.99 ms


  return  1 / (1 + np.exp(-x))


In [327]:
y_train_predict = our_model.predict(X_train)
print(f"Training Accuracy: {our_accuracy_score(y_train, y_train_predict) * 100}%")
y_test_predict = our_model.predict(X_test)
print(f"Testing Accuracy: {our_accuracy_score(y_test, y_test_predict) * 100}%")

Training Accuracy: 98.32402234636871%
Testing Accuracy: 96.53679653679653%


  return  1 / (1 + np.exp(-x))


In [328]:
#TODO: Write your conclusion about your implementation's performance and training time

'''
our implementation:
-------------------
performance:
Training Accuracy: 98.32402234636871%
Testing Accuracy: 96.53679653679653%

training time:
CPU times: total: 15.6 ms
Wall time: 8.97 ms

Scikit-Learn's implementation:
------------------------------
performance:
Training Accurracy: 98.32402234636871%
Testing Accurracy: 96.53679653679653%

training time:
CPU times: total: 0 ns
Wall time: 2.99 ms

--> time can change a little bit each run 
------------------------------------------------------

in training time: 
our implementation is faster than Scikit-Learn's implementation

it may be because of small dataset and small number of epochs

in performance:
Both have almost the same performance
with almost same accuracy

'''

"\nour implementation:\n-------------------\nperformance:\nTraining Accuracy: 98.32402234636871%\nTesting Accuracy: 96.53679653679653%\n\ntraining time:\nCPU times: total: 15.6 ms\nWall time: 11 ms\n\nScikit-Learn's implementation:\n------------------------------\nperformance:\nTraining Accurracy: 98.32402234636871%\nTesting Accurracy: 96.53679653679653%\n\ntraining time:\nCPU times: total: 31.2 ms\nWall time: 2.99 ms\n\n--> time can change a little bit each run \n------------------------------------------------------\n\nin training time: \nour implementation is faster than Scikit-Learn's implementation\n\nit may be because of small dataset and small number of epochs\n\nin performance:\nBoth have almost the same performance\nwith almost same accuracy\n\n"

# Bonus

As a bonus, you can implement and test the following:
* Stochastic gradient descent
* Termination conditions (e.g. The gradient check)
  
Write your conclusion about any results you calculate for your bonus implementations.

**IMPORTANT**: Do not implement the bonus in the previous cells. You can copy and paste codes from the previous cells and continue your implementation below this cell.

## Stochastic gradient descent

In [329]:
#IMPORTANT: You can only use numpy here. Do not use any premade algorithms (e.g. Scikit-Learn's Logistic Regression)
class OurLogisticRegression:
    def __init__(self, lr: int, epochs: int, probability_threshold: float = 0.5, random_state = None):
        self.lr = lr # The learning rate
        self.epochs = epochs # The number of training epochs
        self.probability_threshold = probability_threshold # If the output of the sigmoid function is > probability_threshold, the prediction is considered to be positive (True)
                                                            # otherwise, the prediction is considered to be negative (False)
        self.random_state = random_state # The random state will be used set the random seed for the sake of reproducability
    
    def _prepare_input(self, X):
        # Here, we add a new input with value 1 to each example. It will be multipled by the bias
        ones = np.ones((X.shape[0], 1), dtype=X.dtype)
        return np.concatenate((ones, X), axis=1)
    
    def _prepare_target(self, y):
        # Here, we convert True to +1 and False to -1
        #TODO (Optional): You can modify your function if you wish to used other values for the positive and negative classes
        return np.where(y, 1, -1)

    def _initialize(self, num_weights: int, stdev: float = 0.01):
        # Here, we initialize the weights using a normally distributed random variable with a small standard deviation
        self.w = np.random.randn(num_weights) * stdev

    def _gradient(self, X, y):
        #TODO: Compute and return the gradient of the weights (self.w) wrt to the loss given the X and y arrays
        
        pred = sigmoid(X @ self.w)
        error = y - pred
        gradient = -X.T @ error
        return gradient

    def _update(self, X, y):
        #TODO: Implement this function to apply a single iteration on the weights "self.w"
        #Hint: use self._gradient

        # w(t+1) = w(t) - learning rate - gradient
        index = np.random.randint(X.shape[0])
        xi = X[index, :].reshape(1, -1)
        yi = y[index]
        self.w = self.w - self.lr * self._gradient(xi, yi)

    def fit(self, X, y):
        np.random.seed(self.random_state) # First, we set the seed
        X = self._prepare_input(X) # Then we prepare the inputs
        y = self._prepare_target(y) # and prepare the targets too
        self._initialize(X.shape[1]) # and initialize the weights randomly
        for _ in range(self.epochs): # Then we update the weights for a certain number of epochs
            self._update(X, y)
        return self # Return self to match the behavior of Scikit-Learn's LinearRegression fit()
    
    def predict(self, X):
        X = self._prepare_input(X)
        #TODO: Implement the rest of this function (Note: It should return a boolean array)
        pred = sigmoid(X @ self.w)
        return pred > self.probability_threshold

In [330]:
# We will use this function to tune the hyper parameters
def validate(lr, epochs):
    validation_size = 0.3 #TODO: Choose a size for the validation set as a ratio from the training data
    X_tr, X_val, y_tr, y_val = train_test_split(X_train, y_train, test_size=validation_size, random_state=42)
    # We will fit the model to only a subset of the training data and we will use the rest to evaluate the performance
    our_model = OurLogisticRegression(lr=lr, epochs=epochs, random_state=0).fit(X_tr, y_tr)
    # Then, we evaluate the peformance using the validation set
    return our_accuracy_score(y_val, our_model.predict(X_val)) 

In [331]:
lr = 0.01 #TODO: Choose a learning rate to use while testing different values for the number of epochs
epochs_values = [1,10,100,1000] #TODO: Choose a list of values for the number of epochs to test
for epochs in epochs_values:
    accuracy = validate(lr, epochs)
    print(f"In {epochs} epochs, the accuracy reaches {accuracy * 100}% using lr={lr}")

In 1 epochs, the accuracy reaches 85.80246913580247% using lr=0.01
In 10 epochs, the accuracy reaches 98.76543209876543% using lr=0.01
In 100 epochs, the accuracy reaches 98.76543209876543% using lr=0.01
In 1000 epochs, the accuracy reaches 98.76543209876543% using lr=0.01


In [332]:
epochs = 100 #TODO: Choose the number of epochs to use while testing different values for the learning rate
lr_values = [ 0.5, 0.1, 0.05, 0.01] #TODO: Choose a list of values for the learning rate to test
for lr in lr_values:
    accuracy = validate(lr, epochs)
    print(f"Using lr={lr}, the accuracy reaches {accuracy * 100}% in {epochs} epochs")

Using lr=0.5, the accuracy reaches 98.76543209876543% in 100 epochs
Using lr=0.1, the accuracy reaches 98.76543209876543% in 100 epochs
Using lr=0.05, the accuracy reaches 98.76543209876543% in 100 epochs
Using lr=0.01, the accuracy reaches 98.76543209876543% in 100 epochs


In [333]:
%%time
# We use time to compute the training time of our model
#TODO: Select an appropriate learning rate and number of epochs
lr = 0.01
epochs = 100
our_model = OurLogisticRegression(lr=lr, epochs=epochs, random_state=0).fit(X_train, y_train)

CPU times: total: 0 ns
Wall time: 3.99 ms


In [334]:
y_train_predict = our_model.predict(X_train)
print(f"Training Accuracy: {our_accuracy_score(y_train, y_train_predict) * 100}%")
y_test_predict = our_model.predict(X_test)
print(f"Testing Accuracy: {our_accuracy_score(y_test, y_test_predict) * 100}%")

Training Accuracy: 98.32402234636871%
Testing Accuracy: 96.53679653679653%


In [335]:
'''
Stochastic gradient descent

CPU times: total: 0 ns
Wall time: 3.99 ms

it has a slightly higher wall time, 
indicating that it takes a bit longer in real-world time compared to batch gradient descent.

it may require more iterations to converge due to the noisy updates.
'''

'\nStochastic gradient descent\nhas negligible total CPU time but a higher wall time compared to batch gradient descent\nit might be more efficient in terms of CPU usage, \nit takes longer to complete due to the nature of updating parameters for each data point.\n'

## Termination conditions (e.g. The gradient check)

In [336]:
#IMPORTANT: You can only use numpy here. Do not use any premade algorithms (e.g. Scikit-Learn's Logistic Regression)
class OurLogisticRegression:
    def __init__(self, lr: int, epochs: int, probability_threshold: float = 0.5, random_state = None):
        self.lr = lr # The learning rate
        self.epochs = epochs # The number of training epochs
        self.probability_threshold = probability_threshold # If the output of the sigmoid function is > probability_threshold, the prediction is considered to be positive (True)
                                                            # otherwise, the prediction is considered to be negative (False)
        self.random_state = random_state # The random state will be used set the random seed for the sake of reproducability
    
    def _prepare_input(self, X):
        # Here, we add a new input with value 1 to each example. It will be multipled by the bias
        ones = np.ones((X.shape[0], 1), dtype=X.dtype)
        return np.concatenate((ones, X), axis=1)
    
    def _prepare_target(self, y):
        # Here, we convert True to +1 and False to -1
        #TODO (Optional): You can modify your function if you wish to used other values for the positive and negative classes
        return np.where(y, 1, -1)

    def _initialize(self, num_weights: int, stdev: float = 0.01):
        # Here, we initialize the weights using a normally distributed random variable with a small standard deviation
        self.w = np.random.randn(num_weights) * stdev

    def _gradient(self, X, y):
        #TODO: Compute and return the gradient of the weights (self.w) wrt to the loss given the X and y arrays
        
        pred = sigmoid(X @ self.w)
        error = y - pred
        gradient = -X.T @ error
        return gradient

    def _update(self, X, y):
        #TODO: Implement this function to apply a single iteration on the weights "self.w"
        #Hint: use self._gradient

        # w(t+1) = w(t) - learning rate - gradient
        self.w = self.w - self.lr * self._gradient(X,y)

    def fit(self, X, y, threshold=1e-5, max_iterations=100):
        np.random.seed(self.random_state) # First, we set the seed
        X = self._prepare_input(X) # Then we prepare the inputs
        y = self._prepare_target(y) # and prepare the targets too
        self._initialize(X.shape[1]) # and initialize the weights randomly

        prev_loss = float('inf')
        itrations_no_change = 0

        for epoch in range(self.epochs): # Then we update the weights for a certain number of epochs
            self._update(X, y)    
        
            gradient = self._gradient(X, y)
            curr_loss = np.linalg.norm(gradient)
            
            if np.abs(prev_loss - curr_loss) < threshold: # loss change less than a specific threshold
                itrations_no_change += 1
                if itrations_no_change == max_iterations:
                    break
            else:
                itrations_no_change = 0
                prev_loss = curr_loss
                    
        return self # Return self to match the behavior of Scikit-Learn's LinearRegression fit()
    
    def predict(self, X):
        X = self._prepare_input(X)
        #TODO: Implement the rest of this function (Note: It should return a boolean array)
        pred = sigmoid(X @ self.w)
        return pred > self.probability_threshold

In [337]:
# We will use this function to tune the hyper parameters
def validate(lr, epochs):
    validation_size = 0.3 #TODO: Choose a size for the validation set as a ratio from the training data
    X_tr, X_val, y_tr, y_val = train_test_split(X_train, y_train, test_size=validation_size, random_state=42)
    # We will fit the model to only a subset of the training data and we will use the rest to evaluate the performance
    our_model = OurLogisticRegression(lr=lr, epochs=epochs, random_state=0).fit(X_tr, y_tr)
    # Then, we evaluate the peformance using the validation set
    return our_accuracy_score(y_val, our_model.predict(X_val)) 

In [338]:
lr = 0.01 #TODO: Choose a learning rate to use while testing different values for the number of epochs
epochs_values = [1,10,100,1000] #TODO: Choose a list of values for the number of epochs to test
for epochs in epochs_values:
    accuracy = validate(lr, epochs)
    print(f"In {epochs} epochs, the accuracy reaches {accuracy * 100}% using lr={lr}")

In 1 epochs, the accuracy reaches 98.76543209876543% using lr=0.01
In 10 epochs, the accuracy reaches 98.76543209876543% using lr=0.01
In 100 epochs, the accuracy reaches 98.76543209876543% using lr=0.01
In 1000 epochs, the accuracy reaches 98.76543209876543% using lr=0.01


  return  1 / (1 + np.exp(-x))


In [339]:
epochs = 100 #TODO: Choose the number of epochs to use while testing different values for the learning rate
lr_values = [ 0.5, 0.1, 0.05, 0.01] #TODO: Choose a list of values for the learning rate to test
for lr in lr_values:
    accuracy = validate(lr, epochs)
    print(f"Using lr={lr}, the accuracy reaches {accuracy * 100}% in {epochs} epochs")

Using lr=0.5, the accuracy reaches 98.76543209876543% in 100 epochs
Using lr=0.1, the accuracy reaches 98.76543209876543% in 100 epochs
Using lr=0.05, the accuracy reaches 98.76543209876543% in 100 epochs
Using lr=0.01, the accuracy reaches 98.76543209876543% in 100 epochs


  return  1 / (1 + np.exp(-x))


In [340]:
%%time
# We use time to compute the training time of our model
#TODO: Select an appropriate learning rate and number of epochs
lr = 0.01
epochs = 100
our_model = OurLogisticRegression(lr=lr, epochs=epochs, random_state=0).fit(X_train, y_train)

CPU times: total: 0 ns
Wall time: 6.98 ms


  return  1 / (1 + np.exp(-x))


In [341]:
y_train_predict = our_model.predict(X_train)
print(f"Training Accuracy: {our_accuracy_score(y_train, y_train_predict) * 100}%")
y_test_predict = our_model.predict(X_test)
print(f"Testing Accuracy: {our_accuracy_score(y_test, y_test_predict) * 100}%")

Training Accuracy: 98.32402234636871%
Testing Accuracy: 96.53679653679653%


  return  1 / (1 + np.exp(-x))


In [342]:
'''
Termination conditions (e.g. The gradient check)

CPU times: total: 0 ns
Wall time: 6.98 ms

it has higher wall time, 
indicating that checking the termination condition takes a bit longer in real-world time.
'''

'\nTermination conditions (e.g. The gradient check)\nhas a negligible total CPU time \nbut a significant wall time. \nThis suggests that checking the termination condition adds some overhead to the process,\ntaking up additional real-world time.\n'