## Logistic Regression - An Overview
<br><br>

## Constructing a Logistic Regression Classifier
<br><br>

We'll use the Pima Indians Diabetes dataset to aid in our construction of this classifier. The dataset describes whether an individual's health data, as well as whether or not they developed diabetes within 5 years of the measurements.

In [22]:
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split

data=pd.read_csv("C:/Users/zmurp/github/Machine_Learning_Algorithms/datasets/diabetes.csv")
data.median()
dataset = data.fillna(data.median())
X = dataset.iloc[:500, 0:8].values
ytrain = dataset.iloc[:500, 8].values
scaler = StandardScaler()
Xtrain = scaler.fit_transform(X)

Xtest=scaler.fit_transform(dataset.iloc[500:, 0:8].values)
ytest=dataset.iloc[500:, 8].values

With our data now loaded in, we can get started on constructing the classifier. We'll create a class called <code>LogisticRegression()</code> and add in some attributes that will be suitable for this type of algorithm. Some of these were mentioned above, while others will be explained in more depth later on.

In [23]:
import numpy as np
import types
import math

class LogisticRegression():
    def __init__(self, C=5.0, iterations=6000, initialization_scheme="zeroes", use_intercept=False, learning_rate=0.1, penalization_type='l2'):
        self.C = C
        self.iterations = iterations
        self.learning_rate = learning_rate
        self.initialization_scheme=initialization_scheme
        self.use_intercept=use_intercept
        self.learning_rate=learning_rate
        self.penalization_type = penalization_type
        
logistic_regression = LogisticRegression()

The attributes that we have defined for our logistic regression class are as follows:
        <br><br>
    <strong>C</strong>: Regularization parameter. Increasing values are associated with stronger regularization of the algorithm. 
        <br><br>
    <strong>iterations</strong>: The number of iterations that the training process will go through before completing.
        <br><br>
    <strong>use_intercept</strong>: When true, this adds an extra column of ones to the training dataset. When false, the original unaltered training dataset is used.
        <br><br>
    <strong>learning_rate</strong>: A small constant that is applied to the optimization algorithm in order to improve convergence.
        <br><br>
    <strong>penalization_type</strong>: Whether to use L1-penalization, L2-penalization, or no penalizations in training the logistic regression algorithm.
        <br><br>

In [None]:
def _include_intercept(self, X):
    if self.use_intercept=False
        return X
    if self.use_intercept==True:
        X = np.c_[np.ones([np.shape(X)[0], 1]), X]
        return X

The first order of business is to initialize the weights that will ultimately be optimized throughout the training process. There are a number of ways to initialize the weights, but the 

In [24]:
def _initialize_weights(self, X, initialization_scheme):
    Xtrain = self.include_intercept(X)
    if self.initialization_scheme=="zeroes":
        self.weights = np.zeros(np.shape(Xtrain)[1]) 
    elif self.initialization_scheme=="random":
        self.weights = np.random.uniform(-1 * 0.9, 0.9, np.shape(Xtrain)[1])
    elif self.initialization_scheme=="he":
        n_features = np.shape(Xtrain)[1]
        bound = 1 / math.sqrt(n_features)
        self.weights = np.random.uniform(-1 * bound, bound, (n_features))
    else:
        print("Invalid initialization. Please select from 'zeroes', 'random', or'he'.")

        
logistic_regression._initialize_weights = types.MethodType(_initialize_weights, logistic_regression)

logistic_regression._initialize_weights(Xtrain=Xtrain, initialization_scheme="he")
print(logistic_regression.weights)

[0. 0. 0. 0. 0. 0. 0. 0.]


The next thing we must do is to modify the function above into a method that can apply the sigmoid function to our training data. In doing this, the weights that we generated above are also multiplied with the features in the training dataset via a dot product. The results of this dot product are returned as a list of values.

In [25]:
def _sigmoid(self, X):
        z = np.dot(X, self.weights)
        s = 1.0 / (1 + np.exp(-z))
        return s   
logistic_regression._sigmoid = types.MethodType(_sigmoid, logistic_regression)

The third basic function that we will need will be a function that describes the cost function. A cost function is a mathematical relationship that measures the ability of a machine learning algorithm to determine the relationship between features and their labels in a dataset. A variety of cost functions exist and have specific use-cases for certain algorithms. The cost function that will be used here will be the log-likelihood cost function, which is shown below.
<br><br>
For some algorithms, cost functions have the ability to be <em>regularized.</em> That means that their values can be influenced by the addition of an extra parameter. The existence of this parameter dictates how much wrong classifications should be penalized during the training process. The most common types of regularizations implemented for a logistic regression algorithm are L

In [26]:
def _cost_function(self, X, y, penalization_type):
    m = Xtrain.shape[1]
    if penalization_type== None:
        cost = -1/m * (np.sum((y * np.log(self._sigmoid(X))) + ((1 - y) * np.log(1 - self._sigmoid(X)))))
    elif penalization_type=='l1':
        cost = -1/m * (np.sum((y * np.log(self._sigmoid(X))) + ((1 - y) * np.log(1 - self._sigmoid(X))))) + (1 / (m * self.C)) * np.sum(np.abs(self.weights))
    elif penalization_type=='l2':
        cost = -1/m * (np.sum((y * np.log(self._sigmoid(X))) + ((1 - y) * np.log(1 - self._sigmoid(X))))) + (1 / (2 *m* self.C)) * np.dot(self.weights.T, self.weights)
    else:
        print("Invalid penalization_type. Please enter 'None', 'l1', or 'l2'.")
    return cost

logistic_regression._cost_function = types.MethodType(_cost_function, logistic_regression)
print(logistic_regression._cost_function(Xtrain=Xtrain, ytrain=ytrain, penalization_type='l1'))

43.321698784996585


With these two methods, we can write our training method. The training method will accomplish the following tasks.
<br><br>
1.) It will initialize our weights using the <code>initialize_weights</code> from above.
<br><br>
2.) Our training data will be multiplied by these weights and then transformed by the sigmoid function to make a prediction.
<br><br>
3.) The differences between the sigmoid functions values and the training labels are computed to measure the overall error.

4.) An optimization algorithm called <em>gradient descent</em> will then iteratively adjust the weights based on the resulting value of the cost function in an effort to minimize it on the next iteration. Note that there are slightly different formulations of this depending on what penalization scheme is used.
<br><br>
Steps 1 - 4 then repeat until all iterations are complete.

In [27]:
def train(self, X, y):
    m = Xtrain.shape[1]
    self._initialize_weights(Xtrain, self.initialization_scheme)
    self.costs = []
    for i in range(self.iterations):
            s = self._sigmoid(Xtrain)
            errors = ytrain - s
            if self.penalization_type==None:
                delta_w = self.learning_rate * (1/m)*np.dot(errors, Xtrain)
            elif self.penalization_type=="l1":
                delta_w = self.learning_rate * (self.C * ((1/m)*np.dot(errors, Xtrain)) + np.sum(np.sign(self.weights))) 
            elif self.penalization_type=="l2":                
                delta_w = self.learning_rate * (self.C * ((1/m)*np.dot(errors, Xtrain)) + np.sum(self.weights))  
            self.iterationsPerformed = i
    self.weights += delta_w                                
                #Costs
    self.costs.append(self._cost_function(Xtrain, ytrain, self.penalization_type)) 
    return self                
     

logistic_regression.train = types.MethodType(train, logistic_regression)
logistic_regression.train(Xtrain, ytrain)
print(logistic_regression.weights)

[3.31055321 6.69281647 0.56905497 0.80892749 2.07386126 4.64155386
 2.67219163 3.10528757]


  
  


With our trained weights, we can now make predictions with them. All we have to do is apply the sigmoid function (with the adjusted weights in them) to our test set to get a sense on how it does on unseen data. Recall that the logistic regression algorithm is a probabilistic classifier, and that its outputs are probabilities that it belongs to the positive class. We can apply a quick transformation to the probabilities to come up with binary classification predictions. We say that a positive value is one that has a probability > 0.5, and all others are negative.

In [28]:
def predict(self, Xtest):
    predictions=self._sigmoid(Xtest)
    det_pred = []
    for pred in predictions:
        if pred>0.50:
            p = 1
            det_pred.append(p)
        else:
            p=0
            det_pred.append(0)
    return det_pred
    
logistic_regression.predict = types.MethodType(predict, logistic_regression)

In [58]:
def performanceEval(predictions, y_test):
       
        #Initialize
        TP, TN, FP, FN, P, N = 0, 0, 0, 0, 0, 0
        
        for idx, test_sample in enumerate(y_test):
            
            if predictions[idx] == 1 and test_sample == 1:
                TP += 1       
                P += 1
            elif predictions[idx] == 0 and test_sample == 0:                
                TN += 1
                N += 1
            elif predictions[idx] == 0 and test_sample == 1:
                FN += 1
                P += 1
            elif predictions[idx] == 1 and test_sample == 0:
                FP += 1
                N += 1
            
        accuracy = (TP + TN) / (P + N)                
        sensitivity = TP / P        
        specificity = TN / N        
        PPV = TP / (TP + FP)        
        NPV = TN / (TN + FN)        
        FNR = 1 - sensitivity        
        FPR = 1 - specificity
        
        performance = {'Accuracy': accuracy, 'Sensitivity': sensitivity,
                       'Specificity': specificity, 'Precision': PPV,
                       'NPV': NPV, 'FNR': FNR, 'FPR': FPR}   
        
        conf_matrix1 = {'TP': TP, 'FN': FN}
        conf_matrix2 = {'FP': FP, 'TN': TN}
        
        return conf_matrix1, conf_matrix2, performance

Our complete class is below. This will also be made available in the following Github repository. 

In [72]:
import numpy as np
import types
import math

class LogisticRegression():
    def __init__(self, C=5.0, iterations=6000, initialization_scheme="zeroes", learning_rate=0.1, penalization_type='l2'):
        self.C = C
        self.iterations = iterations
        self.learning_rate = learning_rate
        self.initialization_scheme=initialization_scheme
        self.penalization_type = penalization_type
    
    def _initialize_weights(self, Xtrain, initialization_scheme):
        if self.initialization_scheme=="zeroes":
            self.weights = np.zeros(np.shape(Xtrain)[1]) 
        elif self.initialization_scheme=="random":
            self.weights = np.random.uniform(-1 * 0.9, 0.9, (np.shape(Xtrain)[1]))
        elif self.initialization_scheme=="he":
            n_features = np.shape(Xtrain)[1]
            bound = 1 / math.sqrt(n_features)
            self.weights = np.random.uniform(-1 * bound, bound, (n_features))
        else:
            print("Invalid initialization. Please select from 'zeroes', 'random', or'he'.")
            
    def _sigmoid(self, Xtrain):
        z = np.dot(Xtrain, self.weights)
        s = 1.0 / (1 + np.exp(-z))
        return s  
    
    def _cost_function(self, Xtrain, ytrain, penalization_type):
        m = Xtrain.shape[1]
        if penalization_type== None:
            cost = -1/m * (np.sum((ytrain * np.log(self._sigmoid(Xtrain))) + ((1 - ytrain) * np.log(1 - self._sigmoid(Xtrain)))))
        elif penalization_type=='l1':
            cost = -1/m * (np.sum((ytrain * np.log(self._sigmoid(Xtrain))) + ((1 - ytrain) * np.log(1 - self._sigmoid(Xtrain))))) + (1 / (m * self.C)) * np.sum(np.abs(self.weights))
        elif penalization_type=='l2':
            cost = -1/m * (np.sum((ytrain * np.log(self._sigmoid(Xtrain))) + ((1 - ytrain) * np.log(1 - self._sigmoid(Xtrain))))) + (1 / (2 *m* self.C)) * np.dot(self.weights.T, self.weights)
        else:
            print("Invalid penalization_type. Please enter 'None', 'l1', or 'l2'.")
        return cost
    
    def train(self, Xtrain, ytrain):
        m = Xtrain.shape[1]
        self._initialize_weights(Xtrain, self.initialization_scheme)
        self.costs = []
        for i in range(self.iterations):
                s = self._sigmoid(Xtrain)
                errors = ytrain - s
                if self.penalization_type==None:
                    delta_w = self.learning_rate * (1/m)*np.dot(errors, Xtrain)
                elif self.penalization_type=="l1":
                    delta_w = self.learning_rate * (self.C * ((1/m)*np.dot(errors, Xtrain)) + np.sum(np.sign(self.weights))) 
                elif self.penalization_type=="l2":                
                    delta_w = self.learning_rate * (self.C * ((1/m)*np.dot(errors, Xtrain)) + np.sum(self.weights))  
                self.iterationsPerformed = i
        self.weights += delta_w                                
                #Costs
        self.costs.append(self._cost_function(Xtrain, ytrain, self.penalization_type)) 
        return self        
    
    def predict(self, Xtest):
        predictions=self._sigmoid(Xtest)
        det_pred = []
        for pred in predictions:
            if pred>0.50:
                p = 1
                det_pred.append(p)
            else:
                p=0
                det_pred.append(0)
        return det_pred

In [93]:
clf = LogisticRegression(C=10, iterations=30000, initialization_scheme="zeroes", learning_rate=1, penalization_type='l1')
clf.train(Xtrain, ytrain)
cm1, cm2, per = performanceEval(clf.predict(Xtest), ytest)
print(cm1)
print(cm2)
print(per)

from sklearn.linear_model import LogisticRegression as lr
clf2=lr(penalty='l1',C=10.0, fit_intercept=False, solver='saga', max_iter=30000)
clf2.fit(Xtrain, ytrain)
skpreds = clf2.predict(Xtest)
skcm1, skcm2, skper = performanceEval(skpreds, ytest)

print(skcm1)
print(skcm2)
print(skper)

{'TP': 70, 'FN': 16}
{'FP': 51, 'TN': 131}
{'Accuracy': 0.75, 'Sensitivity': 0.813953488372093, 'Specificity': 0.7197802197802198, 'Precision': 0.5785123966942148, 'NPV': 0.891156462585034, 'FNR': 0.18604651162790697, 'FPR': 0.2802197802197802}
{'TP': 71, 'FN': 15}
{'FP': 49, 'TN': 133}
{'Accuracy': 0.7611940298507462, 'Sensitivity': 0.8255813953488372, 'Specificity': 0.7307692307692307, 'Precision': 0.5916666666666667, 'NPV': 0.8986486486486487, 'FNR': 0.17441860465116277, 'FPR': 0.2692307692307693}


