<h2 style="text-align: center; font-family: comfortaa;">Logistic Regression</h2>
<p>logistic regression is similar to Linear Regression</p>
<p>Where the data point is approximated by a linear function:</p>

\begin{align}
f(w, b) = wx + b
\end{align}

<p>Where:</p>
<p style="text-align: center;"><i>w</i> is the weight or slop</p>
<p style="text-align: center;"><i>b</i> is the bias or offset</p>
<p style="text-align: center;"><i>x</i> is the input feature</p>

<p> The difference is that Logistic Regression is a generalization of the linear regression model, used for classification problems.</p>

<p>This is done by applying the sigmoid function to the linear model.</p>

<p>As we recall, the sigmoid function is defined by:</p>
\begin{align}
\frac {1}{1+e^{ -x}}
\end{align}
<p> and returns a probablity between 0 and 1.</p>
<img src="../images/sigmoid.png">

<p> So that will look like this:</p>
\begin{align}
\hat{y} = h_{\theta}(x) = \frac {1}{1+e^{-wx +b}}
\end{align}
<p>where h(x) is the probability of x.</p>

<h3>cost function </h3>
<p>To find the error rate, we'll need a cost function</p>
<p>For logistic regression, we'll define it as:</p>

\begin{align}
J(w, b) = J(\theta) =  \frac{1}{N} \sum_{i=1}^{n} [y^{i} log(h_{\theta}(x^{i})) + (1- y^{i})log(1 - h_{\theta}(x^{i}))]
\end{align}

<p>Where <i>J(w,b)</i> is the mean difference of the predicted value and the actual value </p>

<p>Of course we want to minimize this, so to find the minimum we take the derivitives: </p>

\begin{align}
J'(\theta) =
    \begin{bmatrix}
     \frac{dJ}{dw}\\
     \frac{dJ}{db}\\
    \end{bmatrix} = [...] =
    \begin{bmatrix}
     \frac{1}{N} \sum 2x_i(\hat{y} - y_i)\\
     \frac{1}{N} \sum 2(\hat{y} - y_i)\\
    \end{bmatrix}
\end{align}


<p>To update the variables <i>w</i> and <i>b</i>, we'll use:</p>
\begin{align}
w = w - \alpha \cdot dw
\end{align}
\begin{align}
b = b - \alpha \cdot db
\end{align}

<p>Where alpha is the learning rate.</p>
<p>It is import to choose a small alpha as a large alph may not not the true global minimum of the Cost function</p>
<img src="../images/gradient.png" >


<h3>Code the algorithm</h3>


In [7]:
import numpy as np

class LogisticRegression:
    
    def __init__(self,lr=0.001,n_iters=1000):
    #initialize and store varaibles
        #learning rate
        self.lr = lr
        
        #number of iterations
        self.n_iters = n_iters
        
        #initiate weights and bias as None
        self.weights = None
        self.bias = None
        
    
    def fit(self,X,y):
        #X is an np.array of size n*m where n is the number of samples and m is the number of features
        #Y is an np.array of size n*1 where n is the number of samples and 1 is the nuber of labels
        
        #initilize parameters
        n_samples, n_features = X.shape #unpacks the first dimension into samples, and second dimension into number of feautres
        self.weights = np.zeros(n_features)
        self.bias = 0
        
        #gradient decent #iterativ
        for _ in range(self.n_iters):
            linear_model = np.dot(X,self.weights) + self.bias
            
            #approximation of y
            y_predicted = self._sigmoid(linear_model)
            
            #calculate derivatives
            dw = (1/n_samples)*np.dot(X.T,(y_predicted - y))
            db = (1/n_samples)*np.sum(y_predicted - y)
            
            #update variables
            self.weights -= self.lr*dw
            self.bias -= self.lr*db
    
    def predict(self,X):
        #X is new test samples
        linear_model = np.dot(X,self.weights) + self.bias
            
        #approximation of y
        y_predicted = self._sigmoid(linear_model)
        y_predicted_cls = [1 if i > 0.5 else 0 for i in y_predicted]
        return y_predicted_cls
    
    def _sigmoid(self,x):
        #helper method that returns the sigmoid
        return 1/(1+np.exp(-x))

<h3>See it in action</h3>

In [10]:
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn import datasets
import matplotlib.pyplot as plt

bc = datasets.load_breast_cancer()
X,y = bc.data, bc.target

#split data set into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.2,random_state=1234)

def accuracy(y_true,y_pred):
    accuracy = np.sum(y_true == y_pred)/len(y_true)
    return accuracy
regressor = LogisticRegression(lr=0.0001,n_iters=1000)
regressor.fit(X_train,y_train)
predictors = regressor.predict(X_test)

print("accuracy: ",accuracy(y_test, predictors))

accuracy:  0.9298245614035088
