A Logistic Regression model computes a weighted sum of the input features plus the bias term and outputs a probability. This probability is expressed through

$\Large \hat{p} = h_{\bf{\theta}}(\bf{x}) = \sigma(\bf{x}^T\theta)$

Here $\sigma$ is the sigmoid function which outputs a number between 0 and 1 and is defined as 

$\Large \sigma(t) = \frac{1}{1 + \exp(-t)}$.

The Model makes predictions via

$\Large \hat{y} = 0: \hat{p} < 0.5 \\$ 
$\Large \hat{y} = 1: \hat{p} \geq 0.5 $.

We notice that $\sigma (t) < 0.5$ when $t < 0$ and $\sigma (t) \geq 0.5$ when $t \geq 0$.

The negative log-likelihood is the cross entropy or the average log-loss which is the function we want to minimize

$\newline \Large J(\theta) = - \frac{1}{m} \sum_{i = 1}^{m}y^i\log\hat{y}^i + (1 - y^i)\log(1 - \hat{y}^i)$ $\newline$

This function does not have a closed form solution but since it is convex we can use gradient descent to find the global minimum. The partial derivatives are

$ \Large \frac{\partial}{\partial \theta_j}J(\bf{\theta}) = \frac{1}{m}\sum_{i=1}^{m} \left(\sigma\left(\bf{\theta}^T\bf{x}^{(i)}\right) - y^{(i)}\right)x_{j}^{(i)}$

we can use this gradient function with gradient descent to compute the weight vector.

In [1]:
import numpy as np

In [59]:
class LogisticRegression:
    
    def __init__(self, epochs = 1000, theta = None, lr = 0.1):
        self.epochs = epochs
        self.theta = None
        self.lr = lr
    
   
    def sigmoid(self, z):
        return 1/(1+np.exp(-z))
    
    
    def fit(self, X, y):
        m,n = X.shape
        X_b = np.c_[np.ones((m, 1)), X]
        self.theta = np.random.randn(n+1)
        for i in range(self.epochs):
            z = X_b.dot(self.theta)
            gradient = 1/m *X_b.T.dot(self.sigmoid(z) - y)
            self.theta -= self.lr*gradient
            
    def predict(self, X):
        X = np.array(X)
        m = X.shape[0]
        X_b = np.c_[np.ones((m, 1)), X]
        proba = self.sigmoid(X_b.dot(self.theta))
        return [1 if i > 0.5 else 0 for i in proba]

In [60]:
from sklearn.datasets import load_iris

In [61]:
iris = load_iris().data

In [69]:
pw = iris[:, 3:]
target = load_iris().target
target = (target == 2).astype(int) # 1 for Iris-Viriginica 0 for Iris-Setosa and Versicolor

In [70]:
log_reg = LogisticRegression()

In [71]:
log_reg.fit(pw, target)

In [78]:
log_reg.predict([[1.5], [2.8]])

[0, 1]

So we have successfully constructed a Logistics Regression binary classifier from scratch.