## Implementation

The below code implements a `LogisticRegression` classifier using a Gradient Descent algorithm.

## Computing Gradients

We have an MSE loss function $-(y\log{f(x)} + (1 - y)\log{(1 - f(x))})$.

Where function $f(x) = xW + b$.

We now calculate partial derivatives of the loss function with respect to weights and a bias.

Let us first calculate the derivative of one square difference w.r.t. $W$:

Recall that $h' = \Big(\dfrac{1}{1 + e^{-x}}\Big)' = h(1 - h)$.

Since y is fixed, let us calculate the derivative directly first:

\begin{align}
    (-(y\log{h} + (1 - y)\log{(1 - h)}))' &= -\frac{y}{h} * h' - \frac{1 - y}{1 - h} * (-h')\\
                                          &= h'* \frac{y(1 - h) - (1 - y)h}{h(1 - h)}\\
                                          &= (xW + b)' * h(1 - h) * \frac{y(1 - h) - (1 - y)h}{h(1 - h)}\\
                                          &= (-(xW + b))' * (y - h)\\
                                          &= (xW + b)' * (h - y)
\end{align}

Then the derivative w.r.t $W$ is

\begin{equation}
    \frac{1}{n} \sum^i x.T \cdot (f(x_i) - y_i)
\end{equation}

and the derivative w.r.t. $b$ is

\begin{equation}
    \frac{1}{n} \sum^i f(x_i) - y_i
\end{equation}

In [1]:
import numpy as np

from dataclasses import dataclass


@dataclass
class LogisticRegression:
    features: np.ndarray
    labels: np.ndarray
    learning_rate: float
    epochs: int
    threshold: float
    logging: bool

    def __post_init__(self) -> None:
        """Initializes additional variables for the Logistic Regression model."""

        self.num_samples, num_features = self.features.shape
        self.weights, self.bias = np.zeros(num_features), 0

    def sigmoid(self, logits: np.ndarray) -> np.ndarray:
        """A numerically stable implementation of the Sigmoid activation function."""

        return np.where(
            logits < 0, np.exp(logits) / (1 + np.exp(logits)), 1 / (1 + np.exp(-logits))
        )

    def mean_log_loss(self, predictions: np.ndarray) -> np.float32:
        """Computes a mean Cross Entropy Loss (in binary classification, also called Log Loss)."""

        return np.mean(
            -self.labels * np.log(predictions) - (1 - self.labels) * (1 - np.log(predictions))
        )

    def fit(self) -> None:
        """Fits a Logistic Regression model."""

        for _ in range(self.epochs):
            prediction = self.sigmoid(self.features.dot(self.weights) + self.bias)
            difference = prediction - self.labels

            d_weights = difference.dot(self.features) / self.num_samples
            d_bias = difference.sum() / self.num_samples

            self.weights -= self.learning_rate * d_weights
            self.bias -= self.learning_rate * d_bias

            if self.logging:
                print(f"Log Loss: {self.mean_log_loss(prediction):.3f}")

    def predict(self, features: np.ndarray) -> np.ndarray:
        """Performs inference using the given features."""

        return np.where(
            self.sigmoid(np.dot(features, self.weights) + self.bias) < self.threshold, 0, 1
        )

## Generate Data

We generate some training and test data to approximate function $f(x) = 2x$.

In [2]:
import matplotlib.pyplot as plt

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split


plt.style.use("bmh")

# Prepare the data
data = load_breast_cancer()
features = data.data
labels = data.target

# Train/test split
train_features, test_features, train_labels, test_labels = train_test_split(
    features, labels, test_size=0.33, random_state=79
)

lr = LogisticRegression(
    train_features, train_labels, learning_rate=1e-5, epochs=30_000, threshold=0.5, logging=False
)
lr.fit()

## Report Model Statistics

$Accuracy  = \dfrac{Correct\ Predictions}{Total\ Predictions}$

$Precision = \dfrac{True\ Positive}{True\ Positive + False\ Positive}$

$Recall    = \dfrac{True\ Positive}{True\ Positive + False\ Negative}$

*Precision* is related to **Type I** error, while *Recall* is related to **Type II** error.

In [3]:
predictions = lr.predict(test_features)
labels = test_labels

tp = ((predictions == 1) & (labels == 1)).sum()
fp = ((predictions == 1) & (labels == 0)).sum()
fn = ((predictions == 0) & (labels == 1)).sum()

accuracy = (predictions == labels).mean()
precision = tp / (tp + fp)
recall = tp / (tp + fn)

print(f"Accuracy:  {100 * accuracy:0.3f}%")
print(f"Precision: {precision:0.3f}")
print(f"Recall:    {recall:0.3f}")

Accuracy:  94.681%
Precision: 0.946
Recall:    0.976
