# Logistic Regression, Logistic Loss


In [None]:
pip install ipympl

## Library

In [None]:
import numpy as np
%matplotlib widget
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.linear_model import LogisticRegression

## logistic regression

Logistic Regression is a statistical method for analyzing datasets in which there are one or more independent variables that determine an outcome. The outcome is measured with a dichotomous variable (in which there are only two possible outcomes). It is used to predict a binary outcome (1 / 0, Yes / No, True / False) given a set of independent variables. To represent binary/categorical outcome, we use dummy variables.

- Sigmoid Function: The core idea of Logistic Regression is the utilization of the sigmoid function (also called the logistic function). It can map any value between 0 and 1, making it useful for a probability estimate.
- Decision Boundary: Logistic regression produces a decision boundary that separates the classes. For a 2D dataset, this boundary is a line
- Estimation: Logistic regression estimates the probability that a given instance belongs to a particular category.
- Multiclass Classification: Though inherently binary, logistic regression can be adapted for multiclass classification using techniques like "One vs All" (OvA) or "Softmax Regression" for mutually exclusive classes.
- Cost Function: The cost function used in logistic regression is log loss.

Now, let's illustrate Logistic Regression with a Python example using the Iris dataset:

In [None]:
# Load the Iris dataset
iris = datasets.load_iris()
X = iris.data[:, :2]  # We take only the first two features for simplicity
y = (iris.target != 0) * 1  # Convert the target to binary

# Instantiate and fit a Logistic Regression model
logreg = LogisticRegression(C=1e5, solver='lbfgs')
logreg.fit(X, y)

# Plotting the decision boundary
x_min, x_max = X[:, 0].min() - .5, X[:, 0].max() + .5
y_min, y_max = X[:, 1].min() - .5, X[:, 1].max() + .5
h = .02  # Step size in the mesh
xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))
Z = logreg.predict(np.c_[xx.ravel(), yy.ravel()])

Z = Z.reshape(xx.shape)
plt.figure(1, figsize=(8, 6))
plt.pcolormesh(xx, yy, Z, cmap=plt.cm.Paired)

# Plot the training points
plt.scatter(X[:, 0], X[:, 1], c=y, edgecolors='k', cmap=plt.cm.Paired)
plt.xlabel('Sepal length')
plt.ylabel('Sepal width')
plt.title('Logistic Regression on Iris Dataset')
plt.show()


## Logistic Loss Function (Log Loss)

Logistic loss, also known as log loss or cross-entropy loss, is commonly used in classification problems. The purpose of the logistic loss function is to quantify the difference between the predicted probabilities for the actual class and the other classes.

For a binary classification:

- **Predicted Probability**: Let \( \hat{y} \) be the predicted probability of an instance belonging to class 1.
  
- **Actual Label**: Let \( y \) be the actual label of the instance, where \( y \in \{0,1\} \).

The logistic loss for that instance is given by:

\[ \mathcal{L}(\hat{y}, y) = - y \log(\hat{y}) - (1-y) \log(1-\hat{y}) \]

### Intuition:

- **When \( y = 1 \)**: The loss is \( -\log(\hat{y}) \). As \( \hat{y} \) approaches 1, the loss goes to 0. Conversely, as \( \hat{y} \) approaches 0, the loss goes to infinity.
  
- **When \( y = 0 \)**: The loss is \( -\log(1-\hat{y}) \). As \( \hat{y} \) approaches 0, the loss goes to 0. Conversely, as \( \hat{y} \) approaches 1, the loss goes to infinity.

This implies that the more confident the incorrect predictions are, the higher the loss will be.

In [None]:
def logistic_loss(y_true, y_pred):
    """
    Computes the logistic loss.
    
    Args:
        y_true (numpy array): Array of true binary labels.
        y_pred (numpy array): Array of predicted probabilities.
    
    Returns:
        numpy array: Array of logistic loss values for each prediction.
    """
    # Ensure no value is exactly 0 or 1 for stability
    epsilon = 1e-15
    y_pred = np.clip(y_pred, epsilon, 1 - epsilon)
    
    loss = - y_true * np.log(y_pred) - (1 - y_true) * np.log(1 - y_pred)
    return loss

# Example usage:
y_true = np.array([0, 0, 1, 1])
y_pred = np.array([0.1, 0.9, 0.7, 0.3])

print("Logistic Loss:", logistic_loss(y_true, y_pred))

# Visualizing the logistic loss for varying predicted probabilities
y_pred_range = np.linspace(0, 1, 100)
plt.plot(y_pred_range, logistic_loss(np.ones_like(y_pred_range), y_pred_range), label="Actual: 1")
plt.plot(y_pred_range, logistic_loss(np.zeros_like(y_pred_range), y_pred_range), label="Actual: 0")
plt.title("Logistic Loss vs. Predicted Probability")
plt.xlabel("Predicted Probability")
plt.ylabel("Loss")
plt.legend()
plt.show()