### Multiclass Classification

In this section, we will discuss how logistic regression can be extended to handle multiclass classification problems. We will cover two common strategies: One-vs-All (OvA) and One-vs-One (OvO). Additionally, we will provide a Python implementation for these strategies.

#### One-vs-All (OvA) Strategy

The One-vs-All (OvA) approach involves decomposing a multiclass classification problem into multiple binary classification problems. For a problem with $R$ classes, we train $R$ binary classifiers, each of which distinguishes one class from the remaining $R-1$ classes.

**Training:**

1. For each class $r$ (where $r \in \{1, 2, \ldots, R\}$):
   - Create a binary label vector $y^{(r)}$ where:
     $$
     y^{(r)}_i = \begin{cases} 
     +1 & \text{if } y_i = r \\
     -1 & \text{otherwise}
     \end{cases}
     $$
   - Train a binary classifier $f_r(\mathbf{x})$ using the feature vectors $\mathbf{x}_i$ and the binary labels $y^{(r)}_i$.

**Prediction:**

1. For a new feature vector $\mathbf{x}$:
   - Compute the output of each binary classifier $f_r(\mathbf{x})$.
   - Assign $\mathbf{x}$ to the class with the highest classifier output:
     $$
     \hat{y} = \arg \max_r f_r(\mathbf{x})
     $$

**Example:**

Let's consider a dataset with three classes (R = 3). We will use the OvA strategy to train three binary classifiers.



In [1]:
import numpy as np
from sklearn.linear_model import LogisticRegression

# Training data
X_train = np.array([[0.5, 1.5], [1.0, 1.0], [1.5, 0.5], [2.0, 1.0], [2.5, 1.5], [3.0, 0.5]])
y_train = np.array([1, 2, 2, 3, 3, 1])

# Create binary labels for each class
y_train_ova = [(y_train == i).astype(int) * 2 - 1 for i in range(1, 4)]

# Train a binary classifier for each class
classifiers = []
for y in y_train_ova:
    clf = LogisticRegression()
    clf.fit(X_train, y)
    classifiers.append(clf)

# Prediction function for OvA
def predict_ova(X, classifiers):
    predictions = np.array([clf.decision_function(X) for clf in classifiers])
    return np.argmax(predictions, axis=0) + 1

# Prediction for new data points
X_new = np.array([[1.2, 0.8], [2.2, 1.3]])
predictions = predict_ova(X_new, classifiers)
print(f'Predictions for the points {X_new}: {predictions}')


Predictions for the points [[1.2 0.8]
 [2.2 1.3]]: [2 3]


#### One-vs-One (OvO) Strategy

The One-vs-One (OvO) approach involves decomposing a multiclass classification problem into multiple binary classification problems, each of which distinguishes between a pair of classes. For a problem with $R$ classes, we train $\frac{R(R-1)}{2}$ binary classifiers, each of which distinguishes between two classes.

**Training:**

1. For each pair of classes $(r, s)$ (where $r, s \in \{1, 2, \ldots, R\}$ and $r \neq s$):
   - Create a subset of the training data containing only the samples from classes $r$ and $s$.
   - Train a binary classifier $f_{r,s}(\mathbf{x})$ on this subset.

**Prediction:**

1. For a new feature vector $\mathbf{x}$:
   - Compute the output of each binary classifier $f_{r,s}(\mathbf{x})$.
   - Use a voting scheme where each classifier votes for one of the two classes.
   - Assign $\mathbf{x}$ to the class with the most votes.

**Example:**

Let's consider a dataset with three classes (R = 3). We will use the OvO strategy to train three binary classifiers.

In [2]:
from itertools import combinations
from collections import Counter

# Training data
X_train = np.array([[0.5, 1.5], [1.0, 1.0], [1.5, 0.5], [2.0, 1.0], [2.5, 1.5], [3.0, 0.5]])
y_train = np.array([1, 2, 2, 3, 3, 1])

# Train a binary classifier for each pair of classes
classifiers = {}
pairs = list(combinations(np.unique(y_train), 2))
for (r, s) in pairs:
    # Create subset for classes r and s
    mask = np.logical_or(y_train == r, y_train == s)
    X_pair = X_train[mask]
    y_pair = y_train[mask]
    y_pair = (y_pair == r).astype(int) * 2 - 1
    clf = LogisticRegression()
    clf.fit(X_pair, y_pair)
    classifiers[(r, s)] = clf

# Prediction function for OvO
def predict_ovo(X, classifiers):
    votes = np.zeros((X.shape[0], len(classifiers)))
    for i, ((r, s), clf) in enumerate(classifiers.items()):
        pred = clf.predict(X)
        votes[:, i] = np.where(pred == 1, r, s)
    final_predictions = [Counter(vote_row).most_common(1)[0][0] for vote_row in votes]
    return np.array(final_predictions)

# Prediction for new data points
X_new = np.array([[1.2, 0.8], [2.2, 1.3]])
predictions = predict_ovo(X_new, classifiers)
print(f'Predictions for the points {X_new}: {predictions}')

Predictions for the points [[1.2 0.8]
 [2.2 1.3]]: [2. 3.]
