- Not all classification predictive models support multi-class classification
- Algorithms such as the Perceptron, Logistic Regression, and Support Vector Machines were designed for binary classification and do not natively support classification tasks with more than two classes
- One approach for using binary classification algorithms for multi-classification problems is to split the multi-class classification dataset into multiple binary classification datasets and fit a binary classification model on each
- Two different examples of this approach are the One-vs-Rest and One-vs-One strategies
- The One-vs-Rest strategy splits a multi-class classification into one binary classification problem per class
- The One-vs-One strategy splits a multi-class classification into one binary classification problem per each pair of classes

#### One-Vs-Rest for Multi-Class Classification

- One-vs-rest is a heuristic method for using binary classification algorithms for multi-class classification
- It involves splitting the multi-class dataset into multiple binary classification problems
- A binary classifier is then trained on each binary classification problem and predictions are made using the model that is the most confident
- For example, given a multi-class classification problem with examples for each class ‘red,’ ‘blue,’ and ‘green‘. This could be divided into three binary classification datasets as follows:
    - Binary Classification Problem 1: red vs [blue, green]
    - Binary Classification Problem 2: blue vs [red, green]
    - Binary Classification Problem 3: green vs [red, blue]
- A possible downside of this approach is that it requires one model to be created for each class. For example, three classes requires three models. This could be an issue for large datasets (e.g. millions of rows), slow models (e.g. neural networks), or very large numbers of classes (e.g. hundreds of classes)

In [3]:
from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

X, y = make_classification(n_samples=1000, n_features=10, n_informative=5, n_redundant=5, n_classes=3, random_state=1)
model = LogisticRegression(multi_class='ovr')
model.fit(X, y)
yhat = model.predict(X)
print(accuracy_score(y,yhat))

0.696


In [4]:
from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
from sklearn.multiclass import OneVsRestClassifier

X, y = make_classification(n_samples=1000, n_features=10, n_informative=5, n_redundant=5, n_classes=3, random_state=1)
model = LogisticRegression()
ovr = OneVsRestClassifier(model)
ovr.fit(X, y)
yhat = ovr.predict(X)
print(accuracy_score(y,yhat))

0.696


#### One-Vs-One for Multi-Class Classification

- Like one-vs-rest, one-vs-one splits a multi-class classification dataset into binary classification problems. Unlike one-vs-rest that splits it into one binary dataset for each class, the one-vs-one approach splits the dataset into one dataset for each class versus every other class
    - Binary Classification Problem 1: red vs. blue
    - Binary Classification Problem 2: red vs. green
    - Binary Classification Problem 3: red vs. yellow
    - Binary Classification Problem 4: blue vs. green
    - Binary Classification Problem 5: blue vs. yellow
    - Binary Classification Problem 6: green vs. yellow
- This is significantly more datasets, and in turn, models than the one-vs-rest strategy
- Each binary classification model may predict one class label and the model with the most predictions or votes is predicted by the one-vs-one strategy

In [5]:
from sklearn.datasets import make_classification
from sklearn.svm import SVC

X, y = make_classification(n_samples=1000, n_features=10, n_informative=5, n_redundant=5, n_classes=3, random_state=1)
model = SVC(decision_function_shape='ovo')
model.fit(X, y)
yhat = model.predict(X)
print(accuracy_score(y,yhat))

0.89


In [None]:
from sklearn.datasets import make_classification
from sklearn.svm import SVC
from sklearn.multiclass import OneVsOneClassifier

X, y = make_classification(n_samples=1000, n_features=10, n_informative=5, n_redundant=5, n_classes=3, random_state=1)
model = SVC()
ovo = OneVsOneClassifier(model)
ovo.fit(X, y)
yhat = ovo.predict(X)
