In [None]:
from sklearn.datasets import fetch_openml
mnist = fetch_openml('mnist_784', version=1)

In [None]:
X, y = mnist.data,mnist.target
import numpy as np

X = np.array(mnist.data)
y = np.array(mnist.target)
y = y.astype(np.uint8)

In [None]:
from sklearn.preprocessing import normalize
X = normalize(X)

In [None]:
from sklearn.model_selection import train_test_split
from sklearn.svm import LinearSVC
from sklearn.metrics import accuracy_score

#splitting the dataset into train and test set'''  '''
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

#Creating Validation set from the training data
X_train_1 , X_valid , y_train_1 , y_valid = train_test_split(X_train,y_train,test_size=0.2,random_state=42)

C_value_max = 0.0
accuracy_max = 0.0
C_value = [0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0]
for i in range(len(C_value)):
  svm_cls = LinearSVC(C = C_value[i] , multi_class = 'ovr')
  # Fit the model to the training data
  svm_cls.fit(X_train_1, y_train_1)

  y_pred = svm_cls.predict(X_valid)
  accuracy = accuracy_score(y_valid, y_pred)

  if accuracy > accuracy_max:
    accuracy_max = accuracy
    C_value_max = C_value[i]
    print(f'C value: {C_value_max}')
    print(f'OvR SVM Accuracy: {accuracy:.4f}')

print(f'Best C value: {C_value_max}')
  #print(f'OvR SVM Accuracy: {accuracy:.4f}')

C value: 0.1
OvR SVM Accuracy: 0.9066
C value: 0.2
OvR SVM Accuracy: 0.9097
C value: 0.3
OvR SVM Accuracy: 0.9113
C value: 0.4
OvR SVM Accuracy: 0.9118
C value: 0.5
OvR SVM Accuracy: 0.9122
C value: 0.6
OvR SVM Accuracy: 0.9125
C value: 0.7
OvR SVM Accuracy: 0.9127
C value: 0.8
OvR SVM Accuracy: 0.9129
C value: 0.9
OvR SVM Accuracy: 0.9130
C value: 1.0
OvR SVM Accuracy: 0.9135
Best C value: 1.0


In [None]:
# prediction on test data with the best regularization parameter
svm_cls = LinearSVC(C = C_value_max, multi_class = 'ovr')

# Fit the model to the training data
svm_cls.fit(X_train_1, y_train_1)
y_pred = svm_cls.predict(X_test)

accuracy = accuracy_score(y_test, y_pred)
print(f'OvR SVM Accuracy: {accuracy:.4f}')

OvR SVM Accuracy: 0.9148


**SVM Classification**

In the One-vs-Rest (OvR) setting for SVM classification, the classification problem is broken down into multiple binary classification tasks. Specifically, for a dataset with n classes, the OvR approach constructs n binary classifiers. Each classifier is responsible for distinguishing one class from the rest of the classes combined.

In the context of SVM (Support Vector Machines), when using the OvR approach, the algorithm constructs separate decision boundaries for each class. For each binary classifier, the SVM tries to find a hyperplane that maximizes the margin between the target class and the rest of the data points (the "rest" refers to all other classes grouped together). During training, the model learns the hyperplane that best separates a particular class from the others.

During inference (or prediction), when a new sample needs to be classified, each of the trained binary classifiers calculates a decision score for the sample. This score represents how confidently the model believes the sample belongs to the target class. After calculating these decision scores across all the classifiers, the class with the highest score is assigned to the sample.

To summarize the key steps in inference under OvR SVM:

Train n binary classifiers for n classes.
For a given test sample, pass the sample through all n classifiers.

Each classifier produces a score indicating its confidence that the sample belongs to the corresponding class.
The class with the highest score is chosen as the predicted class for the sample.

In this approach, sklearn's LinearSVC fits a separate linear decision boundary for each class, and the prediction is based on which classifier outputs the highest confidence score for a given test sample.

This method is efficient for multi-class classification and works well in practice, particularly for linearly separable data. However, it can struggle with more complex, non-linear decision boundaries, where other strategies like kernel SVMs may be more suitable.