#MNIST Dataset Using SVMs

##0. [Importing needed libraries](#0)
##1. [Getting the data ready](#1)
##2. [Using SVM Model](#2)

## 0. Importing needed libraries <a name="0"></a>

In [243]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from matplotlib import pyplot as plt
from sklearn.preprocessing import StandardScaler, Normalizer
from sklearn.datasets import fetch_openml
from cvxopt import matrix, solvers
from sklearn.metrics import accuracy_score, classification_report


## 1. Getting the data ready <a name="1"></a>
* ### [importing the data](#1.1)
* ### [Data Preprocessing](#1.2)
* ### [splitting the data](#1.3)

### importing the data <a name="1.1"></a>

In [244]:
mnist = fetch_openml('mnist_784',version=1)
X,y = mnist.data,mnist.target
y = y.astype(int)
y.head()

Unnamed: 0,class
0,5
1,0
2,4
3,1
4,9


In [245]:
X.shape,y.shape

((70000, 784), (70000,))

### Data Preprocessing <a name="1.2"></a>

In [246]:
for col in X.columns:
    if (X[col] == 0).all():
      X.drop(col,axis=1,inplace=True)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  X.drop(col,axis=1,inplace=True)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  X.drop(col,axis=1,inplace=True)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  X.drop(col,axis=1,inplace=True)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  X.drop(col,axis=1,inplace=True)
A value is trying to be set on a copy of

In [247]:
X.shape,y.shape

((70000, 719), (70000,))

In [248]:
scaler = Normalizer()
scaler.fit(X)
X_standard = scaler.transform(X)
X_standard.shape

(70000, 719)

### splitting the data <a name="1.3"></a>

In [249]:
X_train,xx,y_train,yy = train_test_split(X_standard,y,test_size=.98,random_state=0)
X_test,X_val,y_test,y_val = train_test_split(xx,yy,test_size=.9,random_state=0)

In [250]:
X_train.shape,X_test.shape,X_val.shape,y_train.shape,y_test.shape,y_val.shape

((1400, 719), (6860, 719), (61740, 719), (1400,), (6860,), (61740,))

## 2. Using SVM Model <a name="2"></a>
* ### [Making Support Vector Training function](#2.1)
  * SVT is a function that passing quadratic equation( consisting of alpha and it's parameters) in addition to SVM conistrains to (cvxpy) quadrating programing library
* ### [SVM training function for every category](#2.3)
  * claculate alpha, w and b for every category to use it for prediction
* ### [formulating the classifier function](#2.4)
* ### [training data with diffrant C values](#2.5)
  * train the data with diffrant C values and test it with validation set
  * then choose highest accuracy and test it with test set to know the actual accuracy of the model


### Making Support Vector Training function <a name="2.1"></a>

In [251]:
def SVT(X_train, y_train, P, C=1):
    solvers.options['abstol'] = 1e-9
    solvers.options['reltol'] = 1e-6
    solvers.options['feastol'] = 1e-8
    solvers.options['maxiters'] = 1000
    solvers.options['show_progress'] = False

    m, n = X_train.shape
    P = (y_train @ y_train.T) * P
    P = matrix(P,tc='d')
    q = matrix(-np.ones((m, 1)))
    A = matrix(y_train.reshape(1, -1),tc='d')
    b = matrix(np.zeros(1))
    G = matrix(np.vstack((-np.eye(m), np.eye(m))))
    h = matrix(np.vstack((np.zeros((m, 1)), np.ones((m, 1)) * C)))

    alpha = solvers.qp(P, q, G, h, A, b)['x']
    alpha = np.array(alpha).reshape(-1,1)

    w = np.sum(alpha * y_train * X_train, axis=0)
    support_mask = ((alpha > 0) & (alpha < C)).flatten()
    b = np.mean(y_train[support_mask] - X_train[support_mask] @ w)
    return alpha.flatten(), w, b


### SVM training function for every category <a name="2.3"></a>

In [252]:
def train_SVM(X_train, y_train, C=1):
    m, n = X_train.shape
    W = np.zeros((n,10))
    alphas = np.zeros((m,10))
    bs = np.zeros(10)
    P = X_train @ X_train.T
    for i in range(10):
        yi = np.where(y_train == i,1,-1).reshape(-1,1)

        alpha, w, b = SVT(X_train, yi, P, C)
        alphas[:,i] = alpha
        W[:,i] = w
        bs[i] = b
        print("operation",i,"done")

    return alphas, W, bs

###formulating the classifier function <a name="2.4"></a>

In [253]:
def classifier(X, W, bs):
    return np.argmax(X @ W + bs,axis=1)

###training data with diffrant C values <a name="2.5"></a>

In [254]:
C = [2, 3, 5, 7, 8, 9]
acc = []
for c in C:
    alphas, W, bs = train_SVM(X_train, y_train, c)
    y_predict = classifier(X_val,W,bs)
    acc.append(accuracy_score(y_val, y_predict))
    print(f"c: {c} operation done")

operation 0 done
operation 1 done
operation 2 done
operation 3 done
operation 4 done
operation 5 done
operation 6 done
operation 7 done
operation 8 done
operation 9 done
c: 2 operation done
operation 0 done
operation 1 done
operation 2 done
operation 3 done
operation 4 done
operation 5 done
operation 6 done
operation 7 done
operation 8 done
operation 9 done
c: 3 operation done
operation 0 done
operation 1 done
operation 2 done
operation 3 done
operation 4 done
operation 5 done
operation 6 done
operation 7 done
operation 8 done
operation 9 done
c: 5 operation done
operation 0 done
operation 1 done
operation 2 done
operation 3 done
operation 4 done
operation 5 done
operation 6 done
operation 7 done
operation 8 done
operation 9 done
c: 7 operation done
operation 0 done
operation 1 done
operation 2 done
operation 3 done
operation 4 done
operation 5 done
operation 6 done
operation 7 done
operation 8 done
operation 9 done
c: 8 operation done
operation 0 done
operation 1 done
operation 2 done

In [255]:
c = C[np.argmax(acc)]
print(f"max. C = {c}")
alphas, W, bs = train_SVM(X_train, y_train, c)
y_predict = classifier(X_test,W,bs)
print(classification_report(y_test, y_predict))
print(accuracy_score(y_test, y_predict))

max. C = 5
operation 0 done
operation 1 done
operation 2 done
operation 3 done
operation 4 done
operation 5 done
operation 6 done
operation 7 done
operation 8 done
operation 9 done
              precision    recall  f1-score   support

           0       0.89      0.97      0.92       716
           1       0.87      0.98      0.92       736
           2       0.89      0.81      0.85       711
           3       0.84      0.85      0.84       697
           4       0.80      0.88      0.84       644
           5       0.89      0.75      0.81       605
           6       0.86      0.96      0.91       687
           7       0.93      0.85      0.89       649
           8       0.89      0.74      0.81       689
           9       0.82      0.84      0.83       726

    accuracy                           0.86      6860
   macro avg       0.87      0.86      0.86      6860
weighted avg       0.87      0.86      0.86      6860

0.8644314868804664


In [256]:
for i in range(6):
  print(f"c: {C[i]},accuracy: {acc[i]}")

c: 2,accuracy: 0.8610463232912212
c: 3,accuracy: 0.8636702299967606
c: 5,accuracy: 0.8650469711694202
c: 7,accuracy: 0.8633462908973113
c: 8,accuracy: 0.8630061548428896
c: 9,accuracy: 0.8623096857790735
