# Analysis on SVMs


## Importing required libraries

In [2]:
import numpy as np
import pandas as pd

from sklearn.model_selection import train_test_split
from sklearn import preprocessing
from sklearn.svm import SVC 
from sklearn.ensemble import RandomForestClassifier 

import matplotlib.pyplot as plt
from sklearn.metrics import accuracy_score

## Reading Data

In [3]:
df = pd.read_csv('mnist.csv')
df.head()

Unnamed: 0,label,pixel0,pixel1,pixel2,pixel3,pixel4,pixel5,pixel6,pixel7,pixel8,...,pixel774,pixel775,pixel776,pixel777,pixel778,pixel779,pixel780,pixel781,pixel782,pixel783
0,1,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,1,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,4,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [4]:
Y= df['label']
X = df.drop("label", axis = 'columns')

## Normalizing data

In [5]:
Xpro = preprocessing.StandardScaler().fit(X).transform(X.astype(float))

In [6]:
X_train, X_test, Y_train, Y_test = train_test_split(Xpro,Y, test_size =0.5, random_state =0)

In [10]:
print(len(Y_train))

21000


## Making SVM as Function

In [8]:
def SVM(c, kernl, deg ,gama):
    if c ==0:
        c =1
    if deg ==0:
        deg =3
    if gama ==0:
        gama = 'scale'
        
    classifier = SVC(C= c , kernel =kernl, degree = deg, gamma = gama).fit(X_train, Y_train)
    pred = classifier.predict(X_test)
    
    accu = accuracy_score(pred, Y_test)
    
    print("accuracy score =", accu)
    print("kernel used = ", kernl)
    print("C = ", c)
    print(" degree = ", deg)
    print("gamma  = ", gama)


C is regularization parameter. The strength of the regularization is inversely proportional to C.

Kernel Specifies the kernel type to be used in the algorithm. If none is given, ‘rbf’ will be used. If a callable is given it is used to pre-compute the kernel matrix from data matrices.

Degree of the polynomial kernel function (‘poly’).

Gamma is Kernel coefficient for ‘rbf’, ‘poly’ and ‘sigmoid’.



## SVM classifier (Linearly Non-Separable) 

#### 1) C =0.1

In [7]:
SVM(0.1 , 'linear' , 0 ,0)

accuracy score = 0.9217619047619048
kernel used =  linear
C =  0.1
 degree =  3
gamma  =  scale


Computational time = 1 min 3 sec 

#### 2) C =1

In [8]:
SVM(1 , 'linear' , 0 ,0)

accuracy score = 0.9113333333333333
kernel used =  linear
C =  1
 degree =  3
gamma  =  scale


Computational time = 1 min

In [9]:
SVM(1 , 'linear' , 0 ,0)

accuracy score = 0.9985714285714286
kernel used =  linear
C =  1
 degree =  3
gamma  =  scale


3) C =10

In [9]:
SVM(10 , 'linear' , 0 ,0)

accuracy score = 0.9077142857142857
kernel used =  linear
C =  10
 degree =  3
gamma  =  scale


Computational time = 57 sec

We can see that as C inreases regularization decreases. So does the time taken to process.

We can also observe that accuracy decereases slightly as C increases.  

## Gaussian Kernel (RBF)

#### 1) Using default rbf kernel 

In [10]:
SVM(0, 'rbf', 0, 0)

accuracy score = 0.9515714285714286
kernel used =  rbf
C =  1
 degree =  3
gamma  =  scale


Computational time = 2 mins 38 sec

#### 2) rbf kernel with C =0.1, gamma  =1

In [11]:
SVM(0.1 , 'rbf' , 0 ,1)

accuracy score = 0.11214285714285714
kernel used =  rbf
C =  0.1
 degree =  3
gamma  =  1


Computational time = 11 min 5 sec

#### 3) rbf kernel with C =1, gamma  =0.1

In [12]:
SVM(1 , 'rbf' , 0 ,0.1)

accuracy score = 0.17566666666666667
kernel used =  rbf
C =  1
 degree =  3
gamma  =  0.1


Computational time = 10 min 41 sec

We found kernel with normal parameters works best and fast. By changing gamma or C value accuracy has fallen drastically.

## Polynomial Kernel

#### 1) Degree =2

In [13]:
SVM(1 , 'poly' , 2 ,0)

accuracy score = 0.9598095238095238
kernel used =  poly
C =  1
 degree =  2
gamma  =  scale


Computational time = 5 min 6 sec

#### 2) Degree = 4

In [14]:
SVM(1 , 'poly' , 4 ,0)

accuracy score = 0.8203333333333334
kernel used =  poly
C =  1
 degree =  4
gamma  =  scale


Computational time = 5 min

We found max accuracy with polynomial kernel with degree 2 for this data.
Different degree works best for different data.

## Random Forest

In [6]:
RFclassifier10 = RandomForestClassifier(n_estimators= 100, criterion="entropy")
RFclassifier10.fit(X_train, Y_train)
Y_pred= RFclassifier10.predict(X_test) 
RFacc10 = accuracy_score(Y_test, Y_pred)

print("accuracy score= ", round(RFacc10*100,2))

accuracy score=  95.61


Computational time = 19 sec

## Conclusion

I would like to declare Random forest is better classifer than SVM. Since it has high accuracy score and was computed in just 19 second.

In SVM for this dataset polynomial kernel with degree 2 gave maximum accuracy. The second best is rbf kerner without passing any parameters.

In linear kernel the more we regularize data(decreasing C) the more accuracy we got although the computational time has increased.