---

title: "SVM by Hand"

description: "A gritty battle in linear algebra to implement kernel and multiclass functionality for an SVM by hand."

author:
  - name: Sam Herold
  
    url: https://samhero16.github.io/
    
    orcid: 0000-0002-5300-3075


priority: 2

image: 1_Lsun5-t67owndP0iTV9DNQ.png

draft: false 

---

# Introduction

Warning: This notebook is extremely dense, difficult to follow, and arranged poorly - but I am proud of it so I decided to post it. I will be rearranging it for better readability soon

An SVM out of the box can only separate linearly and only supports binary classification. This can be relieved with kernels and with the OneVsAll strategy. I implemented this functionality by hand on top of raw SVM code, then compared my implementation to the sklearn equivalent. 

# Dependencies and Data Cleaning

In [1]:
from sklearn.datasets import fetch_openml
import time
import numpy as np
from numpy import linalg as LA
from sklearn.model_selection import train_test_split
import time
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics.pairwise import polynomial_kernel, linear_kernel

In [2]:
data = pd.read_fwf('ecoli/ecoli.data', header = None, ) 
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 336 entries, 0 to 335
Data columns (total 9 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   0       336 non-null    object 
 1   1       336 non-null    float64
 2   2       336 non-null    float64
 3   3       336 non-null    float64
 4   4       336 non-null    float64
 5   5       336 non-null    float64
 6   6       336 non-null    float64
 7   7       336 non-null    float64
 8   8       336 non-null    object 
dtypes: float64(7), object(2)
memory usage: 23.8+ KB


In [3]:
#break into features and labels
x = data.iloc[:,1:-1].to_numpy()
y = data.iloc[:,-1].to_numpy()

In [4]:
#encode all of the 'cp', 'imL', etc. stuff into numbers for classification
le = LabelEncoder()
y = le.fit_transform(y)

In [5]:
#mappings and value counts
print(np.unique(y, return_counts=True))
dic = dict(zip( le.transform(le.classes_), le.classes_))
dic

(array([0, 1, 2, 3, 4, 5, 6, 7]), array([  2,   2, 143,  77,   5,  35,  20,  52]))


{0: 'L', 1: 'S', 2: 'cp', 3: 'im', 4: 'mL', 5: 'mU', 6: 'om', 7: 'pp'}

In [6]:
print(np.unique(y))

[0 1 2 3 4 5 6 7]


In [7]:
x.shape, y.shape

((336, 7), (336,))

# Functions
We will be changing the x_train and x_test data-sets later, so they are variables in the function

In [8]:
#break dataset for OneVall. 'All' is -1, 'One' is 1
def breakDataSet(y_train, oneClass):
    y_train_copy = y_train.copy()
    
    y_train_copy[y_train != oneClass] = -1
    y_train_copy[y_train == oneClass] = 1
    
    return y_train_copy

In [9]:
#function to get the svm weights and biases
def getSVMs(x_train, y_train): 
    
    
    #make a flexible dataframe to store the wieghts and biases depending on the input shape. first column for 
    #storing index, everything else execpt lass for weights per feature, last one for bias
    n_columns = x_train.shape[1] + 1
    column_names = [f'Weight_{i}' for i in range(n_columns)]
    df = pd.DataFrame(0, index=range(8), columns=column_names)
    df = df.rename(columns={column_names[-1]: 'Bias'})


    #print(df)
    
    #loop through all 8 classes and make a OneVall for each

    for c in range(len(np.unique(y))):
        

        ydual_train  = breakDataSet(y_train.copy(), oneClass = c)
        xdual_train = x_train.copy()

        N = len(ydual_train)
        XPY = xdual_train.copy()
        for i in range(N):
            if ydual_train[i]==-1:
                XPY[i,:] =-1 * xdual_train[i,:]  
                
           
        A = np.matmul(XPY,XPY.transpose())

        AT = A.copy().transpose()

        YM = np.outer(ydual_train[1:],ydual_train[1:])
        AY = np.outer(A[0,1:],ydual_train[1:])
        YA = np.outer(ydual_train[1:],A[0,1:])


        Y0S = ydual_train[0]**2
        M = AT[1:,1:] + A[0,0]*YM/Y0S - AY/ydual_train[0] - YA/ydual_train[0]


        b = np.zeros(N-1)
        b = 1 - ydual_train[1:]/ydual_train[0]


        aw = np.zeros(N)
        for i in range(2,N):
            aw[i] = (1-ydual_train[i]/ydual_train[0])/(A[i,i] + A[0,0]*ydual_train[i]**2/ydual_train[0]**2 
                                                   - 2*A[0,i]*ydual_train[i]/ydual_train[0])

        aw[0] = -sum(ydual_train[1:]*aw[1:])/ydual_train[0]



        YA   = ydual_train*aw

        wght = sum(xdual_train * YA[:,None])
        b =sum(ydual_train - np.matmul(xdual_train,wght))/N
        

       
        df.iloc[c,0:-1] = wght
        df.iloc[c,-1] = b
            
    return df



In [10]:
# run through each of the 8 classifiers and pick the class of the classifier that 
# is the most confident in the 'One' not the 'All'
def getAccuracy(x_test, y_test, SVM):
    testSetAcc = []
    for j in range(x_test.shape[0]):
        results = []
        #loop through all 8 available SVM's
        for i in range(8):
            #W*features + bias
            results.append((SVM.iloc[i,:-1].values@x_test[j]) + SVM.iloc[i,-1])
        #pick most confident result, store true if it is the same as the actual output
        #argmax is the most confident result. A positive number means pick the one, negative means pick the rest. 
        #so the highest positive number is the most confident 'one' result. 
        
        testSetAcc.append(np.argmax(results) == y_test[j])
        
    #print(testSetAcc)
    if len(testSetAcc) == 0:
        return 0.0
        
    return sum(testSetAcc)/len(testSetAcc)

In [11]:
#see accuracy for each specific class
#should correspond to number of training examples
def getAccuracyPerNumber(x_test, y_test, SVM):
    for i in range(8):
        #split data into only x and y of one class
        yonlyI  = y_test[y_test == i]
        xonlyI  = x_test[y_test == i]
        acc = getAccuracy(xonlyI, yonlyI, SVM)
        print("Accuracy for " , i , " or " , dic[i] , " : " , acc)

# MultiClass SVM's

In [12]:
#get the train test split
#we will be changing x_train and x_test for the kernels
x_train, x_test, y_train, y_test = train_test_split(x,y,train_size = .8,shuffle = True, random_state = 4)
x.shape

(336, 7)

### Sklearn implementations

In [13]:
from sklearn.svm import LinearSVC
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

In [14]:
#Linear SVM
linearSVC = LinearSVC(dual="auto")
linearSVC.fit(x_train,y_train)
y_pred = linearSVC.predict(x_test)
"sklean Linear SVM Accuracy: " , accuracy_score(y_pred,y_test)

('sklean Linear SVM Accuracy: ', 0.9411764705882353)

In [15]:
#Linear Kernel SVM
linearKernelSVC = SVC(kernel = "linear")
linearKernelSVC.fit(x_train,y_train)
y_pred = linearKernelSVC.predict(x_test)
"sklean Linear Kernel SVM Accuracy: " , accuracy_score(y_pred,y_test)

('sklean Linear Kernel SVM Accuracy: ', 0.8382352941176471)

In [16]:
#Polynomial Kernel SVM
polySVC = SVC(kernel = "poly")
polySVC.fit(x_train,y_train)
y_pred = polySVC.predict(x_test)
"sklean Polynomial SVM Accuracy: " , accuracy_score(y_pred,y_test)

('sklean Polynomial SVM Accuracy: ', 0.8823529411764706)

In [17]:
#RBF Kernel SVM
rbfSVC = SVC(kernel = "rbf")
rbfSVC.fit(x_train,y_train)
y_pred = rbfSVC.predict(x_test)
"sklean RBF SVM Accuracy: " , accuracy_score(y_pred,y_test)

('sklean RBF SVM Accuracy: ', 0.8970588235294118)

### Linear (No Kernel) OneVsAll SVM
Do not change x_train or x_test

In [18]:
linearSVM = getSVMs(x_train = x_train, y_train = y_train)
linearSVM 

Unnamed: 0,Weight_0,Weight_1,Weight_2,Weight_3,Weight_4,Weight_5,Weight_6,Bias
0,0.652312,-1.943609,3.244773,1.531159,-0.087257,0.987371,0.218332,-3.284968
1,3.037805,-5.788647,-8.881784e-16,-8.881784e-16,1.392839,2.789961,-0.208726,-1.625561
2,-189.074352,-305.870475,-2.056133e-13,-3.610445e-13,-12.150815,-132.073554,-109.956294,379.137984
3,-47.842855,-116.962305,2.079584,2.797762e-14,37.575416,107.329767,87.19746,-35.081053
4,1.535861,-5.719501,10.89051,8.881784e-16,1.6587,1.569109,-6.278016,-2.75434
5,23.890656,-68.690367,2.51999,-2.176037e-14,16.202489,56.151497,45.912457,-39.212604
6,8.893138,-11.033733,1.24345e-14,-3.552714e-15,53.116386,-9.926145,-52.43613,4.703137
7,198.907436,516.008637,-18.73485,-1.531159,-97.707757,-26.828006,35.550916,-307.882595


In [19]:
acc = getAccuracy(x_test, y_test, SVM = linearSVM)
"Overall accuracy: " , acc

('Overall accuracy: ', 0.7647058823529411)

In [20]:
getAccuracyPerNumber(x_test, y_test, SVM = linearSVM)

Accuracy for  0  or  L  :  0.0
Accuracy for  1  or  S  :  0.0
Accuracy for  2  or  cp  :  1.0
Accuracy for  3  or  im  :  0.6111111111111112
Accuracy for  4  or  mL  :  0.0
Accuracy for  5  or  mU  :  0.0
Accuracy for  6  or  om  :  0.0
Accuracy for  7  or  pp  :  1.0


In [21]:
print("Total Value Counts: " , np.unique(y, return_counts=True))
print("Train Value Counts: " , np.unique(y_train, return_counts=True))
print("Test Value Counts: " , np.unique(y_test, return_counts=True))

Total Value Counts:  (array([0, 1, 2, 3, 4, 5, 6, 7]), array([  2,   2, 143,  77,   5,  35,  20,  52]))
Train Value Counts:  (array([0, 1, 2, 3, 4, 5, 6, 7]), array([  2,   2, 109,  59,   5,  29,  17,  45]))
Test Value Counts:  (array([2, 3, 5, 6, 7]), array([34, 18,  6,  3,  7]))


### Linear Kernel OneVsAll SVM

Here is a sanity check. Doing the nested for loops and dot product is the same as using '@', aka matmul, which is also the same result as sklearn. The three cells below are equal. I will be going forward with '@' method.

In [22]:
m1,_ = x_train.shape
m2,_ = x_train.shape
K = np.zeros((m1, m2))
for i in range(m1):
    for j in range(m2):
        K[i,j] = (np.dot(x_train[i,:], x_train[j,:]))
K

array([[2.2258, 2.1549, 1.8836, ..., 2.224 , 2.3089, 1.4967],
       [2.1549, 2.1061, 1.7873, ..., 2.1159, 2.1993, 1.4239],
       [1.8836, 1.7873, 1.7736, ..., 2.0285, 2.1597, 1.3861],
       ...,
       [2.224 , 2.1159, 2.0285, ..., 2.3671, 2.4829, 1.6044],
       [2.3089, 2.1993, 2.1597, ..., 2.4829, 2.6862, 1.632 ],
       [1.4967, 1.4239, 1.3861, ..., 1.6044, 1.632 , 1.1844]])

In [23]:
#sklean implementaion
linear_kernel(x_train, x_train)

array([[2.2258, 2.1549, 1.8836, ..., 2.224 , 2.3089, 1.4967],
       [2.1549, 2.1061, 1.7873, ..., 2.1159, 2.1993, 1.4239],
       [1.8836, 1.7873, 1.7736, ..., 2.0285, 2.1597, 1.3861],
       ...,
       [2.224 , 2.1159, 2.0285, ..., 2.3671, 2.4829, 1.6044],
       [2.3089, 2.1993, 2.1597, ..., 2.4829, 2.6862, 1.632 ],
       [1.4967, 1.4239, 1.3861, ..., 1.6044, 1.632 , 1.1844]])

In [24]:
#this is the easiest
x_train_lk = (x_train@x_train.T)
x_train_lk

array([[2.2258, 2.1549, 1.8836, ..., 2.224 , 2.3089, 1.4967],
       [2.1549, 2.1061, 1.7873, ..., 2.1159, 2.1993, 1.4239],
       [1.8836, 1.7873, 1.7736, ..., 2.0285, 2.1597, 1.3861],
       ...,
       [2.224 , 2.1159, 2.0285, ..., 2.3671, 2.4829, 1.6044],
       [2.3089, 2.1993, 2.1597, ..., 2.4829, 2.6862, 1.632 ],
       [1.4967, 1.4239, 1.3861, ..., 1.6044, 1.632 , 1.1844]])

In [25]:
x_test_lk = (x_test@x_train.T)
x_test_lk.shape

(68, 268)

In [26]:
linearKernelSVM = getSVMs(x_train_lk, y_train)
linearKernelSVM 

Unnamed: 0,Weight_0,Weight_1,Weight_2,Weight_3,Weight_4,Weight_5,Weight_6,Weight_7,Weight_8,Weight_9,...,Weight_259,Weight_260,Weight_261,Weight_262,Weight_263,Weight_264,Weight_265,Weight_266,Weight_267,Bias
0,0.029816,0.031341,0.040736,0.033559,0.030333,0.038119,0.074498,0.032969,0.033898,0.045377,...,0.035622,0.043069,0.044931,0.041545,0.032067,0.039507,0.034011,0.042333,0.035511,-19.493451
1,-0.128826,-0.09566,-0.020059,-0.031667,-0.140254,-0.074763,0.01864,-0.112745,-0.084766,0.083664,...,-0.110066,0.054653,0.071271,-0.024115,-0.06629,-0.027101,-0.075397,0.044607,-0.097745,19.601696
2,-2.325591,-2.298186,-1.547406,-2.027124,-1.505314,-1.079832,-2.110807,-1.858849,-2.388795,-2.091542,...,-1.363291,-1.101895,-2.233211,-1.326405,-2.104018,-1.569929,-2.004983,-2.02071,-1.116802,881.333004
3,-0.852687,-1.61487,2.031921,-1.340171,0.350807,0.815637,2.861968,0.588308,-1.113975,2.036954,...,-0.194748,-0.12152,1.673075,0.370304,-1.189052,-0.095279,1.509874,2.306719,0.90861,-339.834706
4,-0.02903,0.045244,0.157004,0.203149,0.095817,0.210922,0.623288,0.039999,0.010446,0.15864,...,0.133733,0.381164,0.123264,0.211067,0.112517,0.188817,0.042357,0.134176,0.191198,-72.285431
5,0.112292,-0.088166,1.268413,-0.034959,0.064405,0.415518,2.050423,0.497145,0.269832,1.922934,...,0.088798,0.481926,1.841427,0.634497,0.018704,0.501776,0.978662,1.787581,0.313826,-373.791961
6,-0.424177,-0.123569,-0.169247,1.169128,0.478533,0.035312,-1.023782,-0.288359,-0.846068,-0.182589,...,-0.472083,1.25446,-0.555692,-0.392064,0.62922,-0.144211,0.091844,0.077451,0.355122,45.004395
7,3.618203,4.143866,-1.761363,2.028084,0.625673,-0.360913,-2.494228,1.101533,4.119428,-1.973437,...,1.882034,-0.991856,-0.965066,0.485172,2.566852,1.106421,-0.576369,-2.372157,-0.589721,-146.533546


In [27]:
acc = getAccuracy(x_test_lk, y_test, SVM = linearKernelSVM )
"Overall accuracy: " , acc

('Overall accuracy: ', 0.75)

In [28]:
getAccuracyPerNumber(x_test_lk, y_test, SVM = linearKernelSVM )

Accuracy for  0  or  L  :  0.0
Accuracy for  1  or  S  :  0.0
Accuracy for  2  or  cp  :  1.0
Accuracy for  3  or  im  :  0.6666666666666666
Accuracy for  4  or  mL  :  0.0
Accuracy for  5  or  mU  :  0.0
Accuracy for  6  or  om  :  0.0
Accuracy for  7  or  pp  :  0.7142857142857143


### Polynomial Kernel OneVsAll SVM

In [29]:
x_train_pk = (x_train@x_train.T + 1) ** 2
x_train_pk

array([[10.40578564,  9.95339401,  8.31514896, ..., 10.394176  ,
        10.94881921,  6.23351089],
       [ 9.95339401,  9.64785721,  7.76904129, ...,  9.70883281,
        10.23552049,  5.87529121],
       [ 8.31514896,  7.76904129,  7.69285696, ...,  9.17181225,
         9.98370409,  5.69347321],
       ...,
       [10.394176  ,  9.70883281,  9.17181225, ..., 11.33736241,
        12.13059241,  6.78289936],
       [10.94881921, 10.23552049,  9.98370409, ..., 12.13059241,
        13.58807044,  6.927424  ],
       [ 6.23351089,  5.87529121,  5.69347321, ...,  6.78289936,
         6.927424  ,  4.77160336]])

In [30]:
#sanity check
polynomial_kernel(x_train, x_train, coef0 = 1,degree = 2, gamma = 1)

array([[10.40578564,  9.95339401,  8.31514896, ..., 10.394176  ,
        10.94881921,  6.23351089],
       [ 9.95339401,  9.64785721,  7.76904129, ...,  9.70883281,
        10.23552049,  5.87529121],
       [ 8.31514896,  7.76904129,  7.69285696, ...,  9.17181225,
         9.98370409,  5.69347321],
       ...,
       [10.394176  ,  9.70883281,  9.17181225, ..., 11.33736241,
        12.13059241,  6.78289936],
       [10.94881921, 10.23552049,  9.98370409, ..., 12.13059241,
        13.58807044,  6.927424  ],
       [ 6.23351089,  5.87529121,  5.69347321, ...,  6.78289936,
         6.927424  ,  4.77160336]])

In [31]:
x_test_pk = (x_test@x_train.T + 1) ** 2
x_test_pk.shape

(68, 268)

In [32]:
polyKernelSVM = getSVMs(x_train_pk, y_train)
polyKernelSVM 

Unnamed: 0,Weight_0,Weight_1,Weight_2,Weight_3,Weight_4,Weight_5,Weight_6,Weight_7,Weight_8,Weight_9,...,Weight_259,Weight_260,Weight_261,Weight_262,Weight_263,Weight_264,Weight_265,Weight_266,Weight_267,Bias
0,0.004834,0.005039,0.005986,0.005297,0.004272,0.004829,0.013938,0.004978,0.005503,0.007628,...,0.004672,0.00567,0.007641,0.005535,0.005094,0.005575,0.005472,0.007058,0.004584,-13.824352
1,-0.023621,-0.017087,-0.004547,-0.005689,-0.02192,-0.010647,0.001381,-0.019562,-0.015748,0.013672,...,-0.01572,0.007285,0.011544,-0.004101,-0.011885,-0.004748,-0.01488,0.006353,-0.014056,15.978626
2,-0.41695,-0.402582,-0.253623,-0.348199,-0.236895,-0.15119,-0.424119,-0.313249,-0.4237,-0.389532,...,-0.195676,-0.157768,-0.41947,-0.193139,-0.366234,-0.242053,-0.363583,-0.376329,-0.160401,688.493775
3,-0.159119,-0.271945,0.281023,-0.221829,0.037008,0.090878,0.490914,0.070131,-0.200211,0.314265,...,-0.035489,-0.026978,0.254283,0.033847,-0.203196,-0.02899,0.223788,0.361466,0.105949,-224.619765
4,-0.004263,0.00771,0.024554,0.033053,0.014552,0.027467,0.120441,0.00694,0.002037,0.028506,...,0.017913,0.051007,0.022459,0.028652,0.018709,0.027283,0.008252,0.02439,0.025677,-51.248862
5,0.013673,-0.015428,0.171149,-0.007165,0.006301,0.045628,0.346046,0.066663,0.037082,0.302391,...,0.008669,0.054846,0.292425,0.074199,0.000226,0.061801,0.145269,0.279373,0.034917,-253.385481
6,-0.070384,-0.019263,-0.02485,0.191273,0.069667,0.004393,-0.187206,-0.044716,-0.139,-0.029472,...,-0.061981,0.166707,-0.094902,-0.05184,0.104079,-0.02011,0.016249,0.015183,0.046496,42.392377
7,0.65583,0.713557,-0.199691,0.353259,0.127015,-0.011356,-0.361395,0.228815,0.734036,-0.247458,...,0.277612,-0.100768,-0.07398,0.106847,0.453206,0.201242,-0.020567,-0.317494,-0.043166,-209.786318


In [33]:
acc = getAccuracy(x_test_pk, y_test, SVM = polyKernelSVM)
"Overall accuracy: " , acc

('Overall accuracy: ', 0.75)

In [34]:
getAccuracyPerNumber(x_test_pk, y_test, SVM = polyKernelSVM)

Accuracy for  0  or  L  :  0.0
Accuracy for  1  or  S  :  0.0
Accuracy for  2  or  cp  :  1.0
Accuracy for  3  or  im  :  0.6666666666666666
Accuracy for  4  or  mL  :  0.0
Accuracy for  5  or  mU  :  0.0
Accuracy for  6  or  om  :  0.0
Accuracy for  7  or  pp  :  0.7142857142857143


### RBF Kernel OneVsAll SVM

In [35]:
#change x_TRAIN into the kernel form
gamma = 5
m1 = x_train.shape[0]
m2 = x_train.shape[0]
K = np.zeros((m1, m2))  
for i in range(m1):
    for j in range(m2):
        K[i,j] = np.exp( - gamma * np.linalg.norm(x_train[i,:] - x_train[j,:])**2 )
x_train_g = K 
x_train_g.shape

(268, 268)

In [36]:
#change x_TEST into the kernel form
m1 = x_test.shape[0]
m2 = x_train.shape[0]
K = np.zeros((m1, m2))  
for i in range(m1):
    for j in range(m2):
        K[i,j] = np.exp( -gamma * np.linalg.norm(x_test[i,:] - x_train[j,:])**2 )
x_test_g = K 
x_test_g.shape

(68, 268)

In [37]:
gaussSVM = getSVMs(x_train_g, y_train)
gaussSVM

Unnamed: 0,Weight_0,Weight_1,Weight_2,Weight_3,Weight_4,Weight_5,Weight_6,Weight_7,Weight_8,Weight_9,...,Weight_259,Weight_260,Weight_261,Weight_262,Weight_263,Weight_264,Weight_265,Weight_266,Weight_267,Bias
0,-0.103223,-0.09235,-0.027322,-0.057581,-0.029807,-0.012262,0.051829,-0.069209,-0.081645,-0.009872,...,-0.023576,-0.010042,-0.011961,-0.022334,-0.079101,-0.042691,-0.045777,-0.013807,-0.012573,1.827474
1,-0.227459,-0.15285,-0.00831,-0.01401,-0.069576,-0.018224,0.008834,-0.142911,-0.130679,0.073483,...,-0.046448,0.030229,0.066294,-0.008841,-0.090179,-0.017816,-0.067931,0.047155,-0.023147,1.008084
2,-4.60617,-3.815017,1.224969,-1.362157,1.652092,3.377806,-0.13922,-0.917722,-3.781187,-0.63332,...,2.667629,2.945758,-0.885892,2.841779,-2.454242,1.825746,-1.390003,-0.633337,3.059048,-45.821309
3,-2.704025,-2.720102,0.544612,-1.627132,-0.434572,-0.050359,0.385037,-1.031915,-2.237631,0.771766,...,-0.463443,-0.196975,0.58465,-0.349531,-2.186835,-0.924014,0.187455,0.972225,-0.024639,28.8604
4,-0.299173,-0.257952,-0.075215,-0.133496,-0.077349,-0.027998,0.027091,-0.194469,-0.236544,-0.047653,...,-0.061333,-0.011248,-0.056205,-0.053314,-0.207097,-0.104974,-0.134204,-0.055853,-0.029051,6.59643
5,-1.333802,-1.309359,0.138252,-0.823313,-0.405615,-0.131105,0.29304,-0.710659,-0.958108,0.956391,...,-0.301839,-0.110349,0.919359,-0.182641,-1.095606,-0.445006,-0.053855,0.834657,-0.149837,14.922934
6,-1.749565,-1.191424,-0.351199,0.847945,0.045767,-0.029922,-0.097058,-1.011908,-1.710756,-0.277898,...,-0.291936,0.378823,-0.414643,-0.265084,-0.006951,-0.436253,-0.507447,-0.230471,0.044358,22.060349
7,11.023417,9.539053,-1.445785,3.169743,-0.68094,-3.107938,-0.529552,4.078794,9.13655,-0.832898,...,-1.479055,-3.026196,-0.2016,-1.960034,6.120009,0.145008,2.011762,-0.920568,-2.864159,-35.454363


In [38]:
acc = getAccuracy(x_test_g, y_test, SVM = gaussSVM)
"Overall accuracy: " , acc

('Overall accuracy: ', 0.7941176470588235)

In [39]:
getAccuracyPerNumber(x_test_g, y_test, SVM = gaussSVM)

Accuracy for  0  or  L  :  0.0
Accuracy for  1  or  S  :  0.0
Accuracy for  2  or  cp  :  1.0
Accuracy for  3  or  im  :  0.7222222222222222
Accuracy for  4  or  mL  :  0.0
Accuracy for  5  or  mU  :  0.0
Accuracy for  6  or  om  :  0.0
Accuracy for  7  or  pp  :  1.0


# Binary SVM's, classifying 2 ('cp') and  3 ('im')

### Helper Functions

In [40]:
def breakDataSetBinary(x,y, classNeg, classPos):
    y_copy = y[(y == classNeg) | (y == classPos)]
    x_copy = x[(y == classNeg) | (y == classPos)]
    
    y_copy[y_copy == classNeg] = -1
    y_copy[y_copy ==classPos] = 1
    
    return x_copy,y_copy
    

In [41]:
#function to get the svm weights and biases
def getSVMBinary(x_train, y_train): 
    
    n_columns = x_train.shape[1] + 1
    column_names = [f'Weight_{i}' for i in range(n_columns)]
    df = pd.DataFrame(0, index=range(1), columns=column_names)
    df = df.rename(columns={column_names[-1]: 'Bias'})
    
    ydual_train  = y_train.copy()
    xdual_train = x_train.copy()

    N = len(ydual_train)
    XPY = xdual_train.copy()
    for i in range(N):
        if ydual_train[i]==-1:
            XPY[i,:] =-1 * xdual_train[i,:]    
    A = np.matmul(XPY,XPY.transpose())
    AT = A.copy().transpose()
    YM = np.outer(ydual_train[1:],ydual_train[1:])
    AY = np.outer(A[0,1:],ydual_train[1:])
    YA = np.outer(ydual_train[1:],A[0,1:])
    Y0S = ydual_train[0]**2
    M = AT[1:,1:] + A[0,0]*YM/Y0S - AY/ydual_train[0] - YA/ydual_train[0]
    b = np.zeros(N-1)
    b = 1 - ydual_train[1:]/ydual_train[0]
    aw = np.zeros(N)
    for i in range(2,N):
        aw[i] = (1-ydual_train[i]/ydual_train[0])/(A[i,i] + A[0,0]*ydual_train[i]**2/ydual_train[0]**2 
                                                   - 2*A[0,i]*ydual_train[i]/ydual_train[0])
    aw[0] = -sum(ydual_train[1:]*aw[1:])/ydual_train[0]
    YA   = ydual_train*aw

    wght = sum(xdual_train * YA[:,None])
    b =sum(ydual_train - np.matmul(xdual_train,wght))/N
    
    df.iloc[0,0:-1] = wght
    df.iloc[0,-1] = b
        
    return df

In [42]:
def getBinaryAccuracy(x_test, y_test, SVM):
    testSetAcc = []
    for j in range(x_test.shape[0]):
        pred = (SVM.iloc[0,:-1].values@x_test[j]) + SVM.iloc[0,-1] 
        
        testSetAcc.append(np.sign([pred]) == y_test[j])
    
    return sum(testSetAcc)/len(testSetAcc)

In [43]:
#make new x and y for whatever 2-class dataset we want
binX, binY = breakDataSetBinary(x,y, 2, 3)
binX.shape, binY.shape

((220, 7), (220,))

In [44]:
x_train_bin, x_test_bin, y_train_bin, y_test_bin = train_test_split(binX, binY, train_size = .8,shuffle = True, random_state = 4)

### Sklearn Implementaion

In [45]:
#Linear SVM
linearSVC = LinearSVC(dual="auto")
linearSVC.fit(x_train_bin,y_train_bin)
y_pred = linearSVC.predict(x_test_bin)
"sklean Linear SVM Accuracy: " , accuracy_score(y_pred,y_test_bin)

('sklean Linear SVM Accuracy: ', 0.9772727272727273)

In [46]:
#Linear Kernel SVM
linearKernelSVC = SVC(kernel = "linear")
linearKernelSVC.fit(x_train_bin,y_train_bin)
y_pred = linearKernelSVC.predict(x_test_bin)
"sklean Linear Kernel SVM Accuracy: " , accuracy_score(y_pred,y_test_bin)

('sklean Linear Kernel SVM Accuracy: ', 0.9772727272727273)

In [47]:
#Polynomial Kernel SVM
polySVC = SVC(kernel = "poly")
polySVC.fit(x_train_bin,y_train_bin)
y_pred = polySVC.predict(x_test_bin)
"sklean Polynomial SVM Accuracy: " , accuracy_score(y_pred,y_test_bin)

('sklean Polynomial SVM Accuracy: ', 0.9772727272727273)

In [48]:
#RBF Kernel SVM
rbfSVC = SVC(kernel = "rbf")
rbfSVC.fit(x_train_bin,y_train_bin)
y_pred = rbfSVC.predict(x_test_bin)
"sklean RBF SVM Accuracy: " , accuracy_score(y_pred,y_test_bin)

('sklean RBF SVM Accuracy: ', 0.9772727272727273)

### Linear (No Kernel) Binary SVM

In [49]:
binarySVM = getSVMBinary(x_train_bin, y_train_bin)
binarySVM

Unnamed: 0,Weight_0,Weight_1,Weight_2,Weight_3,Weight_4,Weight_5,Weight_6,Bias
0,114.44046,8.256893,-2.020606e-14,-4.507505e-14,30.061676,174.304006,152.11576,-226.904196


In [50]:
acc = getBinaryAccuracy(x_test_bin, y_test_bin, binarySVM)[0]
"Accuracy: " , acc

('Accuracy: ', 0.9772727272727273)

### Linear Kernel Binary SVM

In [51]:
x_train_binLin = (x_train_bin@x_train_bin.T +1)

In [52]:
x_test_binLin = (x_test_bin@x_train_bin.T +1)

In [53]:
binaryLinSVM = getSVMBinary(x_train_binLin, y_train_bin)
binaryLinSVM

Unnamed: 0,Weight_0,Weight_1,Weight_2,Weight_3,Weight_4,Weight_5,Weight_6,Weight_7,Weight_8,Weight_9,...,Weight_167,Weight_168,Weight_169,Weight_170,Weight_171,Weight_172,Weight_173,Weight_174,Weight_175,Bias
0,2.776165,1.568457,1.127405,1.184949,1.25222,1.156369,1.512408,1.243416,2.637813,0.976,...,1.310995,2.269961,1.40919,1.086981,1.39031,2.036696,1.216645,2.397667,1.052194,-748.535817


In [54]:
acc = getBinaryAccuracy(x_test_binLin, y_test_bin, binaryLinSVM)[0]
"Accuracy: " , acc

('Accuracy: ', 0.9545454545454546)

### Polynomial Kernel Binary SVM

In [55]:
x_train_binPoly = (x_train_bin@x_train_bin.T +1 )**2

In [56]:
x_test_binPoly = (x_test_bin@x_train_bin.T +1 )** 2

In [57]:
binaryPolySVM = getSVMBinary(x_train_binPoly, y_train_bin)
binaryPolySVM

Unnamed: 0,Weight_0,Weight_1,Weight_2,Weight_3,Weight_4,Weight_5,Weight_6,Weight_7,Weight_8,Weight_9,...,Weight_167,Weight_168,Weight_169,Weight_170,Weight_171,Weight_172,Weight_173,Weight_174,Weight_175,Bias
0,0.52576,0.234171,0.152427,0.159653,0.177088,0.159331,0.224659,0.170257,0.494367,0.12192,...,0.186919,0.400485,0.202948,0.14208,0.199616,0.3514,0.166262,0.433306,0.140494,-333.53882


In [58]:
acc = getBinaryAccuracy(x_test_binPoly, y_test_bin, binaryPolySVM)[0]
"Accuracy: " , acc

('Accuracy: ', 1.0)

### RBF Kernel Binary SVM

In [59]:
#change x_TRAIN into the kernel form
gamma = 1

m1 = x_train_bin.shape[0]
m2 = x_train_bin.shape[0]
K = np.zeros((m1, m2))  
for i in range(m1):
    for j in range(m2):
        K[i,j] = np.exp( -gamma * np.linalg.norm(x_train_bin[i,:] - x_train_bin[j,:])**2 )
        
x_train_binRBF = K 
x_train_binRBF.shape

(176, 176)

In [60]:
#change x_TEST into the kernel form
gamma = 1

m1 = x_test_bin.shape[0]
m2 = x_train_bin.shape[0]
K = np.zeros((m1, m2))  
for i in range(m1):
    for j in range(m2):
        K[i,j] = np.exp( -gamma * np.linalg.norm(x_test_bin[i,:] - x_train_bin[j,:])**2 )
        
x_test_binRBF = K 
x_test_binRBF.shape

(44, 176)

In [61]:
binaryRBFSVM = getSVMBinary(x_train_binRBF, y_train_bin)
binaryRBFSVM

Unnamed: 0,Weight_0,Weight_1,Weight_2,Weight_3,Weight_4,Weight_5,Weight_6,Weight_7,Weight_8,Weight_9,...,Weight_167,Weight_168,Weight_169,Weight_170,Weight_171,Weight_172,Weight_173,Weight_174,Weight_175,Bias
0,4.205387,-2.243984,-3.838294,-3.877483,-3.657845,-3.933399,-2.453572,-3.638815,3.42117,-4.347771,...,-3.369824,1.782538,-2.968907,-4.137613,-3.104279,0.423154,-3.524469,2.413333,-4.1625,253.240474


In [62]:
acc = getBinaryAccuracy(x_test_binRBF, y_test_bin, binaryRBFSVM)[0]
"Accuracy: " , acc

('Accuracy: ', 0.9772727272727273)