## ___8E: Implementing Decision Function of SVM RBF Kernel___

<font face=' Comic Sans MS' size=3>After we train a kernel SVM model, we will be getting support vectors and their corresponsing coefficients $\alpha_{i}$

Check the documentation for better understanding of these attributes: 

https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html
<img src='https://i.imgur.com/K11msU4.png' width=500>

Implementing the ```decision_function()``` of kernel SVM. Here decision_function() means based on the value return by ```decision_function()``` model will classify the data point either as positive or negative.

Ex 1: In Logistic Regression, after traning the models with the optimal weights $w$ we get, we will find the value $\frac{1}{1+\exp(-(wx+b))}$, if this value comes out to be < 0.5 we will mark it as negative class, else it is positive class.

Ex 2: In Linear SVM, after traning the models with the optimal weights $w$ we get, we will find the value of $sign(wx+b)$, if this value comes out to be -ve we will mark it as negative class, else it is positive class.

Similarly in Kernel SVM, after traning the models with the coefficients $\alpha_{i}$ we get, we will find the value of 
$sign(\sum_{i=1}^{n}(y_{i}\alpha_{i}K(x_{i},x_{q})) + intercept)$, here $K(x_{i},x_{q})$ is the RBF kernel. If this value comes out to be -ve we will mark $x_{q}$ as negative class, else it is positive class.

RBF kernel is defined as: $K(x_{i},x_{q})$ = $exp(-\gamma ||x_{i} - x_{q}||^2)$

For better understanding check this link: https://scikit-learn.org/stable/modules/svm.html#svm-mathematical-formulation
</font>

> 1. Split the data into $X_{train}$(60), $X_{cv}$(20), $X_{test}$(20)

> 2. Train $SVC(gamma=0.001, C=100.)$ on the ($X_{train}$, $y_{train}$)

> 3. Get the decision boundry values $f_{cv}$ on the $X_{cv}$ data  i.e. ` `$f_{cv}$ ```= decision_function(```$X_{cv}$```)``` 

In [1]:
import warnings
warnings.filterwarnings("ignore")

%matplotlib inline
import matplotlib.pyplot as plt

import numpy as np
import pandas as pd
import math

from sklearn.model_selection import train_test_split
from sklearn.datasets import make_classification
from sklearn.model_selection import GridSearchCV
from sklearn.svm import SVC

In [2]:
X, y = make_classification(n_samples = 5000, n_features = 5, n_redundant = 2, n_classes = 2, 
                           weights = [0.7], class_sep = 0.7, random_state = 15)

In [3]:
Xtrain, Xtest, ytrain, ytest = train_test_split(X, y, test_size = 0.2, random_state = 0)
Xcv, Xtest, ycv, ytest = train_test_split(Xtest, ytest, test_size = 0.25, random_state = 0)

print(Xtrain.shape, ytrain.shape, Xtest.shape, ytest.shape)
print(Xtest.shape, ytest.shape, Xcv.shape, ycv.shape)

(4000, 5) (4000,) (250, 5) (250,)
(250, 5) (250,) (750, 5) (750,)


In [4]:
clf = SVC(kernel = "rbf", random_state = 0, decision_function_shape = 'ovo', C = 100, gamma = 0.001)

clf = clf.fit(Xtrain, ytrain)  # set the best parameters

In [5]:
sup_vectors = clf.support_vectors_
dual_coefs = clf.dual_coef_
intercept = clf.intercept_

#### Pseudocode

clf = SVC(gamma=0.001, C=100.)<br>
clf.fit(Xtrain, ytrain)

<font color='green'>def</font> <font color='blue'>decision_function</font>(Xcv, ...): #use appropriate parameters <br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<font color='green'>for</font> a data point $x_q$ <font color='green'>in</font> Xcv: <br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<font color='grey'>#write code to implement $(\sum_{i=1}^{\text{all the support vectors}}(y_{i}\alpha_{i}K(x_{i},x_{q})) + intercept)$, here the values $y_i$, $\alpha_{i}$, and $intercept$ can be obtained from the trained model</font><br>
   <font color='green'>return</font> <font color='grey'><i># the decision_function output for all the data points in the Xcv</i></font>
    
fcv = decision_function(Xcv, ...)  <i># based on your requirement you can pass any other parameters </i>

<b>Note</b>: Make sure the values you get as fcv, should be equal to outputs of clf.decision_function(Xcv)


In [6]:
##https://stackoverflow.com/questions/28503932/calculating-decision-function-of-svm-manually

def decision_function(xcv, gamma): 
    predict = []
    decision = []
    for xq in xcv:
        dec_func = 0
        for j in range(len(sup_vectors)):
            norm2 = np.linalg.norm(sup_vectors[j, :] -xq) ** 2  # calculating the kernel - K(xi, xq)
            dec_func = dec_func + dual_coefs[0, j] * np.exp(-gamma * norm2) # calculating the sign
        dec_func += intercept
        decision.append(dec_func)
        if (dec_func) < 0:                
            predict.append(0)
        else:
            predict.append(1)
    return np.array(predict), decision      

In [7]:
gamma = 0.001
fcv, decision = decision_function(Xcv, gamma)

In [8]:
print(np.array(decision).T)

[[-4.09306149 -3.10741695 -0.15431067 -3.44147384  1.86469272 -1.88997036
  -1.42578578 -4.05530658 -2.84238853  1.58879661 -2.63347353  0.7584309
  -3.37137513 -3.10451597 -1.76293891 -2.63286738  2.08622943  1.57270623
  -4.72587379 -2.039124   -2.65331918 -3.16248379 -2.26994232 -0.67145105
  -2.86212103 -1.95766344 -2.40206476 -0.52337199 -1.71394028  0.66718224
   0.44739901 -2.52668203 -1.83627576 -2.54908278 -0.27646713 -2.52298837
   2.53077636 -2.00177175 -2.7519223  -1.32526862 -2.12564668 -1.68092689
  -2.39127684 -2.03974776 -2.61426086 -1.74981859 -2.57301876  1.31656057
  -1.82116347 -2.72951822 -3.47134478  1.09675949 -1.95918044 -2.58705688
  -3.20570939  2.82394275 -0.77192207  1.80971227 -2.45156235 -2.86515727
   1.89308139 -2.4951646   1.56220677 -2.78604892  1.79079978 -2.16248874
   1.6163942  -0.44311935 -3.37915061  1.60924309 -2.28744027 -2.86129996
   1.33534717  1.74089224 -1.29141409 -0.33729421 -2.41377324 -3.92789179
  -3.66555007 -2.14994175  0.98464596  

In [9]:
clf.decision_function(Xcv)

array([-4.09306149, -3.10741695, -0.15431067, -3.44147384,  1.86469272,
       -1.88997036, -1.42578578, -4.05530658, -2.84238853,  1.58879661,
       -2.63347353,  0.7584309 , -3.37137513, -3.10451597, -1.76293891,
       -2.63286738,  2.08622943,  1.57270623, -4.72587379, -2.039124  ,
       -2.65331918, -3.16248379, -2.26994232, -0.67145105, -2.86212103,
       -1.95766344, -2.40206476, -0.52337199, -1.71394028,  0.66718224,
        0.44739901, -2.52668203, -1.83627576, -2.54908278, -0.27646713,
       -2.52298837,  2.53077636, -2.00177175, -2.7519223 , -1.32526862,
       -2.12564668, -1.68092689, -2.39127684, -2.03974776, -2.61426086,
       -1.74981859, -2.57301876,  1.31656057, -1.82116347, -2.72951822,
       -3.47134478,  1.09675949, -1.95918044, -2.58705688, -3.20570939,
        2.82394275, -0.77192207,  1.80971227, -2.45156235, -2.86515727,
        1.89308139, -2.4951646 ,  1.56220677, -2.78604892,  1.79079978,
       -2.16248874,  1.6163942 , -0.44311935, -3.37915061,  1.60

___The values of the fcv are same as the outputs of clf.decision_function(Xcv).___