Q1. What is the mathematical formula for a linear SVM?

The mathematical formula for a linear Support Vector Machine (SVM) is:


f(x) = sign(w^T x + b)

where x is the input data point, w is the weight vector, b is the bias term, and sign is the sign function. The goal of the linear SVM is to find the optimal w and b such that the decision boundary separates the positive and negative examples with the maximum margin.

Q2. What is the objective function of a linear SVM?ain with example

The objective function of a linear SVM is to maximize the margin between the decision boundary and the closest data points from each class. The objective function is given by:


minimize 1/2 ||w||^2 subject to y_i(w^T x_i + b) >= 1 for all i

where y_i is the label of the ith data point (+1 for positive examples and -1 for negative examples), x_i is the feature vector of the ith data point, w is the weight vector, b is the bias term, and ||w|| is the Euclidean norm of w.

Q3. What is the kernel trick in SVM?

The kernel trick is a method used to extend the linear SVM to handle nonlinear decision boundaries. It involves mapping the original input data to a higher-dimensional feature space using a nonlinear function called a kernel. In the higher-dimensional space, a linear SVM can be used to find a nonlinear decision boundary. The kernel function calculates the inner product between the mapped feature vectors instead of computing the feature vectors themselves. This makes it possible to represent the decision boundary in a higher-dimensional space without explicitly computing the mapping. The most commonly used kernels are the polynomial kernel, the radial basis function (RBF) kernel, and the sigmoid kernel.

Q4. What is the role of support vectors in SVM Explain with example.

. In SVM, support vectors are the data points that lie closest to the decision boundary. These are the data points that determine the position and orientation of the decision boundary. The role of support vectors is crucial in SVM because they are the only data points that are used in the computation of the decision boundary. All other data points are ignored. The support vectors define the margin of the decision boundary, which is the distance between the decision boundary and the closest data points from each class. If a support vector is removed, the position of the decision boundary may change.

For example, suppose we have a dataset with two classes that are not linearly separable in two dimensions. By applying a kernel function, we can map the data points to a higher-dimensional space where they are linearly separable. In this new space, the support vectors are the data points that lie closest to the decision boundary. Figure 1 shows an example of a nonlinear decision boundary obtained using a polynomial kernel. The support vectors are shown as red and blue circles.

svm example

Figure 1: Example of a nonlinear decision boundary obtained using a polynomial kernel. The support vectors are shown as red and blue circles.

Q6. SVM Implementation through Iris dataset.


~ Load the iris dataset from the scikit-learn library and split it into a training set and a testing setl

~ Train a linear SVM classifier on the training set and predict the labels for the testing setl

~ Compute the accuracy of the model on the testing setl

~ Plot the decision boundaries of the trained model using two of the featuresl

~ Try different values of the regularisation parameter C and see how it affects the performance of
the model.

In [1]:
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt

In [2]:
iris = sns.load_dataset('iris')

In [3]:
iris.head()

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species
0,5.1,3.5,1.4,0.2,setosa
1,4.9,3.0,1.4,0.2,setosa
2,4.7,3.2,1.3,0.2,setosa
3,4.6,3.1,1.5,0.2,setosa
4,5.0,3.6,1.4,0.2,setosa


In [4]:
data = [0,1,2]
x = iris['species'].unique()
for i in range(len(x)):
    iris['species'] = iris['species'].replace(x[i],data[i])

In [5]:
iris['species'].value_counts()

0    50
1    50
2    50
Name: species, dtype: int64

In [6]:
X = iris.iloc[:,:-1]
y = iris.iloc[:,-1]

In [7]:
X,y

(     sepal_length  sepal_width  petal_length  petal_width
 0             5.1          3.5           1.4          0.2
 1             4.9          3.0           1.4          0.2
 2             4.7          3.2           1.3          0.2
 3             4.6          3.1           1.5          0.2
 4             5.0          3.6           1.4          0.2
 ..            ...          ...           ...          ...
 145           6.7          3.0           5.2          2.3
 146           6.3          2.5           5.0          1.9
 147           6.5          3.0           5.2          2.0
 148           6.2          3.4           5.4          2.3
 149           5.9          3.0           5.1          1.8
 
 [150 rows x 4 columns],
 0      0
 1      0
 2      0
 3      0
 4      0
       ..
 145    2
 146    2
 147    2
 148    2
 149    2
 Name: species, Length: 150, dtype: int64)

In [8]:
## Split data into training and test set
from sklearn.model_selection import train_test_split

In [9]:
X_train, X_test, y_train, y_test = train_test_split(X,y,random_state=42,test_size=0.25)

In [10]:
# Train a linear SVM classifier on the training set and predict the labels for the testing setl
from sklearn.svm import SVC

In [11]:
svc = SVC()

In [12]:
svc.fit(X_train,y_train)

In [13]:
y_pred = svc.predict(X_test)

In [14]:
# ~ Compute the accuracy of the model on the testing setl

In [15]:
from sklearn.metrics import classification_report,confusion_matrix,accuracy_score

In [16]:
accuracy_score(y_pred,y_test)

1.0

In [17]:
confusion_matrix(y_pred,y_test)


array([[15,  0,  0],
       [ 0, 11,  0],
       [ 0,  0, 12]])

In [18]:
print(classification_report(y_pred,y_test))

              precision    recall  f1-score   support

           0       1.00      1.00      1.00        15
           1       1.00      1.00      1.00        11
           2       1.00      1.00      1.00        12

    accuracy                           1.00        38
   macro avg       1.00      1.00      1.00        38
weighted avg       1.00      1.00      1.00        38



In [19]:
# Try different values of the regularisation parameter C and see how it affects the performance of the model.

In [20]:
parameters = {
              'C': [0.1, 1, 10, 100, 1000], 
              'gamma': [1, 0.1, 0.01, 0.001, 0.0001],
              'kernel': ['rbf','linear']
                }

In [21]:
from sklearn.model_selection import GridSearchCV

In [22]:
grid = GridSearchCV(SVC(),param_grid=parameters,cv= 5,verbose=3,scoring='accuracy')

In [23]:
grid.fit(X_train,y_train)

Fitting 5 folds for each of 50 candidates, totalling 250 fits
[CV 1/5] END ........C=0.1, gamma=1, kernel=rbf;, score=1.000 total time=   0.0s
[CV 2/5] END ........C=0.1, gamma=1, kernel=rbf;, score=0.957 total time=   0.0s
[CV 3/5] END ........C=0.1, gamma=1, kernel=rbf;, score=0.818 total time=   0.0s
[CV 4/5] END ........C=0.1, gamma=1, kernel=rbf;, score=1.000 total time=   0.0s
[CV 5/5] END ........C=0.1, gamma=1, kernel=rbf;, score=0.955 total time=   0.0s
[CV 1/5] END .....C=0.1, gamma=1, kernel=linear;, score=1.000 total time=   0.0s
[CV 2/5] END .....C=0.1, gamma=1, kernel=linear;, score=0.957 total time=   0.0s
[CV 3/5] END .....C=0.1, gamma=1, kernel=linear;, score=0.818 total time=   0.0s
[CV 4/5] END .....C=0.1, gamma=1, kernel=linear;, score=1.000 total time=   0.0s
[CV 5/5] END .....C=0.1, gamma=1, kernel=linear;, score=0.955 total time=   0.0s
[CV 1/5] END ......C=0.1, gamma=0.1, kernel=rbf;, score=1.000 total time=   0.0s
[CV 2/5] END ......C=0.1, gamma=0.1, kernel=rbf

In [24]:
grid.best_params_

{'C': 100, 'gamma': 0.01, 'kernel': 'rbf'}

In [25]:
grid.best_estimator_

In [26]:
y_pred = grid.predict(X_test)

In [27]:
print(confusion_matrix(y_pred,y_test))
print(accuracy_score(y_pred,y_test))
print(classification_report(y_pred,y_test))

[[15  0  0]
 [ 0 11  0]
 [ 0  0 12]]
1.0
              precision    recall  f1-score   support

           0       1.00      1.00      1.00        15
           1       1.00      1.00      1.00        11
           2       1.00      1.00      1.00        12

    accuracy                           1.00        38
   macro avg       1.00      1.00      1.00        38
weighted avg       1.00      1.00      1.00        38

