Q1. What is the mathematical formula for a linear SVM?

A linear SVM finds a hyperplane that separates the data into two classes. The mathematical formula for a linear SVM is:

f(x) = sign(w · x + b)

where f(x) is the predicted class for a given input x, w is the weight vector, b is the bias term, and · denotes the dot product.

Q2. What is the objective function of a linear SVM?

The objective of the support vector machine algorithm is to find a hyperplane in an N-dimensional space(N — the number of features) that distinctly classifies the data points. To separate the two classes of data points, there are many possible hyperplanes that could be chosen.

Q3. What is the kernel trick in SVM?

Kernel trick allows the inner product of mapping function instead of the data points. The trick is to identify the kernel functions which can be represented in place of the inner product of mapping functions.


Q5. Illustrate with examples and graphs of Hyperplane, Marginal plane, Soft margin and Hard margin in
SVM?

In SVM, a hyperplane is a decision boundary that separates the data into different classes. The objective of SVM is to find the hyperplane that maximizes the margin between the two classes. The margin is the distance between the hyperplane and the closest data points from both classes.

A hard margin SVM is a classification technique that aims to find a hyperplane that separates the data points into two classes without any misclassification. However, in most real-world scenarios, data is not linearly separable, and a hard margin SVM may not be feasible.

A soft margin SVM is a modification of the hard margin SVM that allows for some misclassification of the data points to achieve a better separation.

https://i.imgur.com/NDHnig

In the above image, we can see that the hard margin SVM is unable to separate the two classes since the data points are not linearly separable. On the other hand, the soft margin SVM allows for some misclassification and finds a hyperplane that achieves the maximum margin possible.

The marginal plane is the hyperplane that is parallel to the hyperplane but closest to the support vectors. The support vectors are the data points that lie closest to the hyperplane and play a crucial role in the SVM algorithm.

In summary, the hyperplane is the decision boundary that separates the data points into two classes, the margin is the distance between the hyperplane and the closest data points, the support vectors are the data points closest to the hyperplane, and the marginal plane is the hyperplane that is closest to the support vectors. The hard margin SVM aims to find a hyperplane that separates the data points without any misclassification, while the soft margin SVM allows for some misclassification to achieve a better separation.

Q6. SVM Implementation through Iris dataset.

Bonus task: Implement a linear SVM classifier from scratch using Python and compare its
performance with the scikit-learn implementation.

~ Load the iris dataset from the scikit-learn library and split it into a training set and a testing setl
~ Train a linear SVM classifier on the training set and predict the labels for the testing setl
~ Compute the accuracy of the model on the testing setl
~ Plot the decision boundaries of the trained model using two of the featuresl
~ Try different values of the regularisation parameter C and see how it affects the performance of
the model.

In [1]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import seaborn as sns

In [2]:
from sklearn.datasets import load_iris

In [3]:
dataset=load_iris()

In [4]:
type(dataset)

sklearn.utils._bunch.Bunch

In [5]:
import seaborn as sns
df=sns.load_dataset('iris')

In [6]:
df.head()

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species
0,5.1,3.5,1.4,0.2,setosa
1,4.9,3.0,1.4,0.2,setosa
2,4.7,3.2,1.3,0.2,setosa
3,4.6,3.1,1.5,0.2,setosa
4,5.0,3.6,1.4,0.2,setosa


In [11]:
df['species'].unique()

array(['setosa', 'versicolor', 'virginica'], dtype=object)

In [12]:
#Independent and dependent features
X=df.iloc[:,:-1]
y=df['species']

In [13]:
### train test split
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(
     X, y, test_size=0.33, random_state=42)

In [14]:
from sklearn.svm import SVC

In [15]:
svc=SVC(kernel='linear')

In [16]:
svc.fit(X_train,y_train)

In [17]:
svc.coef_

array([[-0.04631136,  0.52105578, -1.0030165 , -0.46411816],
       [-0.00641373,  0.17867392, -0.5389119 , -0.29158729],
       [ 0.54628096,  1.19553697, -1.92187359, -1.86235093]])

In [18]:
## Prediction
y_pred=svc.predict(X_test)

In [19]:
y_pred

array(['versicolor', 'setosa', 'virginica', 'versicolor', 'versicolor',
       'setosa', 'versicolor', 'virginica', 'versicolor', 'versicolor',
       'virginica', 'setosa', 'setosa', 'setosa', 'setosa', 'versicolor',
       'virginica', 'versicolor', 'versicolor', 'virginica', 'setosa',
       'virginica', 'setosa', 'virginica', 'virginica', 'virginica',
       'virginica', 'virginica', 'setosa', 'setosa', 'setosa', 'setosa',
       'versicolor', 'setosa', 'setosa', 'virginica', 'versicolor',
       'setosa', 'setosa', 'setosa', 'virginica', 'versicolor',
       'versicolor', 'setosa', 'setosa', 'versicolor', 'virginica',
       'virginica', 'versicolor', 'virginica'], dtype=object)

In [20]:
from sklearn.metrics import classification_report,confusion_matrix,accuracy_score

In [21]:
print(classification_report(y_test,y_pred))
print(confusion_matrix(y_test,y_pred))
print(accuracy_score(y_test,y_pred))

              precision    recall  f1-score   support

      setosa       1.00      1.00      1.00        19
  versicolor       1.00      1.00      1.00        15
   virginica       1.00      1.00      1.00        16

    accuracy                           1.00        50
   macro avg       1.00      1.00      1.00        50
weighted avg       1.00      1.00      1.00        50

[[19  0  0]
 [ 0 15  0]
 [ 0  0 16]]
1.0


In [22]:
from sklearn.model_selection import GridSearchCV
 
# defining parameter range
param_grid = {'C': [0.1, 1, 10, 100, 1000],
              'gamma': [1, 0.1, 0.01, 0.001, 0.0001],
              'kernel':['linear']
              }

In [23]:
grid=GridSearchCV(SVC(),param_grid=param_grid,refit=True,cv=5,verbose=3)

In [26]:
grid.fit(X_train,y_train)

Fitting 5 folds for each of 25 candidates, totalling 125 fits
[CV 1/5] END .....C=0.1, gamma=1, kernel=linear;, score=1.000 total time=   0.0s
[CV 2/5] END .....C=0.1, gamma=1, kernel=linear;, score=0.900 total time=   0.0s
[CV 3/5] END .....C=0.1, gamma=1, kernel=linear;, score=0.900 total time=   0.0s
[CV 4/5] END .....C=0.1, gamma=1, kernel=linear;, score=1.000 total time=   0.0s
[CV 5/5] END .....C=0.1, gamma=1, kernel=linear;, score=0.900 total time=   0.0s
[CV 1/5] END ...C=0.1, gamma=0.1, kernel=linear;, score=1.000 total time=   0.0s
[CV 2/5] END ...C=0.1, gamma=0.1, kernel=linear;, score=0.900 total time=   0.0s
[CV 3/5] END ...C=0.1, gamma=0.1, kernel=linear;, score=0.900 total time=   0.0s
[CV 4/5] END ...C=0.1, gamma=0.1, kernel=linear;, score=1.000 total time=   0.0s
[CV 5/5] END ...C=0.1, gamma=0.1, kernel=linear;, score=0.900 total time=   0.0s
[CV 1/5] END ..C=0.1, gamma=0.01, kernel=linear;, score=1.000 total time=   0.0s
[CV 2/5] END ..C=0.1, gamma=0.01, kernel=linear

In [25]:
grid.best_params_

{'C': 1, 'gamma': 1, 'kernel': 'linear'}

In [27]:
## Prediction
y_pred4=grid.predict(X_test)
print(classification_report(y_test,y_pred4))
print(confusion_matrix(y_test,y_pred4))
print(accuracy_score(y_test,y_pred))

              precision    recall  f1-score   support

      setosa       1.00      1.00      1.00        19
  versicolor       1.00      1.00      1.00        15
   virginica       1.00      1.00      1.00        16

    accuracy                           1.00        50
   macro avg       1.00      1.00      1.00        50
weighted avg       1.00      1.00      1.00        50

[[19  0  0]
 [ 0 15  0]
 [ 0  0 16]]
1.0
