#### Q1. What is the mathematical formula for a linear SVM?

f(x)=sign(w⋅x+b)

Here,

w is the weight vector perpendicular to the hyperplane, 

x is the input feature vector, and b is the bias term. The sign function ensures that the output is either -1 or 1, corresponding to the two classes in a binary classification problem. The parameters w and b are determined during the training process to find the optimal hyperplane that maximally separates the data points of different classes.

#### Q2. What is the objective function of a linear SVM?

The objective function of a linear Support Vector Machine (SVM) involves finding the parameters w and b(bar) that define the hyperplane, while maximizing the margin between the two classes. The objective function for a linear SVM can be expressed as follows:

Minimize( (1/2) ∥w∥**2 )

#### Q3. What is the kernel trick in SVM?

The kernel trick is a technique used in Support Vector Machines (SVMs) to implicitly transform input data into a higher-dimensional space without explicitly computing the transformation. In traditional SVMs, a linear kernel is often used, which works well for linearly separable data. However, when the data is not linearly separable in the original feature space, the kernel trick allows SVMs to learn complex, non-linear decision boundaries by implicitly mapping the input data to a higher-dimensional space.

The general idea behind the kernel trick is to introduce a kernel function

#### Q4. What is the role of support vectors in SVM Explain with example

Support vectors play a crucial role in Support Vector Machines (SVMs). In SVM, the objective is to find a hyperplane that maximally separates different classes in the feature space. Support vectors are the data points that are most important in determining the position and orientation of the hyperplane.

EXAMPLE:

Class -1:

Point A: (1,2)

Point B: (2,3)

Class 1:

Point C: (−1,−1)

Point D:(−2,−3)

The linear SVM aims to find a hyperplane (a line in this case) that best separates the two classes.

After training, the SVM finds the hyperplane 2x1+3x2−6=0.
Support vectors are the points on or closest to the margin:

Point A: (1,2)
Point B: (2,3)
Point C: (−1,−1)
These support vectors are crucial in defining the hyperplane and maximizing the margin between the classes.






#### Q5. Illustrate with examples and graphs of Hyperplane, Marginal plane, Soft margin and Hard margin in SVM?

1. Hyperplane:
A hyperplane is a decision boundary that separates different classes in a linear SVM. In a 2D space, a hyperplane is a line, and in 3D, it's a plane. The equation for a hyperplane in a 2D space is w1x1+w2x2+b=0.

Example:

Consider a 2D space with two classes (blue and red). The hyperplane 2x1−3x2+5=0 is the decision boundary.

2. Marginal Plane:
The marginal plane is a plane parallel to the hyperplane but at a certain distance (margin) from it. The margin is the distance between the hyperplane and the nearest data point from either class.

Example:

Using the same example, the marginal planes are parallel to the hyperplane, defining the margin. Points A, B, and C are the support vectors contributing to the margin.


3. Soft Margin:
In real-world scenarios, data may not be perfectly separable. The soft margin allows for some misclassification to achieve a balance between maximizing the margin and minimizing errors. This is useful when dealing with noisy or overlapping data.

Example:

Consider a case with noisy data. The soft margin allows for a few misclassifications, represented by the dotted lines.


4. Hard Margin:
In contrast, a hard margin SVM aims to perfectly separate classes with no misclassifications. This is suitable when the data is well-behaved and noise-free.

Example:

Assuming perfectly separable data, the hard margin SVM seeks to find a hyperplane with no misclassifications.

#### Q6. SVM Implementation through Iris dataset.
#### ~ Load the iris dataset from the scikit-learn library and split it into a training set and a testing setl
#### ~ Train a linear SVM classifier on the training set and predict the labels for the testing setl
#### ~ Compute the accuracy of the model on the testing setl
#### ~ Plot the decision boundaries of the trained model using two of the featuresl
#### ~ Try different values of the regularisation parameter C and see how it affects the performance of the model.

In [1]:
from sklearn.datasets import load_iris
iris=load_iris()

In [4]:
X=iris.data
y=iris.target

In [5]:
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.25,random_state=42)

In [6]:
from sklearn.svm import SVC
svm_classifier=SVC(kernel='linear')

In [7]:
svm_classifier.fit(X_train,y_train)

In [8]:
y_pred=svm_classifier.predict(X_test)

In [9]:
y_pred

array([1, 0, 2, 1, 1, 0, 1, 2, 1, 1, 2, 0, 0, 0, 0, 1, 2, 1, 1, 2, 0, 2,
       0, 2, 2, 2, 2, 2, 0, 0, 0, 0, 1, 0, 0, 2, 1, 0])

In [10]:
from sklearn.metrics import accuracy_score
accuracy_score(y_test,y_pred)

1.0

In [12]:
#### Try different values of the regularisation parameter C and see how it affects the performance of the model.
from sklearn.model_selection import GridSearchCV
param_grid={'C':[0.001,0.01,0.1,1,10,100,1000]}
grid=GridSearchCV(svm_classifier,param_grid,cv=5,scoring='accuracy')

In [13]:
grid.fit(X_train,y_train)
y_pred=grid.predict(X_test)
y_pred

array([1, 0, 2, 1, 1, 0, 1, 2, 1, 1, 2, 0, 0, 0, 0, 1, 2, 1, 1, 2, 0, 2,
       0, 2, 2, 2, 2, 2, 0, 0, 0, 0, 1, 0, 0, 2, 1, 0])

In [14]:
grid.best_params_

{'C': 1}

In [17]:
accuracy_score(y_test,y_pred)

1.0