Q1. What is the mathematical formula for a linear SVM?

The mathematical formula for a linear Support Vector Machine (SVM) can be represented as:

f(x)=sign(w⋅x+b)
where:
f(x) is the decision function,
w is the weight vector,
x is the input feature vector,
b is the bias term.

Q2. What is the objective function of a linear SVM?

The objective function of a linear SVM is to maximize the margin between the support vectors and the decision boundary while minimizing the classification error.
Q3. What is the kernel trick in SVM?

The kernel trick is a method used in SVM to handle non-linearly separable data by implicitly mapping the input vectors into a higher-dimensional space. This allows the SVM to find a linear decision boundary in the higher-dimensional space, which corresponds to a non-linear decision boundary in the original input space, without explicitly computing the transformation. Popular kernel functions include linear, polynomial, radial basis function (RBF), and sigmoid kernels.

Q4. What is the role of support vectors in SVM? Explain with an example

Support vectors are the data points that lie closest to the decision boundary, and they play a crucial role in defining the decision boundary of the SVM. These points are the ones that are most difficult to classify and therefore have the most influence on the position and orientation of the decision boundary.

For example, consider a binary classification problem with two classes, where the classes are not linearly separable. The support vectors would be the data points from each class that lie closest to the decision boundary, as shown in the figure below:


In this example, the support vectors are the highlighted data points, and the decision boundary is determined by the support vectors' positions.

Q5. Illustrate with examples and graphs of Hyperplane, Marginal plane, Soft margin, and Hard margin in SVM?

I'll provide explanations and simple diagrams for each concept:

Hyperplane: A hyperplane is a linear decision boundary that separates classes in an SVM. In 2D, it's a line; in 3D, it's a plane, and in higher dimensions, it's a hyperplane. For example, in a binary classification problem in 2D, the hyperplane is the line that separates the two classes.

Marginal Plane: The marginal plane is the boundary parallel to the hyperplane that touches the support vectors. In 2D, it's represented by two parallel lines, one on each side of the hyperplane, touching the support vectors.

Soft Margin: In a soft-margin SVM, the decision boundary is allowed to have some misclassifications (slack variables), but it aims to minimize them while maximizing the margin. This is useful when the data is not perfectly separable.

Hard Margin: In a hard-margin SVM, there is no allowance for misclassifications. It requires the data to be linearly separable, and it aims to find the maximum margin separating hyperplane without any misclassifications.

In [1]:
import numpy as np
import pandas as pd
import seaborn as sns

In [2]:
from sklearn.datasets import load_iris

In [3]:
data = load_iris()

In [9]:
df = pd.DataFrame(data.data,columns=data.feature_names)
df['target']=data.target
df.head()

Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm),target
0,5.1,3.5,1.4,0.2,0
1,4.9,3.0,1.4,0.2,0
2,4.7,3.2,1.3,0.2,0
3,4.6,3.1,1.5,0.2,0
4,5.0,3.6,1.4,0.2,0


In [11]:
# Creating Independent and dependent features
x = df.iloc[:,:-1]
y = df[['target']]

In [17]:
# train test split
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test = train_test_split(x,y,test_size=0.25,random_state=0)

In [18]:
from sklearn.svm import SVC

In [19]:
svm_classifier = SVC(kernel='linear')
svm_classifier.fit(x_train,y_train)

  y = column_or_1d(y, warn=True)


In [20]:
y_pred = svm_classifier.predict(x_test)

In [21]:
from sklearn.metrics import accuracy_score
accuracy = accuracy_score(y_test,y_pred)
print("Accuracy: ",accuracy)

Accuracy:  0.9736842105263158


In [25]:

# Plotting the decision boundaries of the trained model
# Creating a mesh grid to plot the decision boundaries
x_min, x_max = x[:, 0].min() - 1, x[:, 0].max() + 1
y_min, y_max = x[:, 1].min() - 1, x[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, 0.02),
                     np.arange(y_min, y_max, 0.02))

# Plotting the decision boundaries
plt.figure(figsize=(10, 6))
Z = svm_classifier.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
plt.contourf(xx, yy, Z, cmap=plt.cm.coolwarm, alpha=0.8)

# Plotting the training points
plt.scatter(x_train[:, 0], x_train[:, 1], c=y_train, cmap=plt.cm.coolwarm, edgecolors='k', label='Train')
# Plotting the testing points
plt.scatter(x_test[:, 0], x_test[:, 1], c=y_test, cmap=plt.cm.coolwarm, marker='x', s=80, label='Test')

plt.xlabel('Sepal length')
plt.ylabel('Sepal width')
plt.title('Decision boundaries of the linear SVM classifier')
plt.legend()
plt.show()

InvalidIndexError: (slice(None, None, None), 0)