# Q1. What is the mathematical formula for a linear SVM?

Ans: A linear SVM (Support Vector Machine) is a type of machine learning algorithm used for classification tasks. It finds a linear boundary (also called hyperplane) that separates data points of different classes in a high-dimensional space. The mathematical formula for a linear SVM can be expressed as follows:

Given a training set of input vectors x and corresponding binary output labels y (either -1 or +1), the goal of a linear SVM is to find a weight vector w and bias term b such that the decision boundary is defined as:

f(x) = sign(w*x + b)

where sign() is the sign function which returns -1 or +1 depending on whether the argument is negative or positive.

The weight vector w and bias term b are determined by solving the following optimization problem:

minimize 0.5||w||^2 subject to y_i(w*x_i + b) >= 1 for all i

where ||w|| is the L2 norm of the weight vector and i indexes the training examples. The optimization problem seeks to find the maximum-margin hyperplane that separates the two classes of data points, while also satisfying the constraint that all points are correctly classified.

# Q2. What is the objective function of a linear SVM?

Ans: The objective function of a linear Support Vector Machine (SVM) is to find the hyperplane that maximally separates the training data into two classes.

In other words, the objective of a linear SVM is to find the optimal hyperplane that maximizes the margin, which is the distance between the hyperplane and the closest points from each class (called support vectors). This hyperplane will correctly classify the training data, and it is expected to generalize well to unseen data.

Mathematically, the objective function of a linear SVM can be expressed as:

minimize 1/2 * ||w||^2

subject to y_i(w^T*x_i+b) >= 1 for all i=1,...,n,

where w is the weight vector, b is the bias term, x_i is the i-th training example, y_i is its corresponding label (either -1 or 1), and n is the number of training examples.

The first term in the objective function represents the regularization term, which penalizes large weight values, and the second term represents the margin constraints. The optimization problem aims to minimize the regularization term while satisfying the margin constraints for all training examples.

# Q3. What is the kernel trick in SVM?

Ans: The kernel trick is a technique used in Support Vector Machines (SVMs) to transform the input data into a higher-dimensional space without explicitly computing the transformed features. The idea behind this technique is to use a kernel function that computes the inner product of the transformed data points without actually computing the transformed data points themselves.

In other words, the kernel function takes the original data points as inputs and returns the inner product of their transformed versions, which can be used as a measure of similarity between the data points in the higher-dimensional space.

The use of the kernel function makes it possible to apply SVMs to non-linearly separable data by mapping the data points to a higher-dimensional space, where they are more likely to be linearly separable.

The most commonly used kernel functions in SVMs are the linear kernel, polynomial kernel, and radial basis function (RBF) kernel. The choice of kernel function depends on the nature of the data and the problem at hand.

The use of the kernel trick in SVMs has the advantage of allowing the algorithm to learn complex decision boundaries in high-dimensional spaces, without explicitly computing the transformed features. This can result in better performance and more efficient computation, compared to explicitly computing the transformed features.

# Q4. What is the role of support vectors in SVM Explain with example

Ans: n SVM, support vectors play a critical role in determining the position of the decision boundary, which separates the different classes of data points.

Support vectors are the data points that lie closest to the decision boundary and have a non-zero weight in the SVM model. They are the most influential data points in defining the position and orientation of the decision boundary.

For example, consider a binary classification problem where the goal is to separate two classes of data points, represented by red and blue dots in a two-dimensional space. The SVM algorithm will find the hyperplane that best separates the two classes of data points, as shown in the figure below: In this example, the support vectors are the data points that lie closest to the decision boundary, as shown by the black circles in the figure. These support vectors play a critical role in defining the position and orientation of the decision boundary, and any changes to the position or orientation of the boundary will only affect the classification of data points that are closer to the boundary than the support vectors.

The SVM algorithm aims to maximize the margin, which is the distance between the decision boundary and the closest support vectors. By maximizing the margin, the algorithm tries to ensure that the decision boundary is as far away from the closest data points as possible, which can help improve the generalization performance of the SVM model.

The support vectors are also used to classify new data points. A new data point is classified based on its distance to the decision boundary and the position of the support vectors. If a new data point is closer to one set of support vectors than the other, it will be classified as belonging to the class associated with those support vectors.

Overall, the support vectors are essential in SVM as they determine the position of the decision boundary, play a critical role in the classification of new data points, and help improve the generalization performance of the model.

# Q5. Illustrate with examples and graphs of Hyperplane, Marginal plane, Soft margin and Hard margin in SVM?

Ans: Sure, let me explain each of these concepts with examples and graphs.

1. Hyperplane:

The hyperplane is the decision boundary that separates the two classes in SVM. In a binary classification problem, the hyperplane is a line in 2D, a plane in 3D, and a hyperplane in higher dimensions. The goal of SVM is to find the hyperplane that maximizes the margin between the two classes.

2. Marginal plane:

The marginal plane is the plane that runs parallel to the hyperplane and touches the support vectors. The distance between the hyperplane and the marginal plane is called the margin. In SVM, the goal is to maximize the margin between the hyperplane and the marginal plane.

3. Hard margin:

In hard margin SVM, the algorithm tries to find a hyperplane that perfectly separates the two classes of data points without any errors. This works only if the data points are linearly separable.

4. Soft margin:

In soft margin SVM, the algorithm allows some misclassification errors by introducing a slack variable that relaxes the strictness of the margin. This helps in cases where the data points are not linearly separable. The objective function of soft margin SVM is to find a hyperplane that minimizes the errors and maximizes the margin.

Overall, these concepts are fundamental to understanding SVM and its various types.

# Q6. SVM Implementation through Iris dataset.

# Bonus task: Implement a linear SVM classifier from scratch using Python and compare its 

performance with the scikit-learn implementation.

1. Load the iris dataset from the scikit-learn library and split it into a training set and a testing setl
2. Train a linear SVM classifier on the training set and predict the labels for the testing setl
3. Compute the accuracy of the model on the testing setl
4. Plot the decision boundaries of the trained model using two of the featuresl
5. Try different values of the regularisation parameter C and see how it affects the performance of the model.

In [3]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.datasets import load_iris

In [4]:
iris=load_iris()
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test=train_test_split(iris.data,iris.target,test_size=0.3,random_state=42)

In [5]:
from sklearn.svm import SVC
svc=SVC(kernel='linear')
svc.fit(X_train,y_train)
y_pred=svc.predict(X_test)

In [6]:
from sklearn.metrics import classification_report,confusion_matrix,accuracy_score
print(confusion_matrix(y_test,y_pred))
print(classification_report(y_test,y_pred))
print(accuracy_score(y_test,y_pred))

[[19  0  0]
 [ 0 13  0]
 [ 0  0 13]]
              precision    recall  f1-score   support

           0       1.00      1.00      1.00        19
           1       1.00      1.00      1.00        13
           2       1.00      1.00      1.00        13

    accuracy                           1.00        45
   macro avg       1.00      1.00      1.00        45
weighted avg       1.00      1.00      1.00        45

1.0


In [7]:
import numpy as np
import matplotlib.pyplot as plt
#from mlxtend.plotting import plot_decision_regions

# Plot decision regions for two features
X_train_2feat = X_train[:, [2, 3]]
X_test_2feat = X_test[:, [2, 3]]
svm_2feat = SVC(kernel='linear', C=1)
svm_2feat.fit(X_train_2feat, y_train)
plot_decision_regions(X_train_2feat, y_train, clf=svm_2feat)
plt.xlabel('petal length [cm]')
plt.ylabel('petal width [cm]')
plt.title('SVM Decision Region Boundary')
plt.show()

NameError: name 'plot_decision_regions' is not defined

In [8]:
# Try different values of the regularization parameter C
C_values = [0.1, 1, 10]
for C in C_values:
    svm = SVC(kernel='linear', C=C)
    svm.fit(X_train, y_train)
    y_pred = svm.predict(X_test)
    accuracy = accuracy_score(y_test, y_pred)
    print("C = ", C, "Accuracy:", accuracy)

C =  0.1 Accuracy: 1.0
C =  1 Accuracy: 1.0
C =  10 Accuracy: 0.9777777777777777
