# Q1. What is the mathematical formula for a linear SVM?

Given a training dataset consisting of input vectors X and corresponding class labels y, where X is an n-dimensional feature vector and y is either -1 or +1 indicating the class labels, the goal of a linear SVM is to find a hyperplane that maximally separates the two classes.

The hyperplane is defined by the equation:

w^T * x + b = 0,

where w is the weight vector perpendicular to the hyperplane, x is the input vector, and b is the bias term. The weight vector w determines the orientation of the hyperplane, and the bias term b shifts the hyperplane along the w direction.

The decision function of the SVM is given by:

f(x) = sign(w^T * x + b),

where sign(x) is the sign function that returns -1 if x < 0, +1 if x > 0, and 0 if x = 0.

# Q2. What is the objective function of a linear SVM?

- The objective of the SVM algorithm is to find a hyperplane that, to the best degree possible, separates data points of one class from those of another class.
- The SVM algorithm tries to find the best line or decision boundary that can segregate n-dimensional space into classes so that we can easily put the new data point in the correct category in the future.
- The objective function of a linear SVM is to maximize the margin between the two classes.

# Q3. What is the kernel trick in SVM?

- The kernel trick is a powerful technique that enables SVMs to solve non-linear classification problems by implicitly mapping the input data to a higher-dimensional feature space. By doing so, it allows us to find a hyperplane that separates the different classes of data.
- ure space. By doing so, it allows us to find a hyperplane that separates the different classes of data1. The “Kernel Trick” is a method used in Support Vector Machines (SVMs) to convert data (that is not linearly separable) into a higher-dimensional feature space where it may be linearly separated.
-  Internally, the kernelized SVM can compute these complex transformations just in terms of similarity calculations between pairs of points in the higher dimensional feature space where the transformed feature representation is implicit.

# Q4. What is the role of support vectors in SVM Explain with example.

- Support vectors are data points that are closer to the hyperplane and influence the position and orientation of the hyperplane. Using these support vectors, we maximize the margin of the classifier.
- The goal of SVM is to find the hyperplane that maximizes the margin between the two classes. The margin is defined as the distance between the separating hyperplane (decision boundary) and the closest points of each class.
- The hyperplane with maximum margin is called the optimal hyperplane. The points closest to the hyperplane are called support vectors.
- For example, consider a dataset with two classes that are linearly separable. In this case, we can find multiple hyperplanes that can separate these two classes. However, we want to find a hyperplane that maximizes the margin between these two classes. This is where support vectors come into play. Support vectors are data points that lie closest to the decision boundary (hyperplane). These points are used to define the margin and hence influence the position and orientation of the hyperplane.

# Q5. Illustrate with examples and graphs of Hyperplane, Marginal plane, Soft margin and Hard margin in SVM?

1. Hyperplane:
- In SVM, a hyperplane is a decision boundary that separates data points into different classes. For binary classification, a hyperplane is a line in a 2D space, a plane in a 3D space, or a hyperplane in higher dimensions. It aims to maximize the margin between the classes, which are the closest points to the decision boundary.
Example:
Consider a simple 2D dataset with two classes, represented by red and blue points. The hyperplane is represented by the black line, which effectively separates the two classes.

2. Marginal Plane:
- The marginal plane refers to the parallel planes that lie on both sides of the hyperplane and touch the support vectors (data points closest to the hyperplane). The distance between the hyperplane and the marginal plane is called the margin.
Example:
Let's consider the same dataset as before. The marginal planes are depicted by the dashed lines. They touch the support vectors (represented by the filled circles) and define the width of the margin.

3. Soft Margin:
- In some cases, the data may not be linearly separable, or there may be outliers that make it difficult to find a hyperplane with a wide margin. In such cases, we can introduce a soft margin to allow some misclassification or errors.
Example:
Suppose we have a dataset with overlapping classes and outliers. By introducing a soft margin, SVM allows for some misclassification, indicated by the circled points. The dashed lines represent the marginal planes.

4. Hard Margin:
- On the other hand, a hard margin SVM requires that all data points are correctly classified, and there are no outliers. It aims to find a hyperplane that separates the classes with a maximum margin without allowing any misclassification.
Example:
If we have a linearly separable dataset with no outliers, a hard margin SVM can find a hyperplane (represented by the black line) that perfectly separates the two classes.

# Q6. SVM Implementation through Iris dataset.

In [2]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score 

In [3]:
dataset = load_iris()

In [6]:
x = dataset.data
y = dataset.target

In [8]:
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=42)

In [9]:
svc = SVC(kernel='linear')

svc.fit(x_train, y_train)

In [10]:
y_pred = svc.predict(x_test)

In [12]:
print(accuracy_score(y_test, y_pred))

1.0


In [19]:
C_values = [0.1, 1, 10, 100]
for C in C_values:
    svm = SVC(kernel='linear', C=C)
    svm.fit(x_train, y_train)
    y_pred = svm.predict(x_test)
    accuracy = accuracy_score(y_test, y_pred)
    print("C =", C, "Accuracy:", accuracy)

C = 0.1 Accuracy: 1.0
C = 1 Accuracy: 1.0
C = 10 Accuracy: 0.9666666666666667
C = 100 Accuracy: 1.0
