Q1. What is the mathematical formula for a linear SVM?

Q2. What is the objective function of a linear SVM?

Q3. What is the kernel trick in SVM?

Q4. What is the role of support vectors in SVM Explain with example

Q5. Illustrate with examples and graphs of Hyperplane, Marginal plane, Soft margin and Hard margin in
SVM?

Q6. SVM Implementation through Iris dataset.
~ Load the iris dataset from the scikit-learn library and split it into a training set and a testing setl
~ Train a linear SVM classifier on the training set and predict the labels for the testing setl
~ Compute the accuracy of the model on the testing setl
~ Plot the decision boundaries of the trained model using two of the featuresl
~ Try different values of the regularisation parameter C and see how it affects the performance of
the model.

# Q1. What is the mathematical formula for a linear SVM?
A Support Vector Machine (SVM) aims to find a hyperplane that best separates the data into two classes. For a linear SVM, the hyperplane can be represented by the equation:

𝑤
⋅
𝑥
+
𝑏
=
0
w⋅x+b=0
Where:

𝑤
w is the weight vector (perpendicular to the hyperplane),
𝑥
x is the input feature vector (a data point),
𝑏
b is the bias term (offset from the origin).
The SVM aims to maximize the margin, which is the distance between the hyperplane and the closest data points from both classes (called support vectors).

# Q2. What is the objective function of a linear SVM?
The objective of a linear SVM is to maximize the margin while ensuring that all the data points are correctly classified (if possible). The optimization problem can be formulated as:

Minimize the following objective function:

This ensures that each point is on the correct side of the margin and at least 1 unit away from the hyperplane.

# Q3. What is the kernel trick in SVM?

The kernel trick allows SVM to operate in higher-dimensional spaces without explicitly computing the coordinates in those spaces. It maps the data from its original feature space into a higher-dimensional space where a linear separation is possible, even if the data is not linearly separable in the original space.

The kernel function computes the dot product of the data points in the higher-dimensional space without explicitly transforming the data. Common kernel functions are:

The kernel trick enables SVM to efficiently perform well in complex, non-linearly separable problems.

# Q4. What is the role of support vectors in SVM? Explain with example
Support vectors are the data points that are closest to the hyperplane and are critical in defining the optimal separating hyperplane. These points essentially "support" the hyperplane by lying on the margins, and their positions directly influence the optimal placement of the hyperplane.

Example: Consider a simple 2D classification problem where two classes (positive and negative) are represented by circles and squares. The support vectors are the points that lie closest to the decision boundary (hyperplane), and they will determine the orientation and position of the hyperplane. If we move a support vector, the position of the hyperplane changes.

# Q5. Illustrate with examples and graphs of Hyperplane, Marginal plane, Soft margin, and Hard margin in SVM?
Hyperplane:
The hyperplane is the decision boundary that separates the two classes. In a 2D space, it is simply a line, and in higher dimensions, it is a plane.

Marginal Plane:
The marginal plane is a boundary that is at an equal distance from the support vectors of both classes. It maximizes the margin between the classes.

Hard Margin:
A hard margin SVM is used when the data is perfectly linearly separable. In this case, there is no misclassification of any data points.

Soft Margin:
A soft margin SVM allows some misclassifications of data points to create a better generalization by allowing a slack variable 
𝜉
𝑖
ξ 
i
​
 . This is useful when the data is not perfectly linearly separable.

Graphical Illustration:
Hard Margin SVM: The margin is the largest distance between the two classes, and no points lie within the margin.
Soft Margin SVM: The margin is still maximized, but some points may be inside the margin or misclassified.

In [None]:
# Q6. SVM Implementation through Iris dataset
# Now, we’ll implement an SVM classifier using the Iris dataset, and visualize how different values of the regularization parameter C affect the model’s performance.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, confusion_matrix
from sklearn.decomposition import PCA

# Load the Iris dataset
iris = datasets.load_iris()
X = iris.data[:, :2]  # Using only two features (for 2D visualization)
y = iris.target

# Split the dataset into a training set and testing set (80% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train the linear SVM classifier with default C value (1.0)
svm = SVC(kernel='linear', C=1.0)
svm.fit(X_train, y_train)

# Predict the labels for the test set
y_pred = svm.predict(X_test)

# Compute the accuracy of the model
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy:.2f}')

# Plot decision boundary for the trained model
h = .02  # Step size in the mesh
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1

xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))
Z = svm.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)

# Create the contour plot
plt.contourf(xx, yy, Z, alpha=0.8)
plt.scatter(X[:, 0], X[:, 1], c=y, edgecolors='k', marker='o', s=80)
plt.title('SVM Decision Boundary (C=1.0)')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.show()

# Try different values of C and see how it affects the model
C_values = [0.1, 1.0, 10.0]
for C in C_values:
    svm = SVC(kernel='linear', C=C)
    svm.fit(X_train, y_train)
    y_pred = svm.predict(X_test)
    accuracy = accuracy_score(y_test, y_pred)
    print(f'Accuracy with C={C}: {accuracy:.2f}')

    # Plot decision boundary for different C values
    Z = svm.predict(np.c_[xx.ravel(), yy.ravel()])
    Z = Z.reshape(xx.shape)

    plt.contourf(xx, yy, Z, alpha=0.8)
    plt.scatter(X[:, 0], X[:, 1], c=y, edgecolors='k', marker='o', s=80)
    plt.title(f'SVM Decision Boundary (C={C})')
    plt.xlabel('Feature 1')
    plt.ylabel('Feature 2')
    plt.show()
