Q1. What is the mathematical formula for a linear SVM?

In [None]:
Ans 1:-
The mathematical formula for a linear Support Vector Machine (SVM) can be expressed as follows:

Given a dataset with features denoted as xᵢ and corresponding binary class labels yᵢ, where i = 1, 2, ..., n (n is the number of data points), and each feature vector
xᵢ has d dimensions (xᵢ ∈ ℝᵈ), the linear SVM aims to find a hyperplane represented by the equation:

wᵀx + b = 0

Where:

w is the weight vector (normal to the hyperplane).
x is the feature vector.
b is the bias or intercept.
The decision function for classifying a new data point, x_new, is determined by the sign of the function wᵀx_new + b:

If wᵀx_new + b > 0, then x_new is classified as one class (typically the positive class).
If wᵀx_new + b < 0, then x_new is classified as the other class (typically the negative class).
The goal of linear SVM is to find the optimal values of w and b to maximize the margin between the classes while minimizing classification errors. 

In [None]:
In this formulation, yᵢ represents the class labels (+1 or -1), and the constraint enforces that data points are correctly classified on the correct side of the
margin (at least 1 unit away from the decision boundary). 
The margin is defined as the distance between the two parallel hyperplanes wᵀx + b = 1 and wᵀx + b = -1.

Q2. What is the objective function of a linear SVM?

In [None]:
Ans 2:-The objective function of a linear Support Vector Machine (SVM) is to find the optimal hyperplane (represented by the weight vector w and the bias or intercept
    term b) that maximizes the margin between the classes while minimizing classification errors. 
    This is typically expressed as a convex optimization problem with constraints.

In [None]:
In this formulation:

w represents the weight vector.
b represents the bias or intercept term.
xᵢ represents the feature vector of the i-th data point.
yᵢ represents the class labels (+1 or -1).
The objective is to minimize the L2-norm (Euclidean norm) of the weight vector ‖w‖², which corresponds to maximizing the margin between the classes. 
The margin is defined as the distance between the two parallel hyperplanes wᵀx + b = 1 and wᵀx + b = -1.
By minimizing the L2-norm of w, the SVM encourages the hyperplane to be located as far away from the data points as possible.

Q3. What is the kernel trick in SVM?

In [None]:
Ans 3:-The kernel trick in Support Vector Machines (SVM) is a mathematical technique used to transform data from its original feature space into a higher-dimensional 
feature space.
It allows SVMs to find complex, nonlinear decision boundaries in the original feature space.
The key idea is to implicitly compute the dot product (inner product) between data points in the higher-dimensional space without explicitly computing the 
transformation of the data into that space.

In [None]:
Original Feature Space: 
    In the original feature space, data points may not be linearly separable, meaning you cant draw a single straight line or hyperplane to 
    separate the classes.

Mapping to a Higher-Dimensional Space: 
    To address this, the kernel trick uses a function, called a kernel function (e.g., polynomial kernel, radial basis function (RBF) kernel), that implicitly maps 
    the data to a higher-dimensional space where it is more likely to be linearly separable.
    
Decision Boundary in the Higher-Dimensional Space: 
    In the higher-dimensional space, a linear decision boundary is constructed to separate the classes, and this boundary can be a complex, nonlinear boundary in the
    original feature space.
    This is achieved by solving a linear SVM problem in the higher-dimensional space.

Predictions in the Original Feature Space:
    After training the SVM in the higher-dimensional space, predictions are made for new data points in the original feature space. 
    The kernel trick ensures that these predictions are based on the dot product with respect to the higher-dimensional space, allowing SVMs to classify data points
    in the original space even though they are projected into the higher-dimensional space.

Q4. What is the role of support vectors in SVM Explain with example

In [None]:
Ans 4:-In Support Vector Machines (SVM), support vectors are the data points that are closest to the decision boundary (hyperplane) that separates the classes. 
They are the critical elements of SVM that determine the position and orientation of the decision boundary. 

In [None]:
Determining the Margin: 
    Support vectors are the data points that are closest to the decision boundary.
    The distance between a support vector and the decision boundary is called the margin.
    In an ideal case, the margin should be maximized. 
    The support vectors define this margin, and the larger the margin, the better the SVM can generalize to unseen data.

Influencing the Decision Boundary:
    The decision boundary is determined by the support vectors.
    Non-support vectors do not affect the decision boundary, only the support vectors are involved in this process. 
    This means that most of the data points are irrelevant to the SVMs decision boundary, making SVMs efficient and robust.

Handling Non-Linear Separability:
    In cases where the data is not linearly separable in the original feature space, the support vectors play a crucial role when kernel functions are used.
    Kernel functions allow SVMs to map the data into a higher-dimensional space where it might become linearly separable. 
    In the higher-dimensional space, support vectors still play their central role in defining the decision boundary.

Q5. Illustrate with examples and graphs of Hyperplane, Marginal plane, Soft margin and Hard margin in
SVM?

In [None]:
Ans 5:-
Sure, lets visualize the concepts of hyperplane, marginal plane, soft margin, and hard margin in Support Vector Machines (SVM) using some simple examples and graphs. 
Well use a 2D feature space for simplicity.

In [None]:
Scenario 1: Hyperplane and Hard Margin (C=∞)
In this scenario, we have a "hard margin" SVM, which means the SVM does not tolerate any data points within the margin. 
It tries to find the largest possible margin that separates the two classes.
Lets see the graph:

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm

# Data
X = np.array([[1, 3], [2, 2], [2, 4], [4, 2], [5, 3], [5, 1])
y = [1, 1, 1, -1, -1, -1]

# Create an SVM classifier with a linear kernel (Use a large value for C for a hard margin)
clf = svm.SVC(kernel='linear', C=1e10)
clf.fit(X, y)

# Plot the data points
plt.scatter(X[:, 0], X[:, 1], c=y, cmap='coolwarm')

# Get the separating hyperplane (hyperplane and margin)
w = clf.coef_[0]
a = -w[0] / w[1]
xx = np.linspace(0, 6)
yy = a * xx - clf.intercept_[0] / w[1]

# Margin lines
b = clf.support_vectors_[0]
yy_down = a * xx + (b[1] - a * b[0])
b = clf.support_vectors_[-1]
yy_up = a * xx + (b[1] - a * b[0])

plt.plot(xx, yy, 'k-')
plt.plot(xx, yy_down, 'k--')
plt.plot(xx, yy_up, 'k--')

# Plot support vectors
plt.scatter(clf.support_vectors_[:, 0], clf.support_vectors_[:, 1], s=100, facecolors='none', edgecolors='k')
plt.title('Hard Margin SVM')
plt.show()


Q6. SVM Implementation through Iris dataset.
~ Load the iris dataset from the scikit-learn library and split it into a training set and a testing setl
~ Train a linear SVM classifier on the training set and predict the labels for the testing setl
~ Compute the accuracy of the model on the testing setl
~ Plot the decision boundaries of the trained model using two of the featuresl
~ Try different values of the regularisation parameter C and see how it affects the performance of
the model.

In [None]:
# Ans 6:-
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn import svm
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load the Iris dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target

# Split the data into a training set and a testing set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train a linear SVM classifier
clf = svm.SVC(kernel='linear', C=1)  # You can experiment with different values of C
clf.fit(X_train[:, :2], y_train)  # Using only two features for visualization

# Predict labels for the testing set
y_pred = clf.predict(X_test[:, :2])

# Calculate accuracy
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)

# Create a mesh to plot decision boundaries
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, 0.02), np.arange(y_min, y_max, 0.02))
Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])

# Plot decision boundaries
Z = Z.reshape(xx.shape)
plt.contourf(xx, yy, Z, cmap=plt.cm.coolwarm, alpha=0.8)
plt.scatter(X[:, 0], X[:, 1], c=y, cmap=plt.cm.coolwarm)
plt.xlabel('Sepal Length (cm)')
plt.ylabel('Sepal Width (cm)')
plt.title('SVM Decision Boundaries (C=1)')
plt.show()
