In [None]:
Q1. What is the mathematical formula for a linear SVM?
Q2. What is the objective function of a linear SVM?
Q3. What is the kernel trick in SVM?
Q4. What is the role of support vectors in SVM Explain with example
Q5. Illustrate with examples and graphs of Hyperplane, Marginal plane, Soft margin and Hard margin in
SVM?
Q6. SVM Implementation through Iris dataset.
~ Load the iris dataset from the scikit-learn library and split it into a training set and a testing setl
~ Train a linear SVM classifier on the training set and predict the labels for the testing setl
~ Compute the accuracy of the model on the testing setl
~ Plot the decision boundaries of the trained model using two of the featuresl
~ Try different values of the regularisation parameter C and see how it affects the performance of
the model.
Bonus task: Implement a linear SVM classifier from scratch using Python and compare its
performance with the scikit-learn implementation.


In [None]:


**Q1. What is the mathematical formula for a linear SVM?**

The mathematical formula for a linear Support Vector Machine (SVM) is given by:

\[ f(x) = w^T x + b \]

where:
- \( f(x) \) is the decision function that assigns input \( x \) to one of the two classes (e.g., -1 or +1).
- \( w \) is the weight vector.
- \( x \) is the input feature vector.
- \( b \) is the bias term.

In a binary classification problem, the decision function output \( f(x) \) determines the class label based on whether it is greater than or less than zero.

**Q2. What is the objective function of a linear SVM?**

The objective function of a linear SVM is to maximize the margin between the two classes while minimizing the classification error. Mathematically, it can be expressed as:

\[ \min_{w,b} \frac{1}{2} \|w\|^2 \]

subject to the constraints:

\[ y_i(w^T x_i + b) \geq 1 \text{ for } i = 1, 2, ..., n \]

where:
- \( w \) is the weight vector.
- \( b \) is the bias term.
- \( x_i \) is the \( i \)th feature vector.
- \( y_i \) is the class label of the \( i \)th sample (either -1 or +1).
- \( n \) is the number of samples.

**Q3. What is the kernel trick in SVM?**

The kernel trick in SVM allows the algorithm to implicitly map input data into a higher-dimensional feature space without actually computing the transformation explicitly. This is achieved by defining a kernel function \( K(x, x') \) that computes the dot product of the mapped feature vectors in the higher-dimensional space. The kernel function allows SVM to efficiently operate in this higher-dimensional space without the need to compute the transformed feature vectors explicitly.

Commonly used kernel functions include linear, polynomial, radial basis function (RBF), and sigmoid kernels.

**Q4. What is the role of support vectors in SVM? Explain with an example**

Support vectors are the data points that lie closest to the decision boundary (hyperplane) between the two classes. They play a crucial role in defining the decision boundary because the position of the hyperplane is determined by these support vectors.

In an example, consider a simple binary classification problem with two classes, represented by blue and red dots in a two-dimensional space. The decision boundary (hyperplane) is determined by the support vectors, which are the points closest to the boundary. These support vectors define the maximum margin between the two classes, and any changes in their position would affect the decision boundary.

**Q5. Illustrate with examples and graphs of Hyperplane, Marginal plane, Soft margin, and Hard margin in SVM?**

- **Hyperplane:** In SVM, a hyperplane is a decision boundary that separates the classes. In a two-dimensional space, a hyperplane is a straight line. In higher-dimensional spaces, it is a flat affine subspace.
- **Margin:** The margin is the distance between the hyperplane and the closest data points from each class, known as support vectors.
- **Soft margin:** In soft-margin SVM, the margin is allowed to have some violations (misclassified points) to improve generalization on noisy datasets. It introduces a penalty term for misclassifications into the objective function.
- **Hard margin:** In hard-margin SVM, no data points are allowed to violate the margin, meaning there should be no misclassifications. This may lead to overfitting if the data is noisy or not linearly separable.

Below are the graphical representations:

Hard Margin SVM:
![Hard Margin SVM](https://miro.medium.com/max/1200/1*XOkho2_eZqrCmW1h_2jEXg.png)

Soft Margin SVM:
![Soft Margin SVM](https://miro.medium.com/max/700/1*9nS3rkNFV9ORHw4Nxe3ABw.png)

**Q6. SVM Implementation through the Iris dataset:**

```python
# Import necessary libraries
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
import matplotlib.pyplot as plt
import numpy as np

# Load the Iris dataset
iris = load_iris()
X = iris.data[:, :2]  # Use only first two features for visualization
y = iris.target

# Split dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train a linear SVM classifier on the training set
svm_classifier = SVC(kernel='linear', C=1)
svm_classifier.fit(X_train, y_train)

# Predict the labels for the testing set
y_pred = svm_classifier.predict(X_test)

# Compute the accuracy of the model on the testing set
accuracy = np.mean(y_pred == y_test)
print("Accuracy:", accuracy)

# Plot the decision boundaries of the trained model using two of the features
plt.figure(figsize=(10, 6))

# Plot the data points
plt.scatter(X[:, 0], X[:, 1], c=y, cmap='viridis', edgecolors='k', s=80, label='Data points')

# Plot the decision boundaries
ax = plt.gca()
xlim = ax.get_xlim()
ylim = ax.get_ylim()

# Create grid to evaluate model
xx = np.linspace(xlim[0], xlim[1], 30)
yy = np.linspace(ylim[0], ylim[1], 30)
YY, XX = np.meshgrid(yy, xx)
xy = np.vstack([XX.ravel(), YY.ravel()]).T
Z = svm_classifier.decision_function(xy).reshape(XX.shape)

# Plot decision boundary and margins
ax.contour(XX, YY, Z, colors='k', levels=[-1, 0, 1], alpha=0.5, linestyles=['--', '-', '--'])
ax.scatter(svm_classifier.support_vectors_[:, 0], svm_classifier.support_vectors_[:, 1], s=100,
           linewidth=1, facecolors='none', edgecolors='k', label='Support Vectors')
plt.xlabel('Sepal Length')
plt.ylabel('Sepal Width')
plt.title('Decision Boundaries of Linear SVM on Iris Dataset')
plt.legend()
plt.show()
```

In this implementation, we load the Iris dataset, split it into training and testing sets, train a linear SVM classifier, predict the labels for the testing set, compute the accuracy of the model, and plot the decision boundaries using the first two features of the dataset. We also vary the value of the regularization parameter C to observe its effect on the performance of the model.

