<a href="https://colab.research.google.com/github/UrvashiiThakur/practiceGit/blob/main/06April.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Q1. What is the mathematical formula for a linear SVM?

The mathematical formula for a linear Support Vector Machine (SVM) classifier is given by:

\[ f(\mathbf{x}) = \mathbf{w} \cdot \mathbf{x} + b \]

where:
- \(\mathbf{x}\) is the input feature vector.
- \(\mathbf{w}\) is the weight vector.
- \(b\) is the bias term.

The decision boundary (hyperplane) is defined as:

\[ \mathbf{w} \cdot \mathbf{x} + b = 0 \]

### Q2. What is the objective function of a linear SVM?

The objective function of a linear SVM aims to find the optimal hyperplane that maximizes the margin between the two classes while minimizing classification errors. It consists of two parts: the hinge loss function and the regularization term.

For a linear SVM, the objective function \(J(\mathbf{w}, b)\) is:

\[ J(\mathbf{w}, b) = \frac{1}{2} \|\mathbf{w}\|^2 + C \sum_{i=1}^N \max(0, 1 - y_i (\mathbf{w} \cdot \mathbf{x}_i + b)) \]

where:
- \(\|\mathbf{w}\|^2\) is the regularization term (to prevent overfitting by keeping the weights small).
- \(C\) is the regularization parameter that controls the trade-off between maximizing the margin and minimizing the classification error.
- \(y_i\) is the true label of the \(i\)-th sample (\(y_i \in \{-1, 1\}\)).
- \(\mathbf{x}_i\) is the feature vector of the \(i\)-th sample.
- \(\max(0, 1 - y_i (\mathbf{w} \cdot \mathbf{x}_i + b))\) is the hinge loss function.

### Q3. What is the kernel trick in SVM?

The kernel trick is a method used in SVMs to transform the original feature space into a higher-dimensional space where a linear separation is possible. Instead of computing the transformation explicitly, the kernel trick allows the SVM to compute the dot product of the transformed feature vectors directly using a kernel function \(K(\mathbf{x}_i, \mathbf{x}_j)\). This approach enables the SVM to handle non-linearly separable data efficiently.

Common kernel functions include:
- **Linear kernel**: \(K(\mathbf{x}_i, \mathbf{x}_j) = \mathbf{x}_i \cdot \mathbf{x}_j\)
- **Polynomial kernel**: \(K(\mathbf{x}_i, \mathbf{x}_j) = (\mathbf{x}_i \cdot \mathbf{x}_j + 1)^d\)
- **Radial Basis Function (RBF) kernel**: \(K(\mathbf{x}_i, \mathbf{x}_j) = \exp(-\gamma \|\mathbf{x}_i - \mathbf{x}_j\|^2)\)
- **Sigmoid kernel**: \(K(\mathbf{x}_i, \mathbf{x}_j) = \tanh(\alpha \mathbf{x}_i \cdot \mathbf{x}_j + c)\)

### Q4. What is the role of support vectors in SVM? Explain with example.

Support vectors are the data points that lie closest to the decision boundary (hyperplane) and are critical in defining the position and orientation of the hyperplane. They are the points that directly influence the decision boundary, and the SVM aims to find the hyperplane that maximizes the margin between these support vectors.

Example:
Consider a binary classification problem with two classes (red and blue) in a two-dimensional feature space. The support vectors are the closest red and blue points to the decision boundary. These points are crucial as they determine the maximum margin hyperplane.

In the diagram below, the black line represents the decision boundary, and the dotted lines represent the margins. The circled points are the support vectors:

```
     x (support vector, class +1)                  x (support vector, class -1)
       |                                               |
       |                                               |
     +1|---------------------------|                   |
       |                           |                   |
       |                           |                   |
     + |                           |                   |
       |                           |                   |
       |                           |                   |
     -1|---------------------------|                   |
       |                                               |
```

### Q5. Illustrate with examples and graphs of Hyperplane, Marginal plane, Soft margin, and Hard margin in SVM.

#### Hyperplane
The hyperplane is the decision boundary that separates the classes. For a linear SVM, it is a line in 2D or a plane in 3D.

#### Marginal Plane
The marginal planes are the boundaries that are parallel to the hyperplane and pass through the support vectors. They define the margin.

#### Hard Margin
A hard margin SVM requires that all data points be classified correctly and lie outside the margin. This is feasible when the data is linearly separable.

Graphically:
```
  Class -1
  o o o o o o
  x x x x x x  (support vectors)
  ---------  (hyperplane)
  o o o o o o
  x x x x x x
  Class +1
```

#### Soft Margin
A soft margin SVM allows some misclassification and points to lie inside the margin. This is useful for handling noisy data that is not perfectly separable.

Graphically:
```
  Class -1
  o o o o o o
  x x  x  x  x  (support vectors)
  -- ----  (hyperplane)
  o o o o o o
  x x x x x x
  Class +1
```

In the soft margin, some points are allowed to be on the wrong side of the margin or even the hyperplane, controlled by the parameter \(C\).

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns


In [2]:
from sklearn.datasets import load_iris
dataset = load_iris()

In [3]:
dataset.keys()

dict_keys(['data', 'target', 'frame', 'target_names', 'DESCR', 'feature_names', 'filename', 'data_module'])

In [5]:
dataset.feature_names

['sepal length (cm)',
 'sepal width (cm)',
 'petal length (cm)',
 'petal width (cm)']

In [6]:
df = pd.DataFrame(dataset.data, columns=dataset.feature_names)
df.head()

Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm)
0,5.1,3.5,1.4,0.2
1,4.9,3.0,1.4,0.2
2,4.7,3.2,1.3,0.2
3,4.6,3.1,1.5,0.2
4,5.0,3.6,1.4,0.2


In [7]:
X = df
y = dataset.target

In [9]:
y

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])

In [8]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

In [10]:
from sklearn.svm import SVC
model = SVC(kernel='linear')
model.fit(X_train, y_train)

In [11]:
y_pred = model.predict(X_test)

In [12]:
from sklearn.metrics import accuracy_score
accuracy_score(y_test, y_pred)

1.0