Q1--
Answer--
Linear SVM Formula:

minimize 1/2 ||w||^2
subject to: y_i(w • x_i + b) ≥ 1

In this formula:
- w represents the weight vector,
- b is the bias term,
- x_i are the input vectors,
- y_i are the corresponding class labels.


Q2--
Answer--

The objective function of a linear Support Vector Machine (SVM) is to find the hyperplane that best separates the classes in a linearly separable dataset by maximizing the margin between the classes. This can be formulated as an optimization problem:
minimize (1/2) ||w||^2
subject to: y_i(w • x_i + b) ≥ 1 for all i


Q3--
Answer--
### Kernel Trick in SVM

The kernel trick in Support Vector Machines (SVM) allows the algorithm to operate in high-dimensional spaces without explicitly computing the coordinates of the data in those spaces. Instead, it uses kernel functions to calculate the dot products between the images of all pairs of data points in the feature space.

Mathematically, for a given kernel function K(x_i, x_j), the SVM optimization problem can be transformed using this kernel function instead of the dot product of the original feature vectors:

K(x_i, x_j) = φ(x_i) • φ(x_j)

where:
- K(x_i, x_j) is the kernel function,
- φ(x) is the feature mapping function.

Common kernel functions include:
- **Linear kernel**: K(x_i, x_j) = x_i • x_j
- **Polynomial kernel**: K(x_i, x_j) = (x_i • x_j + c)^d
- **Radial basis function (RBF) or Gaussian kernel**: K(x_i, x_j) = exp(-γ ||x_i - x_j||^2)
- **Sigmoid kernel**: K(x_i, x_j) = tanh(α x_i • x_j + c)

The kernel trick enables the SVM to find a separating hyperplane in a higher-dimensional space without the computational burden of working directly in that space.


Q4--
Answer--Role of Support Vectors in SVM
Support vectors are the critical elements of the dataset in a Support Vector Machine (SVM). They are the data points that lie closest to the decision boundary (or hyperplane) and directly influence its position and orientation. The SVM algorithm aims to find the optimal hyperplane that maximizes the margin, which is the distance between the hyperplane and the nearest data points from each class. These nearest data points are the support vectors.

Q5--
Answer--# SVM Concepts: Hyperplane, Marginal Plane, Soft Margin, and Hard Margin

## Generating Example Data

Let's use Python with `matplotlib` and `scikit-learn` to illustrate these concepts. We'll generate some sample data and create the respective SVM models.

```python
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm
from sklearn.datasets import make_blobs

# Generate sample data
X, y = make_blobs(n_samples=100, centers=2, random_state=6)

# Create a hard margin SVM (large C value)
clf_hard = svm.SVC(kernel='linear', C=1e5)
clf_hard.fit(X, y)

# Create a soft margin SVM (smaller C value)
clf_soft = svm.SVC(kernel='linear', C=1.0)
clf_soft.fit(X, y)

# Function to plot decision boundary and margins
def plot_svm(clf, X, y, title):
    plt.scatter(X[:, 0], X[:, 1], c=y, s=30, cmap=plt.cm.Paired)
    # Plot support vectors
    plt.scatter(clf.support_vectors_[:, 0], clf.support_vectors_[:, 1], s=100, facecolors='none', edgecolors='k')

    ax = plt.gca()
    xlim = ax.get_xlim()
    ylim = ax.get_ylim()

    xx = np.linspace(xlim[0], xlim[1], 30)
    yy = np.linspace(ylim[0], ylim[1], 30)
    YY, XX = np.meshgrid(yy, xx)
    xy = np.vstack([XX.ravel(), YY.ravel()]).T
    Z = clf.decision_function(xy).reshape(XX.shape)

    ax.contour(XX, YY, Z, colors='k', levels=[-1, 0, 1], alpha=0.5,
               linestyles=['--', '-', '--'])

    plt.title(title)
    plt.show()

# Plotting the hard margin SVM
plot_svm(clf_hard, X, y, "Hard Margin SVM")

# Plotting the soft margin SVM
plot_svm(clf_soft, X, y, "Soft Margin SVM")

Hyperplane: The decision boundary separating the classes.
Marginal Planes: Parallel planes to the hyperplane touching the nearest points from each class.
Hard Margin: Requires perfect separation of classes with no points inside the margin.
Soft Margin: Allows some points to be inside the margin or misclassified for better generalization on non-linearly separable data.


Q6--
Answer--# SVM Implementation through Iris Dataset

## Steps:
1. Load the iris dataset from the scikit-learn library and split it into a training set and a testing set.
2. Train a linear SVM classifier on the training set and predict the labels for the testing set.
3. Compute the accuracy of the model on the testing set.
4. Plot the decision boundaries of the trained model using two of the features.
5. Try different values of the regularization parameter C and see how it affects the performance of the model.

### 1. Load the Iris Dataset and Split it into Training and Testing Sets

```python
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

# Load the iris dataset
iris = datasets.load_iris()
X = iris.data[:, :2]  # We will use only the first two features for visualization
y = iris.target

# Split the dataset into a training set and a testing set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

2. Train a Linear SVM Classifier
# Train a linear SVM classifier
clf = SVC(kernel='linear', C=1.0)
clf.fit(X_train, y_train)

# Predict the labels for the testing set
y_pred = clf.predict(X_test)

3. Compute the Accuracy of the Model
# Compute the accuracy of the model on the testing set
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy * 100:.2f}%')

4. Plot the Decision Boundaries of the Trained Model
# Function to plot decision boundaries
def plot_decision_boundaries(X, y, model, title):
    x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
    y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
    xx, yy = np.meshgrid(np.arange(x_min, x_max, 0.02),
                         np.arange(y_min, y_max, 0.02))
    
    Z = model.predict(np.c_[xx.ravel(), yy.ravel()])
    Z = Z.reshape(xx.shape)
    
    plt.contourf(xx, yy, Z, alpha=0.3)
    plt.scatter(X[:, 0], X[:, 1], c=y, edgecolors='k', marker='o')
    plt.title(title)
    plt.xlabel(iris.feature_names[0])
    plt.ylabel(iris.feature_names[1])
    plt.show()

# Plot the decision boundaries
plot_decision_boundaries(X_test, y_test, clf, 'SVM Decision Boundaries (C=1.0)')

5. Try Different Values of the Regularization Parameter C
# Train and evaluate the model with different values of C
C_values = [0.01, 0.1, 1, 10, 100]
for C in C_values:
    clf = SVC(kernel='linear', C=C)
    clf.fit(X_train, y_train)
    y_pred = clf.predict(X_test)
    accuracy = accuracy_score(y_test, y_pred)
    print(f'C={C}: Accuracy: {accuracy * 100:.2f}%')
    plot_decision_boundaries(X_test, y_test, clf, f'SVM Decision Boundaries (C={C})')


bonus task--
# Implementing a Linear SVM Classifier from Scratch

## Steps:
1. Load the Iris dataset and preprocess it.
2. Implement the linear SVM classifier from scratch.
3. Train the custom SVM classifier on the training set.
4. Evaluate its performance on the testing set.
5. Compare the performance with the scikit-learn implementation.

### 1. Load the Iris Dataset and Preprocess it

```python
import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Load the Iris dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target

# Standardize the features
scaler = StandardScaler()
X = scaler.fit_transform(X)

# Split the dataset into a training set and a testing set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

2. Implement the Linear SVM Classifier from Scratch

class LinearSVM:
    def __init__(self, learning_rate=0.001, lambda_param=0.01, num_iterations=1000):
        self.lr = learning_rate
        self.lambda_param = lambda_param
        self.num_iter = num_iterations
        self.weights = None
        self.bias = None

    def fit(self, X, y):
        n_samples, n_features = X.shape
        y_ = np.where(y <= 0, -1, 1)
        self.weights = np.zeros(n_features)
        self.bias = 0

        for _ in range(self.num_iter):
            for idx, x_i in enumerate(X):
                condition = y_[idx] * (np.dot(x_i, self.weights) - self.bias) >= 1
                if condition:
                    self.weights -= self.lr * (2 * self.lambda_param * self.weights)
                else:
                    self.weights -= self.lr * (2 * self.lambda_param * self.weights - np.dot(x_i, y_[idx]))
                    self.bias -= self.lr * y_[idx]

    def predict(self, X):
        approx = np.dot(X, self.weights) - self.bias
        return np.sign(approx)
3. Train the Custom SVM Classifier on the Training Set
# Instantiate and train the custom SVM classifier
custom_svm = LinearSVM()
custom_svm.fit(X_train, y_train)

4. Evaluate its Performance on the Testing Set
# Predict using the custom SVM classifier
y_pred_custom = custom_svm.predict(X_test)

# Calculate accuracy
accuracy_custom = np.mean(y_pred_custom == y_test)
print(f'Custom SVM Accuracy: {accuracy_custom * 100:.2f}%')

5. Compare Performance with the scikit-learn Implementation
from sklearn.svm import SVC

# Instantiate and train the scikit-learn SVM classifier
sklearn_svm = SVC(kernel='linear')
sklearn_svm.fit(X_train, y_train)

# Predict using the scikit-learn SVM classifier
y_pred_sklearn = sklearn_svm.predict(X_test)

# Calculate accuracy
accuracy_sklearn = np.mean(y_pred_sklearn == y_test)
print(f'Scikit-learn SVM Accuracy: {accuracy_sklearn * 100:.2f}%')

