## Ques 1:

### Ans: The mathematical formula for a linear SVM can be written as:
### w * x + b = 0
### where w is the weight vector, x is the input feature vector, b is the bias term, and the expression w * x + b represents the decision function of the SVM. The SVM tries to find the optimal values of w and b such that the decision function separates the positive and negative instances with the largest margin, while minimizing the classification error.

## Ques 2:

### Ans: The objective function of a linear SVM is to maximize the margin between the decision boundary and the closest data points, subject to the constraint that all data points are correctly classified.
### Mathematically, the objective function of a linear SVM can be expressed as:
### minimize: (1/2) * ||w||^2
### subject to:
### y_i * (w * x_i + b) >= 1
### for all i = 1, ..., n
### where w is the weight vector, b is the bias term, x_i is the i-th input feature vector, y_i is the corresponding class label (+1 or -1), and n is the number of training instances.
### The objective function aims to find the decision boundary that maximizes the margin between the positive and negative instances, while ensuring that all instances are correctly classified with a margin of at least 1. The regularization term (1/2) * ||w||^2 penalizes large values of the weight vector and helps to prevent overfitting.

## Ques 3:

### Ans: The kernel trick is a technique used in SVMs to transform the input features of a dataset into a higher-dimensional space without explicitly computing the transformation. This is done by defining a kernel function that computes the dot product between pairs of transformed input feature vectors in the higher-dimensional space, without actually computing the transformation itself.
### The kernel function is defined such that it satisfies the Mercer's condition, which ensures that the dot products computed by the kernel function correspond to the inner products in some high-dimensional feature space. The most commonly used kernels are the linear kernel, polynomial kernel, and the Gaussian radial basis function (RBF) kernel.
### By using the kernel trick, the SVM can learn a nonlinear decision boundary in the transformed feature space, while still operating in the original input feature space. This makes SVMs efficient and computationally feasible for high-dimensional datasets, as the transformation to the higher-dimensional space is avoided, and the kernel function can be computed more efficiently.

## Ques 4:

### Ans: Support vectors are the training instances that lie closest to the decision boundary in an SVM. These instances play a crucial role in defining the decision boundary and determining the margin of the classifier. In fact, the decision boundary of an SVM is determined entirely by the support vectors.
### During the training process of an SVM, the algorithm tries to find the hyperplane that maximizes the margin between the support vectors of different classes. The instances that are not support vectors do not affect the location or orientation of the decision boundary, and hence are not used in the final model.

## Ques 5:

### Ans: SVM stands for Support Vector Machines, which is a supervised machine learning algorithm used for classification and regression analysis. SVMs are based on the concept of finding a hyperplane that best separates different classes in a dataset. In this context, a hyperplane is a line or a plane that divides a set of points into two regions. The goal of SVM is to find the hyperplane that maximizes the margin between the two classes. The margin is defined as the distance between the hyperplane and the nearest data points from each class.

## Ques 6:

### Ans: 

In [40]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline

In [2]:
df = sns.load_dataset('iris')

In [3]:
df.head()

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species
0,5.1,3.5,1.4,0.2,setosa
1,4.9,3.0,1.4,0.2,setosa
2,4.7,3.2,1.3,0.2,setosa
3,4.6,3.1,1.5,0.2,setosa
4,5.0,3.6,1.4,0.2,setosa


In [4]:
df['species'].value_counts()

setosa        50
versicolor    50
virginica     50
Name: species, dtype: int64

In [8]:
df['species']=df['species'].replace('setosa',-1)
df['species']=df['species'].replace('versicolor',0)
df['species']=df['species'].replace('virginica',1)

In [29]:
## Splitting the data into independent and dependent variables
X = df.iloc[:,:-1]
y = df.species

In [28]:
X.head()

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width
0,5.1,3.5,1.4,0.2
1,4.9,3.0,1.4,0.2
2,4.7,3.2,1.3,0.2
3,4.6,3.1,1.5,0.2
4,5.0,3.6,1.4,0.2


In [30]:
## Spltting the data into train and test
from sklearn.model_selection import  train_test_split

In [31]:
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.25,random_state=0)

In [32]:
X_train.head()

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width
61,5.9,3.0,4.2,1.5
92,5.8,2.6,4.0,1.2
112,6.8,3.0,5.5,2.1
2,4.7,3.2,1.3,0.2
141,6.9,3.1,5.1,2.3


#### Here, there are three classes to classify so I will used one vs rest method to classify 

In [26]:
from sklearn.svm import SVC

In [34]:
classifier = SVC(kernel='linear', C=1, decision_function_shape='ovr')

In [35]:
classifier.fit(X_train, y_train)

In [36]:
y_pred = classifier.predict(X_test)

In [37]:
y_pred

array([ 1,  0, -1,  1, -1,  1, -1,  0,  0,  0,  1,  0,  0,  0,  0, -1,  0,
        0, -1, -1,  1,  0, -1, -1,  1, -1, -1,  0,  0, -1,  1,  0, -1,  1,
        1,  0, -1,  1])

In [38]:
from sklearn.metrics import accuracy_score, classification_report

In [39]:
print(accuracy_score(y_test,y_pred))
print(classification_report(y_test,y_pred))

0.9736842105263158
              precision    recall  f1-score   support

          -1       1.00      1.00      1.00        13
           0       1.00      0.94      0.97        16
           1       0.90      1.00      0.95         9

    accuracy                           0.97        38
   macro avg       0.97      0.98      0.97        38
weighted avg       0.98      0.97      0.97        38



### Trying different values of the regularisation parameter C to see how it affects the performance of the model.

In [43]:
for c in [0.1, 1, 10]:
    classifier = SVC(kernel='linear',C=c, decision_function_shape='ovr')
    classifier.fit(X_train, y_train)
    y_pred = classifier.predict(X_test)
    accuracy = accuracy_score(y_test, y_pred)
    print(f"Accuracy for C={c}: {accuracy}")

Accuracy for C=0.1: 0.9736842105263158
Accuracy for C=1: 0.9736842105263158
Accuracy for C=10: 0.9736842105263158
