# Q1.
### What is the mathematical formula for a linear SVM?

w(T) * x + b = 0.

# Q2.
### What is the objective function of a linear SVM?

y(sub i) * w(T) * x + b >= 1.

# Q3.
### What is the kernel trick in SVM?

- The kernel trick in SVM is a method to implicitly map the input data into a higher-dimensional space without explicitly computing the transformation. This is achieved by using kernel functions, which compute the dot product between data points in the higher-dimensional space without actually transforming the data. This allows SVMs to efficiently handle non-linear decision boundaries.

# Q4.
### What is the role of support vectors in SVM Explain with example

-  The support vectors in SVM are the data points that lie closest to the decision boundary. They are the critical points that determine the position and orientation of the decision boundary. These points have non-zero coefficients in the solution of the SVM optimization problem and play a crucial role in defining the margin. For example, in a binary classification problem with two classes, the support vectors are the data points from each class that are closest to the decision boundary.

# Q5.
###  Illustrate with examples and graphs of Hyperplane, Marginal plane, Soft margin and Hard margin in SVM?

- Hyperplane: In a binary classification problem, the hyperplane is the line (in 2D) or plane (in 3D) that separates the two classes. It is represented by the equation w(T) * x + b = 0.

- Marginal plane: The marginal planes are parallel planes to the hyperplane that pass through the support vectors. They define the margin of the SVM classifier.

- Soft margin: In a soft-margin SVM, some data points may be allowed to violate the margin if it leads to a better overall fit of the model. This allows for a more flexible decision boundary that can handle noisy or overlapping data.

- Hard margin: In a hard-margin SVM, no data points are allowed to violate the margin, meaning that the decision boundary is strictly determined by the support vectors.

# Q6.

In [1]:
from sklearn.datasets import load_iris
data = load_iris()

In [2]:
print(data.DESCR)

.. _iris_dataset:

Iris plants dataset
--------------------

**Data Set Characteristics:**

    :Number of Instances: 150 (50 in each of three classes)
    :Number of Attributes: 4 numeric, predictive attributes and the class
    :Attribute Information:
        - sepal length in cm
        - sepal width in cm
        - petal length in cm
        - petal width in cm
        - class:
                - Iris-Setosa
                - Iris-Versicolour
                - Iris-Virginica
                
    :Summary Statistics:

                    Min  Max   Mean    SD   Class Correlation
    sepal length:   4.3  7.9   5.84   0.83    0.7826
    sepal width:    2.0  4.4   3.05   0.43   -0.4194
    petal length:   1.0  6.9   3.76   1.76    0.9490  (high!)
    petal width:    0.1  2.5   1.20   0.76    0.9565  (high!)

    :Missing Attribute Values: None
    :Class Distribution: 33.3% for each of 3 classes.
    :Creator: R.A. Fisher
    :Donor: Michael Marshall (MARSHALL%PLU@io.arc.nasa.gov)
    :

In [7]:
import seaborn as sns
df = sns.load_dataset('iris')
df.head()

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species
0,5.1,3.5,1.4,0.2,setosa
1,4.9,3.0,1.4,0.2,setosa
2,4.7,3.2,1.3,0.2,setosa
3,4.6,3.1,1.5,0.2,setosa
4,5.0,3.6,1.4,0.2,setosa


In [8]:
X = df.iloc[:,:-1]
y = df.iloc[:,-1]

In [9]:
from sklearn.model_selection import train_test_split
X_train,X_test, y_train ,y_test = train_test_split(X,y, test_size = 0.25,random_state = 10)

In [10]:
from sklearn.svm import SVC

In [11]:
svc = SVC(kernel = 'linear')

In [12]:
svc.fit(X_train,y_train)

In [13]:
y_pred = svc.predict(X_test)
y_pred

array(['versicolor', 'virginica', 'setosa', 'versicolor', 'setosa',
       'versicolor', 'versicolor', 'versicolor', 'setosa', 'versicolor',
       'versicolor', 'virginica', 'versicolor', 'setosa', 'setosa',
       'virginica', 'versicolor', 'setosa', 'setosa', 'setosa',
       'virginica', 'virginica', 'virginica', 'setosa', 'versicolor',
       'setosa', 'versicolor', 'versicolor', 'versicolor', 'virginica',
       'versicolor', 'versicolor', 'virginica', 'virginica', 'virginica',
       'setosa', 'virginica', 'virginica'], dtype=object)

In [14]:
from sklearn.metrics import accuracy_score

In [15]:
print(accuracy_score(y_test,y_pred))

1.0


In [16]:
from sklearn.model_selection import GridSearchCV

param = {'C' : [0.1,1,10,100,1000],
         'gamma' : [1,0.1,0.01,0.001,0.0001],
         'kernel':['linear']
        }

In [19]:
grid = GridSearchCV(SVC(),param_grid = param ,refit=True, cv = 5 , verbose = 2)

In [20]:
grid.fit(X_train,y_train)

Fitting 5 folds for each of 25 candidates, totalling 125 fits
[CV] END ......................C=0.1, gamma=1, kernel=linear; total time=   0.0s
[CV] END ......................C=0.1, gamma=1, kernel=linear; total time=   0.0s
[CV] END ......................C=0.1, gamma=1, kernel=linear; total time=   0.0s
[CV] END ......................C=0.1, gamma=1, kernel=linear; total time=   0.0s
[CV] END ......................C=0.1, gamma=1, kernel=linear; total time=   0.0s
[CV] END ....................C=0.1, gamma=0.1, kernel=linear; total time=   0.0s
[CV] END ....................C=0.1, gamma=0.1, kernel=linear; total time=   0.0s
[CV] END ....................C=0.1, gamma=0.1, kernel=linear; total time=   0.0s
[CV] END ....................C=0.1, gamma=0.1, kernel=linear; total time=   0.0s
[CV] END ....................C=0.1, gamma=0.1, kernel=linear; total time=   0.0s
[CV] END ...................C=0.1, gamma=0.01, kernel=linear; total time=   0.0s
[CV] END ...................C=0.1, gamma=0.01, 

In [21]:
grid.best_params_

{'C': 1, 'gamma': 1, 'kernel': 'linear'}

In [22]:
y_pred = grid.predict(X_test)
print(accuracy_score(y_test,y_pred))

1.0
