## Question - 1
ans  - 

1. Polynomial Functions:

* Definition: Polynomial functions are mathematical functions involving variables raised to whole number powers. In the context of machine learning, polynomial functions can be used to model non-linear relationships between input features.

* Example: 
f(x)= ax^2 + bx + c


2. Kernel Functions:

* Definition: Kernel functions are functions that compute the similarity or distance between pairs of data points in a higher-dimensional space. They are used in algorithms like Support Vector Machines (SVMs) to implicitly map data into a higher-dimensional space without explicitly computing the transformation.

* Purpose: Kernels allow linear models to capture complex, non-linear relationships in the data by transforming the input space.

* Example (Polynomial Kernel): 
K(x,y)=(x^T . y + c)^d
 
3. Polynomial Kernel in SVMs:

* Usage: SVMs with a polynomial kernel use polynomial functions to implicitly map the input data into a higher-dimensional space.

* Equation: The polynomial kernel is defined as 
K(x,y)=(xT . y + c) 
d 
 , where 
�
x and 
�
y are data points, 
�
c is a constant, and 
�
d is the degree of the polynomial.
Effect: The polynomial kernel introduces non-linearity into the decision boundary, allowing SVMs to capture complex patterns.
4. Relationship:
The relationship lies in the fact that polynomial functions are used as the basis for polynomial kernel functions in SVMs.
Polynomial kernel functions compute the similarity (dot product) between data points as if they were mapped into a higher-dimensional space using a polynomial transformation.
This relationship allows SVMs to effectively handle non-linear relationships in the data by implicitly working in a higher-dimensional space without explicitly performing the transformation.
5. Degree of Polynomial:
The degree of the polynomial in both polynomial functions and polynomial kernel functions is a critical parameter. It determines the complexity of the non-linear relationships that can be captured. Higher degrees can capture more complex patterns but may also lead to overfitting.

## Question - 2
ans  -

In [11]:
from sklearn.datasets import make_classification

x,y = make_classification(n_samples = 1000 , n_features=2 , n_classes=2, n_redundant=0 , n_informative = 2, random_state=42)

In [12]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt


In [13]:
from sklearn.model_selection import train_test_split

x_train ,x_test , y_train , y_test = train_test_split(x,y,test_size=0.25,random_state=42)

In [14]:
from sklearn.svm import SVC

svc = SVC(kernel = 'poly')

svc.fit(x_train , y_train)


In [15]:
y_pred = svc.predict(x_test)

In [16]:
from sklearn.metrics import accuracy_score

print(accuracy_score(y_pred , y_test))

0.88


## Question - 3
ans - 

increasing the value of epsilon in SVR tends to result in a wider ε-insensitive tube, allowing for larger errors and potentially reducing the number of support vectors. The choice of epsilon should be based on the characteristics of the data and the desired balance between model complexity and generalization.

## Question - 4
ans - 

1. Kernel Function:

* Purpose: The kernel function determines the type of mapping applied to the input features to transform them into a higher-dimensional space. Common choices include linear, polynomial, and radial basis function (RBF or Gaussian) kernels.

* Impact:
A linear kernel is suitable for linear relationships.
Polynomial kernels introduce non-linearity and are effective for capturing complex patterns.
RBF kernels are versatile and can capture intricate relationships.

2. C Parameter:

* Purpose: The C parameter controls the trade-off between a smooth decision boundary and accurately fitting the training data.

* Impact:
Smaller C values lead to a softer margin, allowing more errors but promoting a smoother decision boundary.
Larger C values result in a harder margin, minimizing errors but potentially overfitting to noise in the training data.

* Example:
Increase C when the training data is expected to have minimal noise.
Decrease C when the model should be more tolerant of errors.

3. Epsilon Parameter (ε):

* Purpose: The epsilon parameter defines the width of the ε-insensitive tube around the predicted values. It determines the range within which errors are not penalized.

* Impact:
Smaller ε values lead to a narrower tube, making the model more sensitive to errors.
Larger ε values result in a wider tube, allowing for larger errors without penalty.

* Example:
Increase ε when the goal is to create a more flexible model that tolerates larger errors.
Decrease ε when a more precise fit to the training data is desired.

4. Gamma Parameter (γ):

* Purpose: The gamma parameter defines the influence of a single training example, affecting the shape of the decision boundary.

* Impact:
Smaller γ values lead to a broader decision boundary, making the model more influenced by distant points.
Larger γ values result in a narrower decision boundary, making the model more focused on nearby points.

* Example:
Increase γ when the training data is expected to be concentrated in certain regions.
Decrease γ when the model should consider a wider range of data points.


5. Overall Considerations:


The choice of hyperparameters often involves experimentation, grid search, and cross-validation.
Performance may vary depending on the characteristics of the data, such as its size, noise level, and complexity.
Regularization parameters (C, ε) balance the fit to the training data with the goal of generalization to new data.
Kernel parameters (kernel, γ) influence the model's ability to capture non-linear relationships.

## Question - 5
ans

In [17]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

from warnings import filterwarnings 
filterwarnings('ignore')

In [18]:
from sklearn.datasets import load_iris

iris = load_iris()

In [20]:
iris.feature_names

['sepal length (cm)',
 'sepal width (cm)',
 'petal length (cm)',
 'petal width (cm)']

In [21]:
x = pd.DataFrame(iris.data , columns = iris.feature_names)

In [22]:
x.head()

Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm)
0,5.1,3.5,1.4,0.2
1,4.9,3.0,1.4,0.2
2,4.7,3.2,1.3,0.2
3,4.6,3.1,1.5,0.2
4,5.0,3.6,1.4,0.2


In [23]:
y = iris.target

In [24]:
y

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])

In [25]:
from sklearn.model_selection import train_test_split

x_train , x_test , y_train , y_test = train_test_split(x,y , test_size=0.33 , random_state=42)

In [26]:
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()

x_train_scaled = scaler.fit_transform(x_train)
x_test_scaled = scaler.transform(x_test)

In [27]:
from sklearn.svm import SVC

svc_classifier = SVC()

svc_classifier.fit(x_train_scaled , y_train)


In [30]:
y_pred = svc_classifier.predict(x_test_scaled)

y_pred

array([1, 0, 2, 1, 1, 0, 1, 2, 1, 1, 2, 0, 0, 0, 0, 1, 2, 1, 1, 2, 0, 2,
       0, 2, 2, 2, 2, 2, 0, 0, 0, 0, 1, 0, 0, 2, 1, 0, 0, 0, 2, 1, 1, 0,
       0, 1, 1, 2, 1, 2])

In [29]:
from sklearn.metrics import classification_report , accuracy_score

print(accuracy_score(y_pred , y_test))
print(classification_report(y_pred , y_test))

0.98
              precision    recall  f1-score   support

           0       1.00      1.00      1.00        19
           1       1.00      0.94      0.97        16
           2       0.94      1.00      0.97        15

    accuracy                           0.98        50
   macro avg       0.98      0.98      0.98        50
weighted avg       0.98      0.98      0.98        50



In [31]:
parameters = {'C':[1 , 2 , 10 , 20 , 12],
             'gamma':[0.1 , 0.001 , 0.0001 , 10 , 0.3],
             'kernel':['linear' , 'poly' , 'rbf' , 'sigmoid'],
             }

from sklearn.model_selection import GridSearchCV

grid_clf = GridSearchCV(SVC() , param_grid= parameters , cv = 5 , refit=True , verbose=3)

grid_clf.fit(x_train_scaled , y_train)

Fitting 5 folds for each of 100 candidates, totalling 500 fits
[CV 1/5] END .....C=1, gamma=0.1, kernel=linear;, score=1.000 total time=   0.0s
[CV 2/5] END .....C=1, gamma=0.1, kernel=linear;, score=0.800 total time=   0.0s
[CV 3/5] END .....C=1, gamma=0.1, kernel=linear;, score=0.900 total time=   0.0s
[CV 4/5] END .....C=1, gamma=0.1, kernel=linear;, score=1.000 total time=   0.0s
[CV 5/5] END .....C=1, gamma=0.1, kernel=linear;, score=0.950 total time=   0.0s
[CV 1/5] END .......C=1, gamma=0.1, kernel=poly;, score=0.800 total time=   0.0s
[CV 2/5] END .......C=1, gamma=0.1, kernel=poly;, score=0.750 total time=   0.0s
[CV 3/5] END .......C=1, gamma=0.1, kernel=poly;, score=0.750 total time=   0.0s
[CV 4/5] END .......C=1, gamma=0.1, kernel=poly;, score=0.850 total time=   0.0s
[CV 5/5] END .......C=1, gamma=0.1, kernel=poly;, score=0.750 total time=   0.0s
[CV 1/5] END ........C=1, gamma=0.1, kernel=rbf;, score=0.950 total time=   0.0s
[CV 2/5] END ........C=1, gamma=0.1, kernel=rb

In [34]:
y_pred1 = grid_clf.predict(x_test_scaled)
y_pred1

array([1, 0, 2, 1, 1, 0, 1, 2, 1, 1, 2, 0, 0, 0, 0, 1, 2, 1, 1, 2, 0, 2,
       0, 2, 2, 2, 2, 2, 0, 0, 0, 0, 1, 0, 0, 2, 1, 0, 0, 0, 2, 1, 1, 0,
       0, 1, 1, 2, 1, 2])

In [35]:
print(accuracy_score(y_test , y_pred1))

0.98


In [36]:
grid_clf.best_params_

{'C': 1, 'gamma': 0.1, 'kernel': 'sigmoid'}

In [43]:
tuned_clf = SVC(C = 1 , gamma = 0.1 , kernel = 'sigmoid')

tuned_clf.fit(x_train_scaled,y_train)


In [44]:
y_pred2 = tuned_clf.predict(x_test_scaled)
y_pred2

array([1, 0, 2, 1, 1, 0, 1, 2, 1, 1, 2, 0, 0, 0, 0, 1, 2, 1, 1, 2, 0, 2,
       0, 2, 2, 2, 2, 2, 0, 0, 0, 0, 1, 0, 0, 2, 1, 0, 0, 0, 2, 1, 1, 0,
       0, 1, 1, 2, 1, 2])

In [45]:
print(accuracy_score(y_test , y_pred2))

0.98
