## Q1. What is the relationship between polynomial functions and kernel functions in machine learning algorithms?

Polynomial functions and kernel functions are both mathematical tools used in machine learning, particularly in the context of support vector machines (SVMs) and kernel methods. While they serve different purposes, there is a connection between them.

1. **Polynomial Functions**:
   - A polynomial function is a mathematical function of the form f(x) = a_n * x^n + a_(n-1) * x^(n-1) + ... + a_1 * x + a_0, where x is the input variable, and the a_i coefficients are constants.
   - Polynomial functions are often used to model relationships between variables in various machine learning tasks. For example, polynomial regression uses polynomial functions to fit curves to data points.

2. **Kernel Functions**:
   - Kernel functions, in the context of SVMs and kernel methods, are used to transform data from the original feature space into a higher-dimensional space. These transformations are used to make it easier to separate data points into different classes in cases where a linear boundary doesn't work well.
   - Common kernel functions include the linear kernel, polynomial kernel, radial basis function (RBF) kernel, and others.
   - The choice of kernel function affects the SVM's ability to find a decision boundary that best separates data.

**Relationship between Polynomial Functions and Kernel Functions**:
- The polynomial kernel function in SVMs is a specific type of kernel function that uses polynomial functions to transform data.
- The polynomial kernel K(x, y) between two data points x and y is defined as K(x, y) = (a * x^T * y + c)^d, where 'a' is a scaling factor, 'c' is a constant term, and 'd' is the degree of the polynomial.
- Essentially, the polynomial kernel calculates the inner product of the transformed feature vectors in a higher-dimensional space, allowing SVMs to create nonlinear decision boundaries in the original feature space.

So, the relationship between polynomial functions and kernel functions in machine learning is that polynomial functions are used within specific kernel functions (polynomial kernels) to enable SVMs to model nonlinear relationships in data. Kernel functions, in general, provide a way to implicitly map data into higher-dimensional spaces, and polynomial kernels are a specific instance of this transformation technique.

## Q2. How can we implement an SVM with a polynomial kernel in Python using Scikit-learn?

The kernel parameter is used to specify the kernel in SVM. 

In [1]:
from sklearn.svm import SVC

model=SVC(kernel='polynomial')
model

## Q3. How does increasing the value of epsilon affect the number of support vectors in SVR?

In Support Vector Regression (SVR), epsilon (ε) is a hyperparameter that controls the width of the epsilon-insensitive tube around the regression line (or hyperplane in higher dimensions). The epsilon-insensitive tube is a region within which errors are not penalized, meaning that data points falling inside this tube do not contribute to the loss function. Only data points outside the tube or within a certain distance from it contribute to the loss.

The relationship between the value of epsilon and the number of support vectors in SVR can be summarized as follows:

1. **Small Epsilon (Tight Tube)**:
   - When you set a small value for epsilon (ε), you create a tight epsilon-insensitive tube around the regression line. This means that only data points very close to the regression line are considered as support vectors.
   - Fewer data points fall outside the tube, and therefore, fewer data points become support vectors.

2. **Large Epsilon (Wide Tube)**:
   - When you set a large value for epsilon (ε), you create a wide epsilon-insensitive tube around the regression line. This allows data points to be farther from the regression line while still not contributing to the loss function.
   - More data points may fall within the tube, and consequently, more data points can become support vectors.

The choice of epsilon in SVR directly affects the number of support vectors. Smaller epsilon values lead to a smaller number of support vectors because they result in a narrower tolerance for errors, whereas larger epsilon values lead to a larger number of support vectors because they allow for a wider margin of acceptable error.

It's essential to strike a balance when choosing the epsilon value because a very small epsilon can lead to overfitting (few support vectors), while a very large epsilon may result in a model that is too loose (many support vectors) and has less predictive power. The optimal epsilon value depends on the specific dataset and problem you are working on and often requires experimentation or cross-validation to find the best value.

## Q4. How does the choice of kernel function, C parameter, epsilon parameter, and gamma parameter affect the performance of Support Vector Regression (SVR)? Can you explain how each parameter works and provide examples of when you might want to increase or decrease its value?

The performance of an SVR model depends on several hyperparameters, including the choice of kernel function, C parameter, epsilon parameter (ε), and gamma parameter (γ). 

1. **Choice of Kernel Function**:
   - **Linear Kernel**: Suitable for linear relationships between input features. It performs well when the relationship between input and output is approximately linear.
   - **Polynomial Kernel**: Useful for capturing moderately non-linear relationships. You can control the degree of the polynomial with the `degree` parameter.
   - **Radial Basis Function (RBF) Kernel**: A versatile choice for capturing complex, non-linear relationships. It's the default kernel and often performs well in practice.

   **When to Choose**:
   - Use the linear kernel when the relationship is linear or nearly linear.
   - Use the polynomial or RBF kernel when the relationship is non-linear. The choice between them may require experimentation.

2. **C Parameter**:
   - The C parameter controls the trade-off between minimizing the training error and ensuring that the margin (epsilon-insensitive tube) is as large as possible.
   - Smaller C values result in a wider margin but may allow for more training errors. Larger C values reduce training errors but may lead to overfitting and a narrower margin.

   **When to Increase/Decrease**:
   - Increase C when you suspect the model is underfitting and you want to reduce training errors.
   - Decrease C when you observe overfitting or want to prioritize a wider margin.

3. **Epsilon Parameter (ε)**:
   - Epsilon determines the width of the epsilon-insensitive tube around the regression line. Data points within this tube do not contribute to the loss function.
   - Smaller ε values create a narrow tube, making the model more sensitive to deviations from the target.
   - Larger ε values create a wider tube, allowing for larger errors without penalty.

   **When to Increase/Decrease**:
   - Increase ε if you want to allow for more tolerance to errors in your predictions.
   - Decrease ε if you need the model to be more sensitive to small errors and have a tighter fit around the target values.

4. **Gamma Parameter (γ)**:
   - Gamma determines the shape of the RBF kernel. A smaller gamma makes the kernel more spread out, while a larger gamma makes it more peaked.
   - Higher gamma values result in a more complex decision boundary and can lead to overfitting.

   **When to Increase/Decrease**:
   - Increase γ when you want to create a more complex, localized decision boundary.
   - Decrease γ when you want a smoother, more global decision boundary.



## Q5. Assignment:

## Import the necessary libraries and load the dataset

## Split the dataset into training and testing set

## Preprocess the data using any technique of your choice (e.g. scaling, normalization)

## Create an instance of the SVC classifier and train it on the training data

## use the trained classifier to predict the labels of the testing data

## Evaluate the performance of the classifier using any metric of your choice (e.g. accuracy, precision, recall, F1-score

## Tune the hyperparameters of the SVC classifier using GridSearchCV or RandomizedSearchCV to improve its performance

## Train the tuned classifier on the entire dataset

## Save the trained classifier to a file for future use.

In [21]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

In [22]:
from sklearn.datasets import load_breast_cancer

diabetes=load_breast_cancer()
diabetes.keys()

df=pd.DataFrame(diabetes.data, columns=diabetes.feature_names)
df['target']=diabetes.target

df.head()

Unnamed: 0,mean radius,mean texture,mean perimeter,mean area,mean smoothness,mean compactness,mean concavity,mean concave points,mean symmetry,mean fractal dimension,...,worst texture,worst perimeter,worst area,worst smoothness,worst compactness,worst concavity,worst concave points,worst symmetry,worst fractal dimension,target
0,17.99,10.38,122.8,1001.0,0.1184,0.2776,0.3001,0.1471,0.2419,0.07871,...,17.33,184.6,2019.0,0.1622,0.6656,0.7119,0.2654,0.4601,0.1189,0
1,20.57,17.77,132.9,1326.0,0.08474,0.07864,0.0869,0.07017,0.1812,0.05667,...,23.41,158.8,1956.0,0.1238,0.1866,0.2416,0.186,0.275,0.08902,0
2,19.69,21.25,130.0,1203.0,0.1096,0.1599,0.1974,0.1279,0.2069,0.05999,...,25.53,152.5,1709.0,0.1444,0.4245,0.4504,0.243,0.3613,0.08758,0
3,11.42,20.38,77.58,386.1,0.1425,0.2839,0.2414,0.1052,0.2597,0.09744,...,26.5,98.87,567.7,0.2098,0.8663,0.6869,0.2575,0.6638,0.173,0
4,20.29,14.34,135.1,1297.0,0.1003,0.1328,0.198,0.1043,0.1809,0.05883,...,16.67,152.2,1575.0,0.1374,0.205,0.4,0.1625,0.2364,0.07678,0


In [27]:
from sklearn.model_selection import train_test_split #train and test split

x=df.drop('target',axis=1)
y=df.target
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3, random_state=42)
x_train.shape,x_test.shape

((398, 30), (171, 30))

In [28]:
from sklearn.preprocessing import StandardScaler #scaling

scaler = StandardScaler()
x_train = scaler.fit_transform(x_train)
x_test = scaler.transform(x_test)

In [29]:
from sklearn.svm import SVC #model training

svc=SVC()
svc.fit(x_train,y_train)

In [30]:
y_pred=svc.predict(x_test) #model prediction
y_pred

array([1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1,
       0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1,
       1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 1,
       0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0,
       1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1,
       0, 1, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0,
       1, 1, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1,
       1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1])

In [38]:
from sklearn.metrics import accuracy_score, confusion_matrix,precision_score, recall_score, f1_score  #performance metrics

print("accuracy_score: ",accuracy_score(y_test,y_pred))
print("precision: ",precision_score(y_test,y_pred))
print("recall: ",recall_score(y_test,y_pred))
print("f1_score: ",f1_score(y_test,y_pred))
print("confusion_matrix: \n",confusion_matrix(y_test,y_pred))

accuracy_score:  0.9766081871345029
precision:  0.9814814814814815
recall:  0.9814814814814815
f1_score:  0.9814814814814815
confusion_matrix: 
 [[ 61   2]
 [  2 106]]


In [50]:
from sklearn.model_selection import GridSearchCV  #hyperparameter tuning

parameter={
    'C':list(range(5)),
    'kernel':['linear', 'poly', 'rbf'],
    'degree':list(range(5)),
    'gamma': ('scale', 'auto')
          }

cv=GridSearchCV(SVC(),param_grid=parameter, cv=5, scoring='accuracy', verbose=3)
cv.fit(x_train,y_train)

Fitting 5 folds for each of 150 candidates, totalling 750 fits
[CV 1/5] END C=0, degree=0, gamma=scale, kernel=linear;, score=nan total time=   0.0s
[CV 2/5] END C=0, degree=0, gamma=scale, kernel=linear;, score=nan total time=   0.0s
[CV 3/5] END C=0, degree=0, gamma=scale, kernel=linear;, score=nan total time=   0.0s
[CV 4/5] END C=0, degree=0, gamma=scale, kernel=linear;, score=nan total time=   0.0s
[CV 5/5] END C=0, degree=0, gamma=scale, kernel=linear;, score=nan total time=   0.0s
[CV 1/5] END C=0, degree=0, gamma=scale, kernel=poly;, score=nan total time=   0.0s
[CV 2/5] END C=0, degree=0, gamma=scale, kernel=poly;, score=nan total time=   0.0s
[CV 3/5] END C=0, degree=0, gamma=scale, kernel=poly;, score=nan total time=   0.0s
[CV 4/5] END C=0, degree=0, gamma=scale, kernel=poly;, score=nan total time=   0.0s
[CV 5/5] END C=0, degree=0, gamma=scale, kernel=poly;, score=nan total time=   0.0s
[CV 1/5] END C=0, degree=0, gamma=scale, kernel=rbf;, score=nan total time=   0.0s
[CV 

[CV 2/5] END C=1, degree=0, gamma=scale, kernel=poly;, score=0.625 total time=   0.0s
[CV 3/5] END C=1, degree=0, gamma=scale, kernel=poly;, score=0.625 total time=   0.0s
[CV 4/5] END C=1, degree=0, gamma=scale, kernel=poly;, score=0.633 total time=   0.0s
[CV 5/5] END C=1, degree=0, gamma=scale, kernel=poly;, score=0.620 total time=   0.0s
[CV 1/5] END C=1, degree=0, gamma=scale, kernel=rbf;, score=0.975 total time=   0.0s
[CV 2/5] END C=1, degree=0, gamma=scale, kernel=rbf;, score=0.963 total time=   0.0s
[CV 3/5] END C=1, degree=0, gamma=scale, kernel=rbf;, score=0.988 total time=   0.0s
[CV 4/5] END C=1, degree=0, gamma=scale, kernel=rbf;, score=0.975 total time=   0.0s
[CV 5/5] END C=1, degree=0, gamma=scale, kernel=rbf;, score=0.937 total time=   0.0s
[CV 1/5] END C=1, degree=0, gamma=auto, kernel=linear;, score=0.975 total time=   0.0s
[CV 2/5] END C=1, degree=0, gamma=auto, kernel=linear;, score=0.975 total time=   0.0s
[CV 3/5] END C=1, degree=0, gamma=auto, kernel=linear;, s

[CV 4/5] END C=1, degree=3, gamma=auto, kernel=linear;, score=0.975 total time=   0.0s
[CV 5/5] END C=1, degree=3, gamma=auto, kernel=linear;, score=0.949 total time=   0.0s
[CV 1/5] END C=1, degree=3, gamma=auto, kernel=poly;, score=0.875 total time=   0.0s
[CV 2/5] END C=1, degree=3, gamma=auto, kernel=poly;, score=0.900 total time=   0.0s
[CV 3/5] END C=1, degree=3, gamma=auto, kernel=poly;, score=0.912 total time=   0.0s
[CV 4/5] END C=1, degree=3, gamma=auto, kernel=poly;, score=0.937 total time=   0.0s
[CV 5/5] END C=1, degree=3, gamma=auto, kernel=poly;, score=0.861 total time=   0.0s
[CV 1/5] END C=1, degree=3, gamma=auto, kernel=rbf;, score=0.975 total time=   0.0s
[CV 2/5] END C=1, degree=3, gamma=auto, kernel=rbf;, score=0.963 total time=   0.0s
[CV 3/5] END C=1, degree=3, gamma=auto, kernel=rbf;, score=0.988 total time=   0.0s
[CV 4/5] END C=1, degree=3, gamma=auto, kernel=rbf;, score=0.975 total time=   0.0s
[CV 5/5] END C=1, degree=3, gamma=auto, kernel=rbf;, score=0.937 

[CV 4/5] END C=2, degree=1, gamma=auto, kernel=rbf;, score=0.975 total time=   0.0s
[CV 5/5] END C=2, degree=1, gamma=auto, kernel=rbf;, score=0.975 total time=   0.0s
[CV 1/5] END C=2, degree=2, gamma=scale, kernel=linear;, score=0.975 total time=   0.0s
[CV 2/5] END C=2, degree=2, gamma=scale, kernel=linear;, score=0.963 total time=   0.0s
[CV 3/5] END C=2, degree=2, gamma=scale, kernel=linear;, score=1.000 total time=   0.0s
[CV 4/5] END C=2, degree=2, gamma=scale, kernel=linear;, score=0.975 total time=   0.0s
[CV 5/5] END C=2, degree=2, gamma=scale, kernel=linear;, score=0.937 total time=   0.0s
[CV 1/5] END C=2, degree=2, gamma=scale, kernel=poly;, score=0.825 total time=   0.0s
[CV 2/5] END C=2, degree=2, gamma=scale, kernel=poly;, score=0.850 total time=   0.0s
[CV 3/5] END C=2, degree=2, gamma=scale, kernel=poly;, score=0.787 total time=   0.0s
[CV 4/5] END C=2, degree=2, gamma=scale, kernel=poly;, score=0.810 total time=   0.0s
[CV 5/5] END C=2, degree=2, gamma=scale, kernel=

[CV 5/5] END C=3, degree=0, gamma=scale, kernel=poly;, score=0.620 total time=   0.0s
[CV 1/5] END C=3, degree=0, gamma=scale, kernel=rbf;, score=0.988 total time=   0.0s
[CV 2/5] END C=3, degree=0, gamma=scale, kernel=rbf;, score=0.963 total time=   0.0s
[CV 3/5] END C=3, degree=0, gamma=scale, kernel=rbf;, score=0.975 total time=   0.0s
[CV 4/5] END C=3, degree=0, gamma=scale, kernel=rbf;, score=0.975 total time=   0.0s
[CV 5/5] END C=3, degree=0, gamma=scale, kernel=rbf;, score=0.962 total time=   0.0s
[CV 1/5] END C=3, degree=0, gamma=auto, kernel=linear;, score=0.988 total time=   0.0s
[CV 2/5] END C=3, degree=0, gamma=auto, kernel=linear;, score=0.950 total time=   0.0s
[CV 3/5] END C=3, degree=0, gamma=auto, kernel=linear;, score=1.000 total time=   0.0s
[CV 4/5] END C=3, degree=0, gamma=auto, kernel=linear;, score=0.975 total time=   0.0s
[CV 5/5] END C=3, degree=0, gamma=auto, kernel=linear;, score=0.949 total time=   0.0s
[CV 1/5] END C=3, degree=0, gamma=auto, kernel=poly;, 

[CV 5/5] END C=3, degree=3, gamma=auto, kernel=poly;, score=0.873 total time=   0.0s
[CV 1/5] END C=3, degree=3, gamma=auto, kernel=rbf;, score=0.988 total time=   0.0s
[CV 2/5] END C=3, degree=3, gamma=auto, kernel=rbf;, score=0.963 total time=   0.0s
[CV 3/5] END C=3, degree=3, gamma=auto, kernel=rbf;, score=0.975 total time=   0.0s
[CV 4/5] END C=3, degree=3, gamma=auto, kernel=rbf;, score=0.975 total time=   0.0s
[CV 5/5] END C=3, degree=3, gamma=auto, kernel=rbf;, score=0.962 total time=   0.0s
[CV 1/5] END C=3, degree=4, gamma=scale, kernel=linear;, score=0.988 total time=   0.0s
[CV 2/5] END C=3, degree=4, gamma=scale, kernel=linear;, score=0.950 total time=   0.0s
[CV 3/5] END C=3, degree=4, gamma=scale, kernel=linear;, score=1.000 total time=   0.0s
[CV 4/5] END C=3, degree=4, gamma=scale, kernel=linear;, score=0.975 total time=   0.0s
[CV 5/5] END C=3, degree=4, gamma=scale, kernel=linear;, score=0.949 total time=   0.0s
[CV 1/5] END C=3, degree=4, gamma=scale, kernel=poly;, 

[CV 3/5] END C=4, degree=2, gamma=scale, kernel=rbf;, score=0.975 total time=   0.0s
[CV 4/5] END C=4, degree=2, gamma=scale, kernel=rbf;, score=0.975 total time=   0.0s
[CV 5/5] END C=4, degree=2, gamma=scale, kernel=rbf;, score=0.962 total time=   0.0s
[CV 1/5] END C=4, degree=2, gamma=auto, kernel=linear;, score=0.963 total time=   0.0s
[CV 2/5] END C=4, degree=2, gamma=auto, kernel=linear;, score=0.950 total time=   0.0s
[CV 3/5] END C=4, degree=2, gamma=auto, kernel=linear;, score=1.000 total time=   0.0s
[CV 4/5] END C=4, degree=2, gamma=auto, kernel=linear;, score=0.975 total time=   0.0s
[CV 5/5] END C=4, degree=2, gamma=auto, kernel=linear;, score=0.949 total time=   0.0s
[CV 1/5] END C=4, degree=2, gamma=auto, kernel=poly;, score=0.850 total time=   0.0s
[CV 2/5] END C=4, degree=2, gamma=auto, kernel=poly;, score=0.850 total time=   0.0s
[CV 3/5] END C=4, degree=2, gamma=auto, kernel=poly;, score=0.825 total time=   0.0s
[CV 4/5] END C=4, degree=2, gamma=auto, kernel=poly;, s

150 fits failed out of a total of 750.
The score on these train-test partitions for these parameters will be set to nan.
If these failures are not expected, you can try to debug them by setting error_score='raise'.

Below are more details about the failures:
--------------------------------------------------------------------------------
150 fits failed with the following error:
Traceback (most recent call last):
  File "C:\Users\tanji\AppData\Local\Programs\Python\Python310\lib\site-packages\sklearn\model_selection\_validation.py", line 686, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "C:\Users\tanji\AppData\Local\Programs\Python\Python310\lib\site-packages\sklearn\svm\_base.py", line 180, in fit
    self._validate_params()
  File "C:\Users\tanji\AppData\Local\Programs\Python\Python310\lib\site-packages\sklearn\base.py", line 600, in _validate_params
    validate_parameter_constraints(
  File "C:\Users\tanji\AppData\Local\Programs\Python\Python310\lib\si

In [51]:
cv.best_params_

{'C': 2, 'degree': 0, 'gamma': 'scale', 'kernel': 'rbf'}

In [52]:
cv.best_score_

0.9748734177215189

In [53]:
final_xtrain=scaler.fit_transform(df.drop('target',axis=1))
final_ytrain=df.target

final_svc=SVC(C= 2, degree= 0, gamma= 'scale', kernel= 'rbf')
final_svc.fit(final_xtrain,final_ytrain)

In [55]:
import pickle  #saving model

with open('diabetes_svc.pkl','wb') as f:
    pickle.dump(final_svc,f)