## Q1. What is the relationship between polynomial functions and kernel functions in machine learning algorithms?


Polynomial functions and kernel functions are both used in machine learning algorithms to measure the similarity between two data points. However, they do so in different ways.

A polynomial function takes two data points as input and returns a value that represents their similarity. The specific formula for the polynomial function depends on the degree of the polynomial. For example, a polynomial of degree 2 would return the sum of the squares of the differences between the two data points.

A kernel function, on the other hand, takes two data points as input and returns a value that represents their similarity in a higher-dimensional space. The specific formula for the kernel function depends on the type of kernel function. For example, the polynomial kernel function returns the dot product of the two data points raised to a power.

In machine learning algorithms, kernel functions are often used with support vector machines (SVMs). SVMs are a type of supervised learning algorithm that can be used for classification and regression tasks. SVMs work by finding a hyperplane that separates the data points into two classes. The kernel function is used to measure the similarity between the data points, and the hyperplane is chosen to maximize the distance between the data points of different classes.

Polynomial functions can also be used with SVMs, but they are not as common as kernel functions. This is because polynomial functions do not have the same ability to measure the similarity between data points in a higher-dimensional space. As a result, polynomial functions are not as effective as kernel functions for tasks that require non-linear classification or regression.

In general, kernel functions are more powerful than polynomial functions for machine learning tasks. However, polynomial functions are simpler and faster to compute, which can make them a better choice for some applications.

## Q2. How can we implement an SVM with a polynomial kernel in Python using Scikit-learn?

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns


In [2]:
from sklearn.datasets import make_classification

X,y= make_classification(n_features=2,n_classes=2,n_samples=1000,n_clusters_per_class=2,n_redundant=0)


In [3]:
X

array([[-0.47074449,  0.57605846],
       [-0.98993858,  0.70153487],
       [ 1.50641819,  0.78883871],
       ...,
       [ 0.28969074, -2.74002773],
       [-1.40101579, -1.21848891],
       [ 0.00293995,  1.5113179 ]])

In [4]:
y

array([1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1,
       1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1,
       0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0,
       1, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0, 1, 1, 1, 0,
       0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0,
       1, 1, 0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0,
       1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0,
       0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 1, 0,
       0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1,
       1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 0, 1,
       1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0,
       0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0,
       1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0,

In [7]:
from sklearn.svm import SVC

pd.DataFrame(X)[0]

svm= SVC(kernel='poly')

In [8]:
from sklearn.model_selection import train_test_split

x_train,x_test,y_train,y_test = train_test_split(X,y,test_size=0.30,random_state=43)

In [10]:
x_train.shape,x_test.shape

((700, 2), (300, 2))

In [11]:
svm.fit(x_train,y_train)

In [14]:
y_pred=svm.predict(x_test)

In [16]:
from sklearn.metrics import accuracy_score

score = accuracy_score(y_pred,y_test)

In [17]:
score

0.97

## Q3. How does increasing the value of epsilon affect the number of support vectors in SVR?


Increasing the value of epsilon in SVR will decrease the number of support vectors. This is because epsilon controls the maximum allowable error for a data point to be considered a support vector. A larger epsilon means that more data points will be allowed to have errors, which will decrease the number of support vectors needed to fit the model.

To understand this in more detail, let's look at the definition of a support vector in SVR. A support vector is a data point that is on the margin of the decision boundary. This means that the data point is close to being classified incorrectly, but it is not quite far enough away to be considered an error. The number of support vectors in an SVR model is directly related to the complexity of the model. A model with more support vectors will be more complex and will be able to fit the data more closely. However, a more complex model is also more likely to overfit the data.

Increasing the value of epsilon allows the SVR model to tolerate more errors. This means that the model can be less complex and will have fewer support vectors. A model with fewer support vectors is less likely to overfit the data, which can improve its generalization performance.

However, it is important to note that increasing epsilon too much can also lead to underfitting. This is because the model will not be able to fit the data closely enough, which can lead to poor performance on the training set and the test set.

It is important to choose a value of epsilon that strikes a balance between complexity and overfitting. A good way to do this is to use cross-validation to evaluate the performance of the model with different values of epsilon.

## Q4. How does the choice of kernel function, C parameter, epsilon parameter, and gamma parameter affect the performance of Support Vector Regression (SVR)? Can you explain how each parameter works and provide examples of when you might want to increase or decrease its value?

The choice of kernel function, C parameter, epsilon parameter, and gamma parameter can all affect the performance of Support Vector Regression (SVR). Here is a brief explanation of each parameter and how it affects the model:

Kernel function: The kernel function is used to measure the similarity between two data points. There are many different kernel functions available, each with its own strengths and weaknesses. The most common kernel functions for SVR are the linear kernel, the polynomial kernel, and the radial basis function (RBF) kernel. The linear kernel is the simplest kernel function, but it is only effective for linearly separable data. The polynomial kernel can be used for non-linearly separable data, but it is more complex and computationally expensive. The RBF kernel is a good choice for most non-linearly separable data.

C parameter: The C parameter controls the trade-off between the model's complexity and its accuracy. A larger C value will result in a more complex model with better accuracy, but it is also more likely to overfit the data. A smaller C value will result in a less complex model with worse accuracy, but it is also less likely to overfit the data.

Epsilon parameter: The epsilon parameter controls the maximum allowable error for a data point to be considered a support vector. A larger epsilon value will result in a model with fewer support vectors and better generalization performance. A smaller epsilon value will result in a model with more support vectors and worse generalization performance.

Gamma parameter: The gamma parameter controls the influence of each data point on the decision boundary. A larger gamma value will result in a model that is more sensitive to the individual data points. A smaller gamma value will result in a model that is less sensitive to the individual data points.

Here are some examples of when you might want to increase or decrease the value of each parameter:

Kernel function: If the data is linearly separable, you can use the linear kernel. If the data is non-linearly separable, you can use the polynomial kernel or the RBF kernel.

C parameter: If you are concerned about overfitting, you can use a smaller C value. If you are more concerned about accuracy, you can use a larger C value.

Epsilon parameter: If you are concerned about generalization performance, you can use a larger epsilon value. If you are more concerned about accuracy, you can use a smaller epsilon value.

Gamma parameter: If you want the model to be more sensitive to the individual data points, you can use a larger gamma value. If you want the model to be less sensitive to the individual data points, you can use a smaller gamma value.

It is important to experiment with different values of the kernel function, C parameter, epsilon parameter, and gamma parameter to find the best combination for your data. You can use cross-validation to evaluate the performance of the model with different values of the parameters.

## Q5. Assignment:

In [18]:
#Import the necessary libraries and load the dataset

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sn


In [33]:
df=pd.read_csv('diabetes.csv')

In [34]:
df.head()

Unnamed: 0,Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigreeFunction,Age,Outcome
0,6,148,72,35,0,33.6,0.627,50,1
1,1,85,66,29,0,26.6,0.351,31,0
2,8,183,64,0,0,23.3,0.672,32,1
3,1,89,66,23,94,28.1,0.167,21,0
4,0,137,40,35,168,43.1,2.288,33,1


In [35]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 768 entries, 0 to 767
Data columns (total 9 columns):
 #   Column                    Non-Null Count  Dtype  
---  ------                    --------------  -----  
 0   Pregnancies               768 non-null    int64  
 1   Glucose                   768 non-null    int64  
 2   BloodPressure             768 non-null    int64  
 3   SkinThickness             768 non-null    int64  
 4   Insulin                   768 non-null    int64  
 5   BMI                       768 non-null    float64
 6   DiabetesPedigreeFunction  768 non-null    float64
 7   Age                       768 non-null    int64  
 8   Outcome                   768 non-null    int64  
dtypes: float64(2), int64(7)
memory usage: 54.1 KB


In [38]:
#Split the dataset into training and testing setZ
X=df.drop('Outcome',axis=1)
y=df['Outcome']

In [39]:
X

Unnamed: 0,Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigreeFunction,Age
0,6,148,72,35,0,33.6,0.627,50
1,1,85,66,29,0,26.6,0.351,31
2,8,183,64,0,0,23.3,0.672,32
3,1,89,66,23,94,28.1,0.167,21
4,0,137,40,35,168,43.1,2.288,33
...,...,...,...,...,...,...,...,...
763,10,101,76,48,180,32.9,0.171,63
764,2,122,70,27,0,36.8,0.340,27
765,5,121,72,23,112,26.2,0.245,30
766,1,126,60,0,0,30.1,0.349,47


In [40]:
y

0      1
1      0
2      1
3      0
4      1
      ..
763    0
764    0
765    0
766    1
767    0
Name: Outcome, Length: 768, dtype: int64

In [44]:
from sklearn.model_selection import train_test_split
X_train,X_test,Y_train,Y_test= train_test_split(X,y,test_size=0.3,random_state=43)

In [45]:
x_train.shape,x_test.shape

((537, 8), (231, 8))

In [50]:
#Preprocess the data using any technique of your choice (e.g. scaling, normalizationn)

from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()

X_trainscaled=scaler.fit_transform(X_train)
X_testscaled =scaler.transform(X_test)

In [61]:
X_trainscaled.shape

(537, 8)

In [70]:
X_testscaled.shape

(231, 8)

In [56]:
#Create an instance of the SVC classifier and train it on the training data.

from sklearn.svm import SVC

svc= SVC()
svc.fit(X_trainscaled,y_train)

In [65]:
#use the trained classifier to predict the labels of the testing data.
Y_pred= svc.predict(X_testscaled)

In [67]:
#Evaluate the performance of the classifier using any metric of your choice (e.g. accuracy, precision, recall, F1-score.

from sklearn.metrics import accuracy_score, precision_score

acc_score = accuracy_score(Y_pred,Y_test)
precision= precision_score(Y_pred,Y_test)

In [68]:
acc_score

0.7662337662337663

In [69]:
precision

0.5063291139240507

In [71]:
#Tune the hyperparameters of the SVC classifier using GridSearchCV or RandomiMedSearchCV to improve its performanc_

In [78]:
parameters = {'kernel':['linear', 'poly', 'rbf'],
             'C':[0.1,1,1.5,2,3]}

In [79]:
from sklearn.model_selection import GridSearchCV
grs= GridSearchCV(SVC(),param_grid=parameters,cv=5,refit=True,verbose=3)

In [80]:
grs.fit(X_trainscaled,Y_train)

Fitting 5 folds for each of 15 candidates, totalling 75 fits
[CV 1/5] END ..............C=0.1, kernel=linear;, score=0.759 total time=   0.0s
[CV 2/5] END ..............C=0.1, kernel=linear;, score=0.843 total time=   0.0s
[CV 3/5] END ..............C=0.1, kernel=linear;, score=0.710 total time=   0.0s
[CV 4/5] END ..............C=0.1, kernel=linear;, score=0.748 total time=   0.0s
[CV 5/5] END ..............C=0.1, kernel=linear;, score=0.738 total time=   0.0s
[CV 1/5] END ................C=0.1, kernel=poly;, score=0.722 total time=   0.0s
[CV 2/5] END ................C=0.1, kernel=poly;, score=0.731 total time=   0.0s
[CV 3/5] END ................C=0.1, kernel=poly;, score=0.720 total time=   0.0s
[CV 4/5] END ................C=0.1, kernel=poly;, score=0.701 total time=   0.0s
[CV 5/5] END ................C=0.1, kernel=poly;, score=0.673 total time=   0.0s
[CV 1/5] END .................C=0.1, kernel=rbf;, score=0.676 total time=   0.0s
[CV 2/5] END .................C=0.1, kernel=rbf;

In [81]:
grs.best_params_

{'C': 2, 'kernel': 'rbf'}

In [82]:
# Train the tuned classifier on the entire dataset
y_pred2=grs.predict(X_testscaled)

In [84]:
acc_score = accuracy_score(y_pred2,Y_test)
precision= precision_score(y_pred2,Y_test)

In [85]:
acc_score,precision

(0.7619047619047619, 0.5063291139240507)

In [86]:
# Save the trained classifier to a file for future use.
scaler

In [87]:
grs

In [88]:
import pickle

In [89]:
pickle.dump(scaler,open('scalersvc.pkl','wb'))
pickle.dump(grs,open('gridsearch.pkl','wb'))