Q1. What is the relationship between polynomial functions and kernel functions in machine learning
algorithms?


polynomial functions are a mathematical foundation for modeling complex functions, while polynomial kernels in machine learning leverage the concept of polynomial functions to perform implicit transformations of data for non-linear classification and regression tasks using kernel-based algorithms. This theoretical relationship demonstrates how mathematical concepts can be applied to machine learning to handle non-linear relationships in data.

Q2. How can we implement an SVM with a polynomial kernel in Python using Scikit-learn?


In [6]:
from sklearn.svm import SVC,SVR

svc=SVC(kernel="poly")
svr=SVR(kernel="poly")

Q3. How does increasing the value of epsilon affect the number of support vectors in SVR?


Here's how increasing the value of epsilon affects the number of support vectors in SVR:

1. Smaller Epsilon (ε):
   - When epsilon is set to a smaller value, the epsilon-insensitive tube becomes narrower.
   - A narrower tube means that the model is less tolerant of errors, and it will try to fit the training data more closely.
   - This can result in a larger number of support vectors because the model needs to consider more data points to fit the training data within the narrow margin.

2. Larger Epsilon (ε):
   - Conversely, when epsilon is set to a larger value, the epsilon-insensitive tube becomes wider.
   - A wider tube means that the model is more tolerant of errors, allowing some data points to fall outside the tube without contributing to the loss.
   - This can result in a smaller number of support vectors because the model is more lenient and does not need to consider as many data points to achieve a good fit.



Q4. How does the choice of kernel function, C parameter, epsilon parameter, and gamma parameter
affect the performance of Support Vector Regression (SVR)? Can you explain how each parameter works
and provide examples of when you might want to increase or decrease its value?


In [None]:
from sklearn.model_selection import GridSearchCV, train_test_split
from sklearn.svm import SVR
import pandas as pd

data = pd.read_csv("winequality-red.csv")
X = data.iloc[:, :-1]
y = data["quality"]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30)
svm = SVR()

parameter = {
    "kernel": ['poly', 'rbf'],  
    "degree": [1, 2],  
    "gamma": ["auto", "scale"],
    "C": [1.0, 2.0]
}

GRID = GridSearchCV(svm, param_grid=parameter,refit=True,cv=2,verbose=3)
GRID.fit(X_train, y_train)

Fitting 2 folds for each of 16 candidates, totalling 32 fits
[CV 1/2] END C=1.0, degree=1, gamma=auto, kernel=poly;, score=0.338 total time=   0.2s
[CV 2/2] END C=1.0, degree=1, gamma=auto, kernel=poly;, score=0.275 total time=   0.2s
[CV 1/2] END C=1.0, degree=1, gamma=auto, kernel=rbf;, score=0.144 total time=   0.1s
[CV 2/2] END C=1.0, degree=1, gamma=auto, kernel=rbf;, score=0.152 total time=   0.1s
[CV 1/2] END C=1.0, degree=1, gamma=scale, kernel=poly;, score=0.084 total time=   0.0s
[CV 2/2] END C=1.0, degree=1, gamma=scale, kernel=poly;, score=0.043 total time=   0.0s
[CV 1/2] END C=1.0, degree=1, gamma=scale, kernel=rbf;, score=0.123 total time=   0.0s
[CV 2/2] END C=1.0, degree=1, gamma=scale, kernel=rbf;, score=0.102 total time=   0.1s


In [None]:
GRID.best_params_

Q5. Assignment:


In [1]:
#Import the necessary libraries and load the dataset

import pandas as pd 
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt 

In [5]:
#Split the dataset into training and testing setZ
from sklearn.model_selection import  train_test_split
data = pd.read_csv("winequality-red.csv")
X = data.iloc[:, :-1]
y = data["quality"]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30)

In [29]:
#Preprocess the data using any technique of your choice (e.g. scaling, normaliMation)

from sklearn.preprocessing import StandardScaler

std=StandardScaler()
std_train=std.fit_transform(X_train)
std_test=std.fit_transform(X_test)

In [30]:
# Create an instance of the SVC classifier and train it on the training data

from sklearn.svm import SVC

svc=SVC(kernel="linear")

svc.fit(std_train,y_train)

In [31]:
#use the trained classifier to predict the labels of the testing data

y=svc.predict(X_test)



In [32]:
#Evaluate the performance of the classifier using any metric of your choice (e.g. accuracy,precision, recall, F1-score)

from sklearn.metrics import accuracy_score
print(accuracy_score(y,y_test))

0.5166666666666667


In [33]:
#Tune the hyperparameters of the SVC classifier using GridSearchCV or RandomiMedSearchCV to improve its performance
from sklearn.model_selection import GridSearchCV

parameter = {
    "kernel": ['linear', 'poly', 'rbf', 'sigmoid'],  
    "degree": [1, 2],  
    "gamma": ["auto", "scale"]
}


In [34]:
GRID = GridSearchCV(svc, param_grid=parameter,refit=True,cv=2,verbose=3)
GRID.fit(std_train, y_train)

Fitting 2 folds for each of 16 candidates, totalling 32 fits
[CV 1/2] END degree=1, gamma=auto, kernel=linear;, score=0.588 total time=   0.0s
[CV 2/2] END degree=1, gamma=auto, kernel=linear;, score=0.597 total time=   0.0s
[CV 1/2] END .degree=1, gamma=auto, kernel=poly;, score=0.584 total time=   0.0s
[CV 2/2] END .degree=1, gamma=auto, kernel=poly;, score=0.590 total time=   0.0s
[CV 1/2] END ..degree=1, gamma=auto, kernel=rbf;, score=0.607 total time=   0.1s
[CV 2/2] END ..degree=1, gamma=auto, kernel=rbf;, score=0.610 total time=   0.1s
[CV 1/2] END degree=1, gamma=auto, kernel=sigmoid;, score=0.541 total time=   0.1s
[CV 2/2] END degree=1, gamma=auto, kernel=sigmoid;, score=0.569 total time=   0.1s
[CV 1/2] END degree=1, gamma=scale, kernel=linear;, score=0.588 total time=   0.0s
[CV 2/2] END degree=1, gamma=scale, kernel=linear;, score=0.597 total time=   0.0s
[CV 1/2] END degree=1, gamma=scale, kernel=poly;, score=0.586 total time=   0.0s
[CV 2/2] END degree=1, gamma=scale, ke

In [36]:
y=GRID.predict(std_test)

accucary after hyperparameter

In [37]:
accuracy_score(y,y_test)

0.6166666666666667