1. What is the relationship between polynomial functions and kernel functions in machine learning algorithms?

In machine learning algorithms, particularly Support Vector Machines (SVMs), kernel functions play a vital role in handling non-linear data. While linear SVMs can only create linear decision boundaries, kernel functions effectively transform the data into a higher-dimensional space where a linear separation becomes possible. Polynomial functions are one type of kernel function employed for this transformation.

Here's the key connection:

Polynomial functions: Map the original data points to higher-dimensional feature vectors based on polynomial terms of their original features. For example, a polynomial kernel of degree 2 might create new features like (x1^2, x2^2, x1*x2), allowing the SVM to learn more complex relationships.
Kernel functions: Act as a bridge between the original data space and the transformed space, without explicitly performing the mapping. They calculate the inner product of data points in the high-dimensional space, enabling the SVM to learn decision boundaries in that space even though the original data was non-linear.
In essence, polynomial functions provide a specific way to achieve the general goal of kernel functions: transforming non-linear data into a space where linear separation is possible for improved learning by SVMs.

Q2. How can we implement an SVM with a polynomial kernel in Python using Scikit-learn?

In [1]:
from sklearn.svm import SVC

# Sample data
X = [[1, 1], [2, 2], [3, 3], [4, 4]]
y = [1, 4, 9, 16]

# Define SVR with polynomial kernel
model = SVC(kernel='poly', C=100, degree=2)

# Train the model
model.fit(X, y)

# Make predictions
y_pred = model.predict([[5, 5]])
print(y_pred)


[16]


3. How does increasing the value of epsilon affect the number of support vectors in SVR?

In Support Vector Regression (SVR), epsilon defines the tolerance for deviations from the perfect fit. Increasing epsilon allows more flexibility in the model, permitting more data points to violate the margin (the gap between the decision function and the closest data points). This typically leads to:

Fewer support vectors: As the model can tolerate more errors, fewer data points become critical for defining the decision function, resulting in a sparser model.
Higher bias: With increased flexibility, the model might fit the training data less precisely, potentially leading to higher bias and reduced generalization ability on unseen data.
Example:

Low epsilon: Enforces a stricter fit, potentially using more support vectors but risking overfitting.
High epsilon: Allows more flexibility, potentially reducing support vectors but introducing higher bias.


Q4. How does the choice of kernel function, C parameter, epsilon parameter, and gamma parameter
affect the performance of Support Vector Regression (SVR)? Can you explain how each parameter works
and provide examples of when you might want to increase or decrease its value?

Kernel function:

Choice: Determines how the data is transformed and what kind of non-linear relationships the model can capture.
Linear kernel: For simple linear relationships.
Polynomial kernel: For moderate non-linearity (choose appropriate degree).
RBF kernel: For highly non-linear data, but can be computationally expensive.
Example: Use RBF for complex, noisy data; polynomial for moderate non-linearity with interpretability; linear for simple, well-separated data.
C parameter:

Controls regularization: Higher C penalizes complex models and enforces a larger margin, potentially reducing overfitting but increasing bias.
Example: Increase C for noisy data or when overfitting is a concern; decrease C for more complex relationships or limited data.
Epsilon parameter (SVR only):

Defines error tolerance: Higher epsilon allows more flexibility and potentially fewer support vectors, but risks higher bias.
Example: Increase epsilon for noisy data or when capturing every data point perfectly is not crucial

Q5. Assignment:
L Import the necessary libraries and load the dataseg
L Split the dataset into training and testing setZ
L Preprocess the data using any technique of your choice (e.g. scaling, normaliMationK
L Create an instance of the SVC classifier and train it on the training datW
L hse the trained classifier to predict the labels of the testing datW
L Evaluate the performance of the classifier using any metric of your choice (e.g. accuracy,
precision, recall, F1-scoreK
L Tune the hyperparameters of the SVC classifier using GridSearchCV or RandomiMedSearchCV to
improve its performanc_
L Train the tuned classifier on the entire dataseg
L Save the trained classifier to a file for future use.

In [2]:
import pandas as pd
import numpy as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.datasets import make_classification


In [4]:
x,y=make_classification(n_samples=1000,n_features=2,n_classes=2,n_clusters_per_class=2,n_redundant=0)

In [5]:
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test=train_test_split(x,y,random_state=42,test_size=0.25)

In [6]:
from sklearn.svm import SVC
svc=SVC(kernel='linear')

In [8]:
svc.fit(x_train,y_train)

In [10]:
y_pred=svc.predict(x_test)

In [14]:
from sklearn.metrics import confusion_matrix,accuracy_score,classification_report,recall_score,precision_score

In [16]:
print(confusion_matrix(y_test,y_pred))
print(accuracy_score(y_test,y_pred))
print(classification_report(y_test,y_pred))
print(recall_score(y_test,y_pred))
print(precision_score(y_test,y_pred))

[[114  10]
 [ 10 116]]
0.92
              precision    recall  f1-score   support

           0       0.92      0.92      0.92       124
           1       0.92      0.92      0.92       126

    accuracy                           0.92       250
   macro avg       0.92      0.92      0.92       250
weighted avg       0.92      0.92      0.92       250

0.9206349206349206
0.9206349206349206


In [17]:
from sklearn.model_selection import GridSearchCV
paramters={'kernel':('linear','poly','rbf','sigmoid'),'C':[1,2,3,4,5]}
gt=GridSearchCV(svc,param_grid=paramters,cv=5,refit=True)

In [18]:
gt.fit(x_train,y_train)

In [19]:
gt.best_params_

{'C': 5, 'kernel': 'rbf'}

In [20]:
y_pred=gt.predict(x_test)

In [22]:
gt.best_score_

0.9426666666666668

In [21]:
print(confusion_matrix(y_test,y_pred))
print(accuracy_score(y_test,y_pred))
print(classification_report(y_test,y_pred))
print(recall_score(y_test,y_pred))
print(precision_score(y_test,y_pred))

[[118   6]
 [  5 121]]
0.956
              precision    recall  f1-score   support

           0       0.96      0.95      0.96       124
           1       0.95      0.96      0.96       126

    accuracy                           0.96       250
   macro avg       0.96      0.96      0.96       250
weighted avg       0.96      0.96      0.96       250

0.9603174603174603
0.952755905511811


In [23]:
import pickle
file=open("svcmodel.pkl",'wb')
pickle.dump(svc,file)
file.close()