Q1. What is the relationship between polynomial functions and kernel functions in machine learning
algorithms?

Polynomial Functions:

Polynomial functions are mathematical expressions that involve variables raised to a power.
They can model complex relationships between input features.
Kernel Functions:

Kernel functions enable algorithms to operate in a high-dimensional space without explicitly transforming the data.
The polynomial kernel is a specific type of kernel function that computes the dot product of two vectors in a polynomial feature space.
Relationship:
The polynomial kernel can be expressed as: [ K(x_i, x_j) = (x_i^T x_j + c)^d ]
This allows SVM to learn non-linear decision boundaries by implicitly mapping input features into a higher-dimensional space.


Q2. How can we implement an SVM with a polynomial kernel in Python using Scikit-learn?

In [1]:
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

# Load the dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create an instance of the SVC classifier with a polynomial kernel
model = SVC(kernel='poly', degree=3, C=1.0)

# Train the classifier on the training data
model.fit(X_train, y_train)

# Predict the labels of the testing data
y_pred = model.predict(X_test)

# Evaluate the performance of the classifier
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy}')

Accuracy: 1.0


Q3. How does increasing the value of epsilon affect the number of support vectors in SVR?

Epsilon in SVR:
Epsilon (( \epsilon )) defines the width of the margin around the regression line where no penalty is given to errors.
Effect of Increasing Epsilon:
Increasing the value of epsilon generally leads to fewer support vectors because a larger margin allows more data points to fall within the margin without being penalized.
This can result in a simpler model that may underfit the data if epsilon is too large.


Q4. How does the choice of kernel function, C parameter, epsilon parameter, and gamma parameter
affect the performance of Support Vector Regression (SVR)? Can you explain how each parameter works
and provide examples of when you might want to increase or decrease its value?

Kernel Function:

Determines the shape of the decision boundary.
Example: Use a linear kernel for linearly separable data; use an RBF kernel for non-linear data.
C Parameter:

Controls the trade-off between achieving a low training error and a low testing error.
Example: Increase C to reduce training error (risk of overfitting); decrease C to allow more margin for error (risk of underfitting).
Epsilon Parameter:

Defines the margin of tolerance where no penalty is given to errors.
Example: Increase epsilon to allow more errors without penalty, which can reduce the number of support vectors.
Gamma Parameter:

Defines how far the influence of a single training example reaches.
Example: Increase gamma to make the model more sensitive to training data (risk of overfitting); decrease gamma to make the model more generalized.


Q5: Assignment Steps

In [2]:
import pandas as pd
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.metrics import classification_report
from sklearn.model_selection import GridSearchCV
import joblib

# Load the dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target

In [3]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [4]:
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

In [5]:
model = SVC()
model.fit(X_train, y_train)

In [6]:
y_pred = model.predict(X_test)