Q1. What is the relationship between polynomial functions and kernel functions in machine learning algorithms?
In machine learning, particularly in Support Vector Machines (SVMs), kernel functions enable the transformation of data into higher-dimensional spaces to make it easier to find a linear separating hyperplane. Polynomial kernel functions are a type of kernel function that allow the creation of polynomial decision boundaries in the original feature space.


## Q2 How can we implement an SVM with a polynomial kernel in Python using Scikit-learn?

In [None]:
# Import necessary libraries
import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

# Load the Iris dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create an instance of the SVC classifier with a polynomial kernel
svm_clf = SVC(kernel='poly', degree=3, C=1.0)

# Train the classifier on the training data
svm_clf.fit(X_train, y_train)

# Predict the labels for the testing set
y_pred = svm_clf.predict(X_test)

# Compute the accuracy of the model
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy:.2f}')


In [None]:
# Import necessary libraries
import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

# Load the Iris dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create an instance of the SVC classifier with a polynomial kernel
svm_clf = SVC(kernel='poly', degree=3, C=1.0)

# Train the classifier on the training data
svm_clf.fit(X_train, y_train)

# Predict the labels for the testing set
y_pred = svm_clf.predict(X_test)

# Compute the accuracy of the model
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy:.2f}')


Q3. How does increasing the value of epsilon affect the number of support vectors in SVR?


In Support Vector Regression (SVR), the 
𝜖
ϵ parameter defines a margin of tolerance where no penalty is given to errors. Increasing the value of 
𝜖
ϵ widens this margin, allowing more data points to fall within the margin without contributing to the loss function. As a result, the number of support vectors generally decreases because fewer data points fall outside the margin and become support vectors.

Q4. How does the choice of kernel function, C parameter, epsilon parameter, and gamma parameter affect the performance of Support Vector Regression (SVR)?


Kernel Function: The kernel function determines the shape of the decision boundary. Common kernels include linear, polynomial, and RBF (Radial Basis Function). The choice of kernel affects the model's ability to capture complex patterns. For example, the RBF kernel can handle non-linear relationships, while the linear kernel is suitable for linear relationships.

C Parameter: The 
𝐶
C parameter controls the trade-off between achieving a low error on the training data and minimizing the norm of the weights. A small 
𝐶
C value creates a wider margin at the cost of more misclassifications, leading to a simpler model. A large 
𝐶
C value aims for fewer misclassifications but can lead to overfitting.

Epsilon Parameter: The 
𝜖
ϵ parameter defines the margin of tolerance where no penalty is given to errors. A larger 
𝜖
ϵ value allows for more tolerance, reducing the number of support vectors and potentially leading to a simpler model with higher bias. A smaller 
𝜖
ϵ value results in more support vectors and can capture more details but may overfit the data.

Gamma Parameter: The 
𝛾
γ parameter, used in RBF and polynomial kernels, defines how far the influence of a single training example reaches. A small 
𝛾
γ value means a large influence (smooth decision boundary), while a large 
𝛾
γ value means a small influence (complex decision boundary). Tuning 
𝛾
γ is essential for balancing model complexity and generalization.

Q5. Assignment

In [None]:
import numpy as np
import pandas as pd
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, classification_report
from sklearn.model_selection import GridSearchCV
import joblib


# Load the Iris dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


# Scale the data
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)


# Create an instance of the SVC classifier
svc = SVC(kernel='linear', C=1.0)

# Train the classifier on the training data
svc.fit(X_train, y_train)


# Predict the labels for the testing set
y_pred = svc.predict(X_test)


# Evaluate the performance of the classifier
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred, average='weighted')
recall = recall_score(y_test, y_pred, average='weighted')
f1 = f1_score(y_test, y_pred, average='weighted')

print(f'Accuracy: {accuracy:.2f}')
print(f'Precision: {precision:.2f}')
print(f'Recall: {recall:.2f}')
print(f'F1 Score: {f1:.2f}')

# Print classification report
print('\nClassification Report:')
print(classification_report(y_test, y_pred))


# Define the parameter grid
param_grid = {
    'C': [0.1, 1, 10, 100],
    'kernel': ['linear', 'poly', 'rbf', 'sigmoid'],
    'gamma': ['scale', 'auto']
}

# Perform GridSearchCV
grid_search = GridSearchCV(SVC(), param_grid, cv=5, scoring='accuracy')
grid_search.fit(X_train, y_train)

# Best parameters found by GridSearchCV
print(f'Best Parameters: {grid_search.best_params_}')


# Train the tuned classifier on the entire dataset
best_svc = grid_search.best_estimator_
best_svc.fit(X_train, y_train)

# Predict the labels for the testing set
y_pred_best = best_svc.predict(X_test)

# Evaluate the performance of the tuned classifier
accuracy_best = accuracy_score(y_test, y_pred_best)
print(f'Accuracy of Tuned Classifier: {accuracy_best:.2f}')


# Save the trained classifier
joblib.dump(best_svc, 'tuned_svc_model.pkl')

# Load the saved model (example of how to load the saved model)
loaded_model = joblib.load('tuned_svc_model.pkl')


