In [None]:
Q1. What is the relationship between polynomial functions and kernel functions in machine learning algorithms?
In machine learning algorithms, particularly in Support Vector Machines (SVM), kernel functions are used to transform the input
data into a higher-dimensional space where it is easier to separate the data linearly. Polynomial functions are one type of kernel function.

Polynomial Kernel
The polynomial kernel is defined as:
𝐾(𝑥𝑖,𝑥𝑗) = (𝑥𝑖 ⋅ 𝑥𝑗 + 𝑐 )𝑑
where:
𝑥𝑖 and 𝑥𝑗 are input feature vectors.
c is a free parameter trading off the influence of higher-order versus lower-order terms.
d is the degree of the polynomial.

By using a polynomial kernel, the algorithm can fit more complex, non-linear decision boundaries.


Q2. How can we implement an SVM with a polynomial kernel in Python using Scikit-learn?

Below is an example of implementing an SVM with a polynomial kernel using Scikit-learn:


from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

# Load dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target

# Split dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Create an SVM with a polynomial kernel
svc_poly = SVC(kernel='poly', degree=3, C=1.0)

# Train the model
svc_poly.fit(X_train, y_train)

# Predict on the test set
y_pred = svc_poly.predict(X_test)

# Evaluate the model
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy:.2f}')


Q3. How does increasing the value of epsilon affect the number of support vectors in SVR?


In Support Vector Regression (SVR), the parameter epsilon (ϵ) defines a margin of tolerance where no penalty is given to errors. 
Increasing the value of ϵ results in a wider margin, which means more data points will fall within this margin and not be considered support vectors. 
Consequently, the number of support vectors decreases as ϵ increases.


Q4. How does the choice of kernel function, C parameter, epsilon parameter, and gamma parameter affect the performance of 
Support Vector Regression (SVR)?

Kernel Function: Determines the shape of the hyperplane in the higher-dimensional space.
Different kernel functions can model different types of relationships (e.g., linear, polynomial, RBF).

C Parameter: Controls the trade-off between achieving a low training error and a low testing error (i.e., generalization).
A smaller C value encourages a larger margin, while a larger C value aims to classify all training examples correctly by reducing the margin.

Epsilon (ϵ) Parameter: Defines the margin of tolerance where no penalty is given. Larger ϵ values create a wider margin, which may result in fewer support vectors 
and a simpler model.

Gamma Parameter: Only relevant for RBF and polynomial kernels. It defines the influence of a single training example.
A low value means ‘far’ influence, and a high value means ‘close’ influence. Higher gamma values can lead to overfitting.



Q5. Assignment
Steps:
Import the necessary libraries and load the dataset
Split the dataset into training and testing sets
Preprocess the data using any technique of your choice (e.g., scaling, normalization)
Create an instance of the SVC classifier and train it on the training data
Use the trained classifier to predict the labels of the testing data
Evaluate the performance of the classifier using any metric of your choice (e.g., accuracy, precision, recall, F1-score)
Tune the hyperparameters of the SVC classifier using GridSearchCV or RandomizedSearchCV to improve its performance
Train the tuned classifier on the entire dataset
Save the trained classifier to a file for future use

Implementation:

# Import necessary libraries
import numpy as np
import pandas as pd
from sklearn import datasets
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.metrics import classification_report, accuracy_score
import joblib

# Load dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target

# Split dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Preprocess the data (scaling)
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Create an instance of the SVC classifier and train it on the training data
svc = SVC(kernel='linear', random_state=42)
svc.fit(X_train, y_train)

# Use the trained classifier to predict the labels of the testing data
y_pred = svc.predict(X_test)

# Evaluate the performance of the classifier
print("Initial model performance:")
print(f"Accuracy: {accuracy_score(y_test, y_pred):.2f}")
print(classification_report(y_test, y_pred))

# Tune the hyperparameters of the SVC classifier using GridSearchCV
param_grid = {
    'C': [0.1, 1, 10, 100],
    'kernel': ['linear', 'poly', 'rbf'],
    'gamma': ['scale', 'auto'],
    'degree': [2, 3, 4]
}

grid_search = GridSearchCV(SVC(), param_grid, cv=5, scoring='accuracy')
grid_search.fit(X_train, y_train)

print("Best parameters found by GridSearchCV:")
print(grid_search.best_params_)

# Train the tuned classifier on the entire dataset
best_svc = grid_search.best_estimator_
best_svc.fit(X_train, y_train)

# Save the trained classifier to a file for future use
joblib.dump(best_svc, 'best_svc_model.pkl')

# Use the tuned classifier to predict the labels of the testing data
y_pred_tuned = best_svc.predict(X_test)

# Evaluate the performance of the tuned classifier
print("Tuned model performance:")
print(f"Accuracy: {accuracy_score(y_test, y_pred_tuned):.2f}")
print(classification_report(y_test, y_pred_tuned))

# Plot decision boundaries (optional, for two features)
import matplotlib.pyplot as plt

def plot_decision_boundary(clf, X, y):
    # We only take the first two features for easy visualization
    X = X[:, :2]
    h = .02  # step size in the mesh
    x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
    y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
    xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))
    Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
    Z = Z.reshape(xx.shape)
    plt.contourf(xx, yy, Z, alpha=0.8)
    plt.scatter(X[:, 0], X[:, 1], c=y, edgecolors='k', marker='o')
    plt.xlabel('Feature 1')
    plt.ylabel('Feature 2')
    plt.title('Decision Boundary')
    plt.show()

# Plot decision boundaries for initial and tuned models
plot_decision_boundary(svc, X_train, y_train)
plot_decision_boundary(best_svc, X_train, y_train)


In the above code:

We load the Iris dataset and split it into training and testing sets.
We preprocess the data using StandardScaler.
We create an SVC classifier instance, train it on the training data, and evaluate its initial performance.
We tune the hyperparameters using GridSearchCV to find the best model.
We train the tuned model on the entire training dataset and save it using joblib.
We evaluate the performance of the tuned model and plot the decision boundaries for visualization.

