Q1. What is the relationship between polynomial functions and kernel functions in machine learning
algorithms?

>In machine learning, polynomial functions are used in kernel functions to transform data into a higher-dimensional space, which can help classify non-linearly separable data:
Polynomial kernel
A type of kernel function that uses a polynomial equation to transform input data into a higher-dimensional space. This allows for the creation of complex decision boundaries, which can improve model accuracy and performance.
Relationship
Polynomial functions are used in kernel functions to represent the similarity of vectors in a feature space. This allows for the learning of non-linear models





 Q2. How can we implement an SVM with a polynomial kernel in Python using Scikit-learn?

Ans: Steps to implements  an SVM with a polynomial kernel in Python using Scikit-learn:

1:Import necessary libraries:
svm from sklearn to access the SVM classifier.
train_test_split to split your dataset into training and testing sets.
accuracy_score to evaluate the classifier's performance.
2:Load your dataset:
Replace load_your_dataset() with the appropriate code to load your data.
3:Split the data:
Use train_test_split to split the data into training and testing sets.
4:Create the SVM classifier:
Use svm.SVC() and set kernel='poly' to specify the polynomial kernel.
Adjust the degree parameter to control the polynomial degree.
5:Train the classifier:
Use clf.fit() to train the classifier on the training data.
6:Make predictions:
Use clf.predict() to make predictions on the test data.
7:Evaluate accuracy:
Use accuracy_score to calculate the accuracy of the classifier.



Q3. How does increasing the value of epsilon affect the number of support vectors in SVM?
Ans:
Increasing the value of epsilon in a support vector machine (SVM) does not affect the number of support vectors. However, the value of epsilon does determine the amount of error, or "slack", that is allowed in the predictions.

Here's some more information about epsilon in SVM:
Epsilon-insensitive loss function
This loss function is used in support vector regression (SVR) to allow for a certain degree of error in the predictions.

Epsilon-wide insensitivity tube
SVM regression tries to find a continuous function such that the maximum number of data points lie within the epsilon-wide insensitivity tube.

Box constraint:
The constant C is the box constraint, which controls the penalty imposed on observations that lie outside the epsilon margin

Q4. How does the choice of kernel function, C parameter, epsilon parameter, and gamma parameter
affect the performance of Support Vector Regression (SVR)? Can you explain how each parameter works
and provide examples of when you might want to increase or decrease its value?

Ans:The performance of Support Vector Regression (SVR) is highly influenced by the choice of the kernel function and the values of the C, epsilon (ε), and gamma (γ) parameters. These parameters control the complexity of the model, how well it fits the data, and its generalization to unseen data.


Q5. Assignment:
L Import the necessary libraries and load the dataseg
L Split the dataset into training and testing setZ
L Preprocess the data using any technique of your choice (e.g. scaling, normalizationK
L Create an instance of the SVC classifier and train it on the training datW
L Use the trained classifier to predict the labels of the testing datW
L Evaluate the performance of the classifier using any metric of your choice (e.g. accuracy,
precision, recall, F1-scoreK
L Tune the hyperparameters of the SVC classifier using GridSearchCV or RandomizedSearchCV to
improve its performanc_
L Train the tuned classifier on the entire dataseg
L Save the trained classifier to a file for future use.


In [1]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import classification_report, accuracy_score
import joblib


In [2]:
from sklearn.datasets import load_iris

# Load the dataset
data = load_iris()
X = data.data
y = data.target


In [3]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)



In [4]:
scaler = StandardScaler()

# Fit the scaler on the training data
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)


In [6]:
# Create and train the classifier
svc = SVC(kernel='rbf', random_state=42)
svc.fit(X_train, y_train)


In [8]:

y_pred = svc.predict(X_test)



In [9]:
# Calculate accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")

# Generate a classification report
print("Classification Report:")
print(classification_report(y_test, y_pred))


Accuracy: 1.00
Classification Report:
              precision    recall  f1-score   support

           0       1.00      1.00      1.00        19
           1       1.00      1.00      1.00        13
           2       1.00      1.00      1.00        13

    accuracy                           1.00        45
   macro avg       1.00      1.00      1.00        45
weighted avg       1.00      1.00      1.00        45



In [10]:
# Define the hyperparameter grid
param_grid = {
    'C': [0.1, 1, 10, 100],
    'gamma': [1, 0.1, 0.01, 0.001],
    'kernel': ['rbf', 'linear', 'poly']
}

# Create a GridSearchCV instance
grid_search = GridSearchCV(SVC(random_state=42), param_grid, cv=5, scoring='accuracy', verbose=1)

# Fit the GridSearchCV on the training data
grid_search.fit(X_train, y_train)

# Get the best parameters
best_params = grid_search.best_params_
print("Best Parameters:", best_params)


Fitting 5 folds for each of 48 candidates, totalling 240 fits
Best Parameters: {'C': 0.1, 'gamma': 1, 'kernel': 'poly'}


In [11]:
# Train the tuned classifier
best_svc = SVC(**best_params, random_state=42)
best_svc.fit(X_train, y_train)
