**Q1. What is the relationship between polynomial functions and kernel functions in machine learning
algorithms?**

Polynomial functions can be used as kernel functions in Support Vector Machines (SVM) to enable the algorithm to learn non-linear decision boundaries. The kernel trick allows SVM to operate in a high-dimensional space without explicitly transforming the data. The polynomial kernel computes the dot product of the input features raised to a certain degree, effectively allowing the SVM to fit a polynomial decision boundary in the original feature space.

**Q2. How can we implement an SVM with a polynomial kernel in Python using Scikit-learn?**

In [1]:
# To implement an SVM using Scikit-learn, you can follow these steps:
# 1.Import Libraries:
import numpy as np  
import pandas as pd  
from sklearn import datasets  
from sklearn.model_selection import train_test_split  
from sklearn.svm import SVC

In [2]:
# 2. Load Dataset:
iris = datasets.load_iris()  
X = iris.data  
y = iris.target

In [3]:
# 3.Split the Dataset:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

In [4]:
# 4.Train the SVM Classifier:
model = SVC(kernel='linear')  # You can change the kernel to 'poly' for polynomial kernel  
model.fit(X_train, y_train)

In [6]:
# 5.Make Predictions:
y_pred = model.predict(X_test)

**Q3. How does increasing the value of epsilon affect the number of support vectors in SVR?**

In Support Vector Regression (SVR), the epsilon parameter defines a margin of tolerance where no penalty is given to errors. Increasing epsilon results in a wider margin, which may lead to fewer support vectors and a simpler model. Conversely, a smaller epsilon allows more data points to be within the margin, potentially increasing the number of support vectors and making the model more complex.

**Q4. How does the choice of kernel function, C parameter, epsilon parameter, and gamma parameter
affect the performance of Support Vector Regression (SVR)? Can you explain how each parameter works
and provide examples of when you might want to increase or decrease its value?**

1. Kernel Choice: The choice of kernel (e.g., linear, polynomial, RBF) affects the model's ability to capture the underlying data patterns. For example, the RBF kernel is effective for non-linear data.

2. Hyperparameter Tuning: Hyperparameters like C (regularization) and epsilon can be tuned using techniques like GridSearchCV or RandomizedSearchCV to find the optimal values that improve model performance.

**Q5. Assignment:**

**Implementation Steps**
1. Import Necessary Libraries and Load Dataset:

In [7]:
import pandas as pd  
from sklearn import datasets  
from sklearn.model_selection import train_test_split  
from sklearn.svm import SVR  
from sklearn.metrics import mean_squared_error  
from sklearn.model_selection import GridSearchCV

In [None]:
# 2.Preprocess the Data:
# Load a dataset (e.g., Boston housing dataset)  
boston = datasets.load_boston()  
X = boston.data  
y = boston.target  

# Split the dataset  
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

In [9]:
# 3.Train the SVR Model:
svr = SVR()  
svr.fit(X_train, y_train)


In [10]:
# 4.Evaluate the Classifier:

y_pred = svr.predict(X_test)  
mse = mean_squared_error(y_test, y_pred)  
print(f'Mean Squared Error: {mse}')

Mean Squared Error: 0.030618079775968716


In [11]:
# 5.Hyperparameter Tuning:
param_grid = {  
    'kernel': ['linear', 'poly', 'rbf'],  
    'C': [0.1, 1, 10],  
    'epsilon': [0.1, 0.2, 0.5]  
}  
grid_search = GridSearchCV(SVR(), param_grid, cv=5)  
grid_search.fit(X_train, y_train)  
print(f'Best parameters: {grid_search.best_params_}')

Best parameters: {'C': 10, 'epsilon': 0.1, 'kernel': 'rbf'}


In [12]:
# 6.Save the Model for Future Use
import joblib  
joblib.dump(grid_search.best_estimator_, 'svr_model.pkl')

['svr_model.pkl']