In [None]:
# Q1. The relationship between polynomial functions and kernel functions in machine learning algorithms lies in their use as basis functions to transform data into higher-dimensional spaces. Polynomial functions are a specific type of kernel function used to compute the inner product of the transformed feature vectors. In SVMs, for example, polynomial kernel functions allow the algorithm to create nonlinear decision boundaries by implicitly mapping the input features into a higher-dimensional space using polynomial functions. This mapping enables the SVM to find a hyperplane in the transformed space that separates the classes.

# Q2. We can implement an SVM with a polynomial kernel in Python using Scikit-learn by specifying the kernel parameter of the SVC class as 'poly'. Additionally, we can specify the degree of the polynomial using the degree parameter.

from sklearn.svm import SVC
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load the iris dataset
iris = load_iris()
X = iris.data
y = iris.target

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Create an instance of the SVC classifier with a polynomial kernel
svm_poly = SVC(kernel='poly', degree=3)  # degree=3 is the default degree for polynomial kernel
svm_poly.fit(X_train, y_train)

# Use the trained classifier to predict the labels of the testing data
y_pred = svm_poly.predict(X_test)

# Evaluate the performance of the classifier
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)
# Q3. Increasing the value of epsilon in Support Vector Regression (SVR) typically results in a larger margin around the predicted function. As a result, more data points may fall within this margin, leading to fewer support vectors. Conversely, decreasing epsilon may result in a smaller margin, potentially encompassing fewer data points and requiring more support vectors to define the predicted function accurately.

# Q4. The performance of Support Vector Regression (SVR) is influenced by several parameters:
   - Kernel function: Different kernel functions, such as linear, polynomial, and radial basis function (RBF), can capture different types of relationships in the data. The choice of kernel function affects the flexibility of the SVR model.
   - C parameter: The regularization parameter C controls the trade-off between maximizing the margin and minimizing the training error. Higher values of C allow for fewer margin violations (soft margin), potentially leading to overfitting, while lower values of C prioritize a wider margin and may result in underfitting.
   - Epsilon parameter: Epsilon (ε) determines the width of the margin around the predicted function. Larger values of epsilon result in a wider margin, potentially encompassing more data points within the margin. Smaller values of epsilon result in a narrower margin and may lead to fewer support vectors.
   - Gamma parameter: For kernel functions like RBF, gamma (γ) defines the influence of individual training samples on the decision boundary. Higher values of gamma result in a more complex decision boundary, potentially leading to overfitting, while lower values of gamma result in a smoother decision boundary.

   Example scenarios:
   - Increase C: When the training data contains noise or outliers, increasing C can help reduce the influence of these points and improve generalization.
   - Decrease C: When the dataset is large and noisy, reducing C can prevent overfitting by allowing for a wider margin.
   - Increase epsilon: When the prediction task allows for some deviation from the target values, increasing epsilon can lead to a more robust model.
   - Adjust gamma: Higher gamma values may be suitable for datasets with complex relationships, while lower gamma values may be preferable for smoother decision boundaries.

# Q5. **Assignment:**

# Import necessary libraries
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
import joblib

# Load the dataset
digits = load_digits()
X = digits.data
y = digits.target

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Preprocess the data (scaling)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Create an instance of the SVC classifier
svm = SVC()

# Tune hyperparameters using GridSearchCV
param_grid = {'C': [0.1, 1, 10, 100], 'gamma': [0.001, 0.01, 0.1, 1], 'kernel': ['rbf', 'poly', 'sigmoid']}
grid_search = GridSearchCV(estimator=svm, param_grid=param_grid, cv=5)
grid_search.fit(X_train_scaled, y_train)

# Get the best parameters
best_params = grid_search.best_params_
print("Best parameters:", best_params)

# Train the tuned classifier on the entire dataset
best_svm = SVC(**best_params)
best_svm.fit(X_train_scaled, y_train)

# Save the trained classifier to a file
joblib.dump(best_svm, 'svm_classifier.pkl')

# Load the saved classifier
loaded_svm = joblib.load('svm_classifier.pkl')

# Evaluate the performance of the classifier
y_pred_test = loaded_svm.predict(X_test_scaled)
accuracy = accuracy_score(y_test, y_pred_test)
print("Accuracy:", accuracy)
