### Question 1

 The relationship between polynomial functions and kernel functions in machine learning algorithms, particularly in Support Vector Machines (SVMs), is that polynomial functions can be used as kernel functions to transform data into higher-dimensional spaces. Kernel functions are mathematical functions that compute the similarity (or inner product) between data points in the input space or in a higher-dimensional feature space. Polynomial kernel functions are a type of kernel function used to capture nonlinear relationships in the data.

### Question 2

By passing the kernel parameter as polynomial in the SVC initialization :

SVC( kernel = "polynomial" ) 

### Question 3

As you increase the value of epsilon, it generally leads to a larger number of support vectors, as more data points can fall within the wider tube without causing a penalty. In contrast, smaller values of epsilon make the tube narrower and result in fewer support vectors, as only data points very close to the regression line are considered.

### Question 4

1. **Kernel Function**: The choice of kernel function (linear, poly, rbf, etc.) determines the type of nonlinearity that SVR can capture. Different kernels are suited to different types of data distributions. For example, the poly kernel is suitable for polynomial relationships, while the rbf kernel captures more complex, nonlinear patterns.

2. **C Parameter**: The C parameter controls the trade-off between minimizing the training error and maximizing the margin. Smaller C values result in larger margins but may allow some training errors, while larger C values prioritize minimizing training errors but may lead to overfitting. Adjusting C impacts the balance between bias and variance.

3. **Epsilon Parameter**: The epsilon parameter defines the width of the ε-insensitive tube. Increasing epsilon allows more training points to be within the tube without contributing to the loss function, leading to a wider margin. Smaller epsilon values result in a narrower tube and a smaller margin.

4. **Gamma Paramete**: The gamma parameter affects the shape of the decision boundary. Higher gamma values make the decision boundary more sensitive to individual data points, potentially leading to overfitting. Lower gamma values result in smoother decision boundaries.

In [6]:
### Question 5

# Import necessary libraries
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
from sklearn.model_selection import GridSearchCV
import joblib

# Load the dataset (e.g., the Iris dataset)
data = datasets.load_iris()
X = data.data
y = data.target

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Preprocess the data (scaling)
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Create an instance of the SVC classifier
classifier = SVC()

# Train the classifier on the training data
classifier.fit(X_train, y_train)

# Use the trained classifier to predict the labels of the testing data
y_pred = classifier.predict(X_test)

# Evaluate the performance of the classifier (accuracy)
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")

# Tune hyperparameters using GridSearchCV (example with C and gamma)
param_grid = {'C': [0.1, 1, 10], 'gamma': [0.01, 0.1, 1]}
grid_search = GridSearchCV(SVC(), param_grid, cv=5)
grid_search.fit(X_train, y_train)

# Get the best hyperparameters
best_params = grid_search.best_params_
print(f"Best Hyperparameters: {best_params}")

# Train the tuned classifier on the entire dataset
tuned_classifier = SVC(**best_params)
tuned_classifier.fit(X, y)

# Save the trained classifier to a file (e.g., 'svm_classifier.pkl') for future use
joblib.dump(tuned_classifier, 'svm_classifier.pkl')


Accuracy: 1.00
Best Hyperparameters: {'C': 1, 'gamma': 0.1}


['svm_classifier.pkl']