# 7April(Support Vector Machines)

Question 1 : What is relationship between polynomial funtions and kernel functions in machine learning algorithms ?

Answer 1
Polynomial functions and kernel functions are related in the sense that polynomial functions can be used as a basis for kernel functions in machine learning algorithms. A kernel function is a mathematical function that maps data points from their original space to a new feature space, where they can be more easily separated or classified. The polynomial function is one of the common kernel functions used in Support Vector Machines (SVM) algorithms.

Question 2 : How can we implement SVM with polynomial kernel in Python using Scikit-learn library?

Answer 2
We can implement an SVM with a polynomial kernel in Python using Scikit-learn library by following these steps:

Import the necessary packages: from sklearn import datasets and from sklearn.svm import SVC

Load the data using the datasets.load_iris() method

Define the SVM model with the polynomial kernel using SVC(kernel='poly', degree=3, C=1.0)

Train the model with the data using the .fit() method

Predict the outcomes of the model using the .predict() method

Here is an example code snippet for implementing SVM with a polynomial kernel:

In [2]:
from sklearn import datasets
from sklearn.svm import SVC

# Load the iris dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target

# Define the SVM model with the polynomial kernel
svm_poly = SVC(kernel='poly', degree=3, C=1.0)

# Train the model
svm_poly.fit(X, y)

# Predict the outcomes
y_pred = svm_poly.predict(X)
print(y_pred)

[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 2 1
 1 1 1 2 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2]


Question 3 : How does the increasing value of epsilon affect the number of support vectors in SVR?

Answer 3
The value of epsilon is a hyperparameter in Support Vector Regression (SVR) that controls the size of the margin around the predicted function where no penalty is incurred. Increasing the value of epsilon will increase the size of the margin and allow more errors in the predictions. As a result, increasing the value of epsilon will decrease the number of support vectors in SVR. However, a very large value of epsilon may result in underfitting, while a very small value may result in overfitting. So, it is important to choose the appropriate value of epsilon based on the data and the problem at hand.

Question 4 : How does the choice of kernel function, C parameter, epsilon parameter affect the performance of SVR? Can you explain how each parameter works and provide examples of when you might increase or decrease the value?

Answer 4
Support Vector Regression (SVR) is a supervised learning algorithm used for regression tasks. The performance of SVR is heavily dependent on the choice of kernel function, C parameter, epsilon parameter, and gamma parameter. Here is a brief explanation of each parameter and how it affects the performance of SVR:

Kernel Function: The kernel function is a function that transforms the data from the input space to a higher-dimensional feature space, where it can be more easily separated or classified. The choice of kernel function determines the type of decision boundary that can be created between the data points. Some commonly used kernel functions in SVR are linear, polynomial, and radial basis function (RBF).

C Parameter: The C parameter controls the trade-off between the size of the margin and the amount of error allowed in the training data. A smaller C value allows more errors in the training data, which can result in a wider margin and fewer support vectors. Conversely, a larger C value forces the model to classify all data points correctly, resulting in a smaller margin and more support vectors.

Epsilon Parameter: The epsilon parameter determines the width of the epsilon-insensitive zone around the predicted function, where no penalty is incurred. This means that any prediction within the zone is considered correct, and the model does not incur any penalty.

Gamma Parameter: The gamma parameter controls the shape of the decision boundary in the feature space. A small gamma value leads to a softer decision boundary, while a large gamma value leads to a more complex decision boundary.

Question 5 : Assignment
Import necessary libraries and load the dataset

Split the dataset into training and testing set

Preprocess the data with any technique of your choice (eg. scaling, normaliazation)

Create an instance of SVC Classifer and train it on training data

Use trained Classifier to predict labels on testing data

Evaluate the performance of classifier using any metric of your choice

Tune Hyperparameter of SVC using GridSearchCV or RandomizedSearchCV to imporve the performance

Train the tuned classifier for entire dataset

Save the trained classifier to a file for future use

Answer 5
Here is the implementation:

In [None]:
Answer 5
Here is the implementation:

# Step 1: Import necessary libraries and load the dataset
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
from sklearn.model_selection import GridSearchCV
import joblib

df = pd.read_csv('dataset.csv')

# Step 2: Split the dataset into training and testing sets
X = df.drop('target_variable', axis=1)
y = df['target_variable']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Step 3: Preprocess the data using StandardScaler
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Step 4: Create an instance of the SVC classifier and train it on the training data
svc = SVC()
svc.fit(X_train, y_train)

# Step 5: Use the trained classifier to predict the labels of the testing data
y_pred = svc.predict(X_test)

# Step 6: Evaluate the performance of the classifier using accuracy score
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy score: {accuracy}")

# Step 7: Tune the hyperparameters of the SVC classifier using GridSearchCV
param_grid = {'C': [0.1, 1, 10], 'gamma': [0.1, 1, 10], 'kernel': ['linear', 'rbf']}
grid = GridSearchCV(SVC(), param_grid, refit=True, verbose=3)
grid.fit(X_train, y_train)

# Step 8: Train the tuned classifier on the entire dataset
svc_tuned = grid.best_estimator_
svc_tuned.fit(X, y)

# Step 9: Save the trained classifier to a file for future use
joblib.dump(svc_tuned, 'svc_tuned.pkl')