# Support Vector
Machines-2
Assignment Questions

Polynomial functions and kernel functions are closely related in machine learning algorithms, specifically in kernel-based methods like Support Vector Machines (SVMs).

Polynomial functions are a type of function that can be used as a kernel function in SVMs. A kernel function is a function that maps pairs of data points into a higher-dimensional space, where the decision boundary between classes may be more easily defined. One type of kernel function is the polynomial kernel function, which computes the dot product of two vectors raised to a certain power. The polynomial kernel can be written as:

K(x, y) = (x . y + c)^d

where x and y are input vectors, d is the degree of the polynomial, c is a constant, and "." represents the dot product.

The polynomial kernel function can be used to transform the input data into a higher-dimensional space, where a linear decision boundary may be more easily defined. The degree of the polynomial determines the complexity of the decision boundary, with higher degrees leading to more complex boundaries that can fit more complex data patterns.

In summary, polynomial functions can be used as a kernel function in SVMs to transform the input data into a higher-dimensional space, where a linear decision boundary may be more easily defined. The degree of the polynomial determines the complexity of the decision boundary, and can be adjusted to fit the complexity of the data.

To implement an SVM with a polynomial kernel in Python using Scikit-learn, we can use the SVC (Support Vector Classifier) class and specify the kernel parameter as 'poly'. We also need to specify the degree of the polynomial kernel using the degree parameter.

In [1]:
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC

# load the iris dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target

# split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# train an SVM with a polynomial kernel
poly_svm = SVC(kernel='poly', degree=3)
poly_svm.fit(X_train, y_train)

# predict the labels of the test set
y_pred = poly_svm.predict(X_test)

# calculate the accuracy of the model
accuracy = poly_svm.score(X_test, y_test)

print(f"Accuracy: {accuracy}")

Accuracy: 1.0


 we are using the Iris dataset, splitting it into training and testing sets, training an SVM with a polynomial kernel of degree 3, and predicting the labels of the test set. Finally, we calculate the accuracy of the model and print it out.

In Support Vector Regression (SVR), epsilon (ε) is the parameter that controls the width of the margin around the predicted values. Increasing the value of epsilon leads to an increase in the number of support vectors because a wider margin means that more data points will fall within the margin and become support vectors.

To be more specific, when the value of epsilon is small, the margin is narrow, and only a few points are within the margin, which leads to fewer support vectors. Conversely, when the value of epsilon is large, the margin is wide, and more points are within the margin, which leads to more support vectors.

However, increasing the number of support vectors can also lead to overfitting and increased complexity, which may result in poor generalization performance on new data. Therefore, it is important to choose an appropriate value of epsilon based on the specific dataset and modeling goals.

The choice of kernel function, C parameter, epsilon parameter, and gamma parameter can have a significant impact on the performance of Support Vector Regression (SVR).

1. Kernel function: The kernel function determines how the input data is transformed into a higher-dimensional space, where it may be easier to separate the data. Some commonly used kernel functions are linear, polynomial, radial basis function (RBF), and sigmoid.

- Linear kernel function: This kernel function is used for linearly separable data. It maps the data to a higher-dimensional space using a linear function.

- Polynomial kernel function: This kernel function is used for non-linearly separable data. It maps the data to a higher-dimensional space using a polynomial function. The degree of the polynomial can be specified using the degree parameter.

- RBF kernel function: This kernel function is used for non-linearly separable data. It maps the data to a higher-dimensional space using a Gaussian function. The width of the Gaussian function can be specified using the gamma parameter.

- Sigmoid kernel function: This kernel function is used for non-linearly separable data. It maps the data to a higher-dimensional space using a sigmoid function. The gamma and coef0 parameters can be used to control the width and shape of the sigmoid function.

2. C parameter: The C parameter controls the trade-off between achieving a low training error and a low testing error. A small C value creates a wider margin hyperplane, which allows more errors in the training data but may generalize better to unseen data. Conversely, a large C value creates a narrow margin hyperplane, which may achieve a lower training error but may overfit to the training data and not generalize well to new data.

2. Epsilon parameter: The epsilon parameter determines the width of the epsilon-tube around the predicted regression line where no penalty is given to errors. A larger epsilon value allows more errors to be within the epsilon-tube, leading to a wider margin, while a smaller epsilon value leads to a tighter margin and fewer errors allowed.

4. Gamma parameter: The gamma parameter determines the width of the RBF kernel function. A small gamma value creates a wide bell-shaped curve, meaning that each instance's range of influence is broad, leading to a smooth and more generalized decision boundary. In contrast, a large gamma value creates a narrower bell-shaped curve, meaning that each instance's range of influence is more localized, leading to a more complex and overfitting decision boundary.

Examples:

- If we have a large dataset, it may be useful to increase the C value to allow the algorithm to take more time to find the optimal hyperplane.

- If the dataset has high dimensionality, we may want to use a polynomial kernel function with a low degree value, as high degrees could lead to overfitting.

- If we have a relatively small dataset with noise, we may want to use a larger epsilon value to allow more errors in the training data and prevent overfitting.

- If we suspect that our data is non-linearly separable, we may want to try the RBF kernel function with different values of gamma to see which one leads to better performance.

In [5]:
# Import the necessary libraries
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
from sklearn.model_selection import GridSearchCV
import joblib
import pandas as pd
# Load the dataset
df = pd.read_csv('Breast.csv')
X = data.data
y = data.target

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Preprocess the data using StandardScaler
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Create an instance of the SVC classifier and train it on the training data
clf = SVC()
clf.fit(X_train, y_train)

# Use the trained classifier to predict the labels of the testing data
y_pred = clf.predict(X_test)

# Evaluate the performance of the classifier using accuracy
acc = accuracy_score(y_test, y_pred)
print('Accuracy:', acc)

# Tune the hyperparameters of the SVC classifier using GridSearchCV
param_grid = {'C': [0.1, 1, 10, 100],
              'gamma': [0.1, 1, 10, 100],
              'kernel': ['linear', 'rbf', 'poly']}
grid = GridSearchCV(SVC(), param_grid, refit=True, verbose=3)
grid.fit(X_train, y_train)
print('Best parameters:', grid.best_params_)
print('Best score:', grid.best_score_)

# Train the tuned classifier on the entire dataset
clf_tuned = SVC(C=grid.best_params_['C'], gamma=grid.best_params_['gamma'], kernel=grid.best_params_['kernel'])
clf_tuned.fit(X, y)

# Save the trained classifier to a file for future use
joblib.dump(clf_tuned, 'svm_classifier.joblib')

Accuracy: 0.9766081871345029
Fitting 5 folds for each of 48 candidates, totalling 240 fits
[CV 1/5] END ...C=0.1, gamma=0.1, kernel=linear;, score=0.975 total time=   0.0s
[CV 2/5] END ...C=0.1, gamma=0.1, kernel=linear;, score=0.975 total time=   0.0s
[CV 3/5] END ...C=0.1, gamma=0.1, kernel=linear;, score=0.988 total time=   0.0s
[CV 4/5] END ...C=0.1, gamma=0.1, kernel=linear;, score=0.975 total time=   0.0s
[CV 5/5] END ...C=0.1, gamma=0.1, kernel=linear;, score=0.962 total time=   0.0s
[CV 1/5] END ......C=0.1, gamma=0.1, kernel=rbf;, score=0.938 total time=   0.0s
[CV 2/5] END ......C=0.1, gamma=0.1, kernel=rbf;, score=0.963 total time=   0.0s
[CV 3/5] END ......C=0.1, gamma=0.1, kernel=rbf;, score=0.925 total time=   0.0s
[CV 4/5] END ......C=0.1, gamma=0.1, kernel=rbf;, score=0.937 total time=   0.0s
[CV 5/5] END ......C=0.1, gamma=0.1, kernel=rbf;, score=0.937 total time=   0.0s
[CV 1/5] END .....C=0.1, gamma=0.1, kernel=poly;, score=0.912 total time=   0.0s
[CV 2/5] END .....

[CV 1/5] END .......C=1, gamma=100, kernel=poly;, score=0.963 total time=   0.0s
[CV 2/5] END .......C=1, gamma=100, kernel=poly;, score=0.963 total time=   0.0s
[CV 3/5] END .......C=1, gamma=100, kernel=poly;, score=0.950 total time=   0.0s
[CV 4/5] END .......C=1, gamma=100, kernel=poly;, score=0.949 total time=   0.0s
[CV 5/5] END .......C=1, gamma=100, kernel=poly;, score=0.886 total time=   0.0s
[CV 1/5] END ....C=10, gamma=0.1, kernel=linear;, score=0.975 total time=   0.0s
[CV 2/5] END ....C=10, gamma=0.1, kernel=linear;, score=0.950 total time=   0.0s
[CV 3/5] END ....C=10, gamma=0.1, kernel=linear;, score=1.000 total time=   0.0s
[CV 4/5] END ....C=10, gamma=0.1, kernel=linear;, score=0.975 total time=   0.0s
[CV 5/5] END ....C=10, gamma=0.1, kernel=linear;, score=0.937 total time=   0.0s
[CV 1/5] END .......C=10, gamma=0.1, kernel=rbf;, score=0.975 total time=   0.0s
[CV 2/5] END .......C=10, gamma=0.1, kernel=rbf;, score=0.963 total time=   0.0s
[CV 3/5] END .......C=10, ga

[CV 2/5] END ...C=100, gamma=100, kernel=linear;, score=0.912 total time=   0.0s
[CV 3/5] END ...C=100, gamma=100, kernel=linear;, score=1.000 total time=   0.0s
[CV 4/5] END ...C=100, gamma=100, kernel=linear;, score=0.962 total time=   0.0s
[CV 5/5] END ...C=100, gamma=100, kernel=linear;, score=0.949 total time=   0.0s
[CV 1/5] END ......C=100, gamma=100, kernel=rbf;, score=0.625 total time=   0.0s
[CV 2/5] END ......C=100, gamma=100, kernel=rbf;, score=0.625 total time=   0.0s
[CV 3/5] END ......C=100, gamma=100, kernel=rbf;, score=0.625 total time=   0.0s
[CV 4/5] END ......C=100, gamma=100, kernel=rbf;, score=0.633 total time=   0.0s
[CV 5/5] END ......C=100, gamma=100, kernel=rbf;, score=0.620 total time=   0.0s
[CV 1/5] END .....C=100, gamma=100, kernel=poly;, score=0.963 total time=   0.0s
[CV 2/5] END .....C=100, gamma=100, kernel=poly;, score=0.963 total time=   0.0s
[CV 3/5] END .....C=100, gamma=100, kernel=poly;, score=0.950 total time=   0.0s
[CV 4/5] END .....C=100, gam

['svm_classifier.joblib']