Q1. What is the relationship between polynomial functions and kernel functions in machine learning algorithms?

Polynomial functions and kernel functions are both used in machine learning algorithms, but they have different roles. Polynomial functions are a type of function that can be used to model data, while kernel functions are used in the context of support vector machines (SVMs) to define the similarity between data points.

In SVMs, the choice of kernel function determines how the data is transformed into a higher-dimensional feature space, where a linear decision boundary can be used to separate the classes. The polynomial kernel is one type of kernel function that can be used in SVMs, and it maps the data into a feature space of polynomial degree.



Q2. How can we implement an SVM with a polynomial kernel in Python using Scikit-learn?

To implement an SVM with a polynomial kernel in Python using Scikit-learn, you can use the SVC class and specify the kernel as 'poly'. For example:

In [4]:
from sklearn import svm
X = [[0, 0], [1, 1]]
y = [0, 1]
clf = svm.SVC(kernel='poly')
clf.fit(X, y)
X_test = [[0,1],[1,0]]
clf.predict(X_test)

array([0, 0])

Q3. How does increasing the value of epsilon affect the number of support vectors in SVR?

The value of epsilon in support vector regression (SVR) determines the width of the margin around the regression line within which errors are ignored. As the value of epsilon increases, the number of support vectors may also increase because the margin becomes wider, and more data points may fall within the margin. However, the exact effect of changing epsilon on the number of support vectors can depend on the data and other parameters of the SVR algorithm.

Q4. How does the choice of kernel function, C parameter, epsilon parameter, and gamma parameter affect the performance of Support Vector Regression (SVR)? Can you explain how each parameter works and provide examples of when you might want to increase or decrease its value?

The choice of kernel function, C parameter, epsilon parameter, and gamma parameter can all affect the performance of support vector regression (SVR). The kernel function determines the type of nonlinearity in the regression function, while the C parameter controls the trade-off between minimizing the training error and allowing for more flexibility in the model. The epsilon parameter determines the width of the margin around the regression line within which errors are ignored, and the gamma parameter determines the shape of the kernel function.

If the data has a high degree of nonlinearity, a more complex kernel function may be needed, such as a radial basis function (RBF) kernel. The C parameter can be increased to allow for more flexibility in the model, but this can also increase the risk of overfitting. The epsilon parameter can be increased to allow for more errors in the training data, but this can also reduce the accuracy of the model. The gamma parameter can be increased to make the kernel function more peaked, which can lead to better accuracy in some cases, but it can also make the model more sensitive to noise in the data. The optimal values of these parameters depend on the specific problem and data set, and may need to be tuned through experimentation.

Q5. Assignment:

L Import the necessary libraries and load the dataseg

L Split the dataset into training and testing setZ

L Preprocess the data using any technique of your choice (e.g. scaling, normaliMationK

L Create an instance of the SVC classifier and train it on the training datW

L hse the trained classifier to predict the labels of the testing datW

L Evaluate the performance of the classifier using any metric of your choice (e.g. accuracy, precision, recall, F1-scoreK

L Tune the hyperparameters of the SVC classifier using GridSearchCV or RandomiMedSearchCV to improve its performanc_

L Train the tuned classifier on the entire dataseg

L Save the trained classifier to a file for future use.

You can use any dataset of your choice for this assignment, but make sure it is suitable for classification and has a sufficient number of features and samples.

In [7]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline

In [8]:
from sklearn.datasets import load_breast_cancer

In [9]:
data = load_breast_cancer()

In [10]:
print(data.DESCR)

.. _breast_cancer_dataset:

Breast cancer wisconsin (diagnostic) dataset
--------------------------------------------

**Data Set Characteristics:**

    :Number of Instances: 569

    :Number of Attributes: 30 numeric, predictive attributes and the class

    :Attribute Information:
        - radius (mean of distances from center to points on the perimeter)
        - texture (standard deviation of gray-scale values)
        - perimeter
        - area
        - smoothness (local variation in radius lengths)
        - compactness (perimeter^2 / area - 1.0)
        - concavity (severity of concave portions of the contour)
        - concave points (number of concave portions of the contour)
        - symmetry
        - fractal dimension ("coastline approximation" - 1)

        The mean, standard error, and "worst" or largest (mean of the three
        worst/largest values) of these features were computed for each image,
        resulting in 30 features.  For instance, field 0 is Mean Radi

In [12]:
X = pd.DataFrame(data.data,columns=data.feature_names)
y = data.target
X.head()

Unnamed: 0,mean radius,mean texture,mean perimeter,mean area,mean smoothness,mean compactness,mean concavity,mean concave points,mean symmetry,mean fractal dimension,...,worst radius,worst texture,worst perimeter,worst area,worst smoothness,worst compactness,worst concavity,worst concave points,worst symmetry,worst fractal dimension
0,17.99,10.38,122.8,1001.0,0.1184,0.2776,0.3001,0.1471,0.2419,0.07871,...,25.38,17.33,184.6,2019.0,0.1622,0.6656,0.7119,0.2654,0.4601,0.1189
1,20.57,17.77,132.9,1326.0,0.08474,0.07864,0.0869,0.07017,0.1812,0.05667,...,24.99,23.41,158.8,1956.0,0.1238,0.1866,0.2416,0.186,0.275,0.08902
2,19.69,21.25,130.0,1203.0,0.1096,0.1599,0.1974,0.1279,0.2069,0.05999,...,23.57,25.53,152.5,1709.0,0.1444,0.4245,0.4504,0.243,0.3613,0.08758
3,11.42,20.38,77.58,386.1,0.1425,0.2839,0.2414,0.1052,0.2597,0.09744,...,14.91,26.5,98.87,567.7,0.2098,0.8663,0.6869,0.2575,0.6638,0.173
4,20.29,14.34,135.1,1297.0,0.1003,0.1328,0.198,0.1043,0.1809,0.05883,...,22.54,16.67,152.2,1575.0,0.1374,0.205,0.4,0.1625,0.2364,0.07678


In [13]:
from sklearn.preprocessing import MinMaxScaler
scale = MinMaxScaler()

In [14]:
X = scale.fit_transform(X)

In [15]:
X = pd.DataFrame(X,columns=data.feature_names)

In [16]:
X.head()

Unnamed: 0,mean radius,mean texture,mean perimeter,mean area,mean smoothness,mean compactness,mean concavity,mean concave points,mean symmetry,mean fractal dimension,...,worst radius,worst texture,worst perimeter,worst area,worst smoothness,worst compactness,worst concavity,worst concave points,worst symmetry,worst fractal dimension
0,0.521037,0.022658,0.545989,0.363733,0.593753,0.792037,0.70314,0.731113,0.686364,0.605518,...,0.620776,0.141525,0.66831,0.450698,0.601136,0.619292,0.56861,0.912027,0.598462,0.418864
1,0.643144,0.272574,0.615783,0.501591,0.28988,0.181768,0.203608,0.348757,0.379798,0.141323,...,0.606901,0.303571,0.539818,0.435214,0.347553,0.154563,0.192971,0.639175,0.23359,0.222878
2,0.601496,0.39026,0.595743,0.449417,0.514309,0.431017,0.462512,0.635686,0.509596,0.211247,...,0.556386,0.360075,0.508442,0.374508,0.48359,0.385375,0.359744,0.835052,0.403706,0.213433
3,0.21009,0.360839,0.233501,0.102906,0.811321,0.811361,0.565604,0.522863,0.776263,1.0,...,0.24831,0.385928,0.241347,0.094008,0.915472,0.814012,0.548642,0.88488,1.0,0.773711
4,0.629893,0.156578,0.630986,0.48929,0.430351,0.347893,0.463918,0.51839,0.378283,0.186816,...,0.519744,0.123934,0.506948,0.341575,0.437364,0.172415,0.319489,0.558419,0.1575,0.142595


In [17]:
y

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
       0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0,
       1, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0,
       1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1,
       1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 0,
       0, 1, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 0, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 1,
       1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0,
       0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0,
       1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 1, 1,
       1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0,

In [18]:
## Split into training and test set
from sklearn.model_selection import train_test_split

In [19]:
X_train, X_test,y_train,y_test = train_test_split(X,y,random_state=42,test_size=0.33)

In [20]:
## SVC classifier
from sklearn.svm import SVC

In [21]:
clf = SVC()

In [22]:
clf.fit(X_train,y_train)

In [23]:
y_pred = clf.predict(X_test)

In [24]:
y_pred

array([1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1,
       0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1,
       1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 1,
       0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0,
       1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1,
       0, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0,
       1, 1, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1,
       1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1,
       0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1])

In [25]:
# Evaluate the performance of the classifier using any metric of your choice (e.g. accuracy, precision, #
# recall, F1-scoreK
from sklearn.metrics import accuracy_score,confusion_matrix,classification_report

In [27]:
print(accuracy_score(y_test,y_pred))
print(confusion_matrix(y_test,y_pred))
print(classification_report(y_test,y_pred))

0.9787234042553191
[[ 65   2]
 [  2 119]]
              precision    recall  f1-score   support

           0       0.97      0.97      0.97        67
           1       0.98      0.98      0.98       121

    accuracy                           0.98       188
   macro avg       0.98      0.98      0.98       188
weighted avg       0.98      0.98      0.98       188



In [28]:
# Tune the hyperparameters of the SVC classifier using GridSearchCV or RandomiMedSearchCV to
# improve its performanc_
from sklearn.model_selection import GridSearchCV

In [29]:
parameters = {
    'C': [0.1, 1, 10, 100],
    'gamma': [0.1, 1, 10, 100],
    'kernel': ['linear', 'poly', 'rbf', 'sigmoid']
}

In [30]:
gridcv = GridSearchCV(SVC(),param_grid=parameters,cv=5,verbose=3)

In [31]:
# Train the tuned classifier on the entire dataseg
gridcv.fit(X_train,y_train)

Fitting 5 folds for each of 64 candidates, totalling 320 fits
[CV 1/5] END ...C=0.1, gamma=0.1, kernel=linear;, score=0.961 total time=   0.0s
[CV 2/5] END ...C=0.1, gamma=0.1, kernel=linear;, score=0.961 total time=   0.0s
[CV 3/5] END ...C=0.1, gamma=0.1, kernel=linear;, score=0.961 total time=   0.0s
[CV 4/5] END ...C=0.1, gamma=0.1, kernel=linear;, score=0.934 total time=   0.0s
[CV 5/5] END ...C=0.1, gamma=0.1, kernel=linear;, score=0.934 total time=   0.0s
[CV 1/5] END .....C=0.1, gamma=0.1, kernel=poly;, score=0.649 total time=   0.0s
[CV 2/5] END .....C=0.1, gamma=0.1, kernel=poly;, score=0.684 total time=   0.0s
[CV 3/5] END .....C=0.1, gamma=0.1, kernel=poly;, score=0.632 total time=   0.0s
[CV 4/5] END .....C=0.1, gamma=0.1, kernel=poly;, score=0.632 total time=   0.0s
[CV 5/5] END .....C=0.1, gamma=0.1, kernel=poly;, score=0.671 total time=   0.0s
[CV 1/5] END ......C=0.1, gamma=0.1, kernel=rbf;, score=0.909 total time=   0.0s
[CV 2/5] END ......C=0.1, gamma=0.1, kernel=rbf

In [32]:
gridcv.best_params_

{'C': 1, 'gamma': 0.1, 'kernel': 'linear'}

In [33]:
gridcv.best_score_

0.9737183868762816

In [34]:
gridcv.best_estimator_

In [35]:
y_pred = gridcv.predict(X_test)

In [36]:
print(accuracy_score(y_test,y_pred))
print(confusion_matrix(y_test,y_pred))
print(classification_report(y_test,y_pred))

0.9840425531914894
[[ 65   2]
 [  1 120]]
              precision    recall  f1-score   support

           0       0.98      0.97      0.98        67
           1       0.98      0.99      0.99       121

    accuracy                           0.98       188
   macro avg       0.98      0.98      0.98       188
weighted avg       0.98      0.98      0.98       188



In [37]:
# Save the trained classifier to a file for future use.
import pickle
## Pickle the model
filename = 'svm_model.pkl'
pickle.dump(gridcv, open(filename, 'wb'))

## pickle the min max scaler
filename = 'Scaler.pkl'
pickle.dump(scale, open(filename, 'wb'))