In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import confusion_matrix

<h3>Kernels</h3>

Sometimes an N-1 dimensional hyperplane cannot create a boundary for non-linearly separable data. The linear kernel used previously works in the case of linearly separable data. For non-linear boundaries, a kernel can project different classes of data into different dimensions to allow a linear boundary between them.

<h3>Importing the Dataset</h3>

In [2]:
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"

colnames = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'Class']

irisdata = pd.read_csv(url, names=colnames)

In [3]:
X = irisdata.drop('Class', axis=1)
y = irisdata['Class']

In [6]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.20)

In [8]:
svclassifier = SVC(kernel='linear')
svclassifier.fit(X_train, y_train)

SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
    decision_function_shape='ovr', degree=3, gamma='auto_deprecated',
    kernel='linear', max_iter=-1, probability=False, random_state=None,
    shrinking=True, tol=0.001, verbose=False)

In [9]:
y_pred = svclassifier.predict(X_test)

In [10]:
print(confusion_matrix(y_test,y_pred))

[[ 7  0  0]
 [ 0 12  1]
 [ 0  0 10]]


<h3>Polynomial Kernel</h3>

We can try a polynomial kernel as well.

In [46]:
svclassifier = SVC(kernel='poly', degree=4, gamma='auto')
svclassifier.fit(X_train, y_train)

SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
    decision_function_shape='ovr', degree=4, gamma='auto', kernel='poly',
    max_iter=-1, probability=False, random_state=None, shrinking=True,
    tol=0.001, verbose=False)

In [47]:
y_pred = svclassifier.predict(X_test)

In [48]:
print(confusion_matrix(y_test, y_pred))

[[ 7  0  0]
 [ 0 12  1]
 [ 0  0 10]]


Using degree d = 4, the accuracy doesn't change much. The number of misclassifications is still 1.

<h3>Sigmoid Kernel</h3>

In [50]:
svclassifier = SVC(kernel='sigmoid', gamma='auto')
svclassifier.fit(X_train, y_train)

SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
    decision_function_shape='ovr', degree=3, gamma='auto', kernel='sigmoid',
    max_iter=-1, probability=False, random_state=None, shrinking=True,
    tol=0.001, verbose=False)

In [51]:
y_pred = svclassifier.predict(X_test)

In [52]:
print(confusion_matrix(y_test, y_pred))

[[ 7  0  0]
 [13  0  0]
 [10  0  0]]


The sigmoid classifier performs poorly, misclassifying 23 points. This makes sense as the sigmoid classifier is meant for binary classification. Here, we have three classes.

<h3>Gaussian Kernel</h3>

In [70]:
svclassifier = SVC(kernel='rbf', gamma='auto')
svclassifier.fit(X_train, y_train)

SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
    decision_function_shape='ovr', degree=3, gamma='auto', kernel='rbf',
    max_iter=-1, probability=False, random_state=None, shrinking=True,
    tol=0.001, verbose=False)

In [71]:
y_pred = svclassifier.predict(X_test)

In [72]:
print(confusion_matrix(y_test, y_pred))

[[ 7  0  0]
 [ 0 12  1]
 [ 0  0 10]]


In this problem, the Gaussian Kernel performs just as well as the polynomial kernel. A kernel should be chosen by testing which one works best - there is rule.