# Mark-7

### Support Vector Machines <br>
A support vector machine (SVM) is a type of supervised machine learning classification algorithm. SVMs were introduced initially in 1960s and were later refined in 1990s. However, it is only now that they are becoming extremely popular, owing to their ability to achieve brilliant results. SVMs are implemented in a unique way when compared to other machine learning algorithms.


#### Simple SVM Using Scikit Learn
(Linearly Separatable dataset)

In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

###### About the dataset
Source:

Owner of database: Volker Lohweg (University of Applied Sciences, Ostwestfalen-Lippe, volker.lohweg '@' hs-owl.de) <br>
Donor of database: Helene DÃ¶rksen (University of Applied Sciences, Ostwestfalen-Lippe, helene.doerksen '@' hs-owl.de) <br>
Date received: August, 2012 <br>


Data Set Information: <br>

Data were extracted from images that were taken from genuine and forged banknote-like specimens. For digitization, an industrial camera usually used for print inspection was used. The final images have 400x 400 pixels. Due to the object lens and distance to the investigated object gray-scale pictures with a resolution of about 660 dpi were gained. Wavelet Transform tool were used to extract features from images. <br>


Attribute Information: <br>

1. variance of Wavelet Transformed image (continuous) <br>
2. skewness of Wavelet Transformed image (continuous) <br>
3. curtosis of Wavelet Transformed image (continuous)<br>
4. entropy of image (continuous)<br>
5. class (integer)<br>



In [7]:
bankdata = pd.read_csv("C:/Machine Learning/Datasets/bill_authentication.csv") #Remember when importing datasets using pandas replace \ by /

In [11]:
#Exploratory Data Analysis
bankdata.shape #size of the data


(1372, 5)

In [13]:
bankdata.head() #first few rows of the data

Unnamed: 0,Variance,Skewness,Curtosis,Entropy,Class
0,3.6216,8.6661,-2.8073,-0.44699,0
1,4.5459,8.1674,-2.4586,-1.4621,0
2,3.866,-2.6383,1.9242,0.10645,0
3,3.4566,9.5228,-4.0112,-3.5944,0
4,0.32924,-4.4552,4.5718,-0.9888,0


In [15]:
# Data Preprocessing
X = bankdata.drop('Class', axis=1) #dropping the class column
y = bankdata['Class']  #taking the class column only

In [17]:
#train test split using scikit learn
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.20)

In [20]:
#training Algorithm
from sklearn.svm import SVC
svclassifier = SVC(C=2,kernel = 'linear')
svclassifier.fit(X_train,y_train)

SVC(C=2, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape='ovr', degree=3, gamma='auto_deprecated',
  kernel='linear', max_iter=-1, probability=False, random_state=None,
  shrinking=True, tol=0.001, verbose=False)

In [22]:
y_pred = svclassifier.predict(X_test) #prediction

###### Evaluating the algorithm Using confusion matrix
By definition a confusion matrix  is such that  is equal to the number of observations known to be in group  but predicted to be in group . <br>

Thus in binary classification,the count of true negatives is C_00, false negatives is C_10, true positives is C_11 and false positives is C_01.

In [24]:
from sklearn.metrics import classification_report, confusion_matrix
print(confusion_matrix(y_test,y_pred))
print(classification_report(y_test,y_pred))

[[145   3]
 [  1 126]]
              precision    recall  f1-score   support

           0       0.99      0.98      0.99       148
           1       0.98      0.99      0.98       127

   micro avg       0.99      0.99      0.99       275
   macro avg       0.98      0.99      0.99       275
weighted avg       0.99      0.99      0.99       275



### Kernel SVM (Non-Linear Descision Boundary)

In [26]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

In [28]:
#Importing Datasets
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"

# Assign colum names to the dataset
colnames = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'Class']

# Read dataset to pandas dataframe
irisdata = pd.read_csv(url, names=colnames)

In [30]:
#Preprocessing
X = irisdata.drop('Class', axis=1)
y = irisdata['Class']

In [32]:
#Train-Test Split
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.20)

In [34]:
#Polynomial Kernel
from sklearn.svm import SVC
svclassifier = SVC(kernel='poly', degree=8)
svclassifier.fit(X_train, y_train)
y_pred = svclassifier.predict(X_test)
from sklearn.metrics import classification_report, confusion_matrix
print(confusion_matrix(y_test, y_pred))
print(classification_report(y_test, y_pred))

[[ 8  0  0]
 [ 0  9  0]
 [ 0  2 11]]
                 precision    recall  f1-score   support

    Iris-setosa       1.00      1.00      1.00         8
Iris-versicolor       0.82      1.00      0.90         9
 Iris-virginica       1.00      0.85      0.92        13

      micro avg       0.93      0.93      0.93        30
      macro avg       0.94      0.95      0.94        30
   weighted avg       0.95      0.93      0.93        30





In [36]:
#Gaussian Kernel
from sklearn.svm import SVC
svclassifier = SVC(kernel='rbf')
svclassifier.fit(X_train, y_train)
y_pred = svclassifier.predict(X_test)
from sklearn.metrics import classification_report, confusion_matrix
print(confusion_matrix(y_test, y_pred))
print(classification_report(y_test, y_pred))

[[ 8  0  0]
 [ 0  9  0]
 [ 0  0 13]]
                 precision    recall  f1-score   support

    Iris-setosa       1.00      1.00      1.00         8
Iris-versicolor       1.00      1.00      1.00         9
 Iris-virginica       1.00      1.00      1.00        13

      micro avg       1.00      1.00      1.00        30
      macro avg       1.00      1.00      1.00        30
   weighted avg       1.00      1.00      1.00        30





In [38]:
#Sigmoid Kernel
from sklearn.svm import SVC
svclassifier = SVC(kernel='sigmoid')
svclassifier.fit(X_train, y_train)
y_pred = svclassifier.predict(X_test)
from sklearn.metrics import classification_report, confusion_matrix
print(confusion_matrix(y_test, y_pred))
print(classification_report(y_test, y_pred))

[[ 8  0  0]
 [ 9  0  0]
 [13  0  0]]
                 precision    recall  f1-score   support

    Iris-setosa       0.27      1.00      0.42         8
Iris-versicolor       0.00      0.00      0.00         9
 Iris-virginica       0.00      0.00      0.00        13

      micro avg       0.27      0.27      0.27        30
      macro avg       0.09      0.33      0.14        30
   weighted avg       0.07      0.27      0.11        30



  'precision', 'predicted', average, warn_for)


##### Comparison of Kernel Performance
If we compare the performance of the different types of kernels we can clearly see that the sigmoid kernel performs the worst. This is due to the reason that sigmoid function returns two values, 0 and 1, therefore it is more suitable for binary classification problems. However, in our case we had three output classes.<br>

Amongst the Gaussian kernel and polynomial kernel, we can see that Gaussian kernel achieved a perfect 100% prediction rate while polynomial kernel misclassified one instance. Therefore the Gaussian kernel performed slightly better. However, there is no hard and fast rule as to which kernel performs best in every scenario. It is all about testing all the kernels and selecting the one with the best results on your test dataset.