# Image pattern recognition using SVM:
**Objective of the exercise:** we will study different types of kernels that can be used to implement kernel ***SVM*** with Python’s *Scikit-Learn* library. 
*Scikit-Learn* is a free Python library for automatic learning, it includes functions for estimating random forests, logistic regressions, classification algorithms, and support vector machines.

**Dataset:** We will use the iris dataset to predict the category to which a plant belongs based on 4 attributes: sepal-width, sepal-length, petal-width and petal-length. We have 3 output classes : Setosa, Versicolor and Virginica.




In [0]:
################# 1.Importing Libraries ###########################
import numpy as np  
import matplotlib.pyplot as plt  
import pandas as pd 

## 2) Importing the iris Dataset

Import the iris dataset from this url 
https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data
Do not forget to assign the 5 following column names ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'Class']  to the dataset.
Use read_csv() from pandas to read dataset to pandas.

In [0]:
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
colomn_names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'Class']
irisdata = pd.read_csv(url, names=colomn_names) 

## 3) Dividing data into features and labels


In [0]:
x = irisdata.drop('Class', axis=1)  
y = irisdata['Class']  

## Comments:
In the first line, why did we use the **drop()** method?

All the columns of the iris dataset are being stored in the x variable except the "Class" column which is the label column. The **drop()** method drops this column. 

Which variable from $x$ and $y$ contain attributes?

In the second line, only the class column is being stored in the $y$ variable. At this point of time $x$ variable contains attributes while $y$ variable contains corresponding labels.


## 4) Dividing data into training and testing sets.
Divide data into treain and test.


In [0]:
from sklearn.model_selection import train_test_split  
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.20)



SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
    decision_function_shape='ovr', degree=3, gamma='auto_deprecated',
    kernel='sigmoid', max_iter=-1, probability=False, random_state=None,
    shrinking=True, tol=0.001, verbose=False)

## Training the Algorithm
Scikit-Learn contains the svm library, which contains built-in classes for different SVM algorithms. Since we are going to perform a classification task, we will use the support vector classifier class, which is written as SVC in the Scikit-Learn's svm library. This class takes one parameter, which is the kernel type. 
The fit() method of SVC class is called to train the algorithm on the training data, which is passed as a parameter to the fit method. 
Train the algorithm with a sigmoid kernel.


## 5) Making Predictions

In [0]:
from sklearn.svm import SVC  
svclassifier = SVC(kernel='sigmoid')  
svclassifier.fit(x_train, y_train) 

y_pred = svclassifier.predict(x_test)

## 6) Evaluating the Algorithm
Use confusion_matrix and classification_report methods from Scikit-Learn's metrics library to find confusion matrix, precision, recall, and f1-score measures which are the most commonly used metrics for classification tasks.


In [0]:
from sklearn.metrics import  confusion_matrix, classification_report 
print(confusion_matrix(y_test, y_pred))  
print(classification_report(y_test, y_pred))    

[[ 0  0 13]
 [ 0  0 10]
 [ 0  0  7]]
                 precision    recall  f1-score   support

    Iris-setosa       0.00      0.00      0.00        13
Iris-versicolor       0.00      0.00      0.00        10
 Iris-virginica       0.23      1.00      0.38         7

       accuracy                           0.23        30
      macro avg       0.08      0.33      0.13        30
   weighted avg       0.05      0.23      0.09        30



  'precision', 'predicted', average, warn_for)


## 7) Polynomial Kernel
Repeat all the previous steps for a Polynomial kernel.
Do not forget to add the degree of the polynomial.

In [0]:
from sklearn.svm import SVC  
svclassifier = SVC(kernel='poly', degree=8)  
svclassifier.fit(x_train, y_train) 
#Prediction & Evaluation
y_pred = svclassifier.predict(x_test)
from sklearn.metrics import  confusion_matrix, classification_report   
print(confusion_matrix(y_test, y_pred))  
print(classification_report(y_test, y_pred))



[[13  0  0]
 [ 0  9  1]
 [ 0  1  6]]
                 precision    recall  f1-score   support

    Iris-setosa       1.00      1.00      1.00        13
Iris-versicolor       0.90      0.90      0.90        10
 Iris-virginica       0.86      0.86      0.86         7

       accuracy                           0.93        30
      macro avg       0.92      0.92      0.92        30
   weighted avg       0.93      0.93      0.93        30



## 8) Gaussian Kernel
Repeat all the previous steps for a Gaussian kernel.


In [0]:
from sklearn.svm import SVC  
svclassifier = SVC(kernel='rbf')  
svclassifier.fit(x_train, y_train)  
#Prediction & Evaluation
y_pred = svclassifier.predict(x_test)
from sklearn.metrics import  confusion_matrix, classification_report  
print(confusion_matrix(y_test, y_pred))  
print(classification_report(y_test, y_pred)) 

[[13  0  0]
 [ 0 10  0]
 [ 0  0  7]]
                 precision    recall  f1-score   support

    Iris-setosa       1.00      1.00      1.00        13
Iris-versicolor       1.00      1.00      1.00        10
 Iris-virginica       1.00      1.00      1.00         7

       accuracy                           1.00        30
      macro avg       1.00      1.00      1.00        30
   weighted avg       1.00      1.00      1.00        30





## 9) Comparison of Kernel Performance
Now, you have implement SVM with 3 type of kernel Sigmoid, Polynomial and Gaussian. Compare and interpret the results of these three type of kernels.


The sigmoid kernel is the worst because sigmoid function returns only 2 values 0 and 1. So, it is more suitable for a binary classification while in our case we had 3 output classes.
The Gaussian kernel is slightly better than polynomial kernel. In fact, with Gaussian kernel, we achieve 100% prediction rate while polynomial kernel misclassified one instance.
Finally, there is no best kernel for all scenarios. It depends on your test dataset. 
