# Support Vector Machine

### Information about the Dataset -
**3 Classes** : 3 different types of Wheat Seed 

| Class | Seed Type | Count |
| ----- | --------- | ----- |
| 1  | Kama | 66 |
| 2  | Rosa | 68 |
| 3  | Canadian | 65 |

**7 Features :**
1. area A, 
2. perimeter P, 
3. compactness C = 4*pi*A/P^2, 
4. length of kernel, 
5. width of kernel, 
6. asymmetry coefficient 
7. length of kernel groove. 

**Problem:** Based on 7 features, predict 3 different classes.

In [1]:
# importing all the necessary libraries

import pandas as pd
import matplotlib.pyplot as plt
from sklearn.svm import SVC
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from sklearn.model_selection import train_test_split
import utils

In [2]:
seed = 1

In [3]:
# reading the csv file into pandas dataframe called seeds

seeds = pd.read_csv("seeds.csv")
seeds

Unnamed: 0,Area,Perimeter,Compactness,Kernel.Length,Kernel.Width,Asymmetry.Coeff,Kernel.Groove,Type
0,15.26,14.84,0.8710,5.763,3.312,2.221,5.220,1
1,14.88,14.57,0.8811,5.554,3.333,1.018,4.956,1
2,14.29,14.09,0.9050,5.291,3.337,2.699,4.825,1
3,13.84,13.94,0.8955,5.324,3.379,2.259,4.805,1
4,16.14,14.99,0.9034,5.658,3.562,1.355,5.175,1
...,...,...,...,...,...,...,...,...
194,12.19,13.20,0.8783,5.137,2.981,3.631,4.870,3
195,11.23,12.88,0.8511,5.140,2.795,4.325,5.003,3
196,13.20,13.66,0.8883,5.236,3.232,8.315,5.056,3
197,11.84,13.21,0.8521,5.175,2.836,3.598,5.044,3


In [4]:
# declaring features and labels

features = seeds.drop("Type", axis = 1)
labels = seeds["Type"]

I tried different proportions like:
1. 80% train, 10% test, 10% val
2. 70% train, 20% test, 10% val
3. 70% train, 15% test, 15% val
3. 60% train, 20% test, 20% val

Out of these, I choose 3. with:

**Training Set = 70%**,
**Testing Set = 15%**,
**Validation Set = 15%**

This proportion provides enough data for testing and validation in a way that, Training, Testing and Validation accuracies are all adequtely high in relation to each other.

In [5]:
# splitting data into training, testing and validation datasets

# training and testing 
x_train, x_test, y_train, y_test = train_test_split(features, labels, test_size = 0.15, random_state = seed)

# training and validation 
x_train, x_val, y_train, y_val = train_test_split(x_train, y_train, test_size = 0.15, random_state = seed)

### Explaination of OVO and OVR methods of multiclass classification:

**one vs. one method of classification:**
_This method splits the data into binary sets and makes a model for each of the binary sets. For a prediction, each model gets a vote and the one with majority of votes wins.
For k classes, this method requires k(k-1)/2 SVMs._

**one vs. rest method of classification:**
_This method needs one model for each class. Each model is given a label of that class. And then it makes it a binary classification problem by which it predicts that class or not that class. This way each model gives a score and highest of all wins.
For k calsses, this method requires k SVMs._

### Creating SVM using OVO method

**Experimenting with 4 different values of C parameter.**

In [6]:
# C = 0.01

'''svm_c_001 = SVC(C = 0.01, decision_function_shape = 'ovo', random_state = seed)
svm_c_001.fit(x_train, y_train)
print("C = 0.01")
print("Training Accuracy:", svm_c_001.score(x_train, y_train))
print("Validation Accuracy:", svm_c_001.score(x_val, y_val))
print(" ")'''

# C = 0.1

'''svm_c_01 = SVC(C = 0.1, decision_function_shape = 'ovo', random_state = seed)
svm_c_01.fit(x_train, y_train)
print("C = 0.1")
print("Training Accuracy:", svm_c_01.score(x_train, y_train))
print("Validation Accuracy:", svm_c_01.score(x_val, y_val))
print(" ")'''

# C = 1

'''svm_c_1 = SVC( C = 1, decision_function_shape = 'ovo', random_state = seed)
svm_c_1.fit(x_train, y_train)
print("C = 1")
print("Training Accuracy:", svm_c_1.score(x_train, y_train)) 
print("Validation Accuracy:", svm_c_1.score(x_val, y_val))
print(" ")'''

# C = 10

svm_c_10 = SVC(C = 10, decision_function_shape = 'ovo', random_state = seed)
svm_c_10.fit(x_train, y_train)
print("C = 10")
print("Training Accuracy:", svm_c_10.score(x_train, y_train))
print("Validation Accuracy:", svm_c_10.score(x_val, y_val))
print(" ")

C = 10
Training Accuracy: 0.9230769230769231
Validation Accuracy: 0.9230769230769231
 


From the above values of C parameter, based on the training and validation accuracies, I would choose SVM with **C = 10** to be the best of all. Hence, I will continue with this value of C.

In [7]:
# SVM model with the chosen value of C parameter as 10

print("C = 10")
print("Training Accuracy:", svm_c_10.score(x_train, y_train))
print("Validation Accuracy:", svm_c_10.score(x_val, y_val))
print("Testing Accuracy:", svm_c_10.score(x_test, y_test))
print(" ")

C = 10
Training Accuracy: 0.9230769230769231
Validation Accuracy: 0.9230769230769231
Testing Accuracy: 0.8666666666666667
 


### C Parameter:

Total error = C * (Classification Error) + (Distance Error)

**C parameter** _controls how important each of the above errors is relative to each other. If C has large value, a smaller margin will be accepted and the model is better at classifying all the training points correctly. Smaller value of C will encourage larger margin but with lower training accuracy._

### Trying Linear Kernel

In [8]:
svm_linear = SVC(kernel = 'linear', C = 10, decision_function_shape = 'ovo', random_state = seed)
svm_linear.fit(x_train, y_train)
print("Linear kernel")
print("Training Accuracy:", svm_linear.score(x_train, y_train))
print("Validation Accuracy:", svm_linear.score(x_val, y_val))
print("Testing Accuracy:", svm_linear.score(x_test, y_test))
print(" ")

Linear kernel
Training Accuracy: 0.972027972027972
Validation Accuracy: 0.9230769230769231
Testing Accuracy: 0.9333333333333333
 


### Trying Polynomial kernel with different degrees

In [9]:
# default degree = 3

'''svm_poly = SVC(kernel = 'poly', C = 10, decision_function_shape = 'ovo', random_state = seed)
svm_poly.fit(x_train, y_train)
print("Polynomial kernel with default degree value")
print("Training Accuracy:", svm_poly.score(x_train, y_train))
print("Validation Accuracy:", svm_poly.score(x_val, y_val))
print(" ")'''

# Degree = 2

'''svm_degree_2 = SVC(kernel = 'poly', degree = 2, C = 10, decision_function_shape = 'ovo', random_state = seed)
svm_degree_2.fit(x_train, y_train)
print("Polynomial kernel of degree = 2")
print("Training Accuracy:", svm_degree_2.score(x_train, y_train))
print("Validation Accuracy:", svm_degree_2.score(x_val, y_val))
print(" ")'''

# Degree = 4

'''svm_degree_4 = SVC(kernel = 'poly', degree = 4, C = 10, decision_function_shape = 'ovo', random_state = seed)
svm_degree_4.fit(x_train, y_train)
print("Polynomial kernel of degree = 4")
print("Training Accuracy:", svm_degree_4.score(x_train, y_train))
print("Validation Accuracy:", svm_degree_4.score(x_val, y_val))
print(" ")'''

# Degree = 6

svm_degree_6 = SVC(kernel = 'poly', degree = 6, C = 10, decision_function_shape = 'ovo', random_state = seed)
svm_degree_6.fit(x_train, y_train)
print("Polynomial kernel of degree = 6")
print("Training Accuracy:", svm_degree_6.score(x_train, y_train))
print("Validation Accuracy:", svm_degree_6.score(x_val, y_val))
print(" ")

Polynomial kernel of degree = 6
Training Accuracy: 0.986013986013986
Validation Accuracy: 0.9230769230769231
 


From different values of the degree for Polynomial kernel, I choose **degree = 6** as the best because of its high validation accuracy and the corresponding training accuracy being the highest at the same time as compared to the other Polynomial kernels.

In [10]:
# Degree = 6

print("Polynomial kernel of degree = 6")
print("Training Accuracy:", svm_degree_6.score(x_train, y_train))
print("Validation Accuracy:", svm_degree_6.score(x_val, y_val))
print("Testing Accuracy:", svm_degree_6.score(x_test, y_test))
print(" ")

Polynomial kernel of degree = 6
Training Accuracy: 0.986013986013986
Validation Accuracy: 0.9230769230769231
Testing Accuracy: 0.9666666666666667
 


### Trying RBF Kernel with different values of gamma

In [11]:
# gamma = 0.1

'''svm_gamma_01 = SVC(kernel = 'rbf', gamma = 0.1, C = 10, decision_function_shape = 'ovo', random_state = seed)
svm_gamma_01.fit(x_train, y_train)
print("Gamma = 0.1")
print("Training Accuracy:", svm_gamma_01.score(x_train, y_train))
print("Validation Accuracy:", svm_gamma_01.score(x_val, y_val))
print(" ")'''

# gamma = 1

svm_gamma_1 = SVC(kernel = 'rbf', gamma = 1, C = 10, decision_function_shape = 'ovo', random_state = seed)
svm_gamma_1.fit(x_train, y_train)
print("Gamma = 1")
print("Training Accuracy:", svm_gamma_1.score(x_train, y_train))
print("Validation Accuracy:", svm_gamma_1.score(x_val, y_val))
print(" ")

# gamma = 10

'''svm_gamma_10 = SVC(kernel = 'rbf', gamma = 10, C = 10, decision_function_shape = 'ovo', random_state = seed)
svm_gamma_10.fit(x_train, y_train)
print("Gamma = 10")
print("Training Accuracy:", svm_gamma_10.score(x_train, y_train))
print("Validation Accuracy:", svm_gamma_10.score(x_val, y_val))
print(" ")'''

# gamma = 100

'''svm_gamma_100 = SVC(kernel = 'rbf', gamma = 100, C = 10, decision_function_shape = 'ovo', random_state = seed)
svm_gamma_100.fit(x_train, y_train)
print("Gamma = 100")
print("Training Accuracy:", svm_gamma_100.score(x_train, y_train))
print("Validation Accuracy:", svm_gamma_100.score(x_val, y_val))
print(" ")'''

Gamma = 1
Training Accuracy: 1.0
Validation Accuracy: 0.9615384615384616
 


'svm_gamma_100 = SVC(kernel = \'rbf\', gamma = 100, C = 10, decision_function_shape = \'ovo\', random_state = seed)\nsvm_gamma_100.fit(x_train, y_train)\nprint("Gamma = 100")\nprint("Training Accuracy:", svm_gamma_100.score(x_train, y_train))\nprint("Validation Accuracy:", svm_gamma_100.score(x_val, y_val))\nprint(" ")'

From RBF kernels with different values of gamma, I choose RBF kernel SVM with **gamma = 1** to be the best of all because of the highest validation accuracy and significantly high training accuracy at the same time as compared to other RBF kernels.

In [12]:
# RBF kernel SVM with chosen gamma value of 1

print("Gamma = 1")
print("Training Accuracy:", svm_gamma_1.score(x_train, y_train))
print("Validation Accuracy:", svm_gamma_1.score(x_val, y_val))
print("Testing Accuracy:", svm_gamma_1.score(x_test, y_test))
print(" ")

Gamma = 1
Training Accuracy: 1.0
Validation Accuracy: 0.9615384615384616
Testing Accuracy: 0.9333333333333333
 


### Final choice of Hyperparameters is:

**C = 10, degree = 6 in Polynomial kernel and gamma = 1 for RBF Kernel**. 

Out of Linear, Polynomial and RBF kernel models, **I choose RBF model to be the best for my data**.

In relation with each other, validation accuracy, training accuracy and testing accuracy are all high, which makes RBF model suitable for this case. Also, this means that **RBF model is neither overfitting nor underfitting**. It's doing good in classfying different types of seeds.

In [13]:
# predicting classes within the test data using the trained SVM

svm_gamma_1.predict(x_test)

array([3, 1, 2, 2, 3, 3, 2, 1, 1, 3, 1, 1, 1, 3, 3, 2, 2, 2, 2, 1, 1, 1,
       1, 2, 1, 3, 1, 1, 3, 1])

In [14]:
# displaying mean accuracy for these predictions

print("Testing accuracy: ", svm_gamma_1.score(x_test, y_test))

Testing accuracy:  0.9333333333333333


# Quantitative Analysis of the Support Vector Machine Algorithm

### My final SVM model is svm_gamma_1, which is an RBF kernel with gamma = 1

In [15]:
# accuracy for the training, validation and testing sets for svm_gamma_1

print("Gamma = 1")
print("Training Accuracy:", svm_gamma_1.score(x_train, y_train))
print("Validation Accuracy:", svm_gamma_1.score(x_val, y_val))
print("Testing Accuracy:", svm_gamma_1.score(x_test, y_test))
print(" ")

Gamma = 1
Training Accuracy: 1.0
Validation Accuracy: 0.9615384615384616
Testing Accuracy: 0.9333333333333333
 


In [16]:
# confusion matrix for the training set

confusion_matrix(y_test, svm_gamma_1.predict(x_test))

array([[14,  2,  0],
       [ 0,  6,  0],
       [ 0,  0,  8]])

**From the above confusion matrix, it can be observed that Class 1 (Type 1 seed) has not been perfectly predicted always. It has been falsely classified as Type 2 seed twice. This gives the testing accuracy as 93%. So, this the worst error in my case.**

## Comparing the two notebooks

| Model | Training Accuracy | Validation Accuracy | Testing accuracy |
| ----- | ----------------- | ------------------- | ---------------- |
| Decision Tree | 1.0 | 0.96 | 0.97 |
| Support Vector machine | 1.0 | 0.96 | 0.93 |

I would recommend **Decision Tree algorithm** to perform classification for my case. Both the models give same Training and Validation accuracies, however, in relation to these, testing accuracy of Decision Tree algorithm is higher than the SVM. 