# Simple support vector machine

## Introduction

The goal is to predict whether a bank currency note is authentic or not based upon four attributes of the note i.e. skewness of the wavelet transformed image, variance of the image, entropy of the image, and curtosis of the image.
This is a binary classification problem.

To do so we will use SVM algorithm.


## Import libraries

In [29]:
import pandas as pd
import numpy as np

import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import seaborn as sns

## Load data

In [30]:
# Get path of the dataset
file_path = 'data/bill_authentication.csv'

# Read the csv
bill_data = pd.read_csv(file_path)

# Show first rows
bill_data.head()

Unnamed: 0,Variance,Skewness,Curtosis,Entropy,Class
0,3.6216,8.6661,-2.8073,-0.44699,0
1,4.5459,8.1674,-2.4586,-1.4621,0
2,3.866,-2.6383,1.9242,0.10645,0
3,3.4566,9.5228,-4.0112,-3.5944,0
4,0.32924,-4.4552,4.5718,-0.9888,0


## Visualising the data

In [44]:
import plotly.offline as py
import plotly.graph_objects as go

trace = go.Scatter3d(
    x = bill_data['Variance'],
    y = bill_data['Curtosis'],
    z = bill_data['Entropy'],
    mode = 'markers',
    marker = dict( 
        color = bill_data['Class'],
        opacity = 1),
)
layout2 = go.Layout(scene = {'aspectmode':'cube'})
layout2['scene'].update(xaxis = {'title':'Variance'}, yaxis = {'title':'Curtosis'}, zaxis = {'title':'Entropy'})
layout2['title'] = 'Bills authentification'

camera = dict( up=dict(x=1, y=0, z=0),           # determine the up direction on the page (here x i.e. Age)
               center=dict(x=0, y=0, z=0),       # (0,0,0) is always the center of the domain, no matter data values
               eye=dict(x=0.25, y=1.7, z=-1.2)   # x 2 to value unzoom by 2,  < 1 and you are in the domaine
             )

layout2['scene_camera'] = camera

fig = go.Figure(data=[trace], layout=layout2)
py.iplot(fig, show_link=False)

## Data preprocessing

In [45]:
# Take only the features of the dataset
X = bill_data.drop('Class', axis=1)

# The values to predict
y = bill_data['Class']

In [46]:
from sklearn.model_selection import train_test_split

# Split the dataset to get training values and testing values
X_train, X_valid, y_train, y_valid = train_test_split(X, y, test_size=0.33)

## Model training

In [47]:
from sklearn.svm import SVC

# Specify SVM model
sv_classifier = SVC(kernel='linear')

# Train the model
sv_classifier.fit(X_train, y_train)

## Make predictions

In [48]:
y_pred = sv_classifier.predict(X_valid)

## Model validation

In [49]:
from sklearn.metrics import classification_report, confusion_matrix
print(confusion_matrix(y_valid,y_pred))

print(classification_report(y_valid,y_pred))

[[250   3]
 [  0 200]]
              precision    recall  f1-score   support

           0       1.00      0.99      0.99       253
           1       0.99      1.00      0.99       200

    accuracy                           0.99       453
   macro avg       0.99      0.99      0.99       453
weighted avg       0.99      0.99      0.99       453



In [53]:
df = bill_data.loc[y_valid.index.tolist(), :]
df['Prediction'] = y_pred
df

Unnamed: 0,Variance,Skewness,Curtosis,Entropy,Class,prediction
8,3.20320,5.75880,-0.75345,-0.61251,0,0
510,3.57700,2.40040,1.89080,0.73231,0,0
192,1.45780,-0.08485,4.17850,0.59136,0,0
1243,-5.06760,-5.18770,10.42660,-0.86725,1,1
721,-0.45062,-1.36780,7.08580,-0.40303,0,0
...,...,...,...,...,...,...
1282,-1.99830,-6.60720,4.82540,-0.41984,1,1
335,3.46670,-4.07240,4.28820,1.54180,0,0
334,3.92940,1.41120,1.80760,0.89782,0,0
1209,-0.69078,-0.50077,-0.35417,0.47498,1,1


## Conclusion

In this notebook, we use support vector machine to do binary classification to predict the authenticity of a bank currency note.

The important thing that I remember is that SVM works best when the dataset is small and complex.