<a href="https://colab.research.google.com/github/M-H-Amini/MachineLearning-AUT/blob/master/MLe_Lec4_SVM.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# In The Name Of ALLAH
# Machine Learning *elementary* Course
## Amirkabir University of Technology
### Mohammad Hossein Amini (mhamini@aut.ac.ir)
# Lecture 4 - Support Vector Machines

<img src="https://drive.google.com/uc?id=144SDpgv7EEy6Og1ZFNIv_nBaugKGiSCE" width="400">



# Introduction

The theoretical stuff has been discussed in the video lectures. Let's implement a little...

First of all, we should import some modules.

In [0]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn import svm

Let's see how SVM works on the **Hear Disease** dataset first.

# Heart Disease Dataset

In [0]:
ds = pd.read_csv('heart.csv')
ds = ds.sample(frac=1).reset_index(drop=True)
a = pd.get_dummies(ds['cp'], prefix = "cp")
b = pd.get_dummies(ds['thal'], prefix = "thal")
c = pd.get_dummies(ds['slope'], prefix = "slope")
frames = [ds, a, b, c]
ds = pd.concat(frames, axis = 1)
ds = ds.drop(columns=['cp', 'thal', 'slope'])
y = np.array(ds['target'])
y = y[:, np.newaxis]
ds = ds.drop(columns=['target'])
x = np.array(ds)

split = 0.8
no_of_trains = int(ds.shape[0]*split)
X_train = x[:no_of_trains, :]
Y_train = y[:no_of_trains, :]
X_test = x[no_of_trains:, :]
Y_test = y[no_of_trains:, :]

In [0]:
ds.head()

In [0]:
model = svm.SVC(degree=5 , C = 1000, kernel='poly')
model.fit(X_train, Y_train[:,0])

In [0]:
predicted = model.predict(X_test)

In [0]:
def evaluate(x, y, show=True):
  wrongs = 0
  corrects = 0
  predicted = model.predict(x)
  for i in range(y.shape[0]):
    if show:
      print(f'No {i+1}, Target: {y[i, 0]}, Predicted: {predicted[i]}')
    if y[i, 0] == predicted[i]:
      corrects += 1
    else:
      wrongs += 1
  print(f'Corrects: {corrects}, Wrongs: {wrongs}')
  return corrects/y.shape[0], wrongs/y.shape[0]

In [0]:
evaluate(X_test, Y_test)

<img src="https://drive.google.com/uc?id=1J3RkCDoaa87BfIfhuKUSjoJuqVe5wzUz" width="350">



# MNIST Handwritten Digits

A famous dataset in machine learning and deep learning is the *MNIST Handwritten Digits*. It's a collection of about 60000 handwritten digits labeled. One of the ways of comparing different learning algorithms is to evaluate their performance on MNIST dataset.

<img src="https://drive.google.com/uc?id=1WXwOzPYIbfVjKA2ArA2yJz-lLp9DaJ2n" width="300">


First of all, let's import the tiny version of this dataset on colab.

In [0]:
mnist_train = pd.read_csv('sample_data/mnist_train_small.csv')
mnist_test = pd.read_csv('sample_data/mnist_test.csv')

In [0]:
mnist_train.head()

The first column of the dataset is the target. So we take the first column as our target take the others as the input.

In [0]:
inputs_train = mnist_train.drop(columns=mnist_train.columns[0])
targets_train = mnist_train[mnist_train.columns[0]]
inputs_test = mnist_test.drop(columns=mnist_test.columns[0])
targets_test = mnist_test[mnist_test.columns[0]]

In [0]:
inputs_train_arr = np.array(inputs_train)
targets_train_arr = np.array(targets_train)
inputs_test_arr = np.array(inputs_test)
targets_test_arr = np.array(targets_test)

It's a good idea to have a function for showing each digit.

In [0]:
def showDigit(image, target, predicted=None):
  image = np.reshape(image, (28, 28))
  plt.figure()
  if predicted is not None:
    plt.title(f'Target: {target}, Predicted: {predicted}')
  else:
    plt.title(f'Target: {target}')
  plt.imshow(image)
  plt.show()

index = 0
showDigit(inputs_train_arr[index], targets_train_arr[index])

Time to create our *SVM* model and train it. We use Gaussian kernel for it.

In [0]:
model = svm.SVC(kernel = 'rbf', C=100)
model.fit(inputs_train_arr, targets_train_arr)

In [0]:
index = 15
predicted = model.predict(inputs_test_arr[index:index+1, :])
showDigit(inputs_test_arr[index], targets_test_arr[index], predicted)


Finally, let's see our performance.

In [0]:
predicted = model.predict(inputs_test_arr)
corrects = 0
wrongs = 0
for i in range(inputs_test_arr.shape[0]):
  if predicted[i] == targets_test_arr[i]:
    corrects += 1
  else:
    wrongs += 1

print(f'Corrects: {corrects}, Wrongs: {wrongs}')
print(f'Accuracy: {corrects/(corrects+wrongs)}')