<a href="https://colab.research.google.com/github/solmvz/MLActivities/blob/main/SupportVectorMachine.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Support Vector Machine Implementation

In this session I implemented SVM using the cost function hinge loss and stochastic gradient descent. In SVM, the objective is to find a hyperplane that separates +ve and -ve examples with the largest margin while keeping the misclassification as low as possible. The dataset that I'll be using is a breast cancer diagnostic dataset available on [kaggle](https://www.kaggle.com/datasets/uciml/breast-cancer-wisconsin-data). The features in the dataset are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe the characteristics of the cell nuclei present in the image. Based on these features we will train our SVM model to detect if the mass is benign B (generally harmless) or malignant M (cancerous).

In [1]:
import numpy as np
import pandas as pd
import statsmodels.api as sm
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, recall_score, precision_score
from sklearn.utils import shuffle

In [15]:
class SVM():
  def __init__(self, X, y, reg=10000, lr=0.000001):
    self.X = X
    self.y = y
    self.reg = reg
    self.lr = lr

  def compute_cost(self, X, y, W):
    # Calculate Hinge Loss
    N = X.shape[0]
    distances = 1 - y*(np.dot(X, W))
    distances[distances < 0] = 0
    cost = 1/2*(np.dot(W, W)) + self.reg*(np.sum(distances)/N)
    return cost

  def compute_gradient(self, W, X_batch, y_batch):

    y_batch = np.array([y_batch])
    X_batch = np.array([X_batch])

    distance = 1 - (y_batch*np.dot(X_batch,W))
    dw = np.zeros(len(W))

    for i, d in enumerate(distance):
      if max(0, d) == 0:
        di = W
      else:
        di = W - (self.reg*y_batch[i]*X_batch[i])
      dw += di

    dw = dw/len(Y)
    return dw

  def gradient_descent(self):
    max_epochs = 5000
    weights = np.zeros(self.X.shape[1])
    for epoch in range(1, max_epochs):
      for i, x in enumerate(self.X):
        dw = self.compute_gradient(weights, x, self.y[i])
        weights = weights - (self.lr*dw)
    return weights


In [10]:
data = pd.read_csv('/content/data.csv')
diagnosis_map = {'M':1, 'B':-1}
data['diagnosis'] = data['diagnosis'].map(diagnosis_map)
data.drop(data.columns[[-1, 0]], axis=1, inplace=True)

Y = data.loc[:, 'diagnosis']
X = data.iloc[:, 1:]
# Normalise Data
X_norm = MinMaxScaler().fit_transform(X.values)
X = pd.DataFrame(X_norm)

# Insert 1 in each row for intercept b
X.insert(loc=len(X.columns), column='intercept', value=1)
# Split train and test 
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.2, random_state=42)

In [18]:
model = SVM(X_train.to_numpy(), y_train.to_numpy())
print("training started...")
W = model.gradient_descent()
print("training finished.")
print("weights are: {}".format(W))

training started...
training finished.
weights are: [ 0.42474349  0.22987926  0.47739123  0.60566123 -0.19579285  0.46226968
  1.01518923  1.28085585 -0.14721568 -0.59429363  0.47925372 -0.30330219
  0.39870367  0.40360628 -0.30615331 -0.21558165 -0.17821916 -0.20511514
 -0.26319069 -0.31715063  0.80118522  0.44786878  0.78480886  0.77019176
  0.24604272  0.57781986  0.75480969  1.19301864  0.35981256  0.12602673
 -2.76282531]


In [21]:
y_pred = np.array([])
for i in range(X_test.shape[0]):
  y_p = np.sign(np.dot(W, X_test.to_numpy()[i]))
  y_pred = np.append(y_pred, y_p)

print("accuracy on test dataset: {}".format(accuracy_score(y_test.to_numpy(), y_pred)))
print("recall on test dataset: {}".format(recall_score(y_test.to_numpy(), y_pred)))
print("precision on test dataset: {}".format(recall_score(y_test.to_numpy(), y_pred)))

accuracy on test dataset: 0.956140350877193
recall on test dataset: 0.9069767441860465
precision on test dataset: 0.9069767441860465
