#Introduction to SVM for Beginners
- This notebook will introduce you on how you can implement support vector machines.
- The different kernel types.
- Hyperparameters to tune.

### Import Packages

In [0]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn import svm
from sklearn.model_selection import train_test_split
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler

###Read the data
- Get the data. The data used for this tutorial can be downloaded from https://www.kaggle.com/c/intercampusai2019.
- Seperate the class label from the original data.
- Scale the training data and split into train and test.

In [0]:
# read the data
data = pd.read_csv('train.csv')

# select only numeric datatypes
data = data.select_dtypes(['float64', 'int64'])

# seperate the class and train data
class_label = data['Promoted_or_Not']
train_data = data.drop(data[['Promoted_or_Not']], axis=1)

X = StandardScaler().fit_transform(train_data)

#split the training data into train and test
X_train, X_test, y_train, y_test = train_test_split(X, class_label, test_size=0.3, random_state=42)

### Linear SVM
- The first model we will build is the linear svm model. This model is used if you have a linear dataset and the model can correctly identify the hyperplanes on the hypersurface.

- The model is built by assigning the *linear* value to the kernel parameter as seen in the code block below.

- Fit the model and predict on both train and test dataset.

In [0]:
# instantiate the model
linear_svm = svm.SVC(kernel='linear')

# fit the model on the training data
linear_svm.fit(X_train, y_train)

# make predictions on train data
train_pred = linear_svm.predict(X_train)

# make prediction on test data
test_pred = linear_svm.predict(X_test)

#### Accuracy Score.
- Let's see how the model performs on the data.
- Prediction on the training and test data had accuracy score of 91%. This is a good model.

In [7]:
from sklearn.metrics import accuracy_score

# get the train score
train_score = accuracy_score(train_pred, y_train)
print('Train score is {}'.format(train_score))

# get the test score
test_score = accuracy_score(test_pred, y_test)
print('Test score is {}'.format(test_score))

Train score is 0.9155045118949959
Test score is 0.9151731338089438


### RBF Kernel
- The rbf kernel is used for non-linear dataset. You use this when you linear model cannot accurately find the hyperplanes on you r non-linear data.
- Implementating rbf kernel just needs you to assign the *rbf* value to the kernel parameter.
- The implementation can be seen below.

In [0]:
# instantiate the model
rbf_svm = svm.SVC(kernel='rbf')

# fit the model on the training data
rbf_svm.fit(X_train, y_train)

# make predictions on train data
rbf_train_pred = rbf_svm.predict(X_train)

# make prediction on test data
rbf_test_pred = rbf_svm.predict(X_test)

#### Accuracy Score
- When test against the train and test score, rbf performed better than the linear kernel with an accuracy score of approximately 92% on both train and test.

In [9]:
# get the train score
rbf_train_score = accuracy_score(rbf_train_pred, y_train)
print('Train score is {}'.format(rbf_train_score))

# get the test score
rbf_test_score = accuracy_score(rbf_test_pred, y_test)
print('Test score is {}'.format(rbf_test_score))

Train score is 0.9214333656499366
Test score is 0.9196102314250914


### Poly Kernel
- The poly kernel is used for higly complex non-linear data.
- Implemetation is similar to all the above examples.

In [0]:
# instantiate the model
poly_svm = svm.SVC(kernel='poly')

# fit the model on the training data
poly_svm.fit(X_train, y_train)

# make predictions on train data
poly_train_pred = poly_svm.predict(X_train)

# make prediction on test data
poly_test_pred = poly_svm.predict(X_test)

### Accuracy Score
- We see that we get 92% on both train and test data set.

In [11]:
# get the train score
poly_train_score = accuracy_score(poly_train_pred, y_train)
print('Train score is {}'.format(poly_train_score))

# get the test score
poly_test_score = accuracy_score(poly_test_pred, y_test)
print('Test score is {}'.format(poly_test_score))

Train score is 0.9210604817659781
Test score is 0.9190882199408387


### Sigmoid Kernel.
- Implementing this is similar to everything we have been doing. The code is shown below.


In [0]:
# instantiate the model
sig_svm = svm.SVC(kernel='sigmoid')

# fit the model on the training data
sig_svm.fit(X_train, y_train)

# make predictions on train data
sig_train_pred = sig_svm.predict(X_train)

# make prediction on test data
sig_test_pred = sig_svm.predict(X_test)

####Accuracy Score.
- This performed worst than all the other models.
- 85% accuracy on both train and test.

In [13]:
# get the train score
sig_train_score = accuracy_score(sig_train_pred, y_train)
print('Train score is {}'.format(sig_train_score))

# get the test score
sig_test_score = accuracy_score(sig_test_pred, y_test)
print('Test score is {}'.format(sig_test_score))

Train score is 0.8583041240957566
Test score is 0.8547938054637202


### Hyperparameter Tuning
- We will now learn about the different parameters for SVM.


#### Gamma
- This is a hyperparameter used in non-linear hyperplanes, It is advisable to start with a low gamma value and move your way up and also keep it in mind that a high gamma value can cause your model to overfit

In [0]:
gammas = [0.1, 1, 10, 100]
for gamma in gammas:
   tuned_linear_svc = svm.SVC(kernel='rbf', gamma=gamma)
   tuned_linear_svc.fit(X_train, y_train)
   pred = tuned_linear_svc.predict(X_test)
   print(accuracy_score(pred, y_test))

#### C
- Is the parameter that penalizes the error, it controls the trade off between our decision boundary and correctly classified data points.


In [0]:
cs = [0.1, 1, 10, 100, 1000]
for c in cs:
  tuned_linear_c = svm.SVC(kernel='rbf', C=c)
  tuned_linear_c.fit(X_train, y_train)
  pred = tuned_linear_c.predict(X_test)
  print(accuracy_score(pred, y_test))

#### Degree
- This parameter is specifically targeted to the poly kernel, it is the degree of polynomial used to find the decision boundary to split the data. Using a degree =1 is the same as using a linear kernel and the higher the degree the linger the model will train.

In [0]:
degrees = [0, 1, 2, 3, 4, 5, 6]
for degree in degrees:
  tuned_linear_d = svm.SVC(kernel='poly', degree=degree)
  tuned_linear_d.fit(X_train, y_train)
  pred = tuned_linear_d.predict(X_test)
  print(accuracy_score(pred, y_test))