# Support Vector Machines (SVMs)
The dataset contains information about users on a Social Networking site and using that information as features for our ML model. The model predicts whether a particular user after clicking on an ad on the Social networking site goes on to buy a particular product or not. It is a CLASSIFICATION PROBLEM as the output says whether the user buys the product or not, so it’s either a yes or a no.

**Importing all the libraries**

In [15]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

**Importing the dataset**

In [16]:
dataset = pd.read_csv('Social_Network_Ads.csv')
dataset.head()

Unnamed: 0,User ID,Gender,Age,EstimatedSalary,Purchased
0,15624510,Male,19,19000,0
1,15810944,Male,35,20000,0
2,15668575,Female,26,43000,0
3,15603246,Female,27,57000,0
4,15804002,Male,19,76000,0


**Filtering out columns to retain age and salary columns**

In [17]:
# We are using salary and age to see if a person purchased or not. 
# So we don't care about the gender or stuff like that.
x = dataset.iloc[:, [2, 3]].values
y = dataset.iloc[:, 4].values

**Data split for training and testing** 

In [18]:
from sklearn.model_selection import train_test_split
# testing data size is of 25% of entire data
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.25, random_state = 0)

**Scaling using Standard Scalar for Normal Distribution**

In [19]:
from sklearn.preprocessing import StandardScaler
# Standard scalar standardizes features by removing the mean and scaling to unit variance.
# We're using normal distribution.
sc = StandardScaler()
x_train = sc.fit_transform(x_train)
x_test = sc.transform(x_test) 

**Building the Model using RBF Kernel**

In [20]:
from sklearn.svm import SVC
classifier_rbf = SVC (kernel = 'rbf', random_state = 0) # defining the svm classifier
classifier_rbf.fit(x_train, y_train) # training the classifier on the training data 
y_pred_rbf = classifier_rbf.predict(x_test) # predicting the y values of x_test 

**Printing the confusion matrix**

In [21]:
from sklearn.metrics import confusion_matrix
# confusion matrix ==> ([TN, FP], [FN, TP])
cm_rbf = confusion_matrix(y_test, y_pred_rbf)
print(cm_rbf)
tn, fp, fn, tp = cm_rbf.ravel()
accuracy = (tp + tn) / (tp + tn + fp + fn)
print("Accuracy", accuracy) # accuracy = (TP + TN) / (TP + TN + FP + FN)

[[64  4]
 [ 3 29]]
Accuracy 0.93


**Classification Report**

In [22]:
from sklearn.metrics import classification_report
class_report_rbf = classification_report(y_test, y_pred_rbf)
print(class_report_rbf)

              precision    recall  f1-score   support

           0       0.96      0.94      0.95        68
           1       0.88      0.91      0.89        32

    accuracy                           0.93       100
   macro avg       0.92      0.92      0.92       100
weighted avg       0.93      0.93      0.93       100

