# SVM 

SVM are about finding the best decision boundaries in a dataset, the line is searched through maximum margin, it's the **line** that **seperates the two classes** and it has the maximum margin, the line is basically equidistant from the two most extreme points.

The two points are support vectors, and support the whole algorithm.

![image.png](https://www.researchgate.net/profile/Tanay-Kothari/publication/275974276/figure/fig1/AS:487696138280961@1493287228453/An-example-of-a-maximum-margin-classifier.png)

One is a negative hyperplane and the other is a positive hyperplane.

**The basic idea is we seperate two classes using a divider, and that line must have maximum margins.**

#### Why are they different to other alogrithms?
Let's assume we have to differentiate between an apple and an orange, in any other model, the model will look at the most standard example of apples, and most standard oranges, and learn about it.

**But in SVM, the model picks and learns about the apples that are very close to orange in terms of color or appearance, and the model will look at oranges that looks mostly like an apple. Those inverted cases act like support vectors, and lie on the boundaries.**

---

# Code

In [1]:
import pandas as pd
import numpy as np

In [2]:
dataset = pd.read_csv("Social_Network_Ads.csv")

In [3]:
dataset.head()

Unnamed: 0,Age,EstimatedSalary,Purchased
0,19,19000,0
1,35,20000,0
2,26,43000,0
3,27,57000,0
4,19,76000,0


In [4]:
x = dataset.iloc[:,:-1].values
y = dataset.iloc[:,-1].values

In [5]:
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test = train_test_split(x,y,test_size=0.3,random_state=0)

In [8]:
 from sklearn.svm import SVC
 model = SVC(kernel="linear",random_state=0)
 model.fit(x_train, y_train)

SVC(kernel='linear', random_state=0)

In [9]:
y_pred = model.predict(x_test)
print("Y Predicted vs Y Test Set")
print(np.concatenate((y_pred.reshape(len(y_pred),1),y_test.reshape(len(y_test),1)),1))

Y Predicted vs Y Test Set
[[0 0]
 [0 0]
 [0 0]
 [0 0]
 [0 0]
 [0 0]
 [0 0]
 [1 1]
 [0 0]
 [0 0]
 [0 0]
 [0 0]
 [0 0]
 [0 0]
 [0 0]
 [0 0]
 [0 0]
 [0 0]
 [1 1]
 [0 0]
 [0 0]
 [1 1]
 [0 0]
 [1 1]
 [0 0]
 [1 1]
 [0 0]
 [0 0]
 [0 0]
 [0 0]
 [0 0]
 [0 1]
 [1 1]
 [0 0]
 [0 0]
 [0 0]
 [0 0]
 [0 0]
 [0 0]
 [0 1]
 [0 0]
 [0 0]
 [0 0]
 [0 0]
 [1 1]
 [0 0]
 [0 0]
 [1 1]
 [0 0]
 [1 1]
 [1 1]
 [0 0]
 [0 0]
 [1 0]
 [1 1]
 [0 1]
 [0 0]
 [0 0]
 [0 1]
 [0 0]
 [0 0]
 [1 1]
 [0 0]
 [0 1]
 [0 0]
 [1 1]
 [0 0]
 [0 0]
 [0 0]
 [0 0]
 [1 1]
 [0 0]
 [0 0]
 [0 1]
 [0 0]
 [0 0]
 [1 0]
 [0 0]
 [1 1]
 [1 1]
 [1 1]
 [1 0]
 [0 0]
 [0 0]
 [1 1]
 [1 1]
 [0 0]
 [1 1]
 [0 1]
 [0 0]
 [0 0]
 [1 1]
 [0 0]
 [0 0]
 [0 0]
 [0 1]
 [0 0]
 [0 1]
 [1 1]
 [1 1]
 [0 0]
 [1 1]
 [0 0]
 [1 1]
 [0 1]
 [1 1]
 [0 0]
 [0 0]
 [0 0]
 [0 0]
 [0 0]
 [0 0]
 [0 1]
 [0 1]
 [1 1]
 [1 0]
 [1 1]
 [0 0]
 [1 0]
 [0 1]]


In [10]:
from sklearn.metrics import accuracy_score,classification_report
print("Accuracy {0:.2f}%".format(accuracy_score(y_test,y_pred)*100))

Accuracy 85.00%


The accuracy is less because the maximum margin is linear, hence we catch more incorrect values.

In [11]:
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
print(cm)

[[74  5]
 [13 28]]
