# Support Vector Machine - SVM
SVM can be used for both regression and classification tasks. But, it is widely used in classification objectives.\
The objective of the support vector machine algorithm is to find a hyperplane in an N-dimensional space(N — the number of features) that distinctly classifies the data points.
<img src="pics/svm.pic1.png" width="700">
<img src="pics/svm.pic2.png" width="800">
## Math conditions
<img src="pics\svm.pic4.png" width="400">

${\displaystyle \mathbf {w} ^{T}\mathbf {x} _{i}-b\geq 1}$ if ${\displaystyle y_{i}=1}$ \
and \
${\displaystyle \mathbf {w} ^{T}\mathbf {x} _{i}-b\leq -1}$ if ${\displaystyle y_{i}=-1}$ \
and combined to \
$$y.f(x)={\displaystyle y_{i}}({\displaystyle \mathbf {w} ^{T}\mathbf {x} _{i}-b)\geq 1}$$

## Cost/Loss functions
how far is the point from the expected lines/boundaries
<img src="pics/svm.pic3.png" width="500">
## Hinge Loss with Regularization (margin: width of the lane): 
We want to minimize average loss but also want to maximized margin
$$J =\lambda \|\mathbf {w} \|^{2} + {\displaystyle {\frac {1}{n}}\sum _{i=1}^{n}\max \left(0|1-y_{i}(\mathbf {w} ^{T}\mathbf {x} -b)\right)}$$
## Gradient Descent
Gradient of Loss function
<img src="pics/svm.pic8.png" width="300">
## Update rules
<img src="pics/svm.pic9.png" width="300">

In [13]:
import numpy as np

class SVM:
    
    def __init__(self, lr = 0.001, lambda_= 0.01, epoch = 1000):
        self.lr = lr
        self.lambda_ = lambda_
        self.epoch = epoch
        self.W = None
        self.b = None
        
    def fit(self, X, y):
        y = np.where(y <= 0, -1, 1)
        n_sample, n_feature =  X.shape
        
        self.W = np.zeros(n_feature)
        self.b = 0
        
        for _ in range(self.epoch):
            for idx, x in enumerate(X):
                fx = np.dot(self.W,x) - self.b
                if (fx*y[idx] >=1):
                    dW = 2*self.lambda_*self.W
                    db = 0
                else:
                    dW = 2*self.lambda_*self.W - y[idx]*x
                    db = y[idx]
                    
                self.W -= self.lr*dW
                self.b -= self.lr*db
    
    def loss(self):
        Fx = 1- (np.dot(X, self.W) + self.b)
        cost_lst = [0 if fx < 0 else fx for fx in Fx]
        cost = np.mean(cost_lst) + self.lambda_*np.linalg.norm(self.W)
        return f'{cost:.2f}'                
                
    
    def predict(self, X):
        Fx = [np.dot(self.W, x) - self.b for x in X]
        #Fx = np.dot(X, self.W) + self.b
        y_hat = np.array([1 if fx >0 else -1 for fx in Fx])
        #y_hat = np.sign(Fx)
        y_label = np.where(y_hat <= 0, 0, 1)
        return np.array(y_label)
    
    def accuracy(self, X, y):
        y_label = self.predict(X)
        return f'{sum(y_label==y)/len(y):.3f}'
    

In [14]:
from sklearn import datasets
from sklearn.model_selection import train_test_split


X,y = datasets.make_blobs(n_samples=50,
                                  n_features=2,
                                  centers=2,
                                  cluster_std=1.05,
                                  random_state=40)
model = SVM()
model.fit(X, y)
print(f'Weights: {model.W}')
print(f'Bias: {model.b:.3f}')
print(f'Loss: {model.loss()}')


Weights: [0.58977016 0.17946483]
Bias: -0.152
Loss: 1.94


In [15]:
X,y = datasets.make_blobs(n_samples=100,
                                  n_features=2,
                                  centers=2,
                                  cluster_std=1.05,
                                  random_state=40)

X_train, X_test, y_train, y_test = train_test_split(X, y,
                                                    test_size=0.3,
                                                   random_state=123)

model = SVM()
model.fit(X_train, y_train)
print(f'Weights: {model.W}')
print(f'Bias: {model.b:.3f}')
print(f'Loss: {model.loss()}')
print(f'y_test: \n\t{y_test}')
print(f'y_hat: \n\t{model.predict(X_test)}')
print(f'Accuracy Score: {model.accuracy(X_test, y_test)}')

Weights: [0.43202299 0.15667361]
Bias: -0.059
Loss: 1.65
y_test: 
	[1 0 1 1 0 1 0 1 1 0 1 0 1 1 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 0]
y_hat: 
	[1 0 1 1 0 1 0 1 1 0 1 0 1 1 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 0]
Accuracy Score: 1.000
