# Formula

The loss function of SVM is as follows:
\begin{align}
L(w) = \frac{1}{2}||w||^2 + C\Sigma_{i=1}^{N}max(0, 1-y_i(w^Tx_i-b))
\end{align}

Normally, we'll use +b instead of -b, but for some reasons, the code given used -b, so yeah just keep it. As the result, the gradient descent for bias will be a little bit different.

The derivative of max component:


* if $1-y_i(wx_i -b) >0$, max function equals $1-y_i(wx_i +b)$ and the derivate equals $-y_ix_i$
* if $1-y_i(wx_i -b) <=0$, max function equals 0 and the derivative equals 0



Update parameter with gradient descent is as follows:
\begin{align}
w = w - lr(w-C\Sigma_{i=1}^{N}\delta_i y_i x_i)
\end{align}

In particular:

$$
\frac{\partial L}{\partial \mathbf{w}} = \mathbf{w} - y_i \mathbf{x}_i
$$

$$
\frac{\partial L}{\partial b} = y_i
$$

→ cập nhật:

$$
\mathbf{w} \leftarrow \mathbf{w} - \eta (\mathbf{w} - y_i \mathbf{x}_i)
$$

$$
b \leftarrow b - \eta (y_i) = b - \eta y_i
$$

In [16]:
import numpy as np

In [17]:
# Hard Margin SVM

class HardMarginSVM:
    def __init__(self, lr=1e-3, n_iterations=1000):
        self.lr = lr
        self.n_iterations = n_iterations
        self.weights = None
        self.bias = None

    def fit(self, X, y):
        n_samples, n_features = X.shape
        self.weights = np.zeros(n_features)
        self.bias = 0

        for _ in range(self.n_iterations):
            for i in range(n_samples):
                # We can use matrix product @, but for 1d vector, we can use dot product, is the same still
                condition = y[i] * (np.dot(self.weights, X[i]) - self.bias) >= 1
                if condition:
                    # haven't known that before
                    # regularization, allow the border to be larger
                    self.weights -= self.lr * self.weights 
                else:
                    self.weights -= self.lr * (self.weights - np.dot(y[i], X[i]))
                    self.bias -= self.lr * y[i]

    # return the linear
    def predict(self, X):
        linear_output = np.dot(X, self.weights) - self.bias
        # np.sign: return +1 if it > 0, else ...
        return np.sign(linear_output)
        

In [18]:
# Create example data
X = np.array([[1, 2], [2, 3], [3, 4], [4, 5], [5, 6], [6, 7]])
y = np.array([1, 1, 1, -1, -1, -1])

# Training with hard margin SVM
svm = HardMarginSVM()
svm.fit(X, y)

# Predictions
predictions = svm.predict(X)
print("Predict:", predictions)

Predict: [ 1.  1. -1. -1. -1. -1.]


In [19]:
# Soft Margin SVM
class SoftMarginSVM:
    def __init__(self, lr=1e-3, C=1.0, n_iterations=1000):
        self.lr = lr
        self.C = C
        self.n_iterations = n_iterations
        self.weights = None
        self.bias = None

    def fit(self, X, y):
        n_samples, n_features = X.shape
        self.weights = np.zeros(n_features)
        self.bias = 0

        for _ in range(self.n_iterations):
            for i in range(n_samples):
                condition = y[i] * (np.dot(self.weights, X[i]) - self.bias) >= 1
                if condition:
                    self.weights -= self.lr * (2 * self.C * self.weights)
                else:
                    self.weights -= self.lr * (2 * self.C * self.weights - np.dot(y[i], X[i]))
                    self.bias -= self.lr * y[i]

    def predict(self, X):
        linear_output = np.dot(X, self.weights) - self.bias
        return np.sign(linear_output)


In [20]:
X = np.array([[1, 2], [2, 3], [3, 4], [4, 5], [5, 6], [6, 7]])
y = np.array([1, 1, 1, -1, -1, -1])

svm = SoftMarginSVM(C=0.1)
svm.fit(X, y)

predictions = svm.predict(X)
print("Predict:", predictions)

Predict: [ 1.  1. -1. -1. -1. -1.]
