# Optimization Methods Project Work: SMO and DCD-Linear for training SVM

This notebook will contain some code that implements the SMO and DCD-Linear algorithm, plus the results.

## Testing the environment

The following cell will simply verify if all the libraries necessary to run the code are correctly installed.

In [60]:
import numpy as np
import sklearn
import matplotlib as plt
import math

print("Everything is good!")

Everything is good!


## Defining the Dual Problem

### The Bias One (For SMO)

As we know, the formulation of the dual problem for training a SVM is:
$$
\begin{gather*}
\min_{\alpha} \frac{1}{2}\alpha^TQ\alpha - e^T\alpha \\
\forall i\ 0 \leq \alpha_i \leq C\ \ \sum \limits_i^n \alpha_iy_i = 0
\end{gather*}
$$

As stated in the Project Goals, we will use the Most Violating Pair rule to select the variables to change. So, to implement this algorithm efficiently, we need those elements:
* A function that calculates the derivatives
* A function that adjust the derivative after changing $\alpha$ 
* A function that extract the most violating pair.

To sum up all this functionality, they will be implemented inside a class. The constructor will receive the set of Xs and Ys, and using that will derive Q.

In [63]:
class DualSVMProblem:

    def __init__(self, Xs, Ys, C, epsilon = 10e-4):
        self.X = Xs
        self.Y = Ys
        self.C = C
        self.a = numpy.zeros(Ys.shape[0])
        self.e = np.ones_like(self.a)
        self.epsilon = epsilon

        mat = np.stack([Ys.T] * Xs.shape[1], axis = 1)
        
        Z = Xs * mat
        self.Q = Z @ Z.T

        self.d = self.Q @ self.a - self.e

    def getA(self):
        return self.a
        
    def getDerivative(self):
        return self.d    

    def getMostViolatingPair(self):
        self.directions = self.d / self.Y
        min_idx = -1
        min_value = np.inf
        max_idx = -1
        max_value = -np.inf

        for i in range(len(self.a)):
            if (self.a[i] > self.epsilon and self.Y == 1 and self.directions[i] < min_value):
                min_value = self.directions[i]
                min_idx = i
            if (self.a[i] < self.C - self.epsilon and self.Y == -1 and self.directions[i] < min_value):
                min_value = self.directions[i]
                min_idx = i
                
            if (self.a[i] > self.epsilon and self.Y == -1 and self.directions[i] > max_value):
                max_value = self.directions[i]
                max_idx = i
            if (self.a[i] < self.C - self.epsilon and self.Y == 1 and self.directions[i] < max_value):
                max_value = self.directions[i]
                max_idx = i

        if min_idx == -1 or max_idx == -1:
            return None
                
        return (min_idx, max_idx)

    def updateA(self, idx1, a1, idx2, a2):
        self.d = self.Q[idx1] * (a1 - self.a[idx1]) + self.Q[idx2] * (a2 - self.a[idx2]) + self.d


In [66]:
# Let's test the code with some artificial data
X = np.array([[1, 2, 3], [4, 5, 6]])
Y = np.array([1, -1])
mat = np.stack([Y.T] * X.shape[1], axis = 1)
C = 1

# The correct Q matrix
Q = np.zeros((2, 2))
for i in range(2):
    for j in range(2):
        Q[i][j] = Y[i] * Y[j] * X[i] @ X[j]

problem = DualSVMProblem(X, Y, C)

ok = True
if ((Q != problem.Q).all()):
    print("The matix calculation is wrong")
    
if ((np.array([-1, -1]) != problem.getDerivative()).all()):
    print("The derivative function if wrong")

problem.updateA(0, 1, 1, 1)
if ((Q @ np.ones(2) - np.ones(2) != problem.getDerivative()).all()):
    print("The updates of variables doesn't update the derivative in the right way")

# TODO test the method about the most violating pairs

if ok:
    print("Everything is working as expected!")

Everything is working as expected!
