# TP2: Support Vector Machines

*By Daniel Deutsch and Kevin Khul*

In [3]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Exercise 1

In [4]:
def load_breastcancer(filename):
    data = np.loadtxt(filename, delimiter=',')

    # The column 0 isn't important for us here
    y = data[:, 1]*2 - 1
    X = data[:, 2:]

    # Standardisation of the matrix
    X = X - np.mean(X, axis=0)
    X = X / np.std(X, axis=0)

    return X, y

In [5]:
X, y = load_breastcancer("./datasets/wdbc_M1_B0.data")

# Exercise 2.1

# Exercise 2.2

# Exercise 2.3

# Exercise 2.4

# Exercise 2.5

# Exercise 3.1

We have that:

<br>

\begin{aligned}
    E[f_I (v, a)] \quad & = \quad \sum_{i = 1}^n P(I = i) \ f_i (v, a) \\
    & = \quad \frac{1}{n} \sum_{i = 1}^n f_i (v, a) \\
    & = \quad \frac{1}{n} \sum_{i = 1}^n \left(c \ n \ max(0, \ 1-y_i(x_i^T v + a)) + \frac{1}{2} \sum_{j = 1}^m v_j^2 \right) \\
    & = \quad \frac{1}{n} \sum_{i = 1}^n (c \ n \ max(0, \ 1-y_i(x_i^T v + a))) + \frac{1}{n} \sum_{i = 1}^n \frac{1}{2} \sum_{j = 1}^m v_j^2 \\
    & = \quad \sum_{i = 1}^n (c \ max(0, \ 1-y_i(x_i^T v + a))) + \frac{1}{2} \sum_{j = 1}^m v_j^2 \\
    & = \quad f(v, a)
\end{aligned}

# Exercise 3.2

# Exercise 3.3

The Lagrangian function $L()$ associated to Problem (1) is:

# Exercise 4.2

We have $g(x, \phi) = - \frac{1}{2\rho} \phi^2 + \frac{\rho}{2} (max(0, \ x + \rho^{-1} \phi))^2$ with $\rho > 0$. Firstly, to find $\nabla_x g(x, \phi)$, we do:

<br>

\begin{aligned}
    \nabla_x g(x, \phi) \quad = \quad \frac{\partial g(x, \phi)}{\partial x} \quad & = \quad \frac{\partial \left(-\frac{1}{2\rho} \phi^2\right)}{\partial x} + \frac{\partial \left(\frac{\rho}{2} (max(0, \ x + \rho^{-1} \phi))^2\right)}{\partial x} \\
    & = \quad 0 + \frac{2 \rho}{2} (max(0, \ x + \rho^{-1} \phi)) \\
    & = \quad \rho \ max(0, \ x + \rho^{-1} \phi)
\end{aligned}

<br>

Now, to find $\nabla_\phi g(x, \phi)$:

<br>

\begin{aligned}
    \nabla_\phi g(x, \phi) \quad = \quad \frac{\partial g(x, \phi)}{\partial \phi} \quad & = \quad \frac{\partial \left(-\frac{1}{2\rho} \phi^2\right)}{\partial \phi} + \frac{\partial \left(\frac{\rho}{2} (max(0, \ x + \rho^{-1} \phi))^2\right)}{\partial \phi} \\
    & = \quad -\frac{2}{2\rho} \phi + \frac{2 \rho}{2 \rho} (max(0, \ x +\rho^{-1} \phi)) \\
    & = \quad -\frac{\phi}{\rho} + max(0, \ x + \rho^{-1} \phi) \\
    & = \quad max(-\rho^{-1} \phi, \ x)
\end{aligned}


# Exercise 4.3

From *Exercise 4.2* we know that $\frac{\partial g(x, \phi)}{\partial x} = \rho \ max(0, \ x + \rho^{-1} \phi)$ and that $\frac{\partial g(x, \phi)}{\partial \phi} = max(-\rho^{-1} \phi, \ x)$. 

To categorize the function $x \rightarrow g(x, \phi)$ as concave or convexe, all we need to do is verify the signal of $\frac{\partial^2 g(x, \phi)}{\partial x^2}$:

<br>

\begin{aligned}
    \frac{\partial^2 g(x, \phi)}{\partial x^2} \quad & = \quad \frac{\partial (\rho \ max(0, \ x + \rho^{-1} \phi))}{\partial x} \\
    & = \quad \rho \ max(0, 1) \\
    & = \quad \rho
\end{aligned}

<br>

Since by definition we have that $\rho > 0$, we confirm that the function $x \rightarrow g(x, \phi)$ is convex.

Similarly, for the function $\phi \rightarrow g(x, \phi)$, we analyse the signal of $\frac{\partial^2 g(x, \phi)}{\partial \phi^2}$:

<br>

\begin{aligned}
    \frac{\partial^2 g(x, \phi)}{\partial \phi^2} \quad & = \quad \frac{\partial (max(-\rho^{-1} \phi, \ x))}{\partial \phi} \\
    & = \quad \ max(-\rho^{-1}, 0) \\
    & = \quad 0
\end{aligned}

<br>

Since the Hessian of $\phi \rightarrow g(x, \phi)$ is $\leq 0$, the function is concave.

# Exercise 4.4

In [None]:
def line_search_gradient():
    pass

# Exercise 4.5

In [None]:
def lagrangian_gradient():
    pass

# Exercise 4.6

In [None]:
def augmented_lagrangian():
    pass

# Exercise 5.1