# Support Vector Machines
## Course recap
This lab consists in implementing the **Support Vector Machines** (SVM) algorithm. 

Given a training set $ D = \left\{ \left(x^{(i)}, y^{(i)}\right), x^{(i)} \in \mathcal{X}, y^{(i)} \in \mathcal{Y}, i \in \{1, \dots, n \}  \right\}$, where $\mathcal{Y} = \{ 1, \dots, k\}$ . Recall (from lecture 7), SVM aims at minimizing the following cost function $J$:
$$
\begin{split}
J(\theta_1, \theta_2, \dots, \theta_k) 
	&= \sum_{i = 1}^n L_i \\
	&= \sum_{i = 1}^n \sum_{j \neq y_i} \max(0, \theta_j^Tx^{(i)} - \theta_{y^{(i)}}^T x^{(i)} + \Delta)
\end{split}
$$

## Defining the training set
**Exercise 1**: Define variables `X` and `Y` that will contain the features $\mathcal{X}$ and labels $\mathcal{Y}$ of the training set.

**Hint**: Do not forget the intercept!

In [1]:
X = [[1., 50.], [1., 76.], [1., 26.], [1., 102.]]
Y = [30., 48., 12., 90.]

In this simple example, the dimensionality is $d = 1$ (which means 2 features: don't forget the intercept!) and the number of samples is $n = 4$.

## Prediction function
**Exercise**: Define a function `predict` that takes as parameter *the feature vector* $x$ as well as *a model* $\theta$ and outputs the score:
$$ h(x) = \theta^T x = \sum_{j = 0}^d \theta_j x_j$$

In [2]:
def predict(x, theta):
    d = len(x)
    thetaTx = 0
    for idx in range(d):
        thetaTx += x[idx] * theta[idx]
    return thetaTx

## Defining the cost function
### Cost function on a single sample
**Exercise**: Define a function `cost_function` that takes as parameter *the predicted label* $y$ and *the actual label* $\hat{y}$ of a single sample and returns the value of the cost function for this pair. Recall from lectures 1 and 2 that it is given by:
$$ L_i = \sum_{j \neq y_i} \max(0, \theta_j^Tx^{(i)} - \theta_{y^{(i)}}^T x^{(i)} + \Delta) $$

In [3]:
def cost_function(x, y, thetas, delta):
    thetayTx = predict(x, thetas[y])
    loss = 0
    d = len(x)
    for j in range(d):
        if j is not y:
            thetajTx = predict(x, thetas[idx])
            loss += max(0, thetajTx - thetayTx + delta)
    return loss

In [4]:
def cost_function_total(X, Y, thetas, delta):
    cost = 0 # initialize the cost with 0
    n = len(Y)
    for i in range(n): # iterate over the training set
        x = X[i] # get the ith feature vector
        y = Y[i] # get the ith label
        cost += cost_function(x, y, thetas, delta) # add the cost of the current sample to the total cost
    return cost