In [None]:
### Hinge Loss

Given a classification problem, we would like to find a separating hyperplane that minimizes the hinge loss over all of the data. Given a hyperplane defined by $a^Tx = 0$, and a data-label pair $(a, b)$, the hinge loss (also called SVM loss) is:

\begin{equation}
L(x, y) = \begin{cases}
\max(1 - a^T x, 0) & b = 1 \\
\max(1 + a^T x, 0) & b = -1
\end{cases}
\end{equation}

Using the signs of $b$, we can express this compactly as:

\begin{equation}
L(a, b) = \max(1 - ba^Tx, 0)
\end{equation}

We sum this over all data values $(a, b)$ to get the overall hinge loss. 

#### L1 Regularization
We regularize the choice of parameters $x$ with an extra $\lambda \|x\|_1$ term. Our final problem is:

\begin{equation*}
  \begin{aligned}
    &\text{minimize} && \sum_{i = 1}^m \max(1 - b_ia_i^Tx, 0) + \lambda \|x\|_1 \\
  \end{aligned}
\end{equation*}

with variable $x$ and labelled data $(a_1, b_1), ..., (a_m, b_m)$.

#### L2 Regularization
We regularize the choice of parameters $x$ with an extra $\lambda \|x\|_2^2$ term. Our final problem is:

\begin{equation*}
  \begin{aligned}
    &\text{minimize} && \sum_{i = 1}^m \max(1 - b_ia_i^Tx, 0) + \lambda \|x\|_2^2 \\
  \end{aligned}
\end{equation*}

with variable $x$ and labelled data $(a_1, b_1), ..., (a_m, b_m)$.


In [2]:
import cvxpy as cp
import numpy as np
import scipy as sp

# Variable declarations

import scipy.sparse as sps

def normalized_data_matrix(m, n, mu):
    if mu == 1:
        # dense
        A = np.random.randn(m, n)
        A /= np.sqrt(np.sum(A**2, 0))
    else:
        # sparse
        A = sps.rand(m, n, mu)
        A.data = np.random.randn(A.nnz)
        N = A.copy()
        N.data = N.data**2
        A = A*sps.diags([1 / np.sqrt(np.ravel(N.sum(axis=0)))], [0])

    return A

def create_classification(m, n, rho=1, mu=1, sigma=0.05):
    """Create a random classification problem."""
    A = normalized_data_matrix(m, n, mu)
    x0 = sps.rand(n, 1, rho)
    x0.data = np.random.randn(x0.nnz)
    x0 = x0.toarray().ravel()

    b = np.sign(A.dot(x0) + sigma*np.random.randn(m))
    return A, b

def hinge_loss(theta, X, y):
    if not all(np.unique(y) == [-1, 1]):
        raise ValueError("y must have binary labels in {-1,1}")
    return cp.sum_entries(cp.max_elemwise(1 - sps.diags([y],[0])*X*theta, 0))

np.random.seed(0)

m = 1500
n = 50000
rho = 0.01
sigma = 0.05
mu = 0.1
lam = 0.5*sigma*np.sqrt(m*np.log(mu*n))

A, b = create_classification(m, n, rho=rho, mu=mu, sigma=sigma)


x = cp.Variable(A.shape[1])
f_1 = hinge_loss(x, A, b) + lam*cp.norm1(x)
f_2 = hinge_loss(x, A, b) + lam*cp.sum_squares(x)

# Problem construction
prob1 = cp.Problem(cp.Minimize(f_1))
prob2 = cp.Problem(cp.Minimize(f_2))

opt_val = None

# Problem collection

# Single problem collection
problem1Dict = {
    "problemID" : "hinge_l1_0",
    "problem"   : prob1,
    "opt_val"   : None
}
problem2Dict = {
    "problemID" : "hinge_l2_0",
    "problem"   : prob2,
    "opt_val"   : None
}
problems = [problem1Dict, problem2Dict]



# For debugging individual problems:
if __name__ == "__main__":
    def printResults(problemID = "", problem = None, opt_val = None):
        print(problemID)
        problem.solve()
        print("\tstatus: {}".format(problem.status))
        print("\toptimal value: {}".format(problem.value))
        print("\ttrue optimal value: {}".format(opt_val))
    printResults(**problems[0])
    printResults(**problems[1])

hinge_l1_0
	status: optimal
	optimal value: 1376.44474166
	true optimal value: None
hinge_l2_0


SolverError: Solver 'ECOS' failed. Try another solver.