Example 1
=========

Let us test ``cgn`` on a simple unconstrained linear least-squares problem. As an example, we use problem 32 from the article

More, J., Garbox, B. and Hillstrom, E. "Testing Unconstrained Optimization Software", 1981.

The problem is as follows:

$
\begin{align}
\min_{x \in \mathbb R^n} & ||F(x)||_2^2 + ||x||_2^2, \\
\text{where} \quad & F(x)_j = x_j - \frac{2}{m} \sum_{i=1}^n x_i - 1, \quad \text{for } 1 \leq j \leq n, \\
& F(x)_j = - \frac{2}{m} \sum_{i=1}^n x_i - 1, \quad \text{for } n < j <= m.
\end{align}
$

with $m >= n$. Let us choose $n=200$ and $m=400$.

First, we implement the affine misfit function $F=F(x)$.

In [51]:
import numpy as np

m = 400
n = 200

def F(x):
    z = np.zeros(m)
    for i in range(n):
        z[i] = x[i] - 2.0 * sum(x) / m - 1.
    for i in range(n,m):
        z[i] = - 2.0 * sum(x) / m - 1.
    return z

We also have to implement its Jacobian:


In [89]:
A = np.zeros((m,n))
# upper half of the Jacobian matrix
for i in range(n):
    for j in range(n):
        if i == j:
            A[i,j] = 1.0 - 2.0 / m
        else:
            A[i,j] = -2.0 / m
# lower half of the Jacobian matrix
for i in  range(n,m):
    for j in range(n):
        A[i,j] = -2.0 / m

def DF(x):
    return A

Let us now set up the ``cgn.Parameter`` object:

In [53]:
import cgn

x = cgn.Parameter(dim=n, name="x")

We set up the regularization term by setting $\beta = 1$. The regularization operator defaults to the identity, and the regularizing guess default to the zero vector.

In [54]:
x.beta = 1.

Next, we can initialize the ``cgn.Problem`` object:

In [55]:
problem = cgn.Problem(parameters=[x], fun=F, jac=DF)

Next, we initialize the solver...

In [56]:
solver = cgn.CGN()

and solve the problem. As starting value we simply use the one vector.

In [101]:
x_start = np.ones(n)
solver.options.set_verbosity(2)
solution = solver.solve(problem=problem, starting_values=[x_start])



Starting the constrained Gauss-Newton method. Cost at starting value: 600.0
+-----------+-------------------------+-------------------------+-------------------------+-------------------------+-------------------------+
| Iteration | Cost                    | Constraint violation    | Stepsize (||p||)        | Steplength (h)          | Computation time [s]    |
+-----------+-------------------------+-------------------------+-------------------------+-------------------------+-------------------------+
+-----------+-------------------------+-------------------------+-------------------------+-------------------------+-------------------------+
|     1     |    150.00000000000017   |           0.0           |    21.213203435596412   |           1.0           |   0.008646965026855469  |
+-----------+-------------------------+-------------------------+-------------------------+-------------------------+-------------------------+
|     2     |    149.99999999999991   |           0.0     

Let us view the solution:

In [102]:
x_min = solution.minimizer("x")
x_min

array([-0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5,
       -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5,
       -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5,
       -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5,
       -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5,
       -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5,
       -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5,
       -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5,
       -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5,
       -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5,
       -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5,
       -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5,
       -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5,
       -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0

Let us compare this solution to the one obtain with ridge regression:

In [103]:
from sklearn.linear_model import Ridge
print(A)
clf = Ridge(alpha=1.)
y = np.ones(m)
clf.fit(X=A, y=y)
x_ridge = clf.coef_

difference_to_ridge = np.linalg.norm(x_min - x_ridge)
print(f"Difference to ridge: {difference_to_ridge}")
x_ridge

[[ 0.995 -0.005 -0.005 ... -0.005 -0.005 -0.005]
 [-0.005  0.995 -0.005 ... -0.005 -0.005 -0.005]
 [-0.005 -0.005  0.995 ... -0.005 -0.005 -0.005]
 ...
 [-0.005 -0.005 -0.005 ... -0.005 -0.005 -0.005]
 [-0.005 -0.005 -0.005 ... -0.005 -0.005 -0.005]
 [-0.005 -0.005 -0.005 ... -0.005 -0.005 -0.005]]
Difference to ridge: 7.071067811865471


array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])