Example 2
=========

In this example, we are going to test ``cgn`` on a constrained version of the classic "Osborne problem" (taken from Wright and Holt, "Algorithms for Nonlinear Least Squares with Linear Inequality Constraints", SIAM Journal on Scientific and Statistical Computing, 1985):

$
\begin{align}
\min_{x \in \mathbb R^{11}} \quad & ||y - G(x)||_2^2 \\
\text{s. t.} \quad & C x \geq d, \\
\text{where } \quad & G_j(x) = x_1 e^{-t_j x_5} + x_2 e^{-(t_j - x_9)^2 x_6}
 + x3 e^{-(t_j - x_{10})^2 x_7} + x_4 e^{-(t_j - x_{11})^2 x_8}, \\
& t_j = \frac{j}{10}, \quad j=1,\ldots, 65, \\
& C = \left( \begin{matrix}
1 & 2 & 3 & 4 \\
1 & 1 & 0 & 0
\end{matrix} \right), \quad d = \left( \begin{matrix}
6.270063 \\ 1.741584
\end{matrix} \right)
\end{align}
$

This is a nonlinear least-squares problem with linear inequality constraints, and
its solution is known to satisfy $||y - G(x_\mathrm{min})||_2^2 = 0.0401377$.
The data vector $y$ is given by


In [52]:
import numpy as np

y = np.array([
    1.366, 1.191, 1.112, 1.013, 0.991, 0.885, 0.831, 0.847, 0.786, 0.725,
    0.746, 0.679, 0.608, 0.655, 0.616, 0.606, 0.602, 0.626, 0.651, 0.724,
    0.649, 0.649, 0.694, 0.644, 0.624, 0.661, 0.612, 0.558, 0.533, 0.495,
    0.500, 0.423, 0.395, 0.375, 0.372, 0.391, 0.396, 0.405, 0.428, 0.429,
    0.523, 0.562, 0.607, 0.653, 0.672, 0.708, 0.633, 0.668, 0.645, 0.632,
    0.591, 0.559, 0.597, 0.625, 0.739, 0.710, 0.729, 0.720, 0.636, 0.581,
    0.428, 0.292, 0.162, 0.098, 0.054
])

Let us start with implementation of the function $G$ and its Jacobian.

In [53]:
from math import exp

m = 65
n = 11

t = np.zeros(m)
for i in range(m):
    t[i] = float(i/10)

def G(x):
    z = np.ones(m)
    for i in range(m):
            z[i] = x[0]*exp(-x[4]*t[i]) + x[1]*exp(-x[5]*((t[i]-x[8])**2)) \
                + x[2]*exp(-x[6]*((t[i]-x[9])**2)) + x[3]*exp(-x[7]*((t[i]-x[10])**2))
    return z

def DG(x):
    jac = np.zeros((m, n))
    for i in range(m):
        for j in range(n):
            jac[i, 0] = exp(-x[4] * t[i])
            jac[i, 1] = exp(-x[5] * (t[i] - x[8]) ** 2)
            jac[i, 2] = exp(-x[7] * (t[i] - x[9]) ** 2)
            jac[i, 3] = exp(-x[7] * (t[i] - x[10]) ** 2)
            jac[i, 4] = - t[i] * x[0] * exp(-x[4] * t[i])
            jac[i, 5] = - (t[i] - x[8]) ** 2 * x[1] * exp(-x[5] * (t[i] - x[8]) ** 2)
            jac[i, 6] = - (t[i] - x[9]) ** 2 * x[2] * exp(-x[6] * (t[i] - x[9]) ** 2)
            jac[i, 7] = - (t[i] - x[10]) ** 2 * x[3] * exp(-x[7] * (t[i] - x[10]) ** 2)
            jac[i, 8] = 2.0 * x[5] * (t[i] - x[8]) * x[1] * exp(-x[5] * (t[i] - x[8]) ** 2)
            jac[i, 9] = 2.0 * x[6] * (t[i] - x[9]) * x[2] * exp(-x[6] * (t[i] - x[9]) ** 2)
            jac[i, 10] = 2.0 ** x[7] * (t[i] - x[10]) * x[3] * exp(-x[7] * (t[i] - x[10]) ** 2)
    return jac

Next, set up the matrix $C$ and the vector $d$ for the inequality constraint.


In [54]:
C = np.zeros((2, n))
C[0, 0] = 1.0
C[0, 1] = 2.0
C[0, 3] = 3.0
C[0, 4] = 4.0
C[1, 0] = 1.0
C[1, 2] = 1.0
d = np.array([6.270063, 1.741584])

As initial guess, we use the canonical starting value from the cited paper:

In [55]:
x_start = np.array([1.3, 0.65, 0.65, 0.7, 0.6, 3.0, 5.0, 7.0, 2.0, 4.5, 4.5])

With this, we have now all the ingredients to set up our ``cgn.Problem``.

In [56]:
import cgn

x = cgn.Parameter(start=x_start, name="x")
incon = cgn.LinearConstraint(parameters=[x], a=C, b=d, ctype="ineq")

# Define the misfit function.
def F(x):
    return G(x) - y
# Note that DF(x) = DG(x).
problem = cgn.Problem(parameters=[x], fun=F, jac=DG, constraints=[incon])

To stabilize the problem, we add very mild regularization to $x$:

In [57]:
x.beta = 1e-10

Now, we can solve this problem with ``cgn``:

In [58]:
solver = cgn.CGN()
# Let the solver return some output:
solver.options.set_verbosity(lvl=2)
solution = solver.solve(problem=problem)



Starting the constrained Gauss-Newton method. Cost at starting value: 3.5135292586674707
+-----------+-------------------------+-------------------------+-------------------------+-------------------------+-------------------------+
| Iteration | Cost                    | Constraint violation    | Stepsize (||p||)        | Steplength (h)          | Computation time [s]    |
+-----------+-------------------------+-------------------------+-------------------------+-------------------------+-------------------------+
+-----------+-------------------------+-------------------------+-------------------------+-------------------------+-------------------------+
|     1     |    2.019268940326675    |           0.0           |    51.54989970703435    |          0.125          |   0.00884699821472168   |
+-----------+-------------------------+-------------------------+-------------------------+-------------------------+-------------------------+
|     2     |    0.6954282969801624   |      

Let's check how our solution compares to the theoretical optimum:

In [59]:
theoretical_minimum = 4.01377e-2
solution.cost

0.020288130249285115

Wait, this is below the theoretical optimum! What's wrong? Well, recall that ``cgn`` uses the cost function
$
J(x) = \frac{1}{2} ||F(x)||_2^2 + \frac{\beta}{2} ||R(x-m)||_2^2.
$
Hence, we need to multiply this by $2$ to get the desired quantity:

In [60]:
2 * solution.cost

0.04057626049857023

This is satisfyingly close to the theoretical optimum.