# Non linear optimization: multiple choice questions

## Introduction to optimization and operations research

Michel Bierlaire


In [None]:

import matplotlib.pyplot as plt
import numpy as np


# Question 1
Suppose that $f: \mathbb{R}^n \rightarrow \mathbb{R}$ is a differentiable nonlinear function,
$x_k\in\mathbb{R}^n$ is a point and $d_k\in\mathbb{R}^n$ is a descent direction such that $\nabla f(x_k)^T d_k < 0$,
and that $f$ is bounded from below in the direction $d_k$. We write the two Wolfe's conditions as:
$$
f(x_k + \alpha d_k) \leq f(x_k) + \alpha\beta_1\nabla f(x_k)^T d_k
$$
and
$$
\frac{\nabla f(x_k + \alpha d_k)^T d_k}{\nabla f(x_k)^Td_k} \leq \beta_2.
$$
Which condition should be satisfied by Wolfe's parameters $\beta_1$ and $\beta_2$ so that we can be sure that
there exists a step size $\alpha$ that satisfies both Wolfe's conditions?

1. Such as  condition does not exist.
2. $0 < \beta_1 < \beta_2 < 1$.
3. $0 < \beta_2 < \beta_1 < 1$.
4. $0 < \beta_1 = \beta_2 < 1$.

A sufficient condition for the Wolfe conditions to be compatible
is that $0 < \beta_1 < \beta_2 < 1$. This is Theorem 11.9 in the book.

# Question 2
Suppose that $f: \mathbb{R}^n \rightarrow \mathbb{R}$ is a
differentiable nonlinear function, $x_k\in\mathbb{R}^n$ is a point
and $d_k\in\mathbb{R}^n$ is a descent direction. Consider $\alpha^*$
the step that minimizes the function along $d_k$, that is
$$
f(x_k + \alpha^* d_k ) \leq f(x_k + \alpha d_k), \quad \forall \alpha \geq 0
$$
What can be said about $\alpha^*$?

1. $\alpha^*$ always verifies both Wolfe conditions.
2. $\alpha^*$ always verifies the first Wolfe condition, but not necessarily the second.
3. $\alpha^*$ verifies the second Wolfe condition, but not necessarily the first.
4. Nothing can be said about the validity of the Wolfe conditions at $\alpha^*$ in the general case.

The second Wolfe condition is
$$
\frac{\nabla f(x_k + \alpha^* d_k)^T d_k}{\nabla f(x_k)^T d_k} \leq \beta_2.
$$
As $\alpha^*$ minimizes the function along $d_k$, the directional
derivative at $\alpha^*$ is zero, that is
$$
\nabla f(x_k + \alpha^* d_k)^Td_k = 0.
$$
As $\beta_2 > 0$, the second Wolfe condition is verified.

The first Wolfe condition is not necessarily verified. If the value
of $\beta_1$ is large, this condition allows only short steps. It
is illustrated by the figure below, where
$$
f(x_k + \alpha d_k) = 2 \alpha^2 - 12 \alpha + 55,
$$
and $\beta_1=\frac{2}{3}$.
Therefore, the line defined by the first
Wolfe condition is
$$
-8\alpha + 55.
$$
It crosses the curve at $\alpha_\text{max}=2$, before its minimum $\alpha^*=3$. Only steps
shorter than $\alpha_\text{max}$ are accepted. And
the step $\alpha^*$, corresponding to the minimum, is rejected by the condition.

In [None]:


x = np.linspace(0, 5, 400)
y1 = 2 * x**2 - 12 * x + 55
y2 = 55 - 8 * x
plt.figure(figsize=(8, 6))
plt.plot(x, y1, label=r'$2 \alpha^2 - 12 \alpha + 55$', color='blue')
plt.plot(x, y2, label=r'$55 - 8\alpha$', color='red', linewidth=2)
plt.xlim(0, 5)
plt.ylim(35, 57)

plt.xlabel(r'$\alpha$')
plt.xticks([2, 3], [r'$\alpha_{\mathrm{max}}=2$', r'$\alpha^*=3$'])

plt.legend()
plt.grid(True)
plt.show()