# Non linear optimization: validity of Wolfe conditions

## Introduction to optimization and operations research

Michel Bierlaire


In [None]:

import matplotlib.pyplot as plt
import numpy as np


In this lab, you will test the **validity of the Wolfe conditions** in an inexact line search.
You will work with a univariate function g(α) = f(x + α d), compute g′(α), and check the
**first Wolfe condition** (sufficient decrease) and the **second Wolfe condition** (sufficient progress)
in the **correct order**. You will diagnose whether a trial step is “too long” or “too short”,
and visualize both conditions on the same plot to see how their thresholds delimit acceptable
step sizes. The goal is to connect the algebra (derivatives and inequalities) with practical
line‑search decisions, so you understand *why* algorithms shrink or accept a step and how to
propose α that satisfies **both** conditions.

We are performing a line search along a direction $d$ at a point
$x$. The function evaluated at $x+\alpha d$ is defined as
$$
g(\alpha) = f(x+\alpha d) = -1.833 \alpha^3 + 8.5 \alpha^2  - 10 \alpha + 5.
$$
Consider the step $\alpha=3$.
Consider the first Wolfe condition with $\beta_1=0.15$ and the second
Wolfe condition with $\beta_2=0.8$.

# Question 1
Is the step $\alpha=3$ too short, or too long?

As
$$
g(\alpha) = f(x+\alpha d) = -1.833 \alpha^3 + 8.5 \alpha^2  - 10 \alpha + 5,
$$
the directional derivative  is
$$
g'(\alpha) = -3 \cdot 1.833 \alpha^2 + 2 \cdot 8.5 \alpha  - 10.
$$
Therefore, the second Wolfe condition for $\alpha=3$  is
\begin{align*}
\frac{g'(3)}{g'(0)}& \stackrel{?}{\leq} \beta_2, \\
-8.5 / -10 = 0.85 &\stackrel{?}{\leq} 0.8. \\
\end{align*}
It is violated. Therefore, it is tempting to conclude that the step is
too short.
The first Wolfe condition for $\alpha = 3$ is
\begin{align*}
g(3) &\stackrel{?}{\leq} g(0) + 3 \beta_1 g'(0), \\
2.009 &\stackrel{?}{\leq} 5 - 4.5 = 0.5.
\end{align*}
It is also violated. Therefore, it is tempting to conclude that the
step is too long.

However, a step cannot be both too short and too long. The two
conditions must be verified in the right order. The
first Wolfe condition must be checked first. If it is violated, we
conclude that the step is too long, and we do not need to consider the
second condition. The second Wolfe condition is checked only when the
step is not deemed too long by the first condition.

In the case described above, the step is too long, as it violates the
first condition. And the algorithm
should make it shorter.

In the  figure below, the value of $g(\alpha)$ is plotted in blue, the
line
$$
g(0) + 3 \beta_1 g'(0)
$$
characterizing the first Wolfe condition is plotted in red, and the
ratio
$$
\frac{g'(\alpha)}{g'(0)}
$$
characterizing the second Wolfe condition is plotted in green, as well
as the threshold value $\beta_2=0.8$. The dotted vertical line
corresponds to $\alpha=3$.
Let's draw the function.

Define the functions

In [None]:
def function_g(x):
    return -1.833 * x**3 + 8.5 * x**2 - 10 * x + 5




def wolfe_1(x):
    return 5 - 0.15 * 10 * x




def ratio_wolfe_2(x):
    return (-1.833 * 3 * x**2 + 2 * 8.5 * x - 10) / -10





Prepare the figure.

In [None]:
x = np.linspace(0, 4, 400)

plt.figure(figsize=(8, 6))
plt.plot(
    x,
    function_g(x),
    color='blue',
    linewidth=2,
    label=r'$-1.833x^3 + 8.5x^2 - 10x + 5$',
)
plt.plot(x, wolfe_1(x), color='red', linewidth=2, label=r'$\beta_1=0.15$')
plt.plot(x, ratio_wolfe_2(x), color='green', linewidth=2, label=r'$\beta_2=0.8$')

plt.axvline(x=3, color='black', linestyle=':', linewidth=1)
plt.axvline(x=0.1225, color='blue', linestyle='--', linewidth=1)
plt.axvline(x=1.45912, color='blue', linestyle='--', linewidth=1)

plt.axhline(y=0.8, color='green', linestyle='-', linewidth=1)

plt.xlabel(r'$\alpha$', fontsize=14)
plt.ylabel(r'$g(\alpha)$', fontsize=14)

plt.xlim(0, 4)
plt.ylim(-1, 5)

plt.text(
    1.2, wolfe_1(1.2), r'$\beta_1=0.15$', color='red', fontsize=12, rotation=0
)
plt.text(2.5, 0.9, r'$\beta_2=0.8$', color='green', fontsize=12, rotation=0)

plt.grid(False)
plt.gca().spines['right'].set_color('none')
plt.gca().spines['top'].set_color('none')
plt.gca().xaxis.set_ticks_position('bottom')
plt.gca().yaxis.set_ticks_position('left')
plt.gca().spines['bottom'].set_position('zero')
plt.gca().spines['left'].set_position('zero')

plt.show()


# Question 2
Propose a value of $\alpha$ that verifies both Wolfe conditions.

All values of $\alpha$ such that
$$
0.12251 \leq \alpha \leq 1.45912,
$$
verify both conditions. The two thresholds are identified by the two
dashed blue vertical lines on the picture above.