In [1]:
import numpy as np

The algorithm tests the current trial step. If if is too long (that is, if Wolfe 1 is violated), the step is shortened. If it is too long (that is, if Wolfe 2 is violated), the step is increased. A lower and an upper bound are maintained at each iteration to control the processus.  

In [2]:
def lineSearch(obj, x, d, alpha0, beta1, beta2, lbd):
    """
    :param obj: function returning the value of the objective function and its gradient.
    :type obj: f, g = fct(x)
    
    :param x: point where the line search starts.
    :type x: numpy array
    
    :param d: direction along which the line search is performed.
    :type d: numpy array (size dimension as x)
    
    :param alpha0: first trial for the step
    :type alpha0: float. Must be positive. 
    
    :param beta1: parameter for the first Wolfe condition. 
    :type beta1: float. Must be strictly between 0 and 1.
    
    :param beta2: parameter for the second Wolfe condition. 
    :type beta2: float. Must be strictly between 0 and 1, and beta2 > beta1.
    
    """
    if  lbd <= 1:
        raise Exception(f'lambda is {lbd} and must be > 1')
    if  alpha0 <= 0:
        raise Exception(f'alpha0 is {alpha0} and must be > 0')
    if beta1 <= 0 or beta1 >= 1:
         raise Exception(f'beta1 = {beta1} must be strictly between 0 and 1')
    if beta2 >= 1:
         raise Exception(f'beta2 = {beta2} must be strictly lesser than 1')       
    if  beta1 >= beta2:
        raise Exception(f'Incompatible Wolfe cond. parameters: beta1={beta1} is greater or equal than beta2={beta2}')
        
    f, g = obj(x)
    deriv = np.inner(g,d)
    if deriv >= 0:
        raise Exception(f'd is not a descent direction: {deriv} >= 0')
    i = 0
    alpha = alpha0
    # The lower bound alphal is initialized to 0.
    alphal = 0
    # The upper bound alphar is initialized to "infinity", that is, the largest floating point number
    # representable in the machine. 
    alphar = np.finfo(np.float64).max
    finished = False
    iters = list()
    while not finished:
        xnew = x + alpha * d
        fnew, gnew = obj(xnew)
        # First Wolfe condition
        if fnew > f + alpha * beta1 * deriv:
            reason = "too long"
            alphar = alpha ;
            alpha = (alphal + alphar) / 2.0
        # Second Wolfe condition
        elif np.inner(gnew, d) < beta2 * deriv:
            reason = "too short"
            alphal = alpha 
            if alphar == np.finfo(np.float64).max:
                alpha = lbd * alpha 
            else:
                alpha = (alphal + alphar) / 2.0
        else:
            reason = "ok"
            finished = True
        iters.append([alpha, alphal, alphar, reason])
    return alpha, iters

Consider the function \\[ f(x) = \frac{1}{2} x_1^2 + \frac{9}{2} x_2^2  \\]
for which the gradient is \\[ \nabla f(x) = \left( \begin{array}{c} x_1 \\ 9x_2 \end{array} \right).\\]
Consider the vector $x= \left( \begin{array}{c} 10 \\ 1 \end{array} \right)$ and the descent direction $d= \left( \begin{array}{c} -2/\sqrt{5} \\ 1/\sqrt{5} \end{array} \right).$

Define $\alpha_0=10^{-3}$, $\beta_1=0.3$, $\beta_2=0.7$ and $\lambda=20$.

We apply the line search algorithm in order to find a step $\alpha^*$ such that the Wolfe conditions are satisfied.

In [3]:
# Note that, in Python, the numbering of arrays start at 0.
def func_grad(x):
    f = 0.5 * x[0] * x[0] + 4.5 * x[1] * x[1]
    g = np.array([x[0], 9 * x[1]])
    H = np.array([[1, 0], [0, 9]])
    return f, g

In [4]:
x = np.array([10, 1])
d = np.array([-2 / np.sqrt(5), 1 / np.sqrt(5)])
alpha0 = 1.0e-3
beta1 = 0.3
beta2 = 0.7
lbd = 20
alpha, iters = lineSearch(func_grad, x, d, alpha0, beta1, beta2, lbd)
print(f'The step is {alpha} and the number of iterations of the algorithm is {len(iters)-1}')

The step is 2.3000000000000003 and the number of iterations of the algorithm is 5


In [5]:
print("alpha\t\talpha_l\t\talpha_u\t\tReason")
for k in range(len(iters)):
    print("{:+E}\t{:+E}\t{:+13E}\t{:}".format(*(iters[k])))

alpha		alpha_l		alpha_u		Reason
+2.000000E-02	+1.000000E-03	+1.797693E+308	too short
+4.000000E-01	+2.000000E-02	+1.797693E+308	too short
+8.000000E+00	+4.000000E-01	+1.797693E+308	too short
+4.200000E+00	+4.000000E-01	+8.000000E+00	too long
+2.300000E+00	+4.000000E-01	+4.200000E+00	too long
+2.300000E+00	+4.000000E-01	+4.200000E+00	ok


The value of the parameters used in this example have been chosen to illustrate all the cases and are not appropriate in practice. Thus we next use more appropriate ones with $\alpha_0=1$, $\beta_1=10^{-4}$, $\beta_2=0.99$ and $\lambda=2$.

In [6]:
x = np.array([10, 1])
d = np.array([-2 / np.sqrt(5), 1 / np.sqrt(5)])
alpha0 = 1
beta1bis = 1.0e-4
beta2bis = 0.99
lbdbis = 2
alpha2, iters2 = lineSearch(func_grad, x, d,alpha0, beta1bis, beta2bis, lbdbis)
print(f'The step is {alpha2} and the number of iterations of the algorithm is {len(iters2)-1}')

The step is 1 and the number of iterations of the algorithm is 0


In [7]:
print("alpha\t\talpha_l\t\talpha_u\t\tReason")
for k in range(len(iters2)):
    print("{:+E}\t{:+E}\t{:+13E}\t{:}".format(*(iters2[k])))

alpha		alpha_l		alpha_u		Reason
+1.000000E+00	+0.000000E+00	+1.797693E+308	ok
