### Parameter Update
Let $k \in \mathbb{N}^+$ the $k$ iteration of an parent optimization problem,
$x_k \in \mathbb{R}^n$ its free parameters,
$r_k \in \mathbb{R}^n$ the search or change direction,
and $\alpha_k \in \{ \alpha \in \mathbb{R} | \alpha>0\}$ the optimal step width of a line search algorithm. 
The next parameter update $u_k \in \mathbb{R}^n$ is

$$
u_k = x_{k+1} - x_k = \alpha_k \, r_k
$$

### Line Search Problem
Let $f$ the target function subject to minimization, 
then the optimal $\alpha_k$ of the $k$-th iteration (of the parent optimization problem) is determined by line search optimization problem as follow

$$
\alpha_k = \arg\min_\alpha \, f(x_k + \alpha\, r_k)
$$

An $k$-th variables (e.g. variable $x_k$, change direction $r_k$, gradient $g_k$, etc.) are constant inputs in a line search solver. In the following the $k$-th subscripts are removed because they are just not relevant for line search.

### Armijo Condition (1st Wolfe Condition)
The variables $x \in \mathbb{R}^n$, 
its gradients $g \in \mathbb{R}^n$ (i.e. $g = \nabla f(x)$),
the change of direction $r \in \mathbb{R}^n$,
and the scaling factor $\gamma \in \{ c \in \mathbb{R} | 0 < c < 1\}$ (hyperparameter)
are the input parameters of the Armijo condition

$$
f(x + \alpha_i\, r) < f(x) + \gamma \, \alpha_i \, g^T \, r
$$


### Descent Direction
The term $\gamma \, \alpha \, g^T r$ must be negative because $g^T r$ must have a **descent direction**

$$
g^T r < 0
$$

If $g^T r > 0$ then the new $f(x+\alpha_i r)$ might be worse than given $f(x)$ and keep iterating down this wrong path.

### It's a variable tolerance
The term $\gamma \alpha g^T r$ is a kind of variable tolerance $VarTol(\alpha)$

$$
-VarTol(\alpha) = \gamma \, \alpha \, g^T r \\
f(x + \alpha\, r) < f(x) - VarTol(\alpha) \\
f(x + \alpha\, r) + VarTol(\alpha) < f(x) 
$$

and the line search algorithm will stop if 

$$
f(x + \alpha\, r) + VarTol(\alpha) \geq f(x)
$$

is true 
what is similar to the stopping criteria of gradient descent (see [stopping_gd](https://github.com/hcnn/stopping_gd/blob/master/README.ipynb)).

In order to ensure that $VarTol(\alpha) \geq 0$ 
Positive  can be avoided by preprocessing

$$
VarTol(\alpha) = - \gamma \, \alpha \, \min(0, g^T r)
$$

To ensure $VarTol(\alpha)>0$ (or avoid $VarTol(\alpha)=0$)
use $TolDir>0$

$$
VarTol(\alpha) = - \gamma \, \alpha \, \min(-TolDir, g^T r) \\
VarTol(\alpha) = \gamma \, \alpha \, \max(TolDir, -g^T r)
$$

I suggest to use $TolDir = 10^{-8} / \gamma$. 

### armijo_prep
The function `armijo_prep` will preprocess the value

$$
{\rm ggr} = \gamma \, \max(TolDir, -g^T r)
$$

that can be used to compute 

$$
VarTol(\alpha) = \alpha \cdot {\rm ggr} 
$$

during line search