Here we will define
FixedPointAcceleration.jl approximates the derivative
The implementation of the Newton method in this package uses this formula to predict the fixed point given two previous iterates.1 This method is designed for use with scalar functions.
Consider that a sequence of scalars
For a concrete example consider that every iteration halves the distance between the current vector and the fixed point. In this case the left hand side will be one half which will equal the right hand side which will also be one half.
This above expression can be simply rearranged to give a formula predicting the fixed point that is used as the subsequent iterate. This is:
The implementation of the Aitken method in FixedPointAcceleration.jl uses this formula to predict the fixed point given two previous iterates. This method is designed for use with scalar functions. If it is used with higher dimensional functions that take and return vectors then it will be used elementwise.
The epsilon algorithms introduced by Wynn (1962) provides an alternate method to extrapolate to a fixed point. This documentation will present a brief numerical example and refer readers to Wynn (1962) or Smith, Ford & Sidi (1987) for a mathematical explanation of why it works.
The basic epsilon algorithm starts with a column of iterates (column B in the below figure). If
Where
This can be seen in the below table which uses an epsilon method to find the fixed point of
In this figure B1 is the initial guess of the fixed point. Then we have the iterates
Note that the values in columns C and E are poor extrapolations. Only the even columns D,F provide reasonable extrapolation values. For this reason an even number of iterates (an odd number of values including the starting guess) should be used for extrapolation. FixedPointAcceleration.jl's functions will enforce this by throwing away the first iterate provided if necessary to get an even number of iterates.
In the vector case this algorithm can be visualised by considering each entry in the above table to contain a vector going into the page. In this case the complication emerges from the inverse - there is no clear interpretation of
FixedPointAcceleration.jl implements two minimal polynomial algorithms, Minimal Polynomial Extrapolation (MPE) and Reduced Rank Extrapolation (RRE). This section will simply present the main equations but an interested reader is directed to Cabay & Jackson (1976) or Smith, Ford & Sidi (1987) for a detailed explanation.
To first define notation, each vector (the initial guess and subsequent iterates) is defined by
For the MPE method the extrapolated vector,
Where the coefficient vector is found by
Anderson (1965) acceleration is an acceleration algorithm that is well suited to functions of vectors. Similarly to the minimal polynomial algorithms it takes a weighted average of previous iterates. It is different however to these algorithms (and the VEA algorithm) in that previous iterates need not be sequential but any previous iterates can be used.
Consider that we have previously run an N-dimensional function M times. We can define a matrix
Now Anderson acceleration defines a weight to apply to each column of the matrix. This weight vector is M-dimensional and can be denoted
Thus we choose the weights that will be predicted to create the lowest ``movement'' in an iteration.
With these weights we can then create the expression for the next iterate as:
The actual implementation differs from the proceeding description by recasting the optimisation problem as an unconstrained least squares problem (see Fang & Saad 2009 or Walker & Ni 2011) but in practical terms is identical.
Footnotes
-
Only two rather than three are required because we can use $x_{i+1} = f(x_i)$ and $x_{i+2} = f(x_{i+1})$. ↩