Extrapolation from the past #63

juan43ramirez · 2022-11-07T18:10:16Z

Enhancement

Implement the extrapolation from the past algorithm (Popov, 1980). A good and modern source is Gidel et al. (2019).

This is an algorithm for computing parameter updates similar to extragradient: it computes the direction for updating parameters based on a "lookahead step". It is less intensive computationally than extragradient and enjoys similar convergence results for some class problems (Gidel et al., 2019).

Motivation

Whereas extragradient requires two gradient computations per parameter update, extrapolation from the past stores and re-uses gradients from previous extrapolation steps for use during current extrapolation steps. This means less computational intensity in terms of gradient calculations, which may be helpful in some settings.

However, storing the previous gradients still means an overhead in terms of storage as opposed to gradient descent-ascent.

References

G. Gidel, H. Berard, G. Vignoud, P. Vincent, S. Lacoste-Julien. A Variational Inequality Perspective on Generative Adversarial Networks. In ICLR, 2019.
L. D. Popov. A modification of the arrow-hurwicz method for search of saddle points. Mathematical
notes of the Academy of Sciences of the USSR, 1980.

juan43ramirez added the enhancement New feature or request label Nov 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extrapolation from the past #63

Extrapolation from the past #63

juan43ramirez commented Nov 7, 2022

Extrapolation from the past #63

Extrapolation from the past #63

Comments

juan43ramirez commented Nov 7, 2022

Enhancement

Motivation

References