Skip to content

damaru2/convergence_analysis_hypergradient_descent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 

Repository files navigation




This is a Master Thesis submitted for the course in Mathematics and Foundations of Computer Science at the University of Oxford. Academic year 2016/2017.

Abstract

This dissertation studies the convergence of an adaptive method of gradient descent called Hypergradient Descent. We review some methods of gradient descent and their proofs of convergence for smooth and strongly convex functions. We show that the proof of convergence of Adam, a modern popular method of gradient descent, is incomplete. We derive a multiplicative rule of Hypergradient Descent and justify some choices made by, among other things, proposing a method of gradient descent and proving its convergence. The core of the dissertation is the convergence analysis of an instance of Hypergradient Descent. We prove convergence for quadratic functions and for a simple family of unidimensional functions. We also show that the method diverges if some properties are not assumed. We have implemented the algorithms reviewed in this dissertation and some Hypergradient Descent variants and we have applied them to two large scale optimization problems. We compare the executions and conclude that Hypergradient Descent is a good method in practice, specially because it does not need tuning of the learning rate.

About

Master Thesis submitted for the course in Mathematics and Foundations of Computer Science. Academic year 2016/2017

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published