___ ___ ___ ___ ___ ___
/__/\ / /\ / /\ / /\ ___ / /\ /__/\
\ \:\ / /:/_ / /:/_ / /:/_ / /\ / /::\ \ \:\
\__\:\ / /:/ /\ / /:/ /\ / /:/ /\ / /:/ / /:/\:\ \ \:\
___ / /::\ / /:/ /:/_ / /:/ /::\ / /:/ /::\ /__/::\ / /:/~/::\ _____\__\:\
/__/\ /:/\:\/__/:/ /:/ /\/__/:/ /:/\:\/__/:/ /:/\:\\__\/\:\__ /__/:/ /:/\:\/__/::::::::\
\ \:\/:/__\/\ \:\/:/ /:/\ \:\/:/~/:/\ \:\/:/~/:/ \ \:\/\\ \:\/:/__\/\ \:\~~\~~\/
\ \::/ \ \::/ /:/ \ \::/ /:/ \ \::/ /:/ \__\::/ \ \::/ \ \:\ ~~~
\ \:\ \ \:\/:/ \__\/ /:/ \__\/ /:/ /__/:/ \ \:\ \ \:\
\ \:\ \ \::/ /__/:/ /__/:/ \__\/ \ \:\ \ \:\
\__\/ \__\/ \__\/ \__\/ \__\/ \__\/
___ ___ ___ ___
/ /\ / /\ / /\ /__/\
/ /:/_ / /::\ / /::\ \ \:\
___ ___ / /:/ /\ / /:/\:\ / /:/\:\ \ \:\
/__/\ / /\ / /:/ /:/_ / /:/~/::\ / /:/~/:/ _____\__\:\
\ \:\ / /://__/:/ /:/ /\/__/:/ /:/\:\/__/:/ /:/___/__/::::::::\
\ \:\ /:/ \ \:\/:/ /:/\ \:\/:/__\/\ \:\/:::::/\ \:\~~\~~\/
\ \:\/:/ \ \::/ /:/ \ \::/ \ \::/~~~~ \ \:\ ~~~
\ \::/ \ \:\/:/ \ \:\ \ \:\ \ \:\
\__\/ \ \::/ \ \:\ \ \:\ \ \:\
\__\/ \__\/ \__\/ \__\/
-
Examples of CIFAR10, CIFAR100 classification from pre-trained Imagenet ResNet50 model in
transfer_learning/
-
Pre-trained model serves as well conditioned initial guess for transfer learning. In this setting Newton methods perform well due to their excellent properties in local convergence. Low Rank Saddle Free Newton is able to zero in on highly generalizable local minimizers bypassing indefinite regions. Below are validation accuracies of best choices of fixed step-length for Adam, SGD and LRSFN with fixed rank of 40.
- For more information see the following manuscript
- [2] O'Leary-Roseberry, T., Alger, N., Ghattas O.,
Low Rank Saddle Free Newton: A Scalable Method for Stochastic Nonconvex Optimization.
arXiv:2002.02881.
(Download)
BibTeX
@article{OLearyRoseberryAlgerGhattas2020, title={Low Rank Saddle Free Newton: Algorithm and Analysis}, author={O'Leary-Roseberry, Thomas and Alger, Nick and Ghattas, Omar}, journal={arXiv preprint arXiv:2002.02881}, year={2020} } }