Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hessian free optimizer needed #2682

Closed
rajarsheem opened this issue Jun 6, 2016 · 14 comments
Closed

hessian free optimizer needed #2682

rajarsheem opened this issue Jun 6, 2016 · 14 comments
Labels
stat:contribution welcome Status - Contributions welcome type:feature Feature requests

Comments

@rajarsheem
Copy link

Hessian free optimizers are successfully applied on neural networks (especially RNNs). Currently need one in TF!

@girving
Copy link
Contributor

girving commented Jun 7, 2016

PRs welcome!

@girving girving added stat:contribution welcome Status - Contributions welcome triaged labels Jun 7, 2016
@ajaybhat
Copy link

I would like to work on this.

@girving
Copy link
Contributor

girving commented Jun 10, 2016

@ajaybhat Let us know if you have questions or issues!

@ajaybhat
Copy link

@girving Could you let me know which classes to look for as a reference to implement optimizers? I'm having a bit of trouble finding them.

@girving
Copy link
Contributor

girving commented Jun 13, 2016

@ajaybhat Search for classes which inherit from Optimizer. However, note that for Hessian free methods the standard split into compute_gradients and apply_gradients won't work, since you need to compute partial second order gradients. There are a few different ways one could handle that; the simplest would be to compute gradients as normal during compute_gradients and do the higher order stuff in apply_gradients.

@ajaybhat
Copy link

Thanks! I'll let you know if i've any more questions.

@Fhrozen
Copy link

Fhrozen commented Jun 14, 2016

I was also try to implement HessianFree (HF) optimization on another framework, however i did not success. I am not sure if you can use this git (https://github.com/drasmuss/hessianfree) as reference or ask for help. The implementation is already on python and cuda, but he did not implemented convolutional. If you have a code to test let me know so i can compare with my others results.

@aselle aselle removed the triaged label Jul 28, 2016
@aselle aselle added type:feature Feature requests and removed enhancement labels Feb 9, 2017
@wangzt2012
Copy link

Is there anyone who is still trying to implement HF optimization on the tensorflow framework?

@WihanB
Copy link

WihanB commented Apr 20, 2017

I was attempting to implement HF optimisation and Saddle Free Newton. These algorithms are doable in a FFN framework. Unfortunately for RNN's as discussed in #5985 it is currently not possible to calculate second order derivatives from DynamicRNN's due to the use of a while loop. The current workaround would be to use StaticRNN. However, the while loop second derivative issue seems to be the major issue when attempting to implement a general Hessian Free optimisation algorithm or any other general second order method which requires Hessian Vector Products.

@itsmeolivia
Copy link
Contributor

Automatically closing due to lack of recent activity. Since this issue is old at this point, please reopen the issue if it still occurs when tried with the latest version of Tensorflow. Thank you.

@alberduris
Copy link

It's been more than a year since the Issue was closed, TF 1.3 was recently deployed and I still think that a Hessian Free Optimizer implementation for TF would be great.

Consider reopening?

@lixilinx
Copy link

lixilinx commented Apr 11, 2018

Pardon me to promote my second-order optimization methods here. If interested, please check my tensorflow package at https://github.com/lixilinx/psgd_tf
Second-order optimization with five different preconditioners and rnn/cnn examples are provided. It works for both FNN and RNN with while loop. We know that tf.while_loop still does not support second-order derivative. To work around it, you can just use perturbation of gradient to approximate the Hessian-vector product you want.

As for HF optimization, its damping factor and step size in line search are obtained by trial-and-error, and this could cause further troubles for its tensorflow implementation. A second-order method without line search is preferred, and methods in the above link are such examples.

@dave-fernandes
Copy link

Here is an implementation of the Saddle Free method:
https://github.com/dave-fernandes/SaddleFreeOptimizer

@JaeDukSeo
Copy link

amazing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stat:contribution welcome Status - Contributions welcome type:feature Feature requests
Projects
None yet
Development

No branches or pull requests