Skip to content

flo3003/HF-CSAM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 

Repository files navigation

A Hessian Free Neural Networks Training Algorithm with Curvature Scaled Adaptive Momentum

This is a Hessian Free Neural Networks Training Algorithm with Curvature Scaled Adaptive Momentum called HF-CSAM.

Installation

Install via

git clone https://github.com/flo3003/HF-CSAM.git
cd HF-CSAM/hfcsam
python setup.py install

hfcsam requires a TensorFlow and Keras installation (the current code has been tested for realeases 1.6--1.8), but this is not currently enforced in the setup.py to allow for either the CPU or the GPU version.

Usage

The hfcsam module contains the class HF-CSAM, which inherits from keras optimizers and can be used as direct drop-in replacement for Keras's built-in optimizers.

from hfcsam import HFCSAM

loss = ...
opt = HFCSAM(dP=0.07, xi=0.99)
step = opt.minimize(loss)
with tf.Session() as sess:
    sess.run([loss, step])

HF-CSAM has two hyper-parameters: dP and xi. The dP parameter the step size and can vary depending on the problem. In MNIST and CIFAR datasets dP is 0.05 < dP < 0.5 and xi should be 0.5 < xi < 0.99 (the default value xi=0.99 should work for most problems).

Short Description of HF-CSAM

We give a short description of the algorithm, ignoring various details. Please refer to the [paper][1] for a complete description.

The algorithm's weight update rule is similar to SGD with momentum but with two main differences arising from the formulation of the training task as a constrained optimization problem: (i) the momentum term is scaled with curvature information (in the form of the Hessian); (ii) the coefficients for the learning rate and the scaled momentum term are adaptively determined.

The objective is to reach a minimum of the cost function L_t with respect to the synaptic weights, and simultaneously to maximize incrementally at each epoch the following quantity:

where are the weight updates at the current time step, are the weight updates at the previous time step and is the Hessian of the cost function .

At each epoch t of the learning process, the vector will be incremented by , so that:

And the objective function must be decremented by a quantity , so that:

The learning rule can be derived by solving the following constrained optimization problem:

Maximize z

subject to the constraints

and

Hence, by solving this constrained optimization problem analytically, we get the following update rule:

where is the gradient of the network's loss/cost function .

Feedback

If you have any questions or suggestions regarding this implementation, please open an issue in flo3003/HF-CSAM. Apart from that, we welcome any feedback regarding the performance of HF-CSAM on your training problems (mail to flwra.sakketoy@gmail.com).

Citation

If you use HF-CSAM for your research, please cite the [paper][1].

[1]: A Hessian Free Neural Networks Training Algorithm with Curvature Scaled Adaptive Momentum (under review)

About

This is a Hessian Free Neural Networks Training Algorithm with Curvature Scaled Adaptive Momentum called HF-CSAM

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published