Stochastic gradient optimization routines for Theano
Switch branches/tags
Nothing to show
Clone or download
Pull request Compare This branch is 134 commits behind lmjohns3:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
docs
downhill
examples
test
.gitignore
LICENSE
README.rst
setup.cfg
setup.py

README.rst

DOWNHILL

The downhill package provides algorithms for minimizing scalar loss functions that are defined using Theano.

Several optimization algorithms are included:

  • First-order stochastic gradient descent: SGD and NAG.
  • First-order stochastic techniques with adaptive learning rates: RProp, RMSProp, Equilibrated SGD, Adam, and ADADELTA.
  • Wrappers for several algorithms from scipy.optimize.minimize.

Example Code

Let's say you have 100 samples of 1000-dimensional data, and you want to represent your data as 100 coefficients in a 10-dimensional basis. This is pretty straightforward to model using Theano, using a matrix multiplication. Once you have constructed an expression for the loss, you can optimize it using downhill:

import climate
import theano
import theano.tensor as TT
import downhill
import my_data_set

climate.enable_default_logging()

A, B, K = 100, 1000, 10

x = TT.matrix('x')

u = theano.shared(np.random.randn(A, K).astype('f'), name='u')
v = theano.shared(np.random.randn(K, B).astype('f'), name='v')

err = TT.sqr(x - TT.dot(u, v))

downhill.minimize(
    loss=err.mean() + abs(u).mean() + (v * v).mean(),
    params=[u, v],
    inputs=[x],
    train=my_data_set.training,
    valid=my_data_set.validation,
    batch_size=A,
    monitors=(
        ('u<0.1', 100 * (abs(u) < 0.1).mean()),
        ('v<0.1', 100 * (abs(v) < 0.1).mean()),
    ),
)

More Information

Source: http://github.com/lmjohns3/downhill

Documentation: http://downhill.readthedocs.org

Mailing list: https://groups.google.com/forum/#!forum/downhill-users