FAQ

wernight edited this page Apr 23, 2012 · 4 revisions
Clone this wiki locally

Can I add my own questions here?

Sure. This is a wiki! If you ask the same question twice, we count that as “frequent”. So: can I add my own questions here?

BackpropTrainer

What are good values for `learningrate` and `momentum` ?

The larger the learning rate the larger the the weight changes on each epoch, and the quicker the network learns. However, the size of the learning rate can also influence whether the network achieves a stable solution. If the learning rate gets too large, then the weight changes no longer approximate a gradient descent procedure. (True gradient descent requires infinitesimal steps). Oscillation of the weights is often the result. Ideally then, we would like to use the largest learning rate possible without triggering oscillation. This would offer the most rapid learning and the least amount of time spent waiting at the computer for the network to train. One method that has been proposed is a slight modification of the backpropagation algorithm so that it includes a momentum term.

Applied to backpropagation, the concept of momentum is that previous changes in the weights should influence the current direction of movement in weight space. With momentum, once the weights start moving in a particular direction in weight space, they tend to continue moving in that direction. Imagine a ball rolling down a hill that gets stuck in a depression half way down the hill. If the ball has enough momentum, it will be able to roll through the depression and continue down the hill. Similarly, when applied to weights in a network, momentum can help the network “roll past” a local minima, as well as speed learning (especially along long flat error surfaces).

In BrainWave, the default learning rate is 0.25 and the default momentum parameter is 0.9. When applying backpropagation to a range of problems, you will often want to use much smaller values than these. For especially difficult tasks, a learning rate of 0.01 is not uncommon.

— See http://itee.uq.edu.au/~cogs2010/cmc/chapters/BackProp/index2.html

How to choose a Reinforcement Learner?

NFQ can be faster or slower than Q learning depending on the case. NFQ is able to generalize, and it can solve more complex problems. SARSA is another good learner than plays safer than Q. Both should always converge while SARSA may diverge.

See:

  • RL SARSA vs Q learning with an applet: http://www.cse.unsw.edu.au/~cs9417ml/RL1/algorithms.html
  • PyBrain RL explained: http://simontechblog.blogspot.co.uk/2010/08/pybrain-reinforcement-learning-tutorial.html