New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recursive least-squares learning in Nengo #133
Conversation
Note that, from a Nengo user's perspective, this is as simple as substituting |
Great. Yup this is as it should be, like you mentioned :) Good summary of
differences/strengths/weaknesses. best, .c
…On Wed, Jan 17, 2018 at 5:43 PM, Aaron Russell Voelker < ***@***.***> wrote:
Note that, from a Nengo user's perspective, this is as simple as
substituting nengo.PES(...) with nengolib.RLS(...). They both take the
same parameters, use the same sign on the error signal, and have the same
overall effect of minimizing that error signal over time!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#133 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AB5JU914cvokTAWH1N58ktv954eiIVPSks5tLneYgaJpZM4RiGf5>
.
|
Wow, this is super cool! Nice work @arvoelke 👍 |
" def step_simbcm():\n", | ||
" # Note: dt is not used in learning rule\n", | ||
" rP = r.T.dot(P)\n", | ||
" P[...] -= P.dot(np.outer(r, rP)) / (1 + rP.dot(r))\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to self: P[...] -= np.outer(P.dot(r), rP) / (1 + rP.dot(r))
should be more efficient since it avoids a matrix-multiply.
" # Note: dt is not used in learning rule\n", | ||
" rP = r.T.dot(P)\n", | ||
" P[...] -= P.dot(np.outer(r, rP)) / (1 + rP.dot(r))\n", | ||
" delta[...] = - error * P.dot(r)\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be delta[...] = -np.outer(error, P.dot(r))
to properly handle multi-dimensional error signals.
TODO:
|
The documentation now includes a side-by-side comparison of PES versus RLS on a scalar spiking communication channel. You won't be able to see this until release, unless you build the docs yourself, and so I've copied it below: This PR also contains a notebook example that shows how to construct both spiking FORCE and full-FORCE networks in Nengo. Again, the rendered version will be visible upon next release ( |
Codecov Report
@@ Coverage Diff @@
## master #133 +/- ##
=====================================
Coverage 100% 100%
=====================================
Files 29 29
Lines 1373 1374 +1
Branches 157 157
=====================================
+ Hits 1373 1374 +1
Continue to review full report at Codecov.
|
@psipeter @celiasmith
This shows how to implement recursive least-squares (RLS) as a learning rule in Nengo. The equations come from the Sussillo and Abbott (2009) FORCE paper. See committed notebook for details.
It appears to work extremely well, even with spiking neurons:
This is learning a communication channel. The error signal is disabled after 1 period of the sine wave, and the network gives the correct answer thereafter. The gamma matrix it finds (see above) is essentially identical to the default computed offline by Nengo. This makes sense given that they are both doing least-squares optimization, but it is still remarkable given we're using spiking neurons, only providing one oscillation, and doing this online.
You can think of this as an alternative to using
PES
, with the following important differences:n^2
extra variables in memory (in particular, a running estimate ofinv(gamma)
, wheregamma = A.T.dot(A)
is the exact same matrix computed by Nengo's L2 solvers).In other words, this should consistently outperform PES, but is not biologically plausible, and requires extra compute / memory. If the online aspect is not required, then just stick to Nengo's default (offline) L2-optimization. I think this will be most useful when doing FORCE-style learning.