Fix NNLS weight solvers #1027

hunse · 2016-04-22T18:27:45Z

Fixes #1019, as well as some other small issues.

hunse · 2016-05-24T21:10:17Z

@tcstewar, can you review, since you filed the original #1019 bug that this addresses?

tcstewar · 2016-05-26T19:58:56Z

This looks good to me, and fixes the original problem. I also confirmed that if you use the nnls with weights=True, you get the same result as if you used weights=False (if the encoders are all positive).

However, oddly, if we use normal encoders and weights=True, the results are worse than I'd expect. For example:

model = nengo.Network()
with model:
    stim = nengo.Node(lambda t: np.sin(t*np.pi*2))
    a = nengo.Ensemble(50, 1)
    b = nengo.Ensemble(50, 1)#, encoders=nengo.dists.Choice([[1]]))

    nengo.Connection(stim, a)
    nengo.Connection(a, b, solver=nengo.solvers.Nnls(weights=True))

    p = nengo.Probe(b, synapse=0.03)
    p_stim = nengo.Probe(stim, synapse=0.03)

sim = nengo.Simulator(model)
sim.run(2)    

pylab.plot(sim.trange(), sim.data[p_stim])
pylab.plot(sim.trange(), sim.data[p])
pylab.show()

produces this:

It looks to me like it could have gotten a better fit by just multiplying all the weights by ~1.4. Why didn't it find that solution?

hunse · 2016-05-30T14:07:32Z

That is a good question. Here's my theory:

We pose our normal decoder solving problem as dot(A, X) = Y where A is our activation, X is the decoders, and Y is the targets (in the case of representation, equal to our eval points).
To solve for the weights, I take the targets and multiply them by the encoders E, so we get dot(A, W) = dot(Y, E), where W is the weights we want to solve for.
If we're doing NNLS, all the weights must be non-negative, and so are the activities, so their product is non-negative. dot(Y, E) can be quite negative, though.
Solving the NNLS problem minimizes the RMS amplitude of dot(A, W) - dot(Y, E). Since there are negative terms that we can never get close to, the best we can hope to do is zero for those terms, so those terms end up contributing a lot of the error.
All this is being done in the input space of the post-synaptic neurons, and doesn't account for the bias and nonlinearity of the neurons. So in the example above, doubling the weights helps, but only because there are a lot of neurons that are already silent who are unaffected by the weight change. If all the output neurons were linear, doubling the weights doesn't help. So in the eyes of the solver, it's found the best solution to the linear problem it's given. (I don't fully understand this yet, though.)
Choosing all non-zero intercepts allows the solver to perform almost perfectly. i.e. intercepts=nengo.dists.Uniform(0, 1) in the output population. I don't fully understand this yet either, but basically my thinking is that this makes the active (i.e. firing, non-silent) range of the neuron the same as the controllable range of the neuron. If you have a neuron with a negative intercept, this means that you need a negative current to make the neuron silent, and we can't get a negative current with non-negative weights. So essentially we always have some off and some on neurons fighting against each other, making the amplitude of the result too small. Here's the result with non-negative intercepts:

tcstewar · 2016-05-30T14:36:45Z

Ah, that makes total sense. For about half the neurons, it's finding a pretty perfect set of weights. For the other half, it's supposed to find weights that give a negative number for the input current, but it can't, so it just gets that as small as possible. But that screws up the decode, giving a smaller value than it should (since those neurons that are getting too much input are screwing up the decode, and will always be pushing the decoded value towards zero). When I did the "just multiply by 1.4" thing, I was actually making the input current wrong for all the neurons, but in a way that things kinda cancelled as far as the decode was concerned. But in this scenario the weights that are best at getting the right input current to each neuron are not the weights that are best at getting the right decoded value out of the population.

It'd be nice to toss this observation into the NNLS documentation, along with your suggestion for the intercepts. But that's a separate PR from this one. Now that I (think I) understand what's happening here, this PR looks great to me.

hunse · 2016-05-30T14:38:37Z

There's still a few things I'm working on adding, so I can throw that bit of documentation in as well.

hunse · 2016-05-30T15:01:46Z

Ok @tcstewar, two more little commits you can look at. There was some strange bug in the NnlsL2 solver and I couldn't figure out why it wasn't working when calling out to the Nnls solver for the core of the solving, so I just copied over the code and now it works.

Also, one thing worth mentioning is that Nnls solves the least squares problem directly, whereas NnlsL2 solves the Gram system (i.e. the normal equations). These give slightly different answers, where NnlsL2(weights=True, reg=0.0) does slightly worse than Nnls(weights=True) (0.215 RMSE vs 0.186). However, NnlsL2 is considerably faster, at least for the system I was trying it on, because the Gram system has much smaller matrices to deal with. We should probably document this somewhere, though I'm not sure if it's best to put it in the docstring or a notebook. I'm already not totally happy with the redundancy between the docstrings for the different types of Nnls solvers (i.e. I put that comment about non-negative intercepts in all of them).

tcstewar · 2016-05-30T20:24:10Z

nengo/solvers.py

+        Y, m, n, _, matrix_in = format_system(A, Y)
+        Y = self.mul_encoders(Y, E, copy=True)
+        d = Y.shape[1]
+        Y[Y < 0] = 0  # makes the final RMS error more reasonable


I'm not sure we should be doing this. The RMS error is what it is -- this is returning the RMS error where any negative target value is treated as if its target value was 0, so it's not really the RMS error any more. Having this happen silently seems a bit odd to me...

Err, never mind -- I misinterpreted what was happening here (I thought it was just about giving a more reasonable RMS report out of the solver, rather than something that affects the actual decoders)

Oh no, you're right, this is bad. I liked it because it helped to compare different solvers when I was just playing around, but you're right that it misrepresents things. It could also be problematic if we ever have neurons that can have negative activities.

Oh that's so strange. It does make a difference in the Gram system (if you do it before multiplying Y by the transposed activities), but not in the regular system. I have no idea why that is.

tcstewar · 2016-05-30T20:32:02Z

Looks good to me. I think doing the extra documentation in a notebook makes the most sense, as there are a lot of possibilities of different ways of making use of this solver, and it's not something that people have done much of yet.

hunse · 2016-05-30T20:56:27Z

Ok, I redid those last two commits so they don't affect the RMSE. We just clip Y to be non-negative in the Gram system, because this helps for some reason.

tcstewar · 2016-05-30T21:00:49Z

Cool. That looks great to me. Thanks! (and thanks for adding the TODO about figuring out what's happening there in the future... ) :)

hunse · 2016-05-30T21:16:31Z

Yeah, I mean it kind of makes sense. Making Y non-negative doesn't change the original system, because the solution can never achieve those negative values, so it just does what it can to minimize the error on the positive ones and keeps the negatives zero. The value that minimizes f(x) + large_number is the same that minimizes f(x). Though that said, clipping things to zero in the original system might change the balance of things, for example if the target value is -10, the squared error difference between returing 0 and 1 is larger than if the target value is 0.

In the Gram system, if we don't clip Y, then when we multiply by A transpose, some of the positives and negatives might cancel out, and we end up with a different system than if we clip Y first. So it makes sense that things change, and the whole Gram system/normal equation idea is based on things being linear (i.e. linear least squares), and when we have the non-negative constraint that's no longer the case.

hunse · 2016-06-07T18:08:35Z

Added changelog entries. I think this is ready for merge.

tbekolay · 2016-06-09T19:07:23Z

I'll look at this shortly as it's the last thing for 2.1.1!

tbekolay · 2016-06-10T22:29:43Z

We found a minor issue which Eric fixed; history is clean now so I'll merge this tomorrow morning unless there are objections!

Also fix up regularization to be more accurate.

`ravel` is more efficient as it does not make copies.

- Fixes a bug where Nnls was returning weights of the wrong dimensionality when `weights=True` (fixes #1019). - Also fixes some solvers to be safer and faster in flattening the weights they return.

This was not working properly because we were not clipping `Y` values to be non-negative before forming the Gram system. Now, `Nnls(weights=True)` and `NnlsL2(weights=True, reg=0)` give almost identical results.

hunse force-pushed the fix-weight-solvers branch from 8fab651 to 43b1879 Compare April 25, 2016 15:44

hunse added the needs review label Apr 25, 2016

hunse added this to the 2.1.1 release milestone Apr 25, 2016

hunse assigned tcstewar May 26, 2016

hunse force-pushed the fix-weight-solvers branch 2 times, most recently from fc19682 to 8373d0c Compare May 30, 2016 14:52

tcstewar reviewed May 30, 2016
View reviewed changes

tcstewar added reviewed and removed needs review labels May 30, 2016

hunse force-pushed the fix-weight-solvers branch from 8373d0c to 34ef8b3 Compare May 30, 2016 20:55

tcstewar removed their assignment May 31, 2016

hunse assigned tbekolay and drasmuss Jun 7, 2016

hunse force-pushed the fix-weight-solvers branch from 34ef8b3 to 3723ebd Compare June 7, 2016 18:07

tbekolay unassigned drasmuss Jun 7, 2016

This was referenced Jun 7, 2016

Connections can take points for functions #1010

Merged

2.1.1, 2.2.0 releases #1080

Closed

hunse force-pushed the fix-weight-solvers branch from 3723ebd to 5ac2e95 Compare June 10, 2016 22:25

tbekolay force-pushed the fix-weight-solvers branch from 5ac2e95 to 3e8e1f6 Compare June 10, 2016 22:28

tbekolay force-pushed the fix-weight-solvers branch from 3e8e1f6 to b25505a Compare June 11, 2016 14:32

hunse added 5 commits June 11, 2016 11:10

Use standard solvers in LstsqDrop

420913b

Also fix up regularization to be more accurate.

Use ravel instead of flatten in solvers

a015309

`ravel` is more efficient as it does not make copies.

Fix dimensionality bug with Nnls solver

82444f5

- Fixes a bug where Nnls was returning weights of the wrong dimensionality when `weights=True` (fixes #1019). - Also fixes some solvers to be safer and faster in flattening the weights they return.

Document non-negative intercepts for Nnls solvers

8dce101

Fix solve routine for NnlsL2 solver

bdcb2c0

This was not working properly because we were not clipping `Y` values to be non-negative before forming the Gram system. Now, `Nnls(weights=True)` and `NnlsL2(weights=True, reg=0)` give almost identical results.

tbekolay force-pushed the fix-weight-solvers branch from b25505a to bdcb2c0 Compare June 11, 2016 15:10

tbekolay merged commit bdcb2c0 into master Jun 11, 2016

tbekolay deleted the fix-weight-solvers branch June 11, 2016 15:27

tbekolay removed their assignment Oct 16, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix NNLS weight solvers #1027

Fix NNLS weight solvers #1027

hunse commented Apr 22, 2016 •

edited

Loading

hunse commented May 24, 2016

tcstewar commented May 26, 2016

hunse commented May 30, 2016

tcstewar commented May 30, 2016

hunse commented May 30, 2016

hunse commented May 30, 2016

tcstewar May 30, 2016

tcstewar May 30, 2016

hunse May 30, 2016

hunse May 30, 2016

tcstewar commented May 30, 2016

hunse commented May 30, 2016

tcstewar commented May 30, 2016

hunse commented May 30, 2016

hunse commented Jun 7, 2016

tbekolay commented Jun 9, 2016

tbekolay commented Jun 10, 2016

Fix NNLS weight solvers #1027

Fix NNLS weight solvers #1027

Conversation

hunse commented Apr 22, 2016 • edited Loading

hunse commented May 24, 2016

tcstewar commented May 26, 2016

hunse commented May 30, 2016

tcstewar commented May 30, 2016

hunse commented May 30, 2016

hunse commented May 30, 2016

tcstewar May 30, 2016

Choose a reason for hiding this comment

tcstewar May 30, 2016

Choose a reason for hiding this comment

hunse May 30, 2016

Choose a reason for hiding this comment

hunse May 30, 2016

Choose a reason for hiding this comment

tcstewar commented May 30, 2016

hunse commented May 30, 2016

tcstewar commented May 30, 2016

hunse commented May 30, 2016

hunse commented Jun 7, 2016

tbekolay commented Jun 9, 2016

tbekolay commented Jun 10, 2016

hunse commented Apr 22, 2016 •

edited

Loading