Improve product network accuracy. #651

jgosmann · 2015-02-09T22:06:39Z

More precise product network based on the results of my technical report.

I am also working on a benchmark to include in Nengo based on PR #647.

hunse · 2015-02-12T17:09:05Z

So this actually provides an improvement over using diagonal encoders? I had tried this once, and found it was better than random encoders, but worse than diagonal encoders. My explanation was that diagonal encoders basically force this kind of squaring multiplication anyway, but rather than computing the square terms individually and then taking the difference, it computes the difference directly, which allows decoders to be chosen so that errors in one square term cancel errors in the other square term.

jgosmann · 2015-02-12T18:11:56Z

From my report linked above:

diagonal encoders vs. alternative network:
Improvement by 8% (p < 0.001).

RMSE values for diagonal encoders:

mean RMSE: 0.0481133982306
median RMSE: 0.0477134127208
variance of RMSE: 2.244052918e-05

RMSE values for alternative network:

mean RMSE: 0.0444385504631
median RMSE: 0.0445897354704
variance of RMSE: 1.59544830317e-05

Sample size: 50

Definitely not a huge difference and to detect in, I believe, it is necessary to ensure that the benchmark actually covers the complete input space (it is easy to not include the corners which actually have the largest error contribution).

hunse · 2015-02-12T19:17:06Z

I think you're right: It matters a lot what your input is. I was probably doing something like that differently. I'm still curious why computing the squares separately is better; I was wondering also if it could have to do with intercepts.

A couple minor comments on the notebook: I would report standard deviations instead of variances, since they're in the same units as means. Also, you could consider using relative RMSEs, i.e. the RMS of the error divided by the RMS of the correct output. I find it makes the RMSE a bit easier to interpret. Finally, make sure all the scales in the final spiking plots are the same. Right now, it's hard to tell if the "alternative" network actually results in less noisy outputs than the diagonal encoders, or if this is just because it has a larger scale.

drasmuss · 2015-02-12T19:24:27Z

Also re: the notebook, in the alternative network benchmark you end up having effectively double the evaluation points, right? Did you compare the accuracy if you halve the evaluation points in each population so the total is the same as the simple network?

hunse · 2015-02-12T19:29:52Z

There's an explanation in the notebook as to why @jgosmann does not halve the evaluation points. To summarize, it's because in the simple network you have one 2D population, so the N evaluation points get projected onto each dimension, meaning you have N evaluation points in each dimension. So it makes sense to keep N evaluation points in each dimension (now read population) in the alternative network.

drasmuss · 2015-02-12T19:35:29Z

Ah yes that makes sense 🌈

jgosmann · 2015-02-12T20:37:20Z

@hunse That are good points. But I am not sure if I understand how to calculate the relative RMSE (and therefore whether it makes sense): If I divide by the RMS of the correct output, wouldn't that give extremly large errors for values close to zero?

jgosmann · 2015-02-12T21:02:23Z

Updated the notebook (use standard deviaton and use the same scale in all plots).

hunse · 2015-02-12T22:04:43Z

I wasn't suggesting you compute the relative RMSE on a term-by-term basis; if you did that you would have the problem you described. Compute the RMSE exactly how you did, but then divide the final number by the RMS of the correct signal. Does that make sense?

jgosmann · 2015-02-12T23:31:42Z

Version of the notebook also showing the relative RMSE. (not sure yet if I'll merge it into master)

jgosmann · 2015-02-23T23:12:09Z

Based on PR #647 and #657 I implemented a benchmark for the product network in PR #658.

Suprisingly it shows a much larger improvement than I expected:
Multiplication improvement by 0.310131 (84%, p < 0.001)

hunse · 2015-02-23T23:29:52Z

My two concerns are 1) from a pedagogical point of view, this makes the Product code more opaque, and doesn't highlight the power of neurons to compute arbitrary nonlinear functions, and 2) it has not been well tested in all situations.

With regards to (1), I don't think this is a good reason not to have this new Product network, but maybe just take a few steps to make it clear that it's using a complex technique to get better results. We have a good notebook on basic multiplication (multiplication.ipynb), so maybe we can point to that in the docstring of Product. We could then also point to the tech report notebook in the same docstring as an in-depth look at the advanced techniques used in the new Product notebook.

With regards to (2), it would be good to test improvements on common networks that use Product, for example CircularConvolution. Also, if we have some larger models that make use of Product, it might be nice to test them, too, just to make sure that everything still works. The accuracy tests in the tech report notebook are great, but they do assume particular distributions of input values, and I just want to make sure that our models haven't made other assumptions that result in better performance of the old Product network.

If these are too much for this PR, we can always add the new product network beside the old one, so that everything will keep using the old one by default, but it's easy to get the new one.

jgosmann · 2015-02-24T16:21:32Z

I will soon rerun my spaopt tests which test circular convolution and dot products. It won't be much work to make another run testing the modified product network. Of course there are always some assumptions in writing a benchmark, but I think, the tech report uses the best default assumptions in case no additional information about the distribution of the factors is given. Those are also the right assumptions for the dot product and circular convolution. But sure, there might me models based on different assumptions which are optimized for the old product network.

tbekolay · 2015-08-11T16:26:42Z

modify docstring to point to multiplication.ipynb
run benchmark from Product benchmark #713
add tutorial for making subnetwork with simple product

jgosmann · 2015-08-11T20:12:48Z

Circular convolution and dot product benchmarks submitted to computing cluster.

jgosmann · 2015-08-12T16:50:30Z

Benchmarks are done:

Legend

def = default/current implementation
spaopt = spaopt-v3 implementation (optimized radius), not part of this PR ⚠️

prod = using the improved product networs (this PR)
Thus, compare def to def + prod for the change introduced with this PR.

The plots show the distribution of errors along the y-axis. If you turn the plot by 90 degrees it's like a probability distribution function.

Circular convolution

For the circular convolution the improvement with this PR is even better than the spaopt optimizations! 😮 Together, they give the best result.

Dot product

The improvement is also clearly there for the dot product, though in this case spaopt has a larger effect.

Methods

Data are based on 20 simulations with different seeds, each run for 10 seconds with the first 0.5s dismissed.

jgosmann · 2015-08-12T19:12:09Z

Also ran the benchmark from #713. Here's the relevant line:

Multiplication improvement by 0.310131 (84%, p < 0.001)

jgosmann · 2015-08-12T19:23:12Z

I found a problem when setting n_neurons=1 which I fixed and I added a test for it. Is the usage of nengo.Direct() in that test safe regarding different backends or should this be done in some other way?

jgosmann · 2015-08-12T19:32:30Z

add tutorial for making subnetwork with simple product

There is examples/basic/multiplication.ipynb. Doesn't that suffice?

tbekolay · 2015-08-12T19:36:37Z

Benchmarks are done

Nice, very impressive! I'm sold on using this as our Product implementation.

There is examples/basic/multiplication.ipynb. Doesn't that suffice?

I think what would need to be added is a short section at the end showing how you can put that model creation code in a short function in order to make a multiplication subnetwork. The notebook could also point to the Product network as an optimized way to do exactly the same thing (synergy!)

Also, I think the docstring should point to examples/basic/multiplication.ipynb in addition to the tech report.

xchoo · 2015-08-12T22:44:52Z

So.. I may be a bit math slow (and the workbook kinda too a round-a-bout way of doing the derivation), but the function being computed by this new network is:

0.25 * (a+b)^2 - 0.25 * (a-b)^2
= 0.25 * a^2 + 0.5 * a * b + 0.25 * b^2 - (0.25 * a^2 - 0.5 * a * b + 0.25 * b^2)
= 0.25 * a^2 - 0.25 * a^2 + 0.25 * b^2 - 0.25 * b^2 + 0.5 * a * b + 0.5 * a * b
= a*b!

Neato.

tcstewar · 2015-08-12T23:24:47Z

the function being computed by this new network is

Cute. And that also nicely explains why the diagonal encoders are the right choice in the normal version of the product network: you just need to be able to represent a+b and a-b.....

hunse · 2015-08-13T14:42:20Z

Any idea why adding in spaopt increases the maximum error? Namely "spaopt+prod" has higher maximum error than "def+prod", especially in the middle case. Not a big concern, but I'm curious. Other than that, the results look pretty definitive to me. Also, did you make those plots with matplotlib?

jgosmann · 2015-08-13T15:04:25Z

Yes, spaopt decreases the radius so that the majority of values can be represented more accurately, but this implies that the error for a few rare vectors falls outside of the radius.

The plots were done with Seaborn.

tbekolay · 2015-08-18T09:52:41Z

I just pushed some commits to add the subnetwork to the multiplication example, and a few style suggestions. Will merge on @jgosmann 's +1!

jgosmann · 2015-08-18T15:59:10Z

LGTM 🍰

In the Product network docstring.

jgosmann · 2015-08-21T21:48:38Z

We probably should have added a changelog entry for this, shouldn't we?

tbekolay · 2015-08-21T21:50:51Z

Yeah, I made a branch for it, changelogs. I'll make a PR.
On Aug 21, 2015 5:48 PM, "Jan Gosmann" notifications@github.com wrote:

We probably should have added a changelog entry for this, shouldn't we?

—
Reply to this email directly or view it on GitHub
#651 (comment).

jgosmann added the needs review label Feb 9, 2015

hunse force-pushed the better-product branch 2 times, most recently from 6c9d9cd to 0c80331 Compare February 18, 2015 22:40

tbekolay added this to the 2.1.0 release milestone Mar 3, 2015

jgosmann closed this Aug 11, 2015

jgosmann force-pushed the better-product branch from 0c80331 to 966d508 Compare August 11, 2015 19:04

jgosmann reopened this Aug 11, 2015

jgosmann mentioned this pull request Aug 12, 2015

fixed product network layout #777

Closed

Seanny123 mentioned this pull request Aug 12, 2015

spa.Compare should inherit from the Product network #807

Closed

xchoo mentioned this pull request Aug 12, 2015

Update SPA code #798

Closed

3 tasks

Seanny123 added needs merge and removed needs review labels Aug 13, 2015

jgosmann and others added 5 commits August 18, 2015 13:56

Fix thalamus inhibition transform.

b2d4941

Improve product network accuracy.

d7f7a48

Handle 1 neuron ensembles correctly in product.

1e89462

Point to multiplication tech report and example.

07db912

In the Product network docstring.

Add subnetwork section to mulitiplication example

54ac9ab

tbekolay force-pushed the better-product branch from de406ae to 54ac9ab Compare August 18, 2015 18:54

tbekolay merged commit 54ac9ab into master Aug 18, 2015

tbekolay deleted the better-product branch August 18, 2015 19:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve product network accuracy. #651

Improve product network accuracy. #651

jgosmann commented Feb 9, 2015

hunse commented Feb 12, 2015

jgosmann commented Feb 12, 2015

hunse commented Feb 12, 2015

drasmuss commented Feb 12, 2015

hunse commented Feb 12, 2015

drasmuss commented Feb 12, 2015

jgosmann commented Feb 12, 2015

jgosmann commented Feb 12, 2015

hunse commented Feb 12, 2015

jgosmann commented Feb 12, 2015

jgosmann commented Feb 23, 2015

hunse commented Feb 23, 2015

jgosmann commented Feb 24, 2015

tbekolay commented Aug 11, 2015

jgosmann commented Aug 11, 2015

jgosmann commented Aug 12, 2015

jgosmann commented Aug 12, 2015

jgosmann commented Aug 12, 2015

jgosmann commented Aug 12, 2015

tbekolay commented Aug 12, 2015

xchoo commented Aug 12, 2015

tcstewar commented Aug 12, 2015

hunse commented Aug 13, 2015

jgosmann commented Aug 13, 2015

tbekolay commented Aug 18, 2015

jgosmann commented Aug 18, 2015

jgosmann commented Aug 21, 2015

tbekolay commented Aug 21, 2015

Improve product network accuracy. #651

Improve product network accuracy. #651

Conversation

jgosmann commented Feb 9, 2015

hunse commented Feb 12, 2015

jgosmann commented Feb 12, 2015

hunse commented Feb 12, 2015

drasmuss commented Feb 12, 2015

hunse commented Feb 12, 2015

drasmuss commented Feb 12, 2015

jgosmann commented Feb 12, 2015

jgosmann commented Feb 12, 2015

hunse commented Feb 12, 2015

jgosmann commented Feb 12, 2015

jgosmann commented Feb 23, 2015

hunse commented Feb 23, 2015

jgosmann commented Feb 24, 2015

tbekolay commented Aug 11, 2015

jgosmann commented Aug 11, 2015

jgosmann commented Aug 12, 2015

Legend

Circular convolution

Dot product

Methods

jgosmann commented Aug 12, 2015

jgosmann commented Aug 12, 2015

jgosmann commented Aug 12, 2015

tbekolay commented Aug 12, 2015

xchoo commented Aug 12, 2015

tcstewar commented Aug 12, 2015

hunse commented Aug 13, 2015

jgosmann commented Aug 13, 2015

tbekolay commented Aug 18, 2015

jgosmann commented Aug 18, 2015

jgosmann commented Aug 21, 2015

tbekolay commented Aug 21, 2015