Non-deterministic undocumented Node behaviour #716

xchoo · 2015-04-29T19:45:41Z

For nodes with a size_in > 0, storing a copy of the input to the function can lead to unexpected behaviour.

This code illustrates the problem:

import numpy as np
import matplotlib.pyplot as plt

import nengo
from nengo.utils.functions import piecewise


class DirectBuffer(object):
    def __init__(self, dim=1):
        self.x = np.zeros(dim)

    def set_x(self, t, x):
        self.x = x              # Causes a problem
        # self.x = np.array(x)    # Causes no problems

    def get_x(self, t):
        return self.x


with nengo.Network() as model:
    stim = nengo.Node(piecewise({0: 1, 0.5: 0.5, 1: -1}))

    db = DirectBuffer(1)
    n1 = nengo.Node(output=db.set_x, size_in=1)
    n2 = nengo.Node(output=db.get_x)

    nengo.Connection(stim, n1, synapse=0.005)

    p1 = nengo.Probe(stim)
    p2 = nengo.Probe(n2)

for r in range(10):
    sim = nengo.Simulator(model)
    sim.run(1.5)

    plt.subplot(2, 1, 1)
    plt.plot(sim.trange(), sim.data[p1])
    plt.ylim([-1.05, 1.05])
    plt.title('stim')
    plt.subplot(2, 1, 2)
    plt.plot(sim.trange(), sim.data[p2])
    plt.ylim([-1.05, 1.05])
    plt.title('out')
    plt.show()

In the ideal case, the output of n2 matches the signal generated by stim. However, in a majority of the runs, the output of n2 is 0.

The text was updated successfully, but these errors were encountered:

hunse · 2015-04-29T19:55:19Z

This seems like expected behaviour. When you do self.x = x you're making self.x point to x, so the original buffer that you allocate is lost, and you're left with a view onto some internal simulator signal.

I think what you want is self.x[...] = x.

tcstewar · 2015-04-29T20:01:34Z

This seems like expected behaviour.

Yup. Or, at least, expected if we have giant documentation saying NEVER EVER EVER KEEP THE VARIABLE PASSED IN TO A NODE. (in other words, if you want to keep it, then make a copy).

Part of me wants to just change things so we always pass in a copy to the Node, as I keep running into this confusion. But then part of me cringes at the overhead that causes.

hunse · 2015-04-29T20:08:47Z

Once the .nengorc file is a little more developed, it might not be a bad idea to have the default be to pass in a copy, and advanced users can turn this off for all their models if they want. Or we could just do it with the config system.

I'm guessing that there won't be much cost associated with the copy for all but the largest models (and even then, copies are really cheap compared to any other operation). Actually, the cost is a lot greater if there are a lot of small copies, since there's some Python overhead with making a new array or view onto an existing array (we avoid this by just making the view once, during the build).

xchoo · 2015-04-29T20:10:38Z

Yup. It is expected behaviour, but very un-intuitive behaviour (you have to know how the internals of the simulator work to understand why this behaviour happens).

hunse · 2015-04-29T20:13:57Z

I don't think it's necessarily unintuitive. If I receive an object as a parameter to the function, I would assume that that object is being passed by reference, not by value, and thus could change later. In fact, you seem to have had this view in your original code, too, since you allocated memory to store the incoming value. You just forgot to copy the data into the buffer you had created.

xchoo · 2015-04-29T20:36:29Z

Well, the code posted was after an evening of debugging the odd behaviour from @Seanny123 code. What made the problem even more un-intuitive was that he was using 1D signals (so you are expecting a float, but you don't get a float). (Oh. and the constructor should have been self.x = 0 haha. I was testing other things as well XD)

hunse · 2015-04-29T20:45:24Z

Yeah, the 1D case is probably the least intuitive, because then you might expect to get a float instead of a 1D array, but it also bugs me when software does stuff like that and you have to write special case code for the 1D case versus the N-D case.

Here's the original issue #626 and the fix #628. For some reason, despite most people leaning towards copying, I ended up doing a readonly view. @drasmuss is just so convincing! So maybe we should just go implement the copy like we talked about? My results at the time showed it didn't seem like much overhead.

jgosmann · 2015-04-29T20:48:43Z

It would be nice to have some numbers how much more overhead it is.

It's definitely quite hard to spot the problem. Even after realizing that it has to be either a race condition or shared memory it didn't occur to me why the memory was shared.

xchoo · 2015-04-29T20:50:00Z

Oh! There's that issue. I was trying to find that. 😄
I remember reading about that issue, and that was how I figured out what the problem was. haha

hunse · 2015-04-29T20:52:54Z

See #626 for my timing results. But remember that in my solution, I figured out that if we're using a view, we don't have to re-make it each time step, so there's no overhead. But the results do show how much it costs to copy arrays of different sizes (and it's pretty cheap).

How much it affects an actual network will depend on the network, of course, but I'm sure that as soon as you have any neural computations in the model, the cost of copying will be greatly overshadowed by the costs of encoder and decoder dot products, and the like. Generally, copying seems to be measured on the order of microseconds, whereas dot products are on the order of milliseconds.

jgosmann · 2015-04-29T21:05:52Z

Interestingly, a copy seems to be cheaper than a RO view for n < 1000. Does anyone use nodes with such a high-dimensional input a lot? It seems usually the input would be much smaller. Also, I agree that nodes probably won't be the main bottle neck in most simulations.

As the saying goes developer time is more expensive than computation time. So I'm in favor of copies. We can still add an rc option for the experts who need every little bit of speed.

tcstewar · 2015-04-29T21:10:04Z

As the saying goes developer time is more expensive than computation time. So I'm in favor of copies.

+1 from me as well, as long as that cost is as minimal as it looks.

hunse · 2015-04-29T21:11:28Z

We can still add an rc option for the experts who need every little bit of speed.

Ahh, but is it worth your developer time to add that option 😄

xchoo · 2015-04-29T21:11:39Z

Oops! Wrong button. 😃

drasmuss · 2015-04-29T21:12:37Z

I don't think I had any preference for readonly vs copying, I'd just do whichever one is faster.

Seanny123 · 2015-04-29T22:47:25Z

+1 for copying (if you have to write an all-caps warning in documentation, it's usually a sign it's a bad feature), +1 for writing it down (but not implementing) read-only options as something to implement someday in nengorc and instead just spending time on making Nengo OCL faster

Closes #716.

hunse · 2015-05-27T16:38:02Z

Interestingly, a copy seems to be cheaper than a RO view for n < 1000.

In practice, though, we don't have to make the view every timestep (just at the beginning), whereas we have to make the copy every timestep. So the copy is more expensive. But still pretty cheap in the grand scheme of things.

Another benefit to the copy is that it makes the reference implementation consistent with backends, which will typically be copying off a device to pass to a node function and thus passing a copy.

Closes #716.

tbekolay changed the title ~~Non-deterministic undocumented nengo.Node behaviour~~ Non-deterministic undocumented nengo.Node behaviour Apr 29, 2015

tbekolay changed the title ~~Non-deterministic undocumented nengo.Node behaviour~~ Non-deterministic undocumented Node behaviour Apr 29, 2015

xchoo closed this as completed Apr 29, 2015

xchoo reopened this Apr 29, 2015

jgosmann self-assigned this May 11, 2015

jgosmann added the work in progress label May 11, 2015

jgosmann added this to the 2.1.0 release milestone May 11, 2015

jgosmann added a commit that referenced this issue May 27, 2015

Pass x as copy to nodes.

5a334c3

Closes #716.

jgosmann mentioned this issue May 27, 2015

Pass x as copy to nodes. #722

Merged

jgosmann removed the work in progress label May 27, 2015

jgosmann removed their assignment May 27, 2015

jgosmann added a commit that referenced this issue May 27, 2015

Pass x as copy to nodes.

689eca2

Closes #716.

jgosmann added a commit that referenced this issue May 28, 2015

Pass x as copy to nodes.

f8db58e

Closes #716.

tbekolay closed this as completed in #722 May 28, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Non-deterministic undocumented Node behaviour #716

Non-deterministic undocumented Node behaviour #716

xchoo commented Apr 29, 2015

hunse commented Apr 29, 2015

tcstewar commented Apr 29, 2015

hunse commented Apr 29, 2015

xchoo commented Apr 29, 2015

hunse commented Apr 29, 2015

xchoo commented Apr 29, 2015

hunse commented Apr 29, 2015

jgosmann commented Apr 29, 2015

xchoo commented Apr 29, 2015

hunse commented Apr 29, 2015

jgosmann commented Apr 29, 2015

tcstewar commented Apr 29, 2015

hunse commented Apr 29, 2015

xchoo commented Apr 29, 2015

drasmuss commented Apr 29, 2015

Seanny123 commented Apr 29, 2015

hunse commented May 27, 2015

Non-deterministic undocumented Node behaviour #716

Non-deterministic undocumented Node behaviour #716

Comments

xchoo commented Apr 29, 2015

hunse commented Apr 29, 2015

tcstewar commented Apr 29, 2015

hunse commented Apr 29, 2015

xchoo commented Apr 29, 2015

hunse commented Apr 29, 2015

xchoo commented Apr 29, 2015

hunse commented Apr 29, 2015

jgosmann commented Apr 29, 2015

xchoo commented Apr 29, 2015

hunse commented Apr 29, 2015

jgosmann commented Apr 29, 2015

tcstewar commented Apr 29, 2015

hunse commented Apr 29, 2015

xchoo commented Apr 29, 2015

drasmuss commented Apr 29, 2015

Seanny123 commented Apr 29, 2015

hunse commented May 27, 2015