Potentially confusing Node behaviour #626

drasmuss · 2015-01-19T21:48:05Z

Pulling a Travis here and submitting bugs in my code as Nengo bugs, but just spent a little while tracking this down and thought I should document it. Here's a minimal example:

with nengo.Network() as net:
    class Test:
        tmp = 0.0

        def __call__(self, t):
            if t == 0.5:
                self.tmp = t
            print self.tmp

    a = nengo.Node(output=Test())

sim = nengo.Simulator(net, dt=0.25)
sim.run(1.0)

I think most people would expect that to print out "0.0, 0.0, 0.5, 0.5, 0.5, ...", but in reality it prints out "0.0, 0.0, 0.5, 0.75, 1.0, ...". Clearly (in hindsight) the problem is that t is an ndarray, not a float, so tmp and t get aliased. But the problem is that t looks and acts, in basically all other respects, as if it is a float (e.g., t == 0.5 passes, t[0] fails, etc.).

Potential solutions

cast t to a float before passing it to the Node function (cleanest solution, but pay the cost of type cast)
document this clearly in Node (hope people read it)
this is desired behaviour, make fun of Daniel for making CS 101 errors

The text was updated successfully, but these errors were encountered:

jgosmann · 2015-01-19T22:18:51Z

Pass in a copy of t as ndarray. (1 might be faster, but not sure)

tcstewar · 2015-01-20T01:05:24Z

Cute. I've run into the same problem with the x value being passed in. This also give initially unexpected behaviour for the same reason:

data = []
def node_function(t, x):
    data.append(x)
node = nengo.Node(node_function, size_in=1)

I've gotten into the habit of doing this instead:

data = []
def node_function(t, x):
    data.append(np.copy(x))
node = nengo.Node(node_function, size_in=1)

Not sure what to do about it in general. I still forget and get annoyed by this now and then. My instinct is that the overhead involved in doing (1) or (4) is pretty high, but it's something that probably should be benchmarked and checked. If it is actually a significant cost, then my vote is (2).

hunse · 2015-01-20T01:12:36Z

I would probably have an option on the node whether to copy t and x or not (we could have one option for each, but I think one for both is fine). This would be settable with the config system, and default to copying. That way, people really concerned about speed can turn it off. I'm guessing the time cost will be relatively small in the grand scheme of things.

tcstewar · 2015-01-20T01:22:54Z

I would probably have an option on the node whether to copy t and x or not (we could have one option for each, but I think one for both is fine).

One thing to keep in mind is whether this is a backend-specific option or something that all backends support. Right now in other backends, I've never run into this problem with t, but x behaves similarly to the reference backend.

I'd lean towards checking the speed hit, and if it is negligible, then just always passing in a copy. (fewer parameters)

hunse · 2015-01-20T01:28:01Z

One thing to keep in mind is whether this is a backend-specific option or something that all backends support.

I guess I was assuming that backends can always make a copy if they want, or at least wrap the Python function so that a copy is passed in. Since this is just a performance question, it's not a problem if they have to make a copy (i.e. they don't need to support the no-copy option, I'd even support silent failure in that situation).

I'd lean towards checking the speed hit, and if it is negligible, then just always passing in a copy.

Another benefit to this is people can't accidentally change something inside their simulation (e.g. by setting x[:] = 0).

tcstewar · 2015-01-20T01:31:32Z

I guess I was assuming that backends can always make a copy if they want, or at least wrap the Python function so that a copy is passed in. Since this is just a performance question, it's not a problem if they have to make a copy.

I meant that if there's a parameter that specifies which behaviour to use, then that would mean that backends would also have to support the non-copy option, with its strange side effects. That could be awkward for the backend to implement.

Another benefit to this is people can't accidentally change something inside their simulation (e.g. by setting x[:] = 0).

Yikes. Hadn't even thought of that. That makes me even more convinced that the right answer is "always copy".

hunse · 2015-01-20T01:37:03Z

that would mean that backends would also have to support the non-copy option, with its strange side effects.

I guess I had assumed that this was all undesirable behaviour, so that it wouldn't matter if it was replicated. But I guess that could be the source of annoying bugs, if someone expects t or x to not be copied.

I'm fine with always copying. As I said, I'm guessing the speed hit will be minimal (though perhaps someone should check that).

tbekolay · 2015-01-20T15:53:28Z

+1 copying t (or casting to float, whichever is faster), but I'm not sure about x, as that might incur more cost. We could, instead of copying, set x's flags such that x is readonly instead, if we're worried about people overwriting it (but since that value should only be used by the node anyhow, I don't think overwriting it would affect the rest of the model).

hunse · 2015-01-20T22:10:21Z

I wrote the following script to test:

import sys
import timeit

import numpy as np

import nengo

args = sys.argv[1:]
copy_tx = True if len(args) > 0 and '--copy' in args else False

nx = 10
nnode = 100
rng = np.random.RandomState(9)

def node_fn(t, x):
    if copy_tx:
        t = np.array(t)
        x = np.array(x)
    return x + t

probes = []
with nengo.Network() as model:
    u = nengo.Node(output=rng.uniform(-1, 1, size=nx))
    for i in range(nnode):
        a = nengo.Node(output=node_fn, size_in=nx)
        nengo.Connection(u, a)
        probes.append(nengo.Probe(a))

sim = nengo.Simulator(model)

print(copy_tx)
print(timeit.timeit('sim.run(10.)', 'from __main__ import sim', number=3))

And got the following results for with copying:

True
Simulation finished in 0:00:09.                                                 
Simulation finished in 0:00:09.                                                 
Simulation finished in 0:00:09.                                                 
26.1397881508

and without copying

False
Simulation finished in 0:00:08.                                                 
Simulation finished in 0:00:08.                                                 
Simulation finished in 0:00:08.                                                 
23.938906908

So the difference is about a 9% slowdown, for this simulation of only nodes.

For a similar script but with a 10-neuron 10-dimensional ensemble for each node, the slowdown is 3%. If I make them 100-neuron populations, then it was actually faster when I copied (but just because of random fluctuations, I'm sure, but this indicates to me that the difference was negligible).

If I only copied t, the difference was negligible even in the all-node situation. So copying t is something we should do no matter what. Copying x is a bit more expensive; I'm guessing in most realistic simulations, it would be small enough that it would be negligible. The question is whether we want to risk it. Someone could pass images into a node to do something with them; do we want to have that extra copy?

I looked into the readonly thing, and it seems like for arrays of size 10, which is what I was testing with, it's just as fast to actually copy the array as it is to make a view and set writeable = False. Even trying a few larger sizes (100, 1000), it doesn't seem to be significantly faster to make a read-only view than to copy.

One final thing to note is that array copying time is not linear in array size. There is a lot of overhead that seems to account for most of the time.

tbekolay · 2015-01-20T22:20:19Z

Interesting results! Especially about setting the readonly flag. Copying t seems like a no-brainer for sure; I would say that it's not worth copying x. The potential bugs that we would prevent are pretty subtle (probably not encountered by your average user), and the workaround of copying x inside the function is pretty straightforward. Adding something to the docs seems appropriate, though!

hunse · 2015-01-20T22:25:03Z

For arrays above 1000 elements, the copying time seems to become linear, and it quickly becomes much more efficient to make a read-only view.

EDIT: made a less noisy version for posterity's sake.

drasmuss · 2015-01-20T22:40:41Z

I think not copying x is reasonable. People are used to the idea that they need to worry about aliasing with arrays. I think it would also be fine to just set the read-only tag based on @hunse's latest results, no strong feelings either way.

Node functions now receive `t` as a float (as opposed to a zero-dimensional array), and `x` as a readonly view onto the simulator data. This addresses #626.

hunse · 2015-01-21T20:42:06Z

I realized putting together a PR that we can make the readonly view onto x at build time; we don't have to make it each time step. So it seems like we might as well make it readonly.

tbekolay · 2015-01-21T20:50:51Z

We can make the readonly view onto x at build time

But then we can't modify it on each timestep? Oh, or I suppose the builder has a view that allows for writing. Nice 🌈

Node functions now receive `t` as a float (as opposed to a zero-dimensional array), and `x` as a readonly view onto the simulator data. This addresses #626.

tbekolay · 2015-01-27T22:11:07Z

@drasmuss, did #628 address these concerns fully? Please close if it did!

hunse added a commit that referenced this issue Jan 21, 2015

Node functions get t as float, x as readonly

e1968fb

Node functions now receive `t` as a float (as opposed to a zero-dimensional array), and `x` as a readonly view onto the simulator data. This addresses #626.

hunse mentioned this issue Jan 21, 2015

Node functions get t as float, x as readonly #628

Merged

hunse added a commit that referenced this issue Jan 21, 2015

Node functions get t as float, x as readonly

c6bfe72

Node functions now receive `t` as a float (as opposed to a zero-dimensional array), and `x` as a readonly view onto the simulator data. This addresses #626.

hunse added a commit that referenced this issue Jan 21, 2015

Node functions get t as float, x as readonly

c43016e

Node functions now receive `t` as a float (as opposed to a zero-dimensional array), and `x` as a readonly view onto the simulator data. This addresses #626.

drasmuss closed this as completed Jan 27, 2015

hunse mentioned this issue Apr 29, 2015

Non-deterministic undocumented Node behaviour #716

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Potentially confusing Node behaviour #626

Potentially confusing Node behaviour #626

drasmuss commented Jan 19, 2015

jgosmann commented Jan 19, 2015

tcstewar commented Jan 20, 2015

hunse commented Jan 20, 2015

tcstewar commented Jan 20, 2015

hunse commented Jan 20, 2015

tcstewar commented Jan 20, 2015

hunse commented Jan 20, 2015

tbekolay commented Jan 20, 2015

hunse commented Jan 20, 2015

tbekolay commented Jan 20, 2015

hunse commented Jan 20, 2015

drasmuss commented Jan 20, 2015

hunse commented Jan 21, 2015

tbekolay commented Jan 21, 2015

tbekolay commented Jan 27, 2015

Potentially confusing Node behaviour #626

Potentially confusing Node behaviour #626

Comments

drasmuss commented Jan 19, 2015

jgosmann commented Jan 19, 2015

tcstewar commented Jan 20, 2015

hunse commented Jan 20, 2015

tcstewar commented Jan 20, 2015

hunse commented Jan 20, 2015

tcstewar commented Jan 20, 2015

hunse commented Jan 20, 2015

tbekolay commented Jan 20, 2015

hunse commented Jan 20, 2015

tbekolay commented Jan 20, 2015

hunse commented Jan 20, 2015

drasmuss commented Jan 20, 2015

hunse commented Jan 21, 2015

tbekolay commented Jan 21, 2015

tbekolay commented Jan 27, 2015