Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potentially confusing Node behaviour #626

Closed
drasmuss opened this issue Jan 19, 2015 · 15 comments
Closed

Potentially confusing Node behaviour #626

drasmuss opened this issue Jan 19, 2015 · 15 comments

Comments

@drasmuss
Copy link
Member

Pulling a Travis here and submitting bugs in my code as Nengo bugs, but just spent a little while tracking this down and thought I should document it. Here's a minimal example:

with nengo.Network() as net:
    class Test:
        tmp = 0.0

        def __call__(self, t):
            if t == 0.5:
                self.tmp = t
            print self.tmp

    a = nengo.Node(output=Test())

sim = nengo.Simulator(net, dt=0.25)
sim.run(1.0)

I think most people would expect that to print out "0.0, 0.0, 0.5, 0.5, 0.5, ...", but in reality it prints out "0.0, 0.0, 0.5, 0.75, 1.0, ...". Clearly (in hindsight) the problem is that t is an ndarray, not a float, so tmp and t get aliased. But the problem is that t looks and acts, in basically all other respects, as if it is a float (e.g., t == 0.5 passes, t[0] fails, etc.).

Potential solutions

  1. cast t to a float before passing it to the Node function (cleanest solution, but pay the cost of type cast)
  2. document this clearly in Node (hope people read it)
  3. this is desired behaviour, make fun of Daniel for making CS 101 errors
@jgosmann
Copy link
Collaborator

  1. Pass in a copy of t as ndarray. (1 might be faster, but not sure)

@tcstewar
Copy link
Contributor

Cute. I've run into the same problem with the x value being passed in. This also give initially unexpected behaviour for the same reason:

data = []
def node_function(t, x):
    data.append(x)
node = nengo.Node(node_function, size_in=1)

I've gotten into the habit of doing this instead:

data = []
def node_function(t, x):
    data.append(np.copy(x))
node = nengo.Node(node_function, size_in=1)

Not sure what to do about it in general. I still forget and get annoyed by this now and then. My instinct is that the overhead involved in doing (1) or (4) is pretty high, but it's something that probably should be benchmarked and checked. If it is actually a significant cost, then my vote is (2).

@hunse
Copy link
Collaborator

hunse commented Jan 20, 2015

I would probably have an option on the node whether to copy t and x or not (we could have one option for each, but I think one for both is fine). This would be settable with the config system, and default to copying. That way, people really concerned about speed can turn it off. I'm guessing the time cost will be relatively small in the grand scheme of things.

@tcstewar
Copy link
Contributor

I would probably have an option on the node whether to copy t and x or not (we could have one option for each, but I think one for both is fine).

One thing to keep in mind is whether this is a backend-specific option or something that all backends support. Right now in other backends, I've never run into this problem with t, but x behaves similarly to the reference backend.

I'd lean towards checking the speed hit, and if it is negligible, then just always passing in a copy. (fewer parameters)

@hunse
Copy link
Collaborator

hunse commented Jan 20, 2015

One thing to keep in mind is whether this is a backend-specific option or something that all backends support.

I guess I was assuming that backends can always make a copy if they want, or at least wrap the Python function so that a copy is passed in. Since this is just a performance question, it's not a problem if they have to make a copy (i.e. they don't need to support the no-copy option, I'd even support silent failure in that situation).

I'd lean towards checking the speed hit, and if it is negligible, then just always passing in a copy.

Another benefit to this is people can't accidentally change something inside their simulation (e.g. by setting x[:] = 0).

@tcstewar
Copy link
Contributor

I guess I was assuming that backends can always make a copy if they want, or at least wrap the Python function so that a copy is passed in. Since this is just a performance question, it's not a problem if they have to make a copy.

I meant that if there's a parameter that specifies which behaviour to use, then that would mean that backends would also have to support the non-copy option, with its strange side effects. That could be awkward for the backend to implement.

Another benefit to this is people can't accidentally change something inside their simulation (e.g. by setting x[:] = 0).

Yikes. Hadn't even thought of that. That makes me even more convinced that the right answer is "always copy".

@hunse
Copy link
Collaborator

hunse commented Jan 20, 2015

that would mean that backends would also have to support the non-copy option, with its strange side effects.

I guess I had assumed that this was all undesirable behaviour, so that it wouldn't matter if it was replicated. But I guess that could be the source of annoying bugs, if someone expects t or x to not be copied.

I'm fine with always copying. As I said, I'm guessing the speed hit will be minimal (though perhaps someone should check that).

@tbekolay
Copy link
Member

+1 copying t (or casting to float, whichever is faster), but I'm not sure about x, as that might incur more cost. We could, instead of copying, set x's flags such that x is readonly instead, if we're worried about people overwriting it (but since that value should only be used by the node anyhow, I don't think overwriting it would affect the rest of the model).

@hunse
Copy link
Collaborator

hunse commented Jan 20, 2015

I wrote the following script to test:

import sys
import timeit

import numpy as np

import nengo

args = sys.argv[1:]
copy_tx = True if len(args) > 0 and '--copy' in args else False

nx = 10
nnode = 100
rng = np.random.RandomState(9)

def node_fn(t, x):
    if copy_tx:
        t = np.array(t)
        x = np.array(x)
    return x + t

probes = []
with nengo.Network() as model:
    u = nengo.Node(output=rng.uniform(-1, 1, size=nx))
    for i in range(nnode):
        a = nengo.Node(output=node_fn, size_in=nx)
        nengo.Connection(u, a)
        probes.append(nengo.Probe(a))

sim = nengo.Simulator(model)

print(copy_tx)
print(timeit.timeit('sim.run(10.)', 'from __main__ import sim', number=3))

And got the following results for with copying:

True
Simulation finished in 0:00:09.                                                 
Simulation finished in 0:00:09.                                                 
Simulation finished in 0:00:09.                                                 
26.1397881508

and without copying

False
Simulation finished in 0:00:08.                                                 
Simulation finished in 0:00:08.                                                 
Simulation finished in 0:00:08.                                                 
23.938906908

So the difference is about a 9% slowdown, for this simulation of only nodes.

For a similar script but with a 10-neuron 10-dimensional ensemble for each node, the slowdown is 3%. If I make them 100-neuron populations, then it was actually faster when I copied (but just because of random fluctuations, I'm sure, but this indicates to me that the difference was negligible).

If I only copied t, the difference was negligible even in the all-node situation. So copying t is something we should do no matter what. Copying x is a bit more expensive; I'm guessing in most realistic simulations, it would be small enough that it would be negligible. The question is whether we want to risk it. Someone could pass images into a node to do something with them; do we want to have that extra copy?

I looked into the readonly thing, and it seems like for arrays of size 10, which is what I was testing with, it's just as fast to actually copy the array as it is to make a view and set writeable = False. Even trying a few larger sizes (100, 1000), it doesn't seem to be significantly faster to make a read-only view than to copy.

One final thing to note is that array copying time is not linear in array size. There is a lot of overhead that seems to account for most of the time.

@tbekolay
Copy link
Member

Interesting results! Especially about setting the readonly flag. Copying t seems like a no-brainer for sure; I would say that it's not worth copying x. The potential bugs that we would prevent are pretty subtle (probably not encountered by your average user), and the workaround of copying x inside the function is pretty straightforward. Adding something to the docs seems appropriate, though!

@hunse
Copy link
Collaborator

hunse commented Jan 20, 2015

For arrays above 1000 elements, the copying time seems to become linear, and it quickly becomes much more efficient to make a read-only view.
test_arraycopy

EDIT: made a less noisy version for posterity's sake.

@drasmuss
Copy link
Member Author

I think not copying x is reasonable. People are used to the idea that they need to worry about aliasing with arrays. I think it would also be fine to just set the read-only tag based on @hunse's latest results, no strong feelings either way.

hunse added a commit that referenced this issue Jan 21, 2015
Node functions now receive `t` as a float (as opposed to a
zero-dimensional array), and `x` as a readonly view onto the
simulator data. This addresses #626.
@hunse
Copy link
Collaborator

hunse commented Jan 21, 2015

I realized putting together a PR that we can make the readonly view onto x at build time; we don't have to make it each time step. So it seems like we might as well make it readonly.

@tbekolay
Copy link
Member

We can make the readonly view onto x at build time

But then we can't modify it on each timestep? Oh, or I suppose the builder has a view that allows for writing. Nice 🌈

hunse added a commit that referenced this issue Jan 21, 2015
Node functions now receive `t` as a float (as opposed to a
zero-dimensional array), and `x` as a readonly view onto the
simulator data. This addresses #626.
hunse added a commit that referenced this issue Jan 21, 2015
Node functions now receive `t` as a float (as opposed to a
zero-dimensional array), and `x` as a readonly view onto the
simulator data. This addresses #626.
@tbekolay
Copy link
Member

@drasmuss, did #628 address these concerns fully? Please close if it did!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

5 participants