Fixes bug where multiple identical indexed slices are ignored #1361

AllenHW · 2017-09-21T23:19:05Z

Motivation and context:
As per #947, we have a bug where advanced indexing works when the indices are different but does not work when same index repeats multiple times. Eric found the api .at(), and I use it to fix the bug.

How has this been tested?
I added a test test_set_weight_solver in tests/test_connection.py.

How long should this take to review?

Quick (less than 40 lines changed or changes are straightforward)

Types of changes:

Bug fix (non-breaking change which fixes an issue)

Checklist:

I have read the CONTRIBUTING.rst document.
I have updated the documentation accordingly.
I have included a changelog entry.
I have added tests to cover my changes.
All new and existing tests passed.

Still to do:

AllenHW · 2017-09-22T00:23:13Z

Some of the tests are failing but can't reproduce them locally because I don't have the same numpy version (I should get virtualenv..). Not sure why the case with numpy 1.8 is failing because .at() supported by that version. Will look into it

AllenHW · 2017-09-22T01:28:46Z

It looks like the issue was that for numpy 1.8, .at() does not take Ellipsis as a slice. So for instance np.add.at(np.array([1,2]), Ellipsis, np.array([1,1])) doesn't work. Hence I had to add a silly branch statement where if the slice is Ellipsis we fall to the previous method.
For numpy >=1.9 this is not an issue. So the branch condition is just to support numpy 1.8

AllenHW · 2017-09-22T07:03:22Z

nengo/builder/operator.py

@@ -388,7 +388,10 @@ def make_step(self, signals, dt, rng):

        if inc:
            def step_copy():
-                dst[dst_slice] += src[src_slice]
+                if type(dst_slice) == type(Ellipsis):


type(dst_slice) == type(Ellipsis) if no slice is provided. So we'd think that dst = dst[dst_slice]
Yet, dst += src[src_slice] sometimes fails because "dst referenced before assignment" because of some weird numpy internal

I think isinstance(dst_slice, Ellipsis) is usually preferred for type checks (except when derived types should be explicitly excluded which is very rare).

Also, the slices should not change during the simulation. That should allow to move the if-branching outside of the step_copy function. That might have some performance impact, but no idea whether that's significant.

I'm also curious if there is a performance impact replacing += with at.

I agree with Jan, the slices won't change during the simulation so the if branch should be outside the step_copy function. The step functions returned in the builder are what are actually run when you do sim.run, so they should be written to be as fast as possible.

Additionally, it should only be necessary to use at when dst_slice or src_slice are lists (or numpy arrays) of integers, correct? You could check for that also before defining the step function so that we only use at when we need to.

If we really want to be specific, the only time we need at is when dst_slice is a list or ndarray with repeated indices. We could check for this with something like len(np.unique(dst_slice)) < len(dst_slice).

tbekolay

Looks good! Made some inline comments for minor changes.

tbekolay · 2017-09-22T15:41:43Z

nengo/builder/operator.py

@@ -388,7 +388,10 @@ def make_step(self, signals, dt, rng):

        if inc:
            def step_copy():
-                dst[dst_slice] += src[src_slice]
+                if type(dst_slice) == type(Ellipsis):


I agree with Jan, the slices won't change during the simulation so the if branch should be outside the step_copy function. The step functions returned in the builder are what are actually run when you do sim.run, so they should be written to be as fast as possible.

Additionally, it should only be necessary to use at when dst_slice or src_slice are lists (or numpy arrays) of integers, correct? You could check for that also before defining the step function so that we only use at when we need to.

tbekolay · 2017-09-22T15:43:14Z

nengo/tests/test_connection.py

+    d_data = sim.data[d_probe]
+
+    plt.plot(t, a_data)
+    assert np.allclose(a_data[100:], [0], atol=0.3)


These tolerances feel pretty big to me; I'll look at the plots when I rereview, but try a larger synapse value on the probes to see if the decoded value is more stable?

tbekolay · 2017-09-22T15:44:27Z

nengo/tests/test_connection.py

+def test_advanced_indexing(Simulator, plt, seed):
+    N = 100
+
+    model = nengo.Network()


You should pass the seed into the Network (Network(seed=seed)) so that the test becomes deterministic; otherwise, we'll get random parameters each time and potentially random test failures! 😄

tbekolay · 2017-09-22T15:45:56Z

nengo/tests/test_connection.py

+    plt.plot(t, d_data)
+    assert np.allclose(c_data[100:], [-1,1], atol=0.3)
+    plt.plot(t, d_data)
+    assert np.allclose(d_data[100:], [1,1], atol=0.3)


This test is great, thanks!

We mentioned in the dev meeting that it would also be nice to support boolean arrays for indices. Have you tried testing with boolean indices? If not, it might be worth trying it out to see if it works... if it doesn't, it makes sense to implement it in another PR, but if it just works then it'd be good to add another test for that here!

I tried this and it seems to work. I'll add a test for it.

hunse · 2017-09-22T16:54:50Z

I did a test, and at seems to be considerably slower than += (at least three times slower, and much more if dst_slice is a slice or Ellipsis or something). So I think we should only use at if dst_slice has repeated indices.

Here's a script if anyone wants to play around.

import timeit
import numpy as np

m = int(1e6)

if 0:
    n = m
    inds = Ellipsis
elif 0:
    n = int(0.9*m)
    inds = slice(5, 5+n)
else:
    n = int(0.9*m)
    inds = np.random.permutation(m)[:n]
    # inds = np.sort(inds)

a = np.random.uniform(-1, 1, size=n)
b = np.zeros(m)

t_inc = min(timeit.repeat(
    'b[inds] += a', 'from __main__ import *; b[:] = 0', repeat=3, number=1))
t_at = min(timeit.repeat(
    'np.add.at(b, inds, a)', 'from __main__ import *; b[:] = 0', repeat=3, number=1))

print(t_inc, t_at)
print(t_at/t_inc)

hunse · 2017-09-22T17:08:43Z

nengo/builder/operator.py

+                if type(dst_slice) == type(Ellipsis):
+                    dst[dst_slice] += src[src_slice]
+                else:
+                    np.add.at(dst, dst_slice, src[src_slice])
        else:
            def step_copy():
                dst[dst_slice] = src[src_slice]


In this case, if there are repeated indices in dst_slice, then the behaviour is kind of undefined. I think we should have an assertion here to make sure that's not the case. I don't think a user should ever be able to trip that assertion, but it could save a developer some trouble.

hunse · 2017-09-22T17:09:07Z

I'm going to go through and make some of the changes I suggested.

AllenHW · 2017-09-23T00:07:06Z

The below suggests that use of at is faster for repeated indices than creating say multiple connections

import timeit
import numpy as np

m = int(100)

b = np.zeros(m)
c = np.zeros(m)
inds = np.zeros(m, dtype=np.int)

t_inc = min(timeit.repeat(
    'b[[0]] += [1]', 'from __main__ import *;', repeat=m, number=m))
t_at = min(timeit.repeat(
    'np.add.at(c, inds, [1])', 'from __main__ import *;', repeat=1, number=1))

print(t_inc, t_at)
print(t_at/t_inc)

LGTM

hunse

This all LGTM, assuming that @tbekolay is able to tighten up those tolerances when he goes through it.

Also test that boolean indexing works. With this, advanced indexing is fully supported for ObjViews in connections. Fixes #947.

AllenHW self-assigned this Sep 21, 2017

AllenHW requested review from tbekolay, hunse and tcstewar September 21, 2017 23:19

AllenHW commented Sep 22, 2017

View reviewed changes

tbekolay requested changes Sep 22, 2017

View reviewed changes

hunse reviewed Sep 22, 2017

View reviewed changes

hunse approved these changes Sep 23, 2017

View reviewed changes

tbekolay force-pushed the advanced-index branch from 76fc4e1 to 2720c4b Compare September 25, 2017 15:58

tbekolay approved these changes Sep 25, 2017

View reviewed changes

tbekolay force-pushed the advanced-index branch 2 times, most recently from b36bdb7 to 9bc1e22 Compare September 25, 2017 16:20

AllenHW and others added 2 commits September 25, 2017 12:43

Fix bug with multiple indices in post ObjView

f4bbdc4

Also test that boolean indexing works. With this, advanced indexing is fully supported for ObjViews in connections. Fixes #947.

Bump NumPy requirement to 1.8

7a9e768

tbekolay force-pushed the advanced-index branch from 9bc1e22 to 7a9e768 Compare September 25, 2017 16:43

tbekolay merged commit 7a9e768 into master Sep 25, 2017

tbekolay deleted the advanced-index branch September 25, 2017 17:26

tbekolay unassigned AllenHW Oct 6, 2017

tbekolay mentioned this pull request Oct 6, 2017

Release v1.3.0 nengo/nengo-ocl#148

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes bug where multiple identical indexed slices are ignored #1361

Fixes bug where multiple identical indexed slices are ignored #1361

AllenHW commented Sep 21, 2017 •

edited

Loading

AllenHW commented Sep 22, 2017 •

edited

Loading

AllenHW commented Sep 22, 2017 •

edited

Loading

AllenHW Sep 22, 2017

jgosmann Sep 22, 2017

jgosmann Sep 22, 2017

tbekolay Sep 22, 2017

hunse Sep 22, 2017

tbekolay left a comment

tbekolay Sep 22, 2017

tbekolay Sep 22, 2017

tbekolay Sep 22, 2017

tbekolay Sep 22, 2017

hunse Sep 22, 2017

hunse commented Sep 22, 2017

hunse Sep 22, 2017

hunse commented Sep 22, 2017

AllenHW commented Sep 23, 2017 •

edited

Loading

hunse left a comment

Fixes bug where multiple identical indexed slices are ignored #1361

Fixes bug where multiple identical indexed slices are ignored #1361

Conversation

AllenHW commented Sep 21, 2017 • edited Loading

AllenHW commented Sep 22, 2017 • edited Loading

AllenHW commented Sep 22, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tbekolay left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hunse commented Sep 22, 2017

Choose a reason for hiding this comment

hunse commented Sep 22, 2017

AllenHW commented Sep 23, 2017 • edited Loading

hunse left a comment

Choose a reason for hiding this comment

AllenHW commented Sep 21, 2017 •

edited

Loading

AllenHW commented Sep 22, 2017 •

edited

Loading

AllenHW commented Sep 22, 2017 •

edited

Loading

AllenHW commented Sep 23, 2017 •

edited

Loading