Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[MXNET-1206] Support NDArray indexing with None and Ellipsis #13143

Merged
merged 48 commits into from Aug 8, 2019
Merged

[MXNET-1206] Support NDArray indexing with None and Ellipsis #13143

merged 48 commits into from Aug 8, 2019

Conversation

kohr-h
Copy link
Contributor

@kohr-h kohr-h commented Nov 6, 2018

Description

This PR adds the capability of indexing NDArray with None to create new axes, and with ... (Ellipsis) as a shorthand for an adequate number of : (slice(None)).

Closes #13134
Closes #13166

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • NDArray.__getitem__ and NDArray.__setitem__ now first expand None and Ellipsis entries in indices.
  • __setitem__ ignores None entries, which corresponds to the behavior of NumPy.
  • If the indexed portion is contiguous, __getitem__ now always returns a view.

Comments

  • I've noted that indexing is not as copy-free as it could be. For instance:
    >>> x = nd.ones(3)
    >>> y = x[:2]  # same as y = x[slice(2)]
    >>> y[:] = 0
    >>> x  # as expected
    [0. 0. 1.]
    <NDArray 3 @cpu(0)>
    >>> y = x[(slice(2),)]  # Should be equivalent
    >>> y[:] = -17
    >>> x  # unchanged!
    [0. 0. 1.]
    <NDArray 3 @cpu(0)>
  • Another nifty case is this one:
    >>> x = nd.ones((2, 3))
    >>> y = x[:2]
    >>> y[:] = 0
    >>> x  # as expected
    [[0. 0. 0.]
     [1. 1. 1.]]
    <NDArray 2x3 @cpu(0)>
    >>> y = x[:2, :]
    >>> y[:] = -17
    >>> x  # unchanged!
    [[0. 0. 0.]
     [1. 1. 1.]]
    <NDArray 2x3 @cpu(0)>

I think that the implementation does not live up to the promise of the documentation:

Returns a sliced view of this array if the elements fetched are contiguous in memory;
otherwise, returns a newly created NDArray.

In particular, the "detection" of contiguousness is not good.

I'll make that an extra issue, but it's related to this PR, so I mention it here.

Remaining TODOs

  • Fix error in File "/work/mxnet/tests/python/unittest/test_dgl_graph.py", line 64, in check_compact (from test_uniform_sample)
  • Fix failure in File "/work/mxnet/tests/python/unittest/test_operator.py", line 4195, in test_empty_tensor (from test_tile)
  • Fix failure in File "/work/mxnet/tests/python/gpu/../unittest/test_operator.py", line 4290, in test_empty_indices (from test_one_hot)
  • Fix linter errors/warnings

@kohr-h kohr-h requested a review from szha as a code owner November 6, 2018 22:14
Copy link
Member

@anirudhacharya anirudhacharya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kohr-h Thanks for the contribution!

Can you please add tests for this change.

python/mxnet/ndarray/ndarray.py Show resolved Hide resolved
@kohr-h
Copy link
Contributor Author

kohr-h commented Nov 6, 2018

Should I open a JIRA ticket as well?

@anirudhacharya
Copy link
Member

@mxnet-label-bot [NDArray, pr-awaiting-review]

@anirudhacharya
Copy link
Member

@marcoabreu marcoabreu added NDArray pr-awaiting-review PR is waiting for code review labels Nov 6, 2018
@kohr-h kohr-h changed the title [Python] Support NDArray indexing with None and Ellipsis [MXNET-1206] Support NDArray indexing with None and Ellipsis Nov 7, 2018
@anirudhacharya
Copy link
Member

@kohr-h please ping here once you think the PR is ready for review.

@kohr-h
Copy link
Contributor Author

kohr-h commented Nov 20, 2018

Yes, will do. The whole thing is more complex than anticipated (as usual), in particular with advanced indexing. So it's still work-in-progress. But I'm determined to push it through.

@vandanavk
Copy link
Contributor

@mxnet-label-bot update[NDArray, pr-work-in-progress]

@roywei
Copy link
Member

roywei commented Dec 11, 2018

@kohr-h Thanks for the contribution! Here are some resources if you need help:

  1. you can take a look at MXNet wiki for developers to understand the design
  2. Ask questions on our forum and discuss if you have any difficulty.
  3. For API changes, we can discuss on devlist and slack channel

@kohr-h
Copy link
Contributor Author

kohr-h commented Dec 13, 2018

Getting closer. Advanced indexing with __getitem__ should work now, but I'll add test cases for that.

@Roshrini
Copy link
Member

Roshrini commented Jan 8, 2019

@kohr-h Thanks for working on this! can you look into failing CI builds?

@kohr-h
Copy link
Contributor Author

kohr-h commented Jan 8, 2019

@kohr-h Thanks for working on this! can you look into failing CI builds?

@Roshrini Yes will do. I haven't had much time during Holiday and the next time will also be a bit on and off, but I'll see what I can do.

@stu1130
Copy link
Contributor

stu1130 commented Jan 16, 2019

@mxnet-label-bot update [pr-work-in-progress, NDArray]

@marcoabreu marcoabreu added pr-work-in-progress PR is still work in progress NDArray and removed NDArray pr-awaiting-review PR is waiting for code review labels Jan 16, 2019
@vandanavk
Copy link
Contributor

@kohr-h Could add "Fixes #13166" in the PR description so that #13166 is closed too.

Also could you re-trigger CI

@vandanavk
Copy link
Contributor

@mxnet-label-bot update [pr-awaiting-testing, NDArray]

@marcoabreu marcoabreu added pr-awaiting-testing PR is reviewed and waiting CI build and test and removed pr-work-in-progress PR is still work in progress labels Mar 11, 2019
@kohr-h
Copy link
Contributor Author

kohr-h commented Mar 19, 2019

This is almost finished. There are 2 failing tests for which I have no idea why they're failing and how to fix them. I'd appreciate some help.

Here are the stacktraces (ignore the _gpu part, they also fail on CPU):

======================================================================
FAIL: test_operator_gpu.test_tile
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/nose/case.py", line 198, in runTest
    self.test(*self.arg)
  File "/usr/lib/python3.6/site-packages/nose/util.py", line 620, in newfunc
    return func(*arg, **kw)
  File "/work/mxnet/tests/python/gpu/../unittest/common.py", line 177, in test_new
    orig_test(*args, **kwargs)
  File "/work/mxnet/tests/python/gpu/../unittest/test_operator.py", line 4246, in test_tile
    test_empty_tensor()
  File "/work/mxnet/tests/python/gpu/../unittest/test_operator.py", line 4195, in test_empty_tensor
    assert same(a_tiled, b_tiled)
AssertionError: 
-------------------- >> begin captured logging << --------------------
common: INFO: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=2048974122 to reproduce.
--------------------- >> end captured logging << ---------------------

======================================================================
FAIL: test_operator_gpu.test_one_hot
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/nose/case.py", line 198, in runTest
    self.test(*self.arg)
  File "/usr/lib/python3.6/site-packages/nose/util.py", line 620, in newfunc
    return func(*arg, **kw)
  File "/work/mxnet/tests/python/gpu/../unittest/common.py", line 177, in test_new
    orig_test(*args, **kwargs)
  File "/work/mxnet/tests/python/gpu/../unittest/test_operator.py", line 4306, in test_one_hot
    test_empty_indices()
  File "/work/mxnet/tests/python/gpu/../unittest/test_operator.py", line 4290, in test_empty_indices
    assert same(expected_array, mx_one_hot_array)
AssertionError: 
-------------------- >> begin captured logging << --------------------
common: INFO: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=776351319 to reproduce.
--------------------- >> end captured logging << ---------------------

Both tests fail because tensors with 0s in their shapes are handled incorrectly in the two failing functions tile and one_hot. I didn't change any of those, and the closest thing I changed (my only change in the backend) is this one in slice_forward. But I couldn't find a connection to the other ops.

@karan6181
Copy link
Contributor

@szha and @anirudh2290 Could you please have a took at this PR? Thank you!

@abhinavs95
Copy link
Contributor

@kohr-h There are 7 CI tests failing currently, could you have a look? Thanks!

@reminisce reminisce merged commit b77f524 into apache:master Aug 8, 2019
@kohr-h kohr-h deleted the indexing_none_ellipsis branch August 12, 2019 08:59
@kohr-h
Copy link
Contributor Author

kohr-h commented Aug 12, 2019

Great to see this PR merged, thanks for the "last mile" effort @reminisce and @zoeygxy.

Regarding the performance differences, the trend you observed is somewhat to be expected. The checks performed in basic indexing are more comprehensive than before, and in more cases than before, views are being returned. Thus, the cases where views have always been returned are a bit slower because of extra overhead, but the cases that turned from copy to view should amortize easily.

The advanced indexing performance could have gone either way, so I'm happy about the improvement there.

To see the effect more clearly, I think it would make sense to test very small arrays (pure overhead) and very large arrays (purely copy vs. view).

How much are you worried about the performance hit for the view cases? If you really want to be on par with the old speed, you could re-introduce the same special-casing as in the old code, but IMO that would make the code more complex. My implementation strives to be as generic as possible and takes the small performance hit of manipulating tuples and slice objects a bit too much for simple cases.

@reminisce
Copy link
Contributor

@kohr-h Thanks for the response. Since returning views is a very frequent operation in many dataset loaders in MXNet, we have to keep the performance on a par with the previous version. @zoeygxy will keep working on improving the performance of the cases where views are returned. Although such special handling may lead code to look a little bit disorganized, the performance gain is still worth the effort because we don't want to introduce performance regression in training jobs.

@DickJC123
Copy link
Contributor

@kohr-h
A git show of this PR's commit includes:

diff --git a/python/mxnet/test_utils.py b/python/mxnet/test_utils.py
index 2275f4d5d..89013f439 100644
--- a/python/mxnet/test_utils.py
+++ b/python/mxnet/test_utils.py
@@ -1074,7 +1074,11 @@ def check_symbolic_forward(sym, location, expected, rtol=1E-4, atol=None,
 
     executor = sym.bind(ctx=ctx, args=location, args_grad=args_grad_data, aux_states=aux_states)
     for g in executor.grad_arrays:
-        g[:] = 0
+        print(g.shape)
+        if g.ndim == 0:
+            g[()] = 0
+        else:
+            g[:] = 0
 

Please confirm that the line print(g.shape) was a bit of debugging code inadvertently left in, and so should be removed.

@kohr-h
Copy link
Contributor Author

kohr-h commented Aug 16, 2019

@DickJC123 It's part of fe6336d so I'll bounce the question to @reminisce.

@kohr-h
Copy link
Contributor Author

kohr-h commented Aug 16, 2019

@kohr-h Thanks for the response. Since returning views is a very frequent operation in many dataset loaders in MXNet, we have to keep the performance on a par with the previous version. @zoeygxy will keep working on improving the performance of the cases where views are returned. Although such special handling may lead code to look a little bit disorganized, the performance gain is still worth the effort because we don't want to introduce performance regression in training jobs.

@reminisce Makes sense as a trade-off.

@zoeygxy
Copy link
Contributor

zoeygxy commented Aug 19, 2019

A git show of this PR's commit includes:

diff --git a/python/mxnet/test_utils.py b/python/mxnet/test_utils.py
index 2275f4d5d..89013f439 100644
--- a/python/mxnet/test_utils.py
+++ b/python/mxnet/test_utils.py
@@ -1074,7 +1074,11 @@ def check_symbolic_forward(sym, location, expected, rtol=1E-4, atol=None,
 
     executor = sym.bind(ctx=ctx, args=location, args_grad=args_grad_data, aux_states=aux_states)
     for g in executor.grad_arrays:
-        g[:] = 0
+        print(g.shape)
+        if g.ndim == 0:
+            g[()] = 0
+        else:
+            g[:] = 0
 

Please confirm that the line print(g.shape) was a bit of debugging code inadvertently left in, and so should be removed.

@DickJC123 I believe so. Thanks for pointing out! I will take care of it.

anirudhacharya pushed a commit to anirudhacharya/mxnet that referenced this pull request Aug 20, 2019
…13143)

* Support NDArray indexing with None and Ellipsis

* Update NDArray.__setitem__ docs with None and Ellipsis

* Fix boolean flag in NDArray.__getitem__, add doctests

* Add setitem test for None and Ellipsis

* Fix wrong slice used, add cases to test_indexing

* Revamp NDArray.__getitem__ and __setitem__

* Fix typo in error message of SetSliceOpOutputDimSize

* Fix setting of array with integer indices

* Fix basic __setitem__ for all test cases

* WIP: fixing advanced indexing

* REMOVE: printing in tests

* Re-implement advanced indexing with None and Ellipsis

* Fix lint errors

* WIP: fix basic indexing

* WIP: fix basic indexing

* TEMP: print statements in tests

* Fix op.slice with step<0 and end==-1

* Implement copy-free general contiguous indexing

* Improve doctest of __getitem__

* Fix missing staticmethod

* Remove superfluous _at and _slice

* Fix lint errors

* WIP: basic indexing

* Remove print statements from tests

* Fix call into op.slice in basic indexing, add doctest

* Print failing index in setitem tests

* Simplify implementation of advanced index broadcasting

* Better printing for failing setitem tests

* Remove list indexing restriction, fix value shape broadcasting

* Fix bad string formatting

* Fix bug in test_uniform

* Test mutability of sliced array if contiguous

* Fix whitespace error in matrix_op-inl.h

* "Fix" pylint complaints

* Temporarily disable failing unittests

* Silence another pylint complaint

* Fix size-0 array creation

* Make scalar tensor assignment test check for IndexError

* Re-activate operator tests with 0-size arrays

* Use np.compat in tests with zeros in shape or empty shape

* Change comment in autograd indexing test

* Add more None-containing index tuples to indexing test

* Disable None in advanced indexing test since it has not been supported

* Fix sanity

* Fix ci

* Fix unit test failure

* Fix __getitem__
access2rohit pushed a commit to access2rohit/incubator-mxnet that referenced this pull request Sep 25, 2019
…13143)

* Support NDArray indexing with None and Ellipsis

* Update NDArray.__setitem__ docs with None and Ellipsis

* Fix boolean flag in NDArray.__getitem__, add doctests

* Add setitem test for None and Ellipsis

* Fix wrong slice used, add cases to test_indexing

* Revamp NDArray.__getitem__ and __setitem__

* Fix typo in error message of SetSliceOpOutputDimSize

* Fix setting of array with integer indices

* Fix basic __setitem__ for all test cases

* WIP: fixing advanced indexing

* REMOVE: printing in tests

* Re-implement advanced indexing with None and Ellipsis

* Fix lint errors

* WIP: fix basic indexing

* WIP: fix basic indexing

* TEMP: print statements in tests

* Fix op.slice with step<0 and end==-1

* Implement copy-free general contiguous indexing

* Improve doctest of __getitem__

* Fix missing staticmethod

* Remove superfluous _at and _slice

* Fix lint errors

* WIP: basic indexing

* Remove print statements from tests

* Fix call into op.slice in basic indexing, add doctest

* Print failing index in setitem tests

* Simplify implementation of advanced index broadcasting

* Better printing for failing setitem tests

* Remove list indexing restriction, fix value shape broadcasting

* Fix bad string formatting

* Fix bug in test_uniform

* Test mutability of sliced array if contiguous

* Fix whitespace error in matrix_op-inl.h

* "Fix" pylint complaints

* Temporarily disable failing unittests

* Silence another pylint complaint

* Fix size-0 array creation

* Make scalar tensor assignment test check for IndexError

* Re-activate operator tests with 0-size arrays

* Use np.compat in tests with zeros in shape or empty shape

* Change comment in autograd indexing test

* Add more None-containing index tuples to indexing test

* Disable None in advanced indexing test since it has not been supported

* Fix sanity

* Fix ci

* Fix unit test failure

* Fix __getitem__
access2rohit pushed a commit to access2rohit/incubator-mxnet that referenced this pull request Sep 25, 2019
…13143)

* Support NDArray indexing with None and Ellipsis

* Update NDArray.__setitem__ docs with None and Ellipsis

* Fix boolean flag in NDArray.__getitem__, add doctests

* Add setitem test for None and Ellipsis

* Fix wrong slice used, add cases to test_indexing

* Revamp NDArray.__getitem__ and __setitem__

* Fix typo in error message of SetSliceOpOutputDimSize

* Fix setting of array with integer indices

* Fix basic __setitem__ for all test cases

* WIP: fixing advanced indexing

* REMOVE: printing in tests

* Re-implement advanced indexing with None and Ellipsis

* Fix lint errors

* WIP: fix basic indexing

* WIP: fix basic indexing

* TEMP: print statements in tests

* Fix op.slice with step<0 and end==-1

* Implement copy-free general contiguous indexing

* Improve doctest of __getitem__

* Fix missing staticmethod

* Remove superfluous _at and _slice

* Fix lint errors

* WIP: basic indexing

* Remove print statements from tests

* Fix call into op.slice in basic indexing, add doctest

* Print failing index in setitem tests

* Simplify implementation of advanced index broadcasting

* Better printing for failing setitem tests

* Remove list indexing restriction, fix value shape broadcasting

* Fix bad string formatting

* Fix bug in test_uniform

* Test mutability of sliced array if contiguous

* Fix whitespace error in matrix_op-inl.h

* "Fix" pylint complaints

* Temporarily disable failing unittests

* Silence another pylint complaint

* Fix size-0 array creation

* Make scalar tensor assignment test check for IndexError

* Re-activate operator tests with 0-size arrays

* Use np.compat in tests with zeros in shape or empty shape

* Change comment in autograd indexing test

* Add more None-containing index tuples to indexing test

* Disable None in advanced indexing test since it has not been supported

* Fix sanity

* Fix ci

* Fix unit test failure

* Fix __getitem__
access2rohit pushed a commit to access2rohit/incubator-mxnet that referenced this pull request Sep 25, 2019
…13143)

* Support NDArray indexing with None and Ellipsis

* Update NDArray.__setitem__ docs with None and Ellipsis

* Fix boolean flag in NDArray.__getitem__, add doctests

* Add setitem test for None and Ellipsis

* Fix wrong slice used, add cases to test_indexing

* Revamp NDArray.__getitem__ and __setitem__

* Fix typo in error message of SetSliceOpOutputDimSize

* Fix setting of array with integer indices

* Fix basic __setitem__ for all test cases

* WIP: fixing advanced indexing

* REMOVE: printing in tests

* Re-implement advanced indexing with None and Ellipsis

* Fix lint errors

* WIP: fix basic indexing

* WIP: fix basic indexing

* TEMP: print statements in tests

* Fix op.slice with step<0 and end==-1

* Implement copy-free general contiguous indexing

* Improve doctest of __getitem__

* Fix missing staticmethod

* Remove superfluous _at and _slice

* Fix lint errors

* WIP: basic indexing

* Remove print statements from tests

* Fix call into op.slice in basic indexing, add doctest

* Print failing index in setitem tests

* Simplify implementation of advanced index broadcasting

* Better printing for failing setitem tests

* Remove list indexing restriction, fix value shape broadcasting

* Fix bad string formatting

* Fix bug in test_uniform

* Test mutability of sliced array if contiguous

* Fix whitespace error in matrix_op-inl.h

* "Fix" pylint complaints

* Temporarily disable failing unittests

* Silence another pylint complaint

* Fix size-0 array creation

* Make scalar tensor assignment test check for IndexError

* Re-activate operator tests with 0-size arrays

* Use np.compat in tests with zeros in shape or empty shape

* Change comment in autograd indexing test

* Add more None-containing index tuples to indexing test

* Disable None in advanced indexing test since it has not been supported

* Fix sanity

* Fix ci

* Fix unit test failure

* Fix __getitem__
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
NDArray pr-awaiting-testing PR is reviewed and waiting CI build and test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Python] __getitem__ copies too often [Python]: Indexing with None and Ellipsis