Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tensor indexing #3195

Merged
merged 24 commits into from Aug 3, 2021
Merged

Tensor indexing #3195

merged 24 commits into from Aug 3, 2021

Conversation

mzient
Copy link
Contributor

@mzient mzient commented Jul 28, 2021

Signed-off-by: Michał Zientkiewicz mzient@gmail.com

Description

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Refactoring (Redesign of existing code that doesn't affect functionality)
  • Other (e.g. Documentation, Tests, Configuration)

What happened in this PR

Added NumPy-like tensor indexing operator:

  • the operator supports up to 32 dimensions with at, lo, hi (and in the future also step) arguments.
  • the arguments are passed separately for dimensions
  • adds __getitem__ to DataNode which instantiates subscript operator
  • adds dali.newaxis and dali.newaxis(name)
  • fixes a bug in TensorVector: stale type in sample views when only the type changes in Resize
  • fixes a bug in ExpandDims: incorrenct layout handling for 0D input

Additional information

Affected modules and functionalities:
  • DataNode python object
  • Operators: added a new operator
  • ExpandDims - bug fix
  • TensorVector - bug fix
Key points relevant for the review:
  • Indexing corner cases.

Checklist

Tests

  • Existing tests apply
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: SUBSCR.* except SUBSCR.06

JIRA TASK: 1617

@mzient mzient force-pushed the SubscriptOp branch 2 times, most recently from b2f5f28 to 5223763 Compare July 28, 2021 23:14
@mzient mzient marked this pull request as ready for review July 28, 2021 23:40
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2659393]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2659393]: BUILD FAILED

@JanuszL JanuszL self-assigned this Jul 29, 2021
@klecki klecki self-assigned this Jul 29, 2021
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2661776]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2661776]: BUILD PASSED

Comment on lines +464 to +465
} else if (IsValidType(tl_->type())) {
tensors_[idx]->set_type(tl_->type());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This kept stale type in the views if the pointer or size didn't change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have Reset above, so it happened only with empty TV that got a type?

Copy link
Contributor Author

@mzient mzient Aug 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It happend with something that had non-empty shape and type but 0 volume. And it was obviously a problem in TensorVector, since it had internally inconsistent type (TV had no_type and TLV had a proper type).

@mzient mzient requested a review from JanuszL July 29, 2021 13:47
mzient added 11 commits July 29, 2021 15:51
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
* Add layout handling
* Add Python front-end for slicing

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
… set on views when pointer or size is not changed.

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
…istent args.

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Comment on lines 102 to 103
DALI_FAIL(make_string("The subscriptg for dimension ", i,
" is specified both as an index and as a range."));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
DALI_FAIL(make_string("The subscriptg for dimension ", i,
" is specified both as an index and as a range."));
DALI_FAIL(make_string("The subscriptg for dimension ", i,
" should not be specified both as an index and as a range."));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

must not ;)

}
if (step.IsDefined()) {
DALI_FAIL(make_string("The subscriptg for dimension ", i,
" is specified as an index - it cannot have a step."));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
" is specified as an index - it cannot have a step."));
" should not be specified as an index and have have a step."));



if (s.IsDefined()) {
num_subscripts = i + 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
num_subscripts = i + 1;
max_subscripts = i + 1;

As it is not a number because some subscripts may be not defined.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mixed feelings here. It's used as the number (see subscripts_.resize(num_subscripts);). I would gladly use input ndim here if I had it - but it's not available at construction.

void GetRanges(const workspace_t<Backend> &ws, const TensorListShape<> &in_shape) {
int nsub = subscripts_.size();
int ndim = in_shape.sample_dim();
DALI_ENFORCE(ndim >= nsub, "Too many indices.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
DALI_ENFORCE(ndim >= nsub, "Too many indices.");
DALI_ENFORCE(ndim >= nsub, make_string("Too many indices. Received: ", nsub, ", while input has only " ndim, " dimensions.");

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
@lgtm-com
Copy link
Contributor

lgtm-com bot commented Aug 2, 2021

This pull request introduces 1 alert when merging 981923a into db01018 - view on LGTM.com

new alerts:

  • 1 for Unused import

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2681926]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2681926]: BUILD FAILED

@@ -1609,6 +1609,21 @@ def my_pipe():

p = my_pipe(device_id=0, seed=1234, num_threads=3, set_affinity=True, py_num_workers=3)

def test_not_iterable():
import nvidia.dali._utils.hacks as hacks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why hacks? Workarounds?

assert np.array_equal(ref, cpu.at(i))
assert np.array_equal(ref, gpu.as_cpu().at(i))

def test_constant_ranges():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it repeats L56 and hides it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some copy paste... I'll take a look.

import nvidia.dali.fn
if len(slice_args) == 0:
if len(new_axes) == 0 or new_axes[-1] < len(idxs):
print("Adding dim check for ", len(idxs))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need these prints?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surely not. These are remnants from initial testing - I wanted to be 100% sure that the code follows the intended path and that the right checks fire when expected.

* Fix some tests.
* Fix review issues.

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2687330]: BUILD STARTED

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2687741]: BUILD STARTED

@NVIDIA NVIDIA deleted a comment from dali-automaton Aug 3, 2021
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2687330]: BUILD FAILED

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2687788]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [2687788]: BUILD PASSED

@mzient mzient merged commit df1e6c3 into NVIDIA:main Aug 3, 2021
@JanuszL JanuszL mentioned this pull request Sep 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants