Added Stride to Subscript and Slice Kernel #5007

5had3z · 2023-08-20T03:41:49Z

Category:

New Features

Description:

Added striding to subscript operator by implementing step in slice kernel and adding parameters.

Additional information:

Affected modules and functionalities:

Slice operator and kernels are the main changes as well as the test suite.

Key points relevant for the review:

Removed dimension flattening and pre-anchoring of slice kernel as this would futher complicate logic or result in more duplicated "specialised" kernels. I would conjecture that for a memory-bandwidth limited kenel such as rearranging/copying data, a little more interger logic should't have too much of an adverse effect.

Check if the test_constant_ranges has sufficient coverage of stride permutations, rather than focusing of the logic in subscript.h (maybe I should've looked at numpy's source code for their range conditions, but I seem to match it on final value).

All the "formatting" changes were done with clang-format and black.

Tests:

Added tests with different start : end : step in operator_2.test_subscript and checked for parity with numpy on both gpu and cpu mats. All operator_2 tests pass.

Checklist

Documentation

DALI team only

Requirements

Implements new requirements
Affects existing requirements
N/A

REQ IDs: N/A

JIRA TASK: N/A

5had3z · 2023-08-20T03:54:42Z

If there are performance concerns with the simplification of the gpu kernel, could you please instruct me how to efficiently use nsight to analyse kernels in this project?

mzient · 2023-08-21T09:52:45Z

Hello @5had3z
Thank you for your contribution!
I have one request regarding this PR - please separate the concerns and create a standalone PR with the .devcontainer. This will streamline the review process.

5had3z · 2023-08-21T10:28:10Z

Done, reverted all changes not pertaining to striding/stepping, will open new one for devcontainer.

dali/kernels/slice/slice_gpu.cuh

jantonguirao

Looks good to me. Thank you very much for your contribution!

dali/kernels/slice/slice_gpu.cuh

dali-automaton · 2023-08-22T09:55:25Z

CI MESSAGE: [9465267]: BUILD STARTED

dali-automaton · 2023-08-22T11:34:22Z

CI MESSAGE: [9465267]: BUILD FAILED

dali/kernels/slice/slice_kernel_utils.h

mzient

The tests fail due to incorrectly initialized step.

mzient · 2023-08-22T12:54:58Z

@5had3z Failing tests aside, could you also rebase your branch on latest main?

jantonguirao · 2023-08-22T12:58:39Z

@5had3z to reproduce the test failures, you can run:

./dali/python/nvidia/dali/test/dali_kernel_test.bin --gtest_filter=*SliceGPU*

Thanks

… 1 slicekernel uses strides to iterate Signed-off-by: Bryce Ferenczi <frenzi@hotmail.com.au>

Signed-off-by: Bryce Ferenczi <frenzi@hotmail.com.au>

… anchor and stride modification), fix regression subscript.cc bad types, fix hi/lo anchor logic for negative stride, passes operator_2 tests, temp ommit gpu indexing test Signed-off-by: Bryce Ferenczi <frenzi@hotmail.com.au>

Signed-off-by: Bryce Ferenczi <frenzi@hotmail.com.au>

…work properly, all tests passing Signed-off-by: Bryce Ferenczi <frenzi@hotmail.com.au>

5had3z · 2023-08-22T14:08:39Z

@mzient I realised that UnitCubeShape wasn't actually a thing after I tried to pull and build so I added TensorShape<>::filled_shape (and changed ::empty_shape to call it with value=0). I don't want to break the remote by syncrhonising my local, so I'm re-cloning, making the changes and pushing again. Will take a while to rebuild and test.

mzient · 2023-08-22T14:20:12Z

@mzient I realised that UnitCubeShape wasn't actually a thing after I tried to pull and build so I added TensorShape<>::filled_shape (and changed ::empty_shape to call it with value=0). I don't want to break the remote by syncrhonising my local, so I'm re-cloning, making the changes and pushing again. Will take a while to rebuild and test.

If you don't have new local changes, you can get the latest remote by doing:

# assuming that your fork's remote name is 5had3z
git fetch 5had3z
# assuming that your current branch is feat/strided-slice
git reset --hard 5had3z/feat/strided-slice

It's easier than re-cloning and won't trigger quite as lengthy rebuilds.

5had3z · 2023-08-22T14:30:26Z

@mzient I realised that UnitCubeShape wasn't actually a thing after I tried to pull and build so I added TensorShape<>::filled_shape (and changed ::empty_shape to call it with value=0). I don't want to break the remote by syncrhonising my local, so I'm re-cloning, making the changes and pushing again. Will take a while to rebuild and test.

If you don't have new local changes, you can get the latest remote by doing:
# assuming that your fork's remote name is 5had3z
git fetch 5had3z
# assuming that your current branch is feat/strided-slice
git reset --hard 5had3z/feat/strided-slice
It's easier than re-cloning and won't trigger quite as lengthy rebuilds.

Its alright, I'm trying to watch sopranos at the same time 😂. Build isn't too bad on 5950X if you restrict nvcc to sm86. Running tests now.

5had3z · 2023-08-22T14:34:02Z

if you restrict nvcc to sm86

Speaking of, upping cmake to >1.24 to take advantage of easily switching between cuda arch "all" and "native" could be useful.
https://cmake.org/cmake/help/latest/prop_tgt/CUDA_ARCHITECTURES.html

mzient · 2023-08-22T14:41:31Z

if you restrict nvcc to sm86

Speaking of, upping cmake to >1.24 to take advantage of easily switching between cuda arch "all" and "native" could be useful. https://cmake.org/cmake/help/latest/prop_tgt/CUDA_ARCHITECTURES.html

We already do have a CMake switch for it, e.g. -DCUDA_TARGET_ARCHS="75;80". AFAIK we populate a native CMake argument based on that list, with -real for all listed and -virtual for the oldest one.

Signed-off-by: Bryce Ferenczi <frenzi@hotmail.com.au>

dali-automaton · 2023-08-22T15:39:00Z

CI MESSAGE: [9467364]: BUILD FAILED

include/dali/core/tensor_shape.h

mzient · 2023-08-22T16:52:10Z

include/dali/core/tensor_shape.h

    assert(dim > 0);
    TensorShape<> result;
    result.resize(dim);
+    // TODO: should use std::fill but lack .begin() and .end()


Please remove this comment. Firstly, TensorShape does have begin and end, as clearly indicated by the following ranged for. Secondly, at least in my opinion, assigning elements in a ranged for reads better than the very verbose call to std::fill (STL ranges change that but we're not there yet with the standard support).

My bad, LSP complained it didn't exist and I wasn't thinking carefully (classic 1am moment).
I plan to play with trying to compile with cxx_std_23 later.

include/dali/core/tensor_shape.h

mzient

@5had3z Sorry, you've touched a very fundamental header (tensor_shape.h) so the changes need to be made very carefully. On the bright side, I've discovered some long-standing bugs and we have an opportunity to fix them now.

…ary template for dynamic specialisation, remove wrong comment Signed-off-by: Bryce Ferenczi <frenzi@hotmail.com.au>

include/dali/core/tensor_shape.h

dali-automaton · 2023-08-23T08:47:07Z

CI MESSAGE: [9479881]: BUILD STARTED

dali-automaton · 2023-08-23T10:26:40Z

CI MESSAGE: [9479881]: BUILD FAILED

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

dali-automaton · 2023-08-23T13:03:57Z

CI MESSAGE: [9482171]: BUILD STARTED

dali-automaton · 2023-08-23T14:51:59Z

CI MESSAGE: [9482171]: BUILD PASSED

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

dali-automaton · 2023-08-23T16:23:34Z

CI MESSAGE: [9484228]: BUILD STARTED

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

dali-automaton · 2023-08-23T16:58:20Z

CI MESSAGE: [9484655]: BUILD STARTED

dali-automaton · 2023-08-23T18:57:58Z

CI MESSAGE: [9484655]: BUILD PASSED

Enables numpy style slicing with strides to tensor subscript operator by supporting a `steps` member to slice params. Signed-off-by: Bryce Ferenczi <frenzi@hotmail.com.au> Co-authored-by: Michal Zientkiewicz <michalz@nvidia.com>

jantonguirao assigned jantonguirao and mzient Aug 21, 2023

5had3z changed the title ~~Added Stride to Subscript and Slice Kernel + .devcontainer~~ Added Stride to Subscript and Slice Kernel Aug 21, 2023

5had3z mentioned this pull request Aug 21, 2023

Added Devcontainer #5010

Closed

18 tasks

mzient reviewed Aug 21, 2023

View reviewed changes

dali/kernels/slice/slice_gpu.cuh Outdated Show resolved Hide resolved

mzient reviewed Aug 21, 2023

View reviewed changes

dali/kernels/slice/slice_gpu.cuh Outdated Show resolved Hide resolved

mzient reviewed Aug 21, 2023

View reviewed changes

dali/kernels/slice/slice_gpu.cuh Show resolved Hide resolved

jantonguirao reviewed Aug 21, 2023

View reviewed changes

dali/kernels/slice/slice_gpu.cuh Show resolved Hide resolved

jantonguirao approved these changes Aug 21, 2023

View reviewed changes

mzient reviewed Aug 21, 2023

View reviewed changes

dali/kernels/slice/slice_gpu.cuh Show resolved Hide resolved

mzient reviewed Aug 22, 2023

View reviewed changes

dali/kernels/slice/slice_kernel_utils.h Outdated Show resolved Hide resolved

mzient requested changes Aug 22, 2023

View reviewed changes

5had3z added 11 commits August 22, 2023 15:32

add docker-build folder to gitignore, clang-format slice_cpu.h, level…

0152190

… 1 slicekernel uses strides to iterate Signed-off-by: Bryce Ferenczi <frenzi@hotmail.com.au>

clang-format + black format, removed slice notimpl errors

a9f7387

Signed-off-by: Bryce Ferenczi <frenzi@hotmail.com.au>

add step to slice args, multiply in stride by step, clang-format

898ba89

Signed-off-by: Bryce Ferenczi <frenzi@hotmail.com.au>

update defauly pyver and add new runtime images

d2cc8e6

Signed-off-by: Bryce Ferenczi <frenzi@hotmail.com.au>

fix build script

a936d16

Signed-off-by: Bryce Ferenczi <frenzi@hotmail.com.au>

added .devcontainer and dockerfile

c84adfb

Signed-off-by: Bryce Ferenczi <frenzi@hotmail.com.au>

remove deps post-compile, move pre-commit install

53fce55

Signed-off-by: Bryce Ferenczi <frenzi@hotmail.com.au>

step > 1 works (-ve not), add nsight to devctr

734f5cd

Signed-off-by: Bryce Ferenczi <frenzi@hotmail.com.au>

Add nvjpeg2k and nvcomp to image

170df2f

Signed-off-by: Bryce Ferenczi <frenzi@hotmail.com.au>

remove dimension inlining and anchor embedding to enable stepping to …

996d88d

…work properly, all tests passing Signed-off-by: Bryce Ferenczi <frenzi@hotmail.com.au>

added helper function to TensorShape to create filled tensor

948639d

Signed-off-by: Bryce Ferenczi <frenzi@hotmail.com.au>

mzient reviewed Aug 22, 2023

View reviewed changes

include/dali/core/tensor_shape.h Outdated Show resolved Hide resolved

mzient reviewed Aug 22, 2023

View reviewed changes

include/dali/core/tensor_shape.h Outdated Show resolved Hide resolved

mzient reviewed Aug 22, 2023

View reviewed changes

include/dali/core/tensor_shape.h Outdated Show resolved Hide resolved

mzient reviewed Aug 22, 2023

View reviewed changes

include/dali/core/tensor_shape.h Outdated Show resolved Hide resolved

mzient reviewed Aug 22, 2023

View reviewed changes

include/dali/core/tensor_shape.h Outdated Show resolved Hide resolved

mzient requested changes Aug 22, 2023

View reviewed changes

fix assertions, fix missing template param for ndim, removed unnessec…

fe0ac97

…ary template for dynamic specialisation, remove wrong comment Signed-off-by: Bryce Ferenczi <frenzi@hotmail.com.au>

5had3z commented Aug 23, 2023

View reviewed changes

include/dali/core/tensor_shape.h Show resolved Hide resolved

mzient approved these changes Aug 23, 2023

View reviewed changes

Simplify step alongside anchor and shape.

c9ac7a8

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

Add a targetted test for collapsing untouched dims.

8e372b2

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

Restore formatting and comments.

6d0fcc0

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

jantonguirao merged commit bc133ec into NVIDIA:main Aug 24, 2023
2 of 3 checks passed

5had3z deleted the feat/strided-slice branch August 24, 2023 09:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added Stride to Subscript and Slice Kernel #5007

Added Stride to Subscript and Slice Kernel #5007

5had3z commented Aug 20, 2023 •

edited

Loading

5had3z commented Aug 20, 2023

mzient commented Aug 21, 2023

5had3z commented Aug 21, 2023

jantonguirao left a comment

dali-automaton commented Aug 22, 2023

dali-automaton commented Aug 22, 2023

mzient left a comment

mzient commented Aug 22, 2023

jantonguirao commented Aug 22, 2023

5had3z commented Aug 22, 2023

mzient commented Aug 22, 2023

5had3z commented Aug 22, 2023

5had3z commented Aug 22, 2023

mzient commented Aug 22, 2023 •

edited

Loading

dali-automaton commented Aug 22, 2023

mzient Aug 22, 2023

5had3z Aug 23, 2023

mzient left a comment •

edited

Loading

dali-automaton commented Aug 23, 2023

dali-automaton commented Aug 23, 2023

dali-automaton commented Aug 23, 2023

dali-automaton commented Aug 23, 2023

dali-automaton commented Aug 23, 2023

dali-automaton commented Aug 23, 2023

dali-automaton commented Aug 23, 2023

Added Stride to Subscript and Slice Kernel #5007

Added Stride to Subscript and Slice Kernel #5007

Conversation

5had3z commented Aug 20, 2023 • edited Loading

Category:

Description:

Additional information:

Affected modules and functionalities:

Key points relevant for the review:

Tests:

Checklist

Documentation

DALI team only

Requirements

5had3z commented Aug 20, 2023

mzient commented Aug 21, 2023

5had3z commented Aug 21, 2023

jantonguirao left a comment

Choose a reason for hiding this comment

dali-automaton commented Aug 22, 2023

dali-automaton commented Aug 22, 2023

mzient left a comment

Choose a reason for hiding this comment

mzient commented Aug 22, 2023

jantonguirao commented Aug 22, 2023

5had3z commented Aug 22, 2023

mzient commented Aug 22, 2023

5had3z commented Aug 22, 2023

5had3z commented Aug 22, 2023

mzient commented Aug 22, 2023 • edited Loading

dali-automaton commented Aug 22, 2023

mzient Aug 22, 2023

Choose a reason for hiding this comment

5had3z Aug 23, 2023

Choose a reason for hiding this comment

mzient left a comment • edited Loading

Choose a reason for hiding this comment

dali-automaton commented Aug 23, 2023

dali-automaton commented Aug 23, 2023

dali-automaton commented Aug 23, 2023

dali-automaton commented Aug 23, 2023

dali-automaton commented Aug 23, 2023

dali-automaton commented Aug 23, 2023

dali-automaton commented Aug 23, 2023

5had3z commented Aug 20, 2023 •

edited

Loading

mzient commented Aug 22, 2023 •

edited

Loading

mzient left a comment •

edited

Loading