TensorJoin kernel for CPU #2301

szalpal · 2020-09-28T07:01:30Z

Why we need this PR?

Pick one, remove the rest

It adds new feature: Stacking or Concatenating tensors

Concatenation operation creates a new tensor, with joined values along some dimension, e.g.

arr0 = [[1, 2, 4, 2], [1, 1, 7, 6], [6, 8, 8, 4]])
shape = (3, 4)

arr1 = [[3, 8, 8, 6], [8, 1, 5, 7], [6, 2, 7, 5]]
shape = (3, 4)

concatenate([arr0, arr1], axis=1) ->
[[1, 2, 4, 2, 3, 8, 8, 6],
 [1, 1, 7, 6, 8, 1, 5, 7],
 [6, 8, 8, 4, 6, 2, 7, 5]])
shape = (3, 8)

Stacking, OTOH, creates new tensor with added dimension, where axis points.

stack([arr0, arr1], axis=1) ->
[[[1, 2, 4, 2],
  [3, 8, 8, 6]],

 [[1, 1, 7, 6],
  [8, 1, 5, 7]],

 [[6, 8, 8, 4],
  [6, 2, 7, 5]]]
shape = (3, 2, 4)

Note, that memory layout is the same in both modes, only output shape changes.

Signed-off-by: szalpal <mszolucha@nvidia.com>

dali/kernels/common/join/tensor_join_cpu.h

dali/kernels/common/join/tensor_join_cpu_test.cc

Signed-off-by: szalpal <mszolucha@nvidia.com>

szalpal · 2020-09-28T11:49:20Z

!build

dali-automaton · 2020-09-28T11:55:55Z

CI MESSAGE: [1659611]: BUILD STARTED

dali/kernels/common/join/tensor_join_cpu.h

dali-automaton · 2020-09-28T14:21:44Z

CI MESSAGE: [1659611]: BUILD FAILED

dali/kernels/common/join/tensor_join_cpu_test.cc

JanuszL · 2020-09-28T14:37:17Z

dali/kernels/common/join/tensor_join_cpu.h

+   */
+  void Run(KernelContext &ctx, const OutTensorCPU<T, output_dims> &out,
+           span<const InTensorCPU<T, dims>> in) {
+    if (in.size() != n_input_tensors_) {


How likely is that? Do we need to check it?

On the event user provides the input to Run, that has more tensors than was provided in Setup, we'll get a segfault. This is why I wanted to add this check

I'd say that it is assumed that this won't happen in a well-written program. Should we make it an assert instead? (just a suggestion though)

Again, I agree that in a well-written program it wouldn't happen. But in case it's not well-written, segfault might occur and in some really strange situation.

I guess that's an open question for us, how much error-checking we would like to have between Setup and Run calls (I don't remember discussing it before)

That's why we have asserts, to check logic errors. If we were to check every error this way we will end up with extremely verbose code.
I think that assuming same input for Setup and Run is OK, and an assert would suffice for error checking.
I think we have this assumption all around our code base

dali/kernels/common/join/tensor_join_cpu_test.cc

mzient · 2020-09-28T15:07:48Z

dali/kernels/common/join/tensor_join_cpu_test.cc

+  vector<vector<int>> arr = {{6, 8, 5, 1, 3, 5, 1, 6, 8, 3, 7, 5},
+                             {4, 5, 1, 8, 4, 4, 1, 4, 1, 7, 6, 6}};


Suggested change

vector<vector<int>> arr = {{6, 8, 5, 1, 3, 5, 1, 6, 8, 3, 7, 5},

{4, 5, 1, 8, 4, 4, 1, 4, 1, 7, 6, 6}};

vector<vector<int>> arr = {{

100, 101, 102, 103,

104, 105, 106, 107,

108, 109, 110, 112,

}, {

200, 201, 202, 203,

204, 205, 206, 207,

208, 209, 210, 212,

}};

dali/kernels/common/join/tensor_join_cpu.h

Signed-off-by: szalpal <mszolucha@nvidia.com>

mzient · 2020-09-29T18:12:07Z

dali/kernels/common/join/tensor_join_cpu_test.cc

+  vector<vector<int>> arr = {{100, 101, 102, 110, 111, 112, 120, 121, 122, 130, 131, 132},
+                             {200, 201, 202, 210, 211, 212, 220, 221, 222, 230, 231, 232}};


Sorry for being nitpicky, but... can't you format these into 3x4 arrays? It would be so much easier to read.

mzient · 2020-10-05T13:21:16Z

dali/kernels/common/join/tensor_join_cpu.h

+  KernelRequirements Setup(KernelContext &ctx, span<const TensorShape<dims>> in_shapes, int axis) {
+    n_input_tensors_ = in_shapes.size();
+    auto ndims = in_shapes[0].sample_dim();
+    DALI_ENFORCE(axis >= -ndims + !new_axis && axis <= ndims - !new_axis,


Suggested change

DALI_ENFORCE(axis >= -ndims + !new_axis && axis <= ndims - !new_axis,

DALI_ENFORCE(axis >= 0 && axis <= ndims - !new_axis

mzient · 2020-10-05T13:22:24Z

dali/kernels/common/join/tensor_join_cpu.h

+                 make_string("Incorrect axis. Actual: ", axis, ". Expected in [",
+                             -ndims + !new_axis, ", ", ndims - !new_axis, "] interval (",


Suggested change

make_string("Incorrect axis. Actual: ", axis, ". Expected in [",

-ndims + !new_axis, ", ", ndims - !new_axis, "] interval (",

make_string("Incorrect axis index: ", axis, ". Must be between 0 and ", ndims - !new_axis, "."));

mzient · 2020-10-05T13:24:25Z

dali/kernels/common/join/tensor_join_cpu.h

+                 make_string("Incorrect axis. Actual: ", axis, ". Expected in [",
+                             -ndims + !new_axis, ", ", ndims - !new_axis, "] interval (",
+                             new_axis ? "STACK" : "CONCAT", " mode)"));
+    axis_ = axis >= 0 ? axis : ndims + axis + new_axis;


This kind of Python logic should not make its way to the kernels.

jantonguirao · 2020-10-05T13:10:58Z

dali/kernels/common/join/tensor_join_cpu.h

+  }
+
+
+  int axis_, n_input_tensors_;


Suggested change

int axis_, n_input_tensors_;

int axis_ = -1, n_input_tensors_ = -1;

jantonguirao · 2020-10-05T13:12:12Z

dali/kernels/common/join/tensor_join_cpu.h

+  }
+  ///@}
+
+  static constexpr int output_dims = (dims == DynamicDimensions ? DynamicDimensions :


I'd move this definition to the top (line 109)

I've intentionally put it here, sufficing "put you definition closest to the using point" policy

Normally we don't write definitions in between member functions. IMHO it makes this definition hard to find. Anyway, not pushing.

jantonguirao · 2020-10-05T13:27:35Z

dali/kernels/common/join/tensor_join_cpu.h

+  KernelRequirements Setup(KernelContext &ctx, span<const TensorShape<dims>> in_shapes, int axis) {
+    n_input_tensors_ = in_shapes.size();
+    auto ndims = in_shapes[0].sample_dim();
+    DALI_ENFORCE(axis >= -ndims + !new_axis && axis <= ndims - !new_axis,


As discussed in slack. I believe that this kind of -negative indexing logic doesn't belong to the kernel layer. I'd make the kernel accept:
[0, ndims] for STACK
[0, ndims) for CONCAT
And leave negative indexing to the operator

jantonguirao · 2020-10-05T13:28:28Z

dali/kernels/common/join/tensor_join_cpu.h

+    axis_ = axis >= 0 ? axis : ndims + axis + new_axis;
+
+    {
+      const auto &ref = in_shapes[0];


I'd remove this scope, and move this line to 117 (so that you can take auto ndims = ref.sample_dim())

I'd prefer to retain the scope - it brings some better structure to the function. It's apparent this way, which part of the function is for error checking, and which for actual processing

jantonguirao · 2020-10-05T13:30:50Z

dali/kernels/common/join/tensor_join_cpu.h

+        for (int j = 0; j < ref.size(); j++) {
+          if (!new_axis) {
+            DALI_ENFORCE(in_shapes[i][j] == ref.shape[j] || j == axis_, make_string(
+                    "Number of samples in every dimension (except the concatenated one) "


Suggested change

"Number of samples in every dimension (except the concatenated one) "

"CONCAT: Number of samples in every dimension (except the concatenated one) "

IMHO, the error would be clearer, if the mode info is at the end. Also, sole CONCAT or STACK I believe isn't that clear, it should be accompanied with mode, to remove any ambiguities. Given that, I'd prefer to remain with what I've wrote in the first place, if you don't mind?

jantonguirao · 2020-10-05T13:31:13Z

dali/kernels/common/join/tensor_join_cpu.h

+                    " has dimension ", in_shapes[i][j]));
+          } else {
+            DALI_ENFORCE(in_shapes[i][j] == ref.shape[j], make_string(
+                    "Number of samples in every dimension must be the same (STACK mode). "


Suggested change

"Number of samples in every dimension must be the same (STACK mode). "

"STACK: Number of samples in every dimension must be the same. "

jantonguirao · 2020-10-05T13:33:32Z

dali/kernels/common/join/tensor_join_cpu.h

+   */
+  void Run(KernelContext &ctx, const OutTensorCPU<T, output_dims> &out,
+           span<const InTensorCPU<T, dims>> in) {
+    if (in.size() != n_input_tensors_) {


I'd say that it is assumed that this won't happen in a well-written program. Should we make it an assert instead? (just a suggestion though)

jantonguirao · 2020-10-05T13:36:31Z

dali/kernels/common/join/tensor_join_cpu_test.cc

+  TensorShape<> sh1 = {4, 10, 7, 8};
+  TensorShape<> sh2 = {4, 5, 14, 8};
+  TensorShape<> sh3 = {4, 5, 7, 16};
+  EXPECT_EQ(impl::DetermineShape<false>(make_span(shin), 0), sh0);


Suggested change

EXPECT_EQ(impl::DetermineShape<false>(make_span(shin), 0), sh0);

EXPECT_EQ(impl::DetermineShape<false>(make_span(shin), 0), {8, 5, 7, 8});

and so on would read a bit easier (just a suggestion)

I wanted to emphasise, which reference data goes for which dimension

jantonguirao · 2020-10-05T13:38:03Z

dali/kernels/common/join/tensor_join_cpu_test.cc

+
+
+TEST(TensorJoinCpuTest, ConcatenateTensorsTest) {
+  using namespace std;  // NOLINT


Suggested change

using namespace std; // NOLINT

using std::vector;

If you really must

jantonguirao · 2020-10-05T13:39:53Z

dali/kernels/common/join/tensor_join_cpu_test.cc

+
+
+TEST(TensorJoinCpuTest, StackKernelTest) {
+  using namespace std;  // NOLINT


Suggested change

using namespace std; // NOLINT

using std::vector;

Signed-off-by: szalpal <mszolucha@nvidia.com>

szalpal · 2020-10-06T10:32:58Z

!build

dali/kernels/common/join/tensor_join_cpu.h

dali-automaton · 2020-10-06T10:35:36Z

CI MESSAGE: [1679344]: BUILD STARTED

jantonguirao · 2020-10-06T11:44:24Z

dali/kernels/common/join/tensor_join_cpu.h

+  }
+  ///@}
+
+  static constexpr int output_dims = (dims == DynamicDimensions ? DynamicDimensions :


Normally we don't write definitions in between member functions. IMHO it makes this definition hard to find. Anyway, not pushing.

jantonguirao · 2020-10-06T11:46:30Z

dali/kernels/common/join/tensor_join_cpu.h

+   */
+  void Run(KernelContext &ctx, const OutTensorCPU<T, output_dims> &out,
+           span<const InTensorCPU<T, dims>> in) {
+    if (in.size() != n_input_tensors_) {


That's why we have asserts, to check logic errors. If we were to check every error this way we will end up with extremely verbose code.
I think that assuming same input for Setup and Run is OK, and an assert would suffice for error checking.
I think we have this assumption all around our code base

mzient · 2020-10-06T12:11:18Z

!build

dali-automaton · 2020-10-06T12:15:39Z

CI MESSAGE: [1679489]: BUILD STARTED

dali-automaton · 2020-10-06T14:06:39Z

CI MESSAGE: [1679344]: BUILD FAILED

dali-automaton · 2020-10-06T14:38:58Z

CI MESSAGE: [1679489]: BUILD FAILED

dali-automaton · 2020-10-06T14:57:32Z

CI MESSAGE: [1679489]: BUILD PASSED

szalpal added 3 commits September 28, 2020 07:09

wip

c925aaa

Signed-off-by: szalpal <mszolucha@nvidia.com>

initial working version

6bd7aa7

Signed-off-by: szalpal <mszolucha@nvidia.com>

moving files around

d66962b

Signed-off-by: szalpal <mszolucha@nvidia.com>

mzient reviewed Sep 28, 2020

View reviewed changes

dali/kernels/common/join/tensor_join_cpu.h Outdated Show resolved Hide resolved

mzient reviewed Sep 28, 2020

View reviewed changes

dali/kernels/common/join/tensor_join_cpu.h Outdated Show resolved Hide resolved

mzient reviewed Sep 28, 2020

View reviewed changes

dali/kernels/common/join/tensor_join_cpu.h Outdated Show resolved Hide resolved

mzient reviewed Sep 28, 2020

View reviewed changes

dali/kernels/common/join/tensor_join_cpu.h Outdated Show resolved Hide resolved

mzient reviewed Sep 28, 2020

View reviewed changes

dali/kernels/common/join/tensor_join_cpu.h Outdated Show resolved Hide resolved

mzient reviewed Sep 28, 2020

View reviewed changes

dali/kernels/common/join/tensor_join_cpu.h Outdated Show resolved Hide resolved

mzient reviewed Sep 28, 2020

View reviewed changes

dali/kernels/common/join/tensor_join_cpu.h Outdated Show resolved Hide resolved

mzient reviewed Sep 28, 2020

View reviewed changes

dali/kernels/common/join/tensor_join_cpu.h Outdated Show resolved Hide resolved

mzient reviewed Sep 28, 2020

View reviewed changes

dali/kernels/common/join/tensor_join_cpu_test.cc Outdated Show resolved Hide resolved

lint & api changes

c1345d9

Signed-off-by: szalpal <mszolucha@nvidia.com>

szalpal changed the title ~~[WIP] TensorJoin kernel for CPU~~ TensorJoin kernel for CPU Sep 28, 2020

szalpal added 2 commits September 28, 2020 13:34

finction signature change

40f8682

Signed-off-by: szalpal <mszolucha@nvidia.com>

improving docs

3d0f0b4

Signed-off-by: szalpal <mszolucha@nvidia.com>

JanuszL reviewed Sep 28, 2020

View reviewed changes

dali/kernels/common/join/tensor_join_cpu.h Outdated Show resolved Hide resolved

mzient reviewed Sep 28, 2020

View reviewed changes

dali/kernels/common/join/tensor_join_cpu_test.cc Outdated Show resolved Hide resolved

JanuszL reviewed Sep 28, 2020

View reviewed changes

dali/kernels/common/join/tensor_join_cpu_test.cc Show resolved Hide resolved

mzient reviewed Sep 28, 2020

View reviewed changes

dali/kernels/common/join/tensor_join_cpu_test.cc Outdated Show resolved Hide resolved

mzient reviewed Sep 28, 2020

View reviewed changes

dali/kernels/common/join/tensor_join_cpu_test.cc Outdated Show resolved Hide resolved

mzient reviewed Sep 28, 2020

View reviewed changes

dali/kernels/common/join/tensor_join_cpu.h Outdated Show resolved Hide resolved

mzient reviewed Sep 29, 2020

View reviewed changes

dali/kernels/common/join/tensor_join_cpu.h Outdated Show resolved Hide resolved

mzient reviewed Sep 29, 2020

View reviewed changes

dali/kernels/common/join/tensor_join_cpu.h Outdated Show resolved Hide resolved

mzient reviewed Sep 29, 2020

View reviewed changes

dali/kernels/common/join/tensor_join_cpu.h Outdated Show resolved Hide resolved

add kernel test

c344c0c

Signed-off-by: szalpal <mszolucha@nvidia.com>

mzient reviewed Sep 29, 2020

View reviewed changes

mzient reviewed Oct 5, 2020

View reviewed changes

jantonguirao reviewed Oct 5, 2020

View reviewed changes

szalpal added 2 commits October 6, 2020 01:18

review

c5c92ed

Signed-off-by: szalpal <mszolucha@nvidia.com>

bug fix

e58330e

Signed-off-by: szalpal <mszolucha@nvidia.com>

JanuszL reviewed Oct 6, 2020

View reviewed changes

dali/kernels/common/join/tensor_join_cpu.h Show resolved Hide resolved

JanuszL approved these changes Oct 6, 2020

View reviewed changes

jantonguirao approved these changes Oct 6, 2020

View reviewed changes

mzient approved these changes Oct 6, 2020

View reviewed changes

szalpal merged commit e30b577 into NVIDIA:master Oct 6, 2020

		vector<vector<int>> arr = {{6, 8, 5, 1, 3, 5, 1, 6, 8, 3, 7, 5},
		{4, 5, 1, 8, 4, 4, 1, 4, 1, 7, 6, 6}};

-  vector<vector<int>> arr = {{6, 8, 5, 1, 3, 5, 1, 6, 8, 3, 7, 5},
-                             {4, 5, 1, 8, 4, 4, 1, 4, 1, 7, 6, 6}};
+  vector<vector<int>> arr = {{
+, 101, 102, 103,
+, 105, 106, 107,
+, 109, 110, 112,
+  }, {
+, 201, 202, 203,
+, 205, 206, 207,
+, 209, 210, 212,
+  }};

		vector<vector<int>> arr = {{100, 101, 102, 110, 111, 112, 120, 121, 122, 130, 131, 132},
		{200, 201, 202, 210, 211, 212, 220, 221, 222, 230, 231, 232}};

	DALI_ENFORCE(axis >= -ndims + !new_axis && axis <= ndims - !new_axis,
	DALI_ENFORCE(axis >= 0 && axis <= ndims - !new_axis

		make_string("Incorrect axis. Actual: ", axis, ". Expected in [",
		-ndims + !new_axis, ", ", ndims - !new_axis, "] interval (",

	make_string("Incorrect axis. Actual: ", axis, ". Expected in [",
	-ndims + !new_axis, ", ", ndims - !new_axis, "] interval (",
	make_string("Incorrect axis index: ", axis, ". Must be between 0 and ", ndims - !new_axis, "."));

	int axis_, n_input_tensors_;
	int axis_ = -1, n_input_tensors_ = -1;

	"Number of samples in every dimension (except the concatenated one) "
	"CONCAT: Number of samples in every dimension (except the concatenated one) "

	"Number of samples in every dimension must be the same (STACK mode). "
	"STACK: Number of samples in every dimension must be the same. "

	EXPECT_EQ(impl::DetermineShape<false>(make_span(shin), 0), sh0);
	EXPECT_EQ(impl::DetermineShape<false>(make_span(shin), 0), {8, 5, 7, 8});



		TEST(TensorJoinCpuTest, ConcatenateTensorsTest) {
		using namespace std; // NOLINT



		TEST(TensorJoinCpuTest, StackKernelTest) {
		using namespace std; // NOLINT

TensorJoin kernel for CPU #2301

TensorJoin kernel for CPU #2301

Conversation

szalpal commented Sep 28, 2020

Why we need this PR?

szalpal commented Sep 28, 2020

dali-automaton commented Sep 28, 2020

dali-automaton commented Sep 28, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mzient Oct 5, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mzient Oct 5, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

szalpal commented Oct 6, 2020

dali-automaton commented Oct 6, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mzient commented Oct 6, 2020

dali-automaton commented Oct 6, 2020

dali-automaton commented Oct 6, 2020

dali-automaton commented Oct 6, 2020

dali-automaton commented Oct 6, 2020

mzient Oct 5, 2020 •

edited

Loading

mzient Oct 5, 2020 •

edited

Loading