Linear transformation GPU kernel #1262

szalpal · 2019-09-16T16:12:54Z

Why we need this PR?

It adds new feature needed because of refactoring HSV manipulation operator

What happened in this PR?

Added LinearTransformation kernel plus minor refactoring

szalpal · 2019-09-16T16:43:40Z

!build

dali-automaton · 2019-09-16T16:49:35Z

CI MESSAGE: [902327]: BUILD STARTED

dali-automaton · 2019-09-16T18:38:48Z

CI MESSAGE: [902327]: BUILD PASSED

dali/kernels/algebra/linear_transformation_gpu.h

banasraf · 2019-09-17T11:15:17Z

dali/kernels/algebra/linear_transformation_gpu.h

+        for (int i = 0; i < M; i++) {
+          val += sample.transformation_matrix.at(o, i) * in(x, y, i);
+        }
+        out(x, y, o) = val;


I don't know if you need to reimplement the matrix - vector multiplication.
Maybe something like that:

auto *in_vec = reinterpret_cast<vec<InputType, M>*>(&in(x, y, 0)); auto *out_vec = reinterpret_cast<vec<OutputType, N>*>(&out(x, y, 0)); *out_vec = sample.transformation_matrix * (*in_vec);

Or instead of these reinterpret_casts you could just copy memory to temporary vecs.

@banasraf

This would only work with dense HWC layout.

We can reinterpret input, but for the output we need to apply proper conversion anyway.

dali/kernels/CMakeLists.txt

dali/kernels/algebra/linear_transformation_gpu.h

include/dali/core/dev_array.h

mzient

Many issues.

Incomplete functionality (M*x + B - missing B)
Implementation bugs

input pointers don't have proper ROI offsets
number of channels should be fixed, based on matrix shape
loss of data when accumulating to OutputType (should be float)

Misleading namespace - "algebra" suggests something more than a pixelwise operation

...and more problems.

szalpal · 2019-09-18T14:04:46Z

!build

dali-automaton · 2019-09-18T14:10:05Z

CI MESSAGE: [905830]: BUILD STARTED

dali-automaton · 2019-09-18T15:32:40Z

CI MESSAGE: [905830]: BUILD PASSED

Signed-off-by: Michał Szołucha <mszolucha@nvidia.com>

mzient · 2019-09-27T09:04:06Z

dali/kernels/imgproc/pointwise/linear_transformation_gpu_test.cu

+  ASSERT_EQ(mat.rows * mat.cols * mat.channels(), res.first.num_elements())
+                        << "Number of elements doesn't match";
+  auto ptr = reinterpret_cast<typename TypeParam::Out *>(mat.data);
+  for (int i = 0; i < res.first.num_elements(); i++) {


I really don't like this loop - you're flattening the shape and it's not a good idea when considering ROIs.
Why don't you have a TensorView and use Check function that will print failing coordintates and limit the output size in case of some total disaster?

I used the OpenCV to calculate proper ROI. AFAIK, there's no easy way to create BB-ed TensorView (which would be good idea, but not for this PR)

You can create a TensorView back from the cv::Mat and use Check function.

dali/kernels/imgproc/roi.h

dali/kernels/imgproc/pointwise/linear_transformation_gpu.h

Signed-off-by: Michał Szołucha <mszolucha@nvidia.com>

dali/kernels/imgproc/pointwise/linear_transformation_gpu_test.cu

dali/kernels/imgproc/pointwise/linear_transformation_gpu.h

Signed-off-by: Michał Szołucha <mszolucha@nvidia.com>

dali/kernels/imgproc/pointwise/linear_transformation_gpu.h

dali/kernels/test/tensor_test_utils.h

Signed-off-by: Michał Szołucha <mszolucha@nvidia.com>

mzient · 2019-10-01T13:54:54Z

dali/kernels/imgproc/pointwise/linear_transformation_gpu_test.cu

+  Out *output_;
+  std::vector<In> input_host_;
+  std::vector<float> ref_output_;
+  std::vector<TensorShape<kNDims>> in_shapes_ = {{4, 3, kNChannelsIn}, {4, 3, kNChannelsIn}};


Why not:

Suggested change

std::vector<TensorShape<kNDims>> in_shapes_ = {{4, 3, kNChannelsIn}, {4, 3, kNChannelsIn}};

TensorListShape<kNDims> in_shapes_ = {{{4, 3, kNChannelsIn}, {4, 3, kNChannelsIn}}};

dali/kernels/imgproc/pointwise/linear_transformation_gpu_test.cu

Signed-off-by: Michał Szołucha <mszolucha@nvidia.com>

szalpal · 2019-10-01T14:18:36Z

!build

jantonguirao · 2019-10-01T14:20:23Z

dali/kernels/imgproc/pointwise/linear_transformation_gpu.h

+
+namespace dali {
+namespace kernels {
+namespace lin_trans {


lin_trans doesn't tell me anything. linear_transform at least

dali-automaton · 2019-10-01T14:20:24Z

CI MESSAGE: [925087]: BUILD STARTED

jantonguirao · 2019-10-01T14:25:54Z

dali/kernels/imgproc/pointwise/linear_transformation_gpu_test.cu

+  }
+};
+
+using TestTypes = std::tuple<float>;


at least one uint type

dali-automaton · 2019-10-01T20:34:57Z

CI MESSAGE: [925087]: BUILD PASSED

* Linear transformation GPU kernel Signed-off-by: Michał Szołucha <mszolucha@nvidia.com> Signed-off-by: Jianjun Liu <00liujj@163.com>

szalpal changed the title ~~[WIP] Linear transformation GPU kernel~~ Linear transformation GPU kernel Sep 16, 2019