Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linear transformation GPU kernel #1262

Merged
merged 110 commits into from
Oct 2, 2019
Merged

Linear transformation GPU kernel #1262

merged 110 commits into from
Oct 2, 2019

Conversation

szalpal
Copy link
Member

@szalpal szalpal commented Sep 16, 2019

Why we need this PR?

  • It adds new feature needed because of refactoring HSV manipulation operator

What happened in this PR?

  • Added LinearTransformation kernel plus minor refactoring

@szalpal szalpal changed the title [WIP] Linear transformation GPU kernel Linear transformation GPU kernel Sep 16, 2019
@szalpal
Copy link
Member Author

szalpal commented Sep 16, 2019

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [902327]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [902327]: BUILD PASSED

for (int i = 0; i < M; i++) {
val += sample.transformation_matrix.at(o, i) * in(x, y, i);
}
out(x, y, o) = val;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if you need to reimplement the matrix - vector multiplication.
Maybe something like that:

auto *in_vec = reinterpret_cast<vec<InputType, M>*>(&in(x, y, 0));
auto *out_vec =  reinterpret_cast<vec<OutputType, N>*>(&out(x, y, 0));
*out_vec = sample.transformation_matrix * (*in_vec);

Or instead of these reinterpret_casts you could just copy memory to temporary vecs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@banasraf

  1. This would only work with dense HWC layout.
  2. We can reinterpret input, but for the output we need to apply proper conversion anyway.

dali/kernels/CMakeLists.txt Outdated Show resolved Hide resolved
Copy link
Contributor

@mzient mzient left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many issues.

  1. Incomplete functionality (M*x + B - missing B)
  2. Implementation bugs
  • input pointers don't have proper ROI offsets
  • number of channels should be fixed, based on matrix shape
  • loss of data when accumulating to OutputType (should be float)
  1. Misleading namespace - "algebra" suggests something more than a pixelwise operation

...and more problems.

@szalpal
Copy link
Member Author

szalpal commented Sep 18, 2019

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [905830]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [905830]: BUILD PASSED

Signed-off-by: Michał Szołucha <mszolucha@nvidia.com>
Signed-off-by: Michał Szołucha <mszolucha@nvidia.com>
Signed-off-by: Michał Szołucha <mszolucha@nvidia.com>
Signed-off-by: Michał Szołucha <mszolucha@nvidia.com>
Signed-off-by: Michał Szołucha <mszolucha@nvidia.com>
ASSERT_EQ(mat.rows * mat.cols * mat.channels(), res.first.num_elements())
<< "Number of elements doesn't match";
auto ptr = reinterpret_cast<typename TypeParam::Out *>(mat.data);
for (int i = 0; i < res.first.num_elements(); i++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't like this loop - you're flattening the shape and it's not a good idea when considering ROIs.
Why don't you have a TensorView and use Check function that will print failing coordintates and limit the output size in case of some total disaster?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used the OpenCV to calculate proper ROI. AFAIK, there's no easy way to create BB-ed TensorView (which would be good idea, but not for this PR)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can create a TensorView back from the cv::Mat and use Check function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

dali/kernels/imgproc/roi.h Outdated Show resolved Hide resolved
Signed-off-by: Michał Szołucha <mszolucha@nvidia.com>
@mzient mzient self-requested a review September 30, 2019 14:18
Signed-off-by: Michał Szołucha <mszolucha@nvidia.com>
Signed-off-by: Michał Szołucha <mszolucha@nvidia.com>
Signed-off-by: Michał Szołucha <mszolucha@nvidia.com>
Signed-off-by: Michał Szołucha <mszolucha@nvidia.com>
Signed-off-by: Michał Szołucha <mszolucha@nvidia.com>
Out *output_;
std::vector<In> input_host_;
std::vector<float> ref_output_;
std::vector<TensorShape<kNDims>> in_shapes_ = {{4, 3, kNChannelsIn}, {4, 3, kNChannelsIn}};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not:

Suggested change
std::vector<TensorShape<kNDims>> in_shapes_ = {{4, 3, kNChannelsIn}, {4, 3, kNChannelsIn}};
TensorListShape<kNDims> in_shapes_ = {{{4, 3, kNChannelsIn}, {4, 3, kNChannelsIn}}};

Signed-off-by: Michał Szołucha <mszolucha@nvidia.com>
@szalpal
Copy link
Member Author

szalpal commented Oct 1, 2019

!build


namespace dali {
namespace kernels {
namespace lin_trans {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lin_trans doesn't tell me anything. linear_transform at least

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [925087]: BUILD STARTED

}
};

using TestTypes = std::tuple<float>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at least one uint type

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [925087]: BUILD PASSED

@szalpal szalpal merged commit ec869f4 into NVIDIA:master Oct 2, 2019
00liujj pushed a commit to 00liujj/DALI that referenced this pull request Oct 10, 2019
* Linear transformation GPU kernel

Signed-off-by: Michał Szołucha <mszolucha@nvidia.com>
Signed-off-by: Jianjun Liu <00liujj@163.com>
@szalpal szalpal deleted the hsv_gpu branch November 20, 2019 11:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants