-
Notifications
You must be signed in to change notification settings - Fork 609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Linear transformation GPU kernel #1262
Conversation
!build |
CI MESSAGE: [902327]: BUILD STARTED |
CI MESSAGE: [902327]: BUILD PASSED |
for (int i = 0; i < M; i++) { | ||
val += sample.transformation_matrix.at(o, i) * in(x, y, i); | ||
} | ||
out(x, y, o) = val; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if you need to reimplement the matrix - vector multiplication.
Maybe something like that:
auto *in_vec = reinterpret_cast<vec<InputType, M>*>(&in(x, y, 0));
auto *out_vec = reinterpret_cast<vec<OutputType, N>*>(&out(x, y, 0));
*out_vec = sample.transformation_matrix * (*in_vec);
Or instead of these reinterpret_casts you could just copy memory to temporary vecs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- This would only work with dense HWC layout.
- We can reinterpret input, but for the output we need to apply proper conversion anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many issues.
- Incomplete functionality (M*x + B - missing B)
- Implementation bugs
- input pointers don't have proper ROI offsets
- number of channels should be fixed, based on matrix shape
- loss of data when accumulating to OutputType (should be float)
- Misleading namespace - "algebra" suggests something more than a pixelwise operation
...and more problems.
!build |
CI MESSAGE: [905830]: BUILD STARTED |
CI MESSAGE: [905830]: BUILD PASSED |
Signed-off-by: Michał Szołucha <mszolucha@nvidia.com>
ASSERT_EQ(mat.rows * mat.cols * mat.channels(), res.first.num_elements()) | ||
<< "Number of elements doesn't match"; | ||
auto ptr = reinterpret_cast<typename TypeParam::Out *>(mat.data); | ||
for (int i = 0; i < res.first.num_elements(); i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really don't like this loop - you're flattening the shape and it's not a good idea when considering ROIs.
Why don't you have a TensorView and use Check function that will print failing coordintates and limit the output size in case of some total disaster?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used the OpenCV to calculate proper ROI. AFAIK, there's no easy way to create BB-ed TensorView (which would be good idea, but not for this PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can create a TensorView back from the cv::Mat and use Check function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
dali/kernels/imgproc/pointwise/linear_transformation_gpu_test.cu
Outdated
Show resolved
Hide resolved
Out *output_; | ||
std::vector<In> input_host_; | ||
std::vector<float> ref_output_; | ||
std::vector<TensorShape<kNDims>> in_shapes_ = {{4, 3, kNChannelsIn}, {4, 3, kNChannelsIn}}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not:
std::vector<TensorShape<kNDims>> in_shapes_ = {{4, 3, kNChannelsIn}, {4, 3, kNChannelsIn}}; | |
TensorListShape<kNDims> in_shapes_ = {{{4, 3, kNChannelsIn}, {4, 3, kNChannelsIn}}}; |
Signed-off-by: Michał Szołucha <mszolucha@nvidia.com>
!build |
|
||
namespace dali { | ||
namespace kernels { | ||
namespace lin_trans { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lin_trans
doesn't tell me anything. linear_transform
at least
CI MESSAGE: [925087]: BUILD STARTED |
} | ||
}; | ||
|
||
using TestTypes = std::tuple<float>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at least one uint type
CI MESSAGE: [925087]: BUILD PASSED |
* Linear transformation GPU kernel Signed-off-by: Michał Szołucha <mszolucha@nvidia.com> Signed-off-by: Jianjun Liu <00liujj@163.com>
Why we need this PR?
What happened in this PR?