-
Notifications
You must be signed in to change notification settings - Fork 621
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Workaround a compiler problem that caused Invalid device function error. #2656
Conversation
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
!build |
CI MESSAGE: [2044971]: BUILD STARTED |
dl_type.bits = sizeof(T) * 8; | ||
dl_type.lanes = 1; | ||
if (std::is_floating_point<T>::value) { | ||
if (dali::is_fp_or_half<T>::value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bad idea. I'm not sure if fp16 is supported by dlpack at all. It supports bfloat but not fp16. Now you would put fp16 inside and claim it to be float.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Slice tests checks this indirectly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the record: DLPack capsules created this way are compatible with PyTorch. My guess is that it's a de-facto standard for float16 DLPack tensors now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks ok, but I had bad experience with the commented constructor being used.
@@ -256,7 +256,7 @@ class SliceGPU { | |||
for (int i = 0; i < in.size(); i++) { | |||
if (default_fill_values_) { | |||
assert(nfill_values_ == 1); | |||
fill_values_cpu[i] = static_cast<OutputType>(0.f); | |||
fill_values_cpu[i] = OutputType{}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you check if this works as intended? When I was writing the GaussianBlur, the compiler did some weird things if the output was half, and I had to initialize the 0 by using the conversion from float (that one worked without problems).
CI MESSAGE: [2044971]: BUILD PASSED |
Signed-off-by: Michał Zientkiewicz mzient@gmail.com
Why we need this PR?
Pick one, remove the rest
What happened in this PR?
Fill relevant points, put NA otherwise. Replace anything inside []
DALI_TYPE_SWITCH_WITH_FP16
andis_fp_or_half
in DLTensorJIRA TASK: DALI-1831