Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix fp16 bug from #1129 and add fp16 test case #1160

Merged
merged 1 commit into from
Aug 12, 2019

Conversation

Kh4L
Copy link
Contributor

@Kh4L Kh4L commented Aug 12, 2019

Signed-off-by: Serge Panev spanev@nvidia.com

Why we need this PR?

What happened in this PR?

  • Explain solution of the problem, new feature added.
    padding_val is now a fp32 that is casted to OuputType in the CUDA kernel
    static_cast<OutputType>(padding_value)
  • What is most important part that reviewers should focus on?
    The CUDA kernel changes
  • Was this PR tested? How?
    Adding a fp16 test case

JIRA TASK: [DALI-1011]

Signed-off-by: Serge Panev <spanev@nvidia.com>
@jantonguirao
Copy link
Contributor

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [852854]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [852854]: BUILD FAILED

@klecki klecki merged commit 4c77eb2 into NVIDIA:master Aug 12, 2019
klecki pushed a commit that referenced this pull request Aug 12, 2019
Initializing member variable `OutputType padding_value = 0;` for `__half` in the host code produced wrong results.

`padding_val` is now a fp32 that is casted to `OuputType` in the CUDA kernel


Signed-off-by: Serge Panev <spanev@nvidia.com>
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [852854]: BUILD PASSED

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants