Skip to content

Conversation

igorsugak
Copy link
Contributor

Summary:
When NVCC or Clang are driving CUDA compilation many math functions are declared by default, with a small difference: Clang marks them as __device__ only, while NVCC uses both __host__ and __device__. This makes every un-elaborated min or max function call from a __host__ function generate a syntax error when Clang is used.

Fix the errors by using std::min and std::max from <algorithm>, since C++14 they are constexpr and can be used in the __device__ code [1].

  1. https://llvm.org/docs/CompileCudaWithLLVM.html#algorithm

Test Plan:

buck build mode/opt -c fbcode.cuda_use_clang=true //fblearner/flow/projects/dper:workflow
buck build mode/opt //fblearner/flow/projects/dper:workflow

Differential Revision: D20005795

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D20005795

@dr-ci
Copy link

dr-ci bot commented Feb 20, 2020

💊 CircleCI build failures summary and remediations

As of commit 3e27b69:

Commit 3e27b69 was recently pushed. Waiting for builds...


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker.

This comment has been revised 1 time.

Summary:
Pull Request resolved: pytorch#33563

When NVCC or Clang are driving CUDA compilation many math functions are declared by default, with a small difference: Clang marks them as `__device__` only, while NVCC uses both `__host__` and `__device__`. This makes every un-elaborated `min` or `max` function call from a `__host__` function generate a syntax error when Clang is used.

Fix the errors by using `std::min` and `std::max` from `<algorithm>`, since C++14 they are `constexpr` and can be used in the `__device__` code [1].

1. https://llvm.org/docs/CompileCudaWithLLVM.html#algorithm

Test Plan:
```lang=bash
buck build mode/opt -c fbcode.cuda_use_clang=true //fblearner/flow/projects/dper:workflow
buck build mode/opt //fblearner/flow/projects/dper:workflow
```

Differential Revision: D20005795

fbshipit-source-id: 67f46a361b7d312337741f39d0946b851f0fd5db
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D20005795

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 5dde8cd.

ttumiel pushed a commit to ttumiel/pytorch that referenced this pull request Mar 4, 2020
Summary:
Pull Request resolved: pytorch#33563

When NVCC or Clang are driving CUDA compilation many math functions are declared by default, with a small difference: Clang marks them as `__device__` only, while NVCC uses both `__host__` and `__device__`. This makes every un-elaborated `min` or `max` function call from a `__host__` function generate a syntax error when Clang is used.

Fix the errors by using `std::min` and `std::max` from `<algorithm>`, since C++14 they are `constexpr` and can be used in the `__device__` code [1].

1. https://llvm.org/docs/CompileCudaWithLLVM.html#algorithm

Test Plan:
```lang=bash
buck build mode/opt -c fbcode.cuda_use_clang=true //fblearner/flow/projects/dper:workflow
buck build mode/opt //fblearner/flow/projects/dper:workflow
```
Execute tests on devgpu:
```
buck test mode/dev-nosan -j 8 //caffe2/caffe2/python/operator_test/... //caffe2/test:cuda
```

Reviewed By: ngimel

Differential Revision: D20005795

fbshipit-source-id: 98a3f35e8a96c15d3ad3d2066396591f5cca1696
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants