Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support of std::atomic for hip-clang #764

Closed
wants to merge 1 commit into from

Conversation

yxsamliu
Copy link
Contributor

No description provided.

*/

/* HIT_START
* BUILD: %t %s ../test_common.cpp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need EXCLUDE_HIP_PLATFORM nvcc here since calling std::atomic* is not supported by nvcc I believe.

@AlexVlx
Copy link
Contributor

AlexVlx commented Nov 26, 2018

@yxsamliu unless there is an application ask for this, I would avoid doing it since it violates ODR (we're seldom if ever symmetric in what regards lockfree-ness with the host e.g. we don't do sub-DWORD atomics) and since it will stellarly break for non-lockfree cases, since we absolutely cannot get to the sharded lock table. As far as I know CUDA does not do this, so why are we trying to add this to hip-clang? Has there been some recent change in CUDA?

@yxsamliu
Copy link
Contributor Author

There are some recent changes in TensorFlow which requires a framework called rccl, which uses std::atomic in device code. HCC is able to compile that. This change just allows hip-clang to do the same thing like HCC.

@AlexVlx
Copy link
Contributor

AlexVlx commented Nov 26, 2018

@yxsamliu being able to compile something doesn't mean it is correct, or should be done. HCC compiling that is an error and a hack which was put in place a long time ago and ignores all of the issues I described above. Since we (AMD) own RCCL, we should make sure it doesn't do silly things such as use std::atomic in device code. So I would strongly suggest that instead of going down this particular rabbit hole which can only lead to madness and despair, we just tell the RCCL devs to use CUDA / HIP atomics, just like everybody else.

@yxsamliu
Copy link
Contributor Author

I will try the latest tensorflow with rccl and come back to this.

@AlexVlx
Copy link
Contributor

AlexVlx commented Nov 27, 2018

@yxsamliu I don't think that anything will change unless RCCL stops using std::atomic, which it should do.

@yxsamliu
Copy link
Contributor Author

It seems the latest rccl does not use std::atomic. I just want to make sure if they already removed those.

@AlexVlx
Copy link
Contributor

AlexVlx commented Nov 27, 2018

This suggests that, unfortunately, it still does: https://github.com/ROCmSoftwarePlatform/rccl/search?q=std%3A%3Aatomic&unscoped_q=std%3A%3Aatomic, but hopefully I am missing something.

@yxsamliu
Copy link
Contributor Author

I opened issue ROCm/rccl#54 for rccl. Let's see their response.

@yxsamliu
Copy link
Contributor Author

close this since rccl agrees to replace std::atomics with hip atomics.

@yxsamliu yxsamliu closed this Nov 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants