New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Size and alignment of 3-component built-in vector types are incompatible with CUDA #706
Comments
|
Hi, Can anyone point out workarounds or fixes related to this issue ? |
|
I use this workaround: template <> __forceinline__ __device__
void gather_force_store<float3>(const float fx, const float fy, const float fz,
const int stride, const int pos,
float3* force) {
// Store into non-strided "float3" array
#if defined(__HIP_PLATFORM_HCC__)
// Workaround: unlike CUDA, HIP-hcc has sizeof(float3) != sizeof(CudaForce) (and == sizeof(float4))
// TODO-HIP: Remove when https://github.com/ROCm-Developer-Tools/HIP/issues/706 is fixed
reinterpret_cast<float*>(force)[pos * 3 + 0] = fx;
reinterpret_cast<float*>(force)[pos * 3 + 1] = fy;
reinterpret_cast<float*>(force)[pos * 3 + 2] = fz;
#else
force[pos].x = fx;
force[pos].y = fy;
force[pos].z = fz;
#endif
}(not sure that it can be applied to your situation) |
|
@ex-rzr Thanks for your response but I guess I don't have access to that repo. (gives 404 error) |
|
oh, sorry. The previous message has been updated. |
|
@ex-rzr Hi Anton could you provide us access to view NAMD repository ? |
|
@NEELMCW I don't have permissions to change the repo's settings. |
|
This issue had solved on ROCm-5.1.0. |
HIP 3-component vectors have in fact same sizes and alignments as 4-component vectors.
CUDA 3-component vectors are packed:
sizeof(T3) = 3 * sizeof(T)andalignof(T3) = alignof(T)See https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#built-in-vector-types
HIP (hcc):
CUDA
If this is correct behavior then it should be mentioned in the docs (https://github.com/ROCm-Developer-Tools/HIP/blob/master/docs/markdown/hip_kernel_language.md#short-vector-types).
The text was updated successfully, but these errors were encountered: