Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rocm-5.0.2 async_copy test failed on 6700xt #233

Closed
littlewu2508 opened this issue May 2, 2022 · 11 comments
Closed

rocm-5.0.2 async_copy test failed on 6700xt #233

littlewu2508 opened this issue May 2, 2022 · 11 comments

Comments

@littlewu2508
Copy link

Summary

I compiled rocThrust-rocm-5.0.2 for gfx1031, and find 1 failed (async_copy) test among 113. Other tests passed.

Environment

Hardware description
GPU Navy_flounder [Radeon RX 6700XT]
CPU AMD Ryzen 9 5950X
Software version
Linux 5.17.3
Distribution Gentoo
ROCK Upstream Kernel
ROCR v5.0.2
Host Compiler gcc-11.2
Device Compiler hipcc-5.0.2

Log

Command: "/ext4-disk/build/portage/sci-libs/rocThrust-5.0.2/work/rocThrust-5.0.2_build/test/async_copy.hip"
Directory: /ext4-disk/build/portage/sci-libs/rocThrust-5.0.2/work/rocThrust-5.0.2_build/test
"async_copy.hip" start time: May 02 15:50 CST
Output:
----------------------------------------------------------
Running main() from /opt/build/portage/dev-cpp/gtest-1.11.0/work/googletest-aee0f9d9b5b87796ee8a0ab26b7587ec30e8858e/googletest/src/gtest_main.cc
[==========] Running 32 tests from 8 test suites.
[----------] Global test environment set-up.
[----------] 4 tests from AsyncCopyTests/0, where TypeParam = Params<short>
[ RUN      ] AsyncCopyTests/0.TestAsyncTriviallyRelocatableElementsHostToDevice
/ext4-disk/build/portage/sci-libs/rocThrust-5.0.2/work/rocThrust-rocm-5.0.2/test/test_async_copy.cpp:78: Failure
Expected equality of these values:
  h0
    Which is: { -32768 }
  d0
    Which is: { 0 }
Google Test trace:
/ext4-disk/build/portage/sci-libs/rocThrust-5.0.2/work/rocThrust-rocm-5.0.2/test/test_async_copy.cpp:66: with seed= 1
/ext4-disk/build/portage/sci-libs/rocThrust-5.0.2/work/rocThrust-rocm-5.0.2/test/test_async_copy.cpp:63: with size = 1
/ext4-disk/build/portage/sci-libs/rocThrust-5.0.2/work/rocThrust-rocm-5.0.2/test/test_async_copy.cpp:85: with device_id= 0
[  FAILED  ] AsyncCopyTests/0.TestAsyncTriviallyRelocatableElementsHostToDevice, where TypeParam = Params<short> (17005 ms)
littlewu2508 added a commit to littlewu2508/gentoo that referenced this issue May 2, 2022
Observed one failed test on gfx1031 though:
ROCm/rocThrust#233

Package-Manager: Portage-3.0.30, Repoman-3.0.3
Signed-off-by: Yiyang Wu <xgreenlandforwyy@gmail.com>
littlewu2508 added a commit to littlewu2508/gentoo that referenced this issue May 2, 2022
Observed one failed test on gfx1031 though:
ROCm/rocThrust#233

Package-Manager: Portage-3.0.30, Repoman-3.0.3
Signed-off-by: Yiyang Wu <xgreenlandforwyy@gmail.com>
littlewu2508 added a commit to littlewu2508/gentoo that referenced this issue May 2, 2022
Observed one failed test on gfx1031 though:
ROCm/rocThrust#233

Package-Manager: Portage-3.0.30, Repoman-3.0.3
Signed-off-by: Yiyang Wu <xgreenlandforwyy@gmail.com>
littlewu2508 added a commit to littlewu2508/gentoo that referenced this issue May 2, 2022
Observed one failed test on gfx1031 though:
ROCm/rocThrust#233

Package-Manager: Portage-3.0.30, Repoman-3.0.3
Signed-off-by: Yiyang Wu <xgreenlandforwyy@gmail.com>
littlewu2508 added a commit to littlewu2508/gentoo that referenced this issue May 2, 2022
Observed one failed test on gfx1031 though:
ROCm/rocThrust#233

Package-Manager: Portage-3.0.30, Repoman-3.0.3
Signed-off-by: Yiyang Wu <xgreenlandforwyy@gmail.com>
gentoo-bot pushed a commit to gentoo/gentoo that referenced this issue May 3, 2022
Observed one failed test on gfx1031 though:
ROCm/rocThrust#233

Package-Manager: Portage-3.0.30, Repoman-3.0.3
Signed-off-by: Yiyang Wu <xgreenlandforwyy@gmail.com>
Signed-off-by: Benda Xu <heroxbd@gentoo.org>
@doctorcolinsmith
Copy link
Collaborator

@mfep Please take a look at this.

@doctorcolinsmith
Copy link
Collaborator

Also, note that the 6700XT is not officially supported in the ROCm stack at this time.

@klausbu
Copy link

klausbu commented Jan 17, 2023

Has this issue been resolved?

@littlewu2508
Copy link
Author

littlewu2508 commented Jan 17, 2023 via email

@littlewu2508
Copy link
Author

littlewu2508 commented Jan 19, 2023

For rocThrust-5.1.3 (hip-5.3.3, clang-15.0.7), tests all passed on 6700XT.

There are issues about compiling rocThrust-5.3.3. I'll report that in another issue.

@klausbu
Copy link

klausbu commented Jan 19, 2023

There's the other open question: How can I compile rocThrust-5.1.3 with my local, modified version of rocPRIM supporting gfx1031 instead of triggering the download of the unmodified version as defined in the install script?

@littlewu2508
Copy link
Author

There's the other open question: How can I compile rocThrust-5.1.3 with my local, modified version of rocPRIM supporting gfx1031 instead of triggering the download of the unmodified version as defined in the install script?

I think if your rocprim-config.cmake is in default search locations (like /usr/lib64/cmake) then find_package will automatically choose the local installation.

Actually I packaged rocThrust and rocPRIM in gentoo, applying some patches to enable gfx1031 and deleting codes about downloading dependencies.

@Snektron
Copy link
Collaborator

See #268

@klausbu
Copy link

klausbu commented Jan 28, 2023

Where can I find the changes/patches you applied to rocThrust and rocPRIM for the gentoo implementation?

@littlewu2508
Copy link
Author

Where can I find the changes/patches you applied to rocThrust and rocPRIM for the gentoo implementation?

You can check the ebuilds out at rocPRIM-5.1.3.ebuild and rocThrust-5.1.3.ebuild, where I applied some sed commands to remove downloading external dependencies, non-FHS install scripts (which shouldn't be a matter now because ROCm now supports turning off the backward compatible, non-FHS install).

As I remember no dedicated patches is applied for gfx1031. The key is to set -DAMDGPU_TARGETS=gfx1031 for cmake configuration.

@klausbu
Copy link

klausbu commented Jan 28, 2023

Setting the target works, my problem is a compile error related to rocPRIM but the rocPRIM installation and tests went well:

... In file included from /home/klaus/Programme/rocalution_install/rocThrust-rocm-5.1.3/thrust/../thrust/detail/scan.inl:29: In file included from /home/klaus/Programme/rocalution_install/rocThrust-rocm-5.1.3/thrust/../thrust/system/detail/adl/scan_by_key.h:44: In file included from /home/klaus/Programme/rocalution_install/rocThrust-rocm-5.1.3/thrust/../thrust/system/hip/detail/scan_by_key.h:36: In file included from /home/klaus/Programme/rocalution_install/rocThrust-rocm-5.1.3/thrust/../thrust/system/hip/execution_policy.h:81: /home/klaus/Programme/rocalution_install/rocThrust-rocm-5.1.3/thrust/../thrust/system/hip/detail/set_operations.h:956:61: error: no member named 'init_offset_scan_state_kernel' in namespace 'rocprim::detail'; did you mean 'init_lookback_scan_state_kernel'? hipLaunchKernelGGL(HIP_KERNEL_NAME(rocprim::detail::init_offset_scan_state_kernel), ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ init_lookback_scan_state_kernel /opt/rocm-5.4.0/include/hip/amd_detail/amd_hip_runtime.h:199:30: note: expanded from macro 'HIP_KERNEL_NAME' #define HIP_KERNEL_NAME(...) __VA_ARGS__ ^~~~~~~~~~~ /opt/rocm-5.4.0/include/hip/amd_detail/amd_hip_runtime.h:251:74: note: expanded from macro 'hipLaunchKernelGGL' #define hipLaunchKernelGGL(kernelName, ...) hipLaunchKernelGGLInternal((kernelName), __VA_ARGS__) ^~~~~~~~~~ /opt/rocm-5.4.0/include/hip/amd_detail/amd_hip_runtime.h:248:9: note: expanded from macro 'hipLaunchKernelGGLInternal' kernelName<<<(numBlocks), (numThreads), (memPerBlock), (streamId)>>>(__VA_ARGS__); \ ^~~~~~~~~~ /opt/rocm-5.4.0/include/rocprim/device/detail/device_scan_common.hpp:76:60: note: 'init_lookback_scan_state_kernel' declared here __launch_bounds__(ROCPRIM_DEFAULT_MAX_BLOCK_SIZE) void init_lookback_scan_state_kernel( ^ In file included from /home/klaus/Programme/rocalution_install/rocThrust-rocm-5.1.3/test/test_zip_iterator.cpp:18: In file included from /home/klaus/Programme/rocalution_install/rocThrust-rocm-5.1.3/thrust/../thrust/copy.h:512: ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants