-
Notifications
You must be signed in to change notification settings - Fork 354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ROCm 5.0 Segmentation fault #1686
Comments
Is your configuration APU(Cezanne or Renoir) + RX6800M? For your issue, if you want to run hip application on RX6800M. Assume it is 0. Then, run like this hipDeviceProp_t devProp; $ ROCR_VISIBLE_DEVICES=0 ./helloworld |
Hi, Yes, I have a laptop with 5900HX + 6800M. Thank you! Running with ROCR_VISIBLE_DEVICES=0 worked! BTW: Apparently it is 0, but how can I identify 6800M deviceIndex? Is it somewhere visible in rocminfo output (I did not see such field)? |
You can use following fields in rocminfo. Node: 1 If Device Type is GPU, then device index should be Node-1.
|
Thanks @sampie for reaching out. |
Yes, the main issue is resolved and 6800M seems to work under rocm with all the code that I have run with it so far. However, I guess, if possible, it would be good to try to fix rocm in such a way that it would not crash if ROCR_VISIBLE_DEVICES is not set. It looks like rocminfo works just fine, so I wonder why hipGetDeviceProperties does crash. |
Hi
/opt/rocm/bin/rocminfo works just fine and shows GPUs, but when I try to make and run a simple helloworld program, I get a segmentation fault.
I have RDNA2 GPU (6800m). What could be the problem? I am not running kernel module from ROCm 5.0 package, but instead I am running vanilla Linux kernel 5.16.9. Has ROCm 5.0 kernel module been already incorporated to upstream Linux kernel?
--- The simple helloworld.cpp ---
#include
#include <hip/hip_runtime.h>
int main(int argc, char* argv[])
{
std::cout << "Before." << std::endl;
hipDeviceProp_t devProp;
hipGetDeviceProperties(&devProp, 0);
std::cout << "After." << std::endl;
return 0;
}
--- GDB showing the crash dump ---
Core was generated by `./helloworld'.
Program terminated with signal SIGSEGV, Segmentation fault.
warning: Section `.reg-xstate/887' in core file too small.
#0 0x00007f79e0257688 in rocr::image::ImageRuntime::GetImageInfoMaxDimension(hsa_agent_s, hsa_agent_info_t, void*) () from /opt/rocm-5.0.0/hip/lib/../../lib/libhsa-runtime64.so.1
[Current thread is 1 (Thread 0x7f79e00fdec0 (LWP 887))]
(gdb) bt
#0 0x00007f79e0257688 in rocr::image::ImageRuntime::GetImageInfoMaxDimension(hsa_agent_s, hsa_agent_info_t, void*) () from /opt/rocm-5.0.0/hip/lib/../../lib/libhsa-runtime64.so.1
#1 0x00007f79e02567b1 in rocr::image::hsa_amd_image_get_info_max_dim(hsa_agent_s, hsa_agent_info_t, void*) () from /opt/rocm-5.0.0/hip/lib/../../lib/libhsa-runtime64.so.1
#2 0x00007f79e0194679 in rocr::AMD::GpuAgent::GetInfo(hsa_agent_info_t, void*) const () from /opt/rocm-5.0.0/hip/lib/../../lib/libhsa-runtime64.so.1
#3 0x00007f79e01ba0ff in rocr::HSA::hsa_agent_get_info(hsa_agent_s, hsa_agent_info_t, void*) () from /opt/rocm-5.0.0/hip/lib/../../lib/libhsa-runtime64.so.1
#4 0x00007f79e0c45bb6 in ?? () from /opt/rocm-5.0.0/hip/lib/libamdhip64.so.5
#5 0x00007f79e0c4738d in ?? () from /opt/rocm-5.0.0/hip/lib/libamdhip64.so.5
#6 0x00007f79e0c47c79 in ?? () from /opt/rocm-5.0.0/hip/lib/libamdhip64.so.5
#7 0x00007f79e0c02fde in ?? () from /opt/rocm-5.0.0/hip/lib/libamdhip64.so.5
#8 0x00007f79e0c3af56 in ?? () from /opt/rocm-5.0.0/hip/lib/libamdhip64.so.5
#9 0x00007f79e0a7bdfe in ?? () from /opt/rocm-5.0.0/hip/lib/libamdhip64.so.5
#10 0x00007f79e0a9c365 in hipGetDeviceProperties () from /opt/rocm-5.0.0/hip/lib/libamdhip64.so.5
#11 0x0000000000201b09 in main (argc=, argv=) at helloworld.cpp:10
The text was updated successfully, but these errors were encountered: