Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core dump in polaris with rocm opencl version 6.0 #95

Closed
BishopWolf opened this issue Feb 10, 2024 · 6 comments
Closed

core dump in polaris with rocm opencl version 6.0 #95

BishopWolf opened this issue Feb 10, 2024 · 6 comments

Comments

@BishopWolf
Copy link

[alex@alex-b450aoruselite ~]$ clinfo -v
clinfo version 3.0.21.02.21
-------------------------------------------------------------------------------------------------------------------------------- 09:46:00
[alex@alex-b450aoruselite ~]$ clinfo -a
free(): invalid pointer
Aborted (core dumped)

Additional info

[alex@alex-b450aoruselite ~]$ rocminfo
ROCk module is loaded
=====================    
HSA System Attributes    
=====================    
Runtime Version:         1.1
System Timestamp Freq.:  1000.000000MHz
Sig. Max Wait Duration:  18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model:           LARGE                              
System Endianness:       LITTLE                             
Mwaitx:                  DISABLED
DMAbuf Support:          YES

==========               
HSA Agents               
==========               
*******                  
Agent 1                  
*******                  
  Name:                    AMD Ryzen 7 2700X Eight-Core Processor
  Uuid:                    CPU-XX                             
  Marketing Name:          AMD Ryzen 7 2700X Eight-Core Processor
  Vendor Name:             CPU                                
  Feature:                 None specified                     
  Profile:                 FULL_PROFILE                       
  Float Round Mode:        NEAR                               
  Max Queue Number:        0(0x0)                             
  Queue Min Size:          0(0x0)                             
  Queue Max Size:          0(0x0)                             
  Queue Type:              MULTI                              
  Node:                    0                                  
  Device Type:             CPU                                
  Cache Info:              
    L1:                      32768(0x8000) KB                   
  Chip ID:                 0(0x0)                             
  ASIC Revision:           0(0x0)                             
  Cacheline Size:          64(0x40)                           
  Max Clock Freq. (MHz):   3700                               
  BDFID:                   0                                  
  Internal Node ID:        0                                  
  Compute Unit:            16                                 
  SIMDs per CU:            0                                  
  Shader Engines:          0                                  
  Shader Arrs. per Eng.:   0                                  
  WatchPts on Addr. Ranges:1                                  
  Features:                None
  Pool Info:               
    Pool 1                   
      Segment:                 GLOBAL; FLAGS: FINE GRAINED        
      Size:                    32785788(0x1f4457c) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
    Pool 2                   
      Segment:                 GLOBAL; FLAGS: KERNARG, FINE GRAINED
      Size:                    32785788(0x1f4457c) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
    Pool 3                   
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
      Size:                    32785788(0x1f4457c) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
  ISA Info:                
*******                  
Agent 2                  
*******                  
  Name:                    gfx803                             
  Uuid:                    GPU-XX                             
  Marketing Name:          AMD Radeon RX 570 Series           
  Vendor Name:             AMD                                
  Feature:                 KERNEL_DISPATCH                    
  Profile:                 BASE_PROFILE                       
  Float Round Mode:        NEAR                               
  Max Queue Number:        128(0x80)                          
  Queue Min Size:          64(0x40)                           
  Queue Max Size:          131072(0x20000)                    
  Queue Type:              MULTI                              
  Node:                    1                                  
  Device Type:             GPU                                
  Cache Info:              
    L1:                      16(0x10) KB                        
  Chip ID:                 26591(0x67df)                      
  ASIC Revision:           1(0x1)                             
  Cacheline Size:          64(0x40)                           
  Max Clock Freq. (MHz):   1286                               
  BDFID:                   2048                               
  Internal Node ID:        1                                  
  Compute Unit:            32                                 
  SIMDs per CU:            4                                  
  Shader Engines:          4                                  
  Shader Arrs. per Eng.:   1                                  
  WatchPts on Addr. Ranges:4                                  
  Coherent Host Access:    FALSE                              
  Features:                KERNEL_DISPATCH 
  Fast F16 Operation:      TRUE                               
  Wavefront Size:          64(0x40)                           
  Workgroup Max Size:      1024(0x400)                        
  Workgroup Max Size per Dimension:
    x                        1024(0x400)                        
    y                        1024(0x400)                        
    z                        1024(0x400)                        
  Max Waves Per CU:        40(0x28)                           
  Max Work-item Per CU:    2560(0xa00)                        
  Grid Max Size:           4294967295(0xffffffff)             
  Grid Max Size per Dimension:
    x                        4294967295(0xffffffff)             
    y                        4294967295(0xffffffff)             
    z                        4294967295(0xffffffff)             
  Max fbarriers/Workgrp:   32                                 
  Packet Processor uCode:: 730                                
  SDMA engine uCode::      58                                 
  IOMMU Support::          None                               
  Pool Info:               
    Pool 1                   
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
      Size:                    8388608(0x800000) KB               
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       FALSE                              
    Pool 2                   
      Segment:                 GLOBAL; FLAGS: EXTENDED FINE GRAINED
      Size:                    8388608(0x800000) KB               
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       FALSE                              
    Pool 3                   
      Segment:                 GROUP                              
      Size:                    64(0x40) KB                        
      Allocatable:             FALSE                              
      Alloc Granule:           0KB                                
      Alloc Alignment:         0KB                                
      Accessible by all:       FALSE                              
  ISA Info:                
    ISA 1                    
      Name:                    amdgcn-amd-amdhsa--gfx803          
      Machine Models:          HSA_MACHINE_MODEL_LARGE            
      Profiles:                HSA_PROFILE_BASE                   
      Default Rounding Mode:   NEAR                               
      Default Rounding Mode:   NEAR                               
      Fast f16:                TRUE                               
      Workgroup Max Size:      1024(0x400)                        
      Workgroup Max Size per Dimension:
        x                        1024(0x400)                        
        y                        1024(0x400)                        
        z                        1024(0x400)                        
      Grid Max Size:           4294967295(0xffffffff)             
      Grid Max Size per Dimension:
        x                        4294967295(0xffffffff)             
        y                        4294967295(0xffffffff)             
        z                        4294967295(0xffffffff)             
      FBarrier Max Size:       32                                 
*** Done ***             
@Oblomov
Copy link
Owner

Oblomov commented Feb 10, 2024

can you please run clinfo under gdb to see if the issue is happening inside clinfo or the AMD driver?

@BishopWolf
Copy link
Author

@BishopWolf
Copy link
Author

I am in Manjaro btw

@Oblomov
Copy link
Owner

Oblomov commented Feb 10, 2024

I'm sorry, that's not the debug info I was looking for. Can you try gdb --args clinfo and then inside gdb run and when the program aborts do a bt to show the backtrace? That should give the information about where the issue happens.

@BishopWolf
Copy link
Author

BishopWolf commented Feb 10, 2024

Here it is

$ gdb --args clinfo
GNU gdb (GDB) 14.1
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from clinfo...

This GDB supports auto-downloading debuginfo from the following URLs:
  <https://debuginfod.archlinux.org>
Enable debuginfod for this session? (y or [n]) y
Debuginfod has been enabled.
To make this setting permanent, add 'set debuginfod enabled on' to .gdbinit.
Downloading separate debug info for /usr/bin/clinfo
(No debugging symbols found in clinfo)                                                                                                   
(gdb) run
Starting program: /usr/bin/clinfo 
Downloading separate debug info for /lib64/ld-linux-x86-64.so.2
Downloading separate debug info for system-supplied DSO at 0x7ffff7fc6000                                                                
Downloading separate debug info for /home/alex/.local/lib/python3.11/site-packages/opengate_core.libs/libG4processes-92d9f07e.so         
Downloading separate debug info for /home/alex/.local/lib/python3.11/site-packages/opengate_core.libs/libG4geometry-0c687114.so          
Downloading separate debug info for /opt/rocm/lib/libOpenCL.so.1                                                                         
Downloading separate debug info for /usr/lib/libdl.so.2                                                                                  
Downloading separate debug info for /usr/lib/libc.so.6       
[Thread debugging using libthread_db enabled]                                                                                            
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Downloading separate debug info for /usr/lib/libexpat.so.1
Downloading separate debug info for /home/alex/.local/lib/python3.11/site-packages/opengate_core.libs/libG4analysis-58c18685.so          
Downloading separate debug info for /home/alex/.local/lib/python3.11/site-packages/opengate_core.libs/libG4digits_hits-49ed91d8.so       
Downloading separate debug info for /home/alex/.local/lib/python3.11/site-packages/opengate_core.libs/libG4track-0f17a637.so             
Downloading separate debug info for /home/alex/.local/lib/python3.11/site-packages/opengate_core.libs/libG4materials-0a3fe0e7.so         
Downloading separate debug info for /home/alex/.local/lib/python3.11/site-packages/opengate_core.libs/libG4zlib-8a92f4d5.so              
Downloading separate debug info for /home/alex/.local/lib/python3.11/site-packages/opengate_core.libs/libG4particles-a2ab1706.so         
Downloading separate debug info for /home/alex/.local/lib/python3.11/site-packages/opengate_core.libs/libG4graphics_reps-f3db01e8.so     
Downloading separate debug info for /home/alex/.local/lib/python3.11/site-packages/opengate_core.libs/libG4intercoms-d770af8d.so         
Downloading separate debug info for /home/alex/.local/lib/python3.11/site-packages/opengate_core.libs/libG4global-a19c55f2.so            
Downloading separate debug info for /home/alex/.local/lib/python3.11/site-packages/opengate_core.libs/libG4clhep-170e676c.so             
Downloading separate debug info for /home/alex/.local/lib/python3.11/site-packages/opengate_core.libs/libG4ptl-124ac18e.so.2.3.3         
Downloading separate debug info for /usr/lib/libm.so.6                                                                                   
Downloading separate debug info for /usr/lib/libpthread.so.0                                                                             
Downloading separate debug info for /opt/rocm/lib/libamdocl64.so                                                                         
Downloading separate debug info for /opt/rocm/lib/libamd_comgr.so.2                                                                      
Downloading separate debug info for /opt/rocm/lib/libhsa-runtime64.so.1                                                                  
Downloading separate debug info for /usr/lib/libnuma.so.1                                                                                
Downloading separate debug info for /usr/lib/libz.so.1                                                                                   
Downloading separate debug info for /usr/lib/libzstd.so.1                                                                                
Downloading separate debug info for /usr/lib/libncursesw.so.6                                                                            
Downloading separate debug info for /opt/rocm/lib/libhsakmt.so.1                                                                         
Downloading separate debug info for /usr/lib/libelf.so.1                                                                                 
Downloading separate debug info for /usr/lib/libdrm.so.2                                                                                 
Downloading separate debug info for /usr/lib/libdrm_amdgpu.so.1                                                                          
[New Thread 0x7fffe9bff6c0 (LWP 839783)]                                                                                                 
[New Thread 0x7fffe93fe6c0 (LWP 839784)]
free(): invalid pointer
[Thread 0x7fffe93fe6c0 (LWP 839784) exited]

Thread 1 "clinfo" received signal SIGABRT, Aborted.
0x00007ffff5aab32c in ?? () from /usr/lib/libc.so.6
(gdb) bt
#0  0x00007ffff5aab32c in ?? () from /usr/lib/libc.so.6
#1  0x00007ffff5a5a6c8 in raise () from /usr/lib/libc.so.6
#2  0x00007ffff5a424b8 in abort () from /usr/lib/libc.so.6
#3  0x00007ffff5a43395 in ?? () from /usr/lib/libc.so.6
#4  0x00007ffff5ab52a7 in ?? () from /usr/lib/libc.so.6
#5  0x00007ffff5ab75b4 in ?? () from /usr/lib/libc.so.6
#6  0x00007ffff5ab9e93 in free () from /usr/lib/libc.so.6
#7  0x00007ffff48aeaba in operator delete (ptr=<optimized out>) at /usr/src/debug/gcc/gcc/libstdc++-v3/libsupc++/del_op.cc:49
#8  0x00007ffff48aeaea in operator delete[] (ptr=<optimized out>) at /usr/src/debug/gcc/gcc/libstdc++-v3/libsupc++/del_opv.cc:35
#9  0x00007ffff451faad in roc::Device::~Device (this=<optimized out>, this=<optimized out>)
    at /usr/src/debug/rocm-opencl-runtime/clr-rocm-6.0.0/rocclr/device/rocm/rocdevice.cpp:279
#10 0x00007ffff45a127d in roc::Device::~Device (this=<optimized out>, this=<optimized out>)
    at /usr/src/debug/rocm-opencl-runtime/clr-rocm-6.0.0/rocclr/device/rocm/rocdevice.cpp:290
#11 std::default_delete<roc::Device>::operator() (__ptr=0x5555556f43a0, this=<optimized out>)
    at /usr/include/c++/13.2.1/bits/unique_ptr.h:99
#12 std::unique_ptr<roc::Device, std::default_delete<roc::Device> >::~unique_ptr (this=<optimized out>, this=<optimized out>)
    at /usr/include/c++/13.2.1/bits/unique_ptr.h:404
#13 roc::Device::init () at /usr/src/debug/rocm-opencl-runtime/clr-rocm-6.0.0/rocclr/device/rocm/rocdevice.cpp:530
#14 amd::Device::init () at /usr/src/debug/rocm-opencl-runtime/clr-rocm-6.0.0/rocclr/device/device.cpp:488
#15 amd::Runtime::init() [clone .isra.0] () at /usr/src/debug/rocm-opencl-runtime/clr-rocm-6.0.0/rocclr/platform/runtime.cpp:75
#16 0x00007ffff44ef2cd in std::once_flag::_Prepare_execution::_Prepare_execution<std::call_once<clIcdGetPlatformIDsKHR::{lambda()#1}>(std::once_flag&, clIcdGetPlatformIDsKHR::{lambda()#1}&&)::{lambda()#1}>(clIcdGetPlatformIDsKHR::{lambda()#1}&)::{lambda()#1}::_FUN() ()
    at /usr/src/debug/rocm-opencl-runtime/clr-rocm-6.0.0/opencl/amdocl/cl_icd.cpp:224
#17 0x00007ffff5aae6af in ?? () from /usr/lib/libc.so.6
#18 0x00007ffff44ef100 in __gthread_once (__func=<optimized out>, __once=0x7ffff45f2f70 <clIcdGetPlatformIDsKHR::initOnce>)
    at /usr/include/c++/13.2.1/x86_64-pc-linux-gnu/bits/gthr-default.h:700
#19 std::call_once<clIcdGetPlatformIDsKHR(cl_uint, _cl_platform_id**, cl_uint*)::<lambda()> > (__once=..., __f=...)
    at /usr/include/c++/13.2.1/mutex:907
#20 clIcdGetPlatformIDsKHR (num_entries=num_entries@entry=0, platforms=platforms@entry=0x0, 
    num_platforms=num_platforms@entry=0x7fffffffd4dc) at /usr/src/debug/rocm-opencl-runtime/clr-rocm-6.0.0/opencl/amdocl/cl_icd.cpp:274
#21 0x00007ffff7fb94d4 in khrIcdVendorAdd (libraryName=0x5555555eec20 "/opt/rocm/lib/libamdocl64.so")
    at /usr/src/debug/rocm-opencl-runtime/clr-rocm-6.0.0/opencl/khronos/icd/loader/icd.c:107
#22 0x00007ffff7fb9980 in khrIcdOsVendorsEnumerate ()
    at /usr/src/debug/rocm-opencl-runtime/clr-rocm-6.0.0/opencl/khronos/icd/loader/linux/icd_linux.c:129
#23 0x00007ffff5aae6af in ?? () from /usr/lib/libc.so.6
#24 0x00007ffff7fb9a45 in khrIcdOsVendorsEnumerateOnce ()
    at /usr/src/debug/rocm-opencl-runtime/clr-rocm-6.0.0/opencl/khronos/icd/loader/linux/icd_linux.c:163
#25 khrIcdInitialize () at /usr/src/debug/rocm-opencl-runtime/clr-rocm-6.0.0/opencl/khronos/icd/loader/icd.c:35
#26 clGetPlatformIDs (num_entries=0, platforms=0x0, num_platforms=0x7fffffffd630)
    at /usr/src/debug/rocm-opencl-runtime/clr-rocm-6.0.0/opencl/khronos/icd/loader/icd_dispatch.c:38
#27 0x000055555555b1e6 in ?? ()
#28 0x00007ffff5a43cd0 in ?? () from /usr/lib/libc.so.6
#29 0x00007ffff5a43d8a in __libc_start_main () from /usr/lib/libc.so.6
#30 0x000055555555b8be in ?? ()
(gdb) Quit
       

@BishopWolf BishopWolf changed the title core dump in polaris with version 6.0 core dump in polaris with rocm opencl version 6.0 Feb 10, 2024
@Oblomov
Copy link
Owner

Oblomov commented Feb 10, 2024

From the backtrace, the issue is in the rocm driver, that is trying to delete an array that was probably not initialized (possibly because the GPU is not supported for OpenCL, and the driver doesn't properly take it into account? don't know). You should report this issue to rocm. You could try exporting the environment variable ROC_ENABLE_PRE_VEGA=1 to see if this forces the rocm driver to use the polaris GPU. In either case though, this isn't an issue in clinfo but in the AMD OpenCL platform. Not much I can do on this side, I'm afraid.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants