Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hiprtcCompileProgram(): HIPRTC_ERROR_COMPILATION #3918

Closed
Demontager opened this issue Dec 9, 2023 · 6 comments
Closed

hiprtcCompileProgram(): HIPRTC_ERROR_COMPILATION #3918

Demontager opened this issue Dec 9, 2023 · 6 comments

Comments

@Demontager
Copy link

Hi, i have tried to run currently the latest hashcat (v6.2.6-846-g4d412c8e0) on Ubuntu 20.04 and 22.04 and faced the same problem to run benchmarks.

$ sudo ./hashcat -m 5600 -b
hashcat (v6.2.6-846-g4d412c8e0) starting in benchmark mode

Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.
Note: Using optimized kernel code limits the maximum supported password length.
To disable the optimized kernel code in benchmark mode, use the -w option.

HIP API (HIP 5.7.31921)
=======================
* Device #1: AMD Radeon RX 6800, 16334/16368 MB, 30MCU

OpenCL API (OpenCL 2.1 AMD-APP (3590.0)) - Platform #1 [Advanced Micro Devices, Inc.]
=====================================================================================
* Device #2: AMD Radeon RX 6800, skipped

Benchmark relevant options:
===========================
* --backend-devices-virtual=1
* --optimized-kernel-enable

----------------------------
* Hash-Mode 5600 (NetNTLMv2)
----------------------------

hiprtcCompileProgram(): HIPRTC_ERROR_COMPILATION

lld: error: undefined hidden symbol: __ockl_get_group_id
>>> referenced by /home/dem/Documents/hashcat/comgr-d28482/input/LLVMBitcode.bc.o:(gpu_decompress)
>>> referenced by /home/dem/Documents/hashcat/comgr-d28482/input/LLVMBitcode.bc.o:(gpu_decompress)
>>> referenced by /home/dem/Documents/hashcat/comgr-d28482/input/LLVMBitcode.bc.o:(gpu_memset)
>>> referenced 7 more times

lld: error: undefined hidden symbol: __ockl_get_local_size
>>> referenced by /home/dem/Documents/hashcat/comgr-d28482/input/LLVMBitcode.bc.o:(gpu_decompress)
>>> referenced by /home/dem/Documents/hashcat/comgr-d28482/input/LLVMBitcode.bc.o:(gpu_decompress)
>>> referenced by /home/dem/Documents/hashcat/comgr-d28482/input/LLVMBitcode.bc.o:(gpu_memset)
>>> referenced 7 more times

lld: error: undefined hidden symbol: __ockl_get_local_id
>>> referenced by /home/dem/Documents/hashcat/comgr-d28482/input/LLVMBitcode.bc.o:(gpu_decompress)
>>> referenced by /home/dem/Documents/hashcat/comgr-d28482/input/LLVMBitcode.bc.o:(gpu_decompress)
>>> referenced by /home/dem/Documents/hashcat/comgr-d28482/input/LLVMBitcode.bc.o:(gpu_memset)
>>> referenced 7 more times

* Device #1: Kernel /home/dem/Documents/hashcat/OpenCL/shared.cl build failed.

* Device #1: Kernel /home/dem/Documents/hashcat/OpenCL/shared.cl build failed.

Started: Sat Dec  9 17:57:13 2023
Stopped: Sat Dec  9 17:57:17 2023

To Reproduce
wget https://repo.radeon.com/amdgpu-install/5.7.2/ubuntu/focal/amdgpu-install_5.7.50702-1_all.deb
sudo dpkg -i amdgpu-install_5.7.50702-1_all.deb
sudo amdgpu-install --opencl=rocr

Hardware/Compute device (please complete the following information):
AMD Radeon RX 6800

Hashcat version (please complete the following information):

  • OS: [Linux]
  • Distribution: [Ubuntu 20.04]
  • Version: [v6.2.6-846-g4d412c8e0]

Diagnostic output compute devices:

$ sudo rocminfo 
ROCk module is loaded
=====================    
HSA System Attributes    
=====================    
Runtime Version:         1.1
System Timestamp Freq.:  1000.000000MHz
Sig. Max Wait Duration:  18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model:           LARGE                              
System Endianness:       LITTLE                             
Mwaitx:                  DISABLED
DMAbuf Support:          YES

==========               
HSA Agents               
==========               
*******                  
Agent 1                  
*******                  
  Name:                    Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz
  Uuid:                    CPU-XX                             
  Marketing Name:          Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz
  Vendor Name:             CPU                                
  Feature:                 None specified                     
  Profile:                 FULL_PROFILE                       
  Float Round Mode:        NEAR                               
  Max Queue Number:        0(0x0)                             
  Queue Min Size:          0(0x0)                             
  Queue Max Size:          0(0x0)                             
  Queue Type:              MULTI                              
  Node:                    0                                  
  Device Type:             CPU                                
  Cache Info:              
    L1:                      32768(0x8000) KB                   
  Chip ID:                 0(0x0)                             
  ASIC Revision:           0(0x0)                             
  Cacheline Size:          64(0x40)                           
  Max Clock Freq. (MHz):   3600                               
  BDFID:                   0                                  
  Internal Node ID:        0                                  
  Compute Unit:            20                                 
  SIMDs per CU:            0                                  
  Shader Engines:          0                                  
  Shader Arrs. per Eng.:   0                                  
  WatchPts on Addr. Ranges:1                                  
  Features:                None
  Pool Info:               
    Pool 1                   
      Segment:                 GLOBAL; FLAGS: FINE GRAINED        
      Size:                    65845376(0x3ecb880) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
    Pool 2                   
      Segment:                 GLOBAL; FLAGS: KERNARG, FINE GRAINED
      Size:                    65845376(0x3ecb880) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
    Pool 3                   
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
      Size:                    65845376(0x3ecb880) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
  ISA Info:                
*******                  
Agent 2                  
*******                  
  Name:                    Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz
  Uuid:                    CPU-XX                             
  Marketing Name:          Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz
  Vendor Name:             CPU                                
  Feature:                 None specified                     
  Profile:                 FULL_PROFILE                       
  Float Round Mode:        NEAR                               
  Max Queue Number:        0(0x0)                             
  Queue Min Size:          0(0x0)                             
  Queue Max Size:          0(0x0)                             
  Queue Type:              MULTI                              
  Node:                    1                                  
  Device Type:             CPU                                
  Cache Info:              
    L1:                      32768(0x8000) KB                   
  Chip ID:                 0(0x0)                             
  ASIC Revision:           0(0x0)                             
  Cacheline Size:          64(0x40)                           
  Max Clock Freq. (MHz):   3600                               
  BDFID:                   0                                  
  Internal Node ID:        1                                  
  Compute Unit:            20                                 
  SIMDs per CU:            0                                  
  Shader Engines:          0                                  
  Shader Arrs. per Eng.:   0                                  
  WatchPts on Addr. Ranges:1                                  
  Features:                None
  Pool Info:               
    Pool 1                   
      Segment:                 GLOBAL; FLAGS: FINE GRAINED        
      Size:                    66033928(0x3ef9908) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
    Pool 2                   
      Segment:                 GLOBAL; FLAGS: KERNARG, FINE GRAINED
      Size:                    66033928(0x3ef9908) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
    Pool 3                   
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
      Size:                    66033928(0x3ef9908) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
  ISA Info:                
*******                  
Agent 3                  
*******                  
  Name:                    gfx1030                            
  Uuid:                    GPU-fbceebc280595e18               
  Marketing Name:          AMD Radeon RX 6800                 
  Vendor Name:             AMD                                
  Feature:                 KERNEL_DISPATCH                    
  Profile:                 BASE_PROFILE                       
  Float Round Mode:        NEAR                               
  Max Queue Number:        128(0x80)                          
  Queue Min Size:          64(0x40)                           
  Queue Max Size:          131072(0x20000)                    
  Queue Type:              MULTI                              
  Node:                    2                                  
  Device Type:             GPU                                
  Cache Info:              
    L1:                      16(0x10) KB                        
  Chip ID:                 29631(0x73bf)                      
  ASIC Revision:           1(0x1)                             
  Cacheline Size:          64(0x40)                           
  Max Clock Freq. (MHz):   2475                               
  BDFID:                   1024                               
  Internal Node ID:        2                                  
  Compute Unit:            60                                 
  SIMDs per CU:            2                                  
  Shader Engines:          4                                  
  Shader Arrs. per Eng.:   2                                  
  WatchPts on Addr. Ranges:4                                  
  Features:                KERNEL_DISPATCH 
  Fast F16 Operation:      TRUE                               
  Wavefront Size:          32(0x20)                           
  Workgroup Max Size:      1024(0x400)                        
  Workgroup Max Size per Dimension:
    x                        1024(0x400)                        
    y                        1024(0x400)                        
    z                        1024(0x400)                        
  Max Waves Per CU:        32(0x20)                           
  Max Work-item Per CU:    1024(0x400)                        
  Grid Max Size:           4294967295(0xffffffff)             
  Grid Max Size per Dimension:
    x                        4294967295(0xffffffff)             
    y                        4294967295(0xffffffff)             
    z                        4294967295(0xffffffff)             
  Max fbarriers/Workgrp:   32                                 
  Packet Processor uCode:: 115                                
  SDMA engine uCode::      83                                 
  IOMMU Support::          None                               
  Pool Info:               
    Pool 1                   
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
      Size:                    16760832(0xffc000) KB              
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       FALSE                              
    Pool 2                   
      Segment:                 GLOBAL; FLAGS:                     
      Size:                    16760832(0xffc000) KB              
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       FALSE                              
    Pool 3                   
      Segment:                 GROUP                              
      Size:                    64(0x40) KB                        
      Allocatable:             FALSE                              
      Alloc Granule:           0KB                                
      Alloc Alignment:         0KB                                
      Accessible by all:       FALSE                              
  ISA Info:                
    ISA 1                    
      Name:                    amdgcn-amd-amdhsa--gfx1030         
      Machine Models:          HSA_MACHINE_MODEL_LARGE            
      Profiles:                HSA_PROFILE_BASE                   
      Default Rounding Mode:   NEAR                               
      Default Rounding Mode:   NEAR                               
      Fast f16:                TRUE                               
      Workgroup Max Size:      1024(0x400)                        
      Workgroup Max Size per Dimension:
        x                        1024(0x400)                        
        y                        1024(0x400)                        
        z                        1024(0x400)                        
      Grid Max Size:           4294967295(0xffffffff)             
      Grid Max Size per Dimension:
        x                        4294967295(0xffffffff)             
        y                        4294967295(0xffffffff)             
        z                        4294967295(0xffffffff)             
      FBarrier Max Size:       32                                 
*** Done ***             

$ clinfo
Number of platforms:				 1
  Platform Profile:				 FULL_PROFILE
  Platform Version:				 OpenCL 2.1 AMD-APP (3590.0)
  Platform Name:				 AMD Accelerated Parallel Processing
  Platform Vendor:				 Advanced Micro Devices, Inc.
  Platform Extensions:				 cl_khr_icd cl_amd_event_callback 


  Platform Name:				 AMD Accelerated Parallel Processing
Number of devices:				 0
$ sudo ./hashcat -I 
hashcat (v6.2.6-846-g4d412c8e0) starting in backend information mode

HIP Info:
=========

HIP.Version.: 5.7.31921

Backend Device ID #1 (Alias: #2)
  Name...........: AMD Radeon RX 6800
  Processor(s)...: 30
  Clock..........: 2475
  Memory.Total...: 16368 MB
  Memory.Free....: 16334 MB
  Local.Memory...: 64 KB
  PCI.Addr.BDFe..: 0000:04:00.0

OpenCL Info:
============

OpenCL Platform ID #1
  Vendor..: Advanced Micro Devices, Inc.
  Name....: AMD Accelerated Parallel Processing
  Version.: OpenCL 2.1 AMD-APP (3590.0)

  Backend Device ID #2 (Alias: #1)
    Type...........: GPU
    Vendor.ID......: 1
    Vendor.........: Advanced Micro Devices, Inc.
    Name...........: AMD Radeon RX 6800
    Version........: OpenCL 2.0 
    Processor(s)...: 30
    Clock..........: 2475
    Memory.Total...: 16368 MB (limited to 13912 MB allocatable in one block)
    Memory.Free....: 16256 MB
    Local.Memory...: 64 KB
    OpenCL.Version.: OpenCL C 2.0 
    Driver.Version.: 3590.0 (HSA1.1,LC)
    PCI.Addr.BDF...: 04:00.0
@Demontager
Copy link
Author

I have used other Ubuntu Server 22.04.3 LTS setup and it worked now.
Used these commands to install amd drivers and rocr

wget https://repo.radeon.com/amdgpu-install/5.7.2/ubuntu/jammy/amdgpu-install_5.7.50702-1_all.deb
sudo apt install ./amdgpu-install_5.7.50702-1_all.deb 
sudo amdgpu-install --opencl=rocr --no-dkms --no-32

Kernel

uname -a
Linux xeon 5.15.0-91-generic #101-Ubuntu SMP Tue Nov 14 13:30:08 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

rocminfo

~$ sudo rocminfo 
ROCk module is loaded
=====================    
HSA System Attributes    
=====================    
Runtime Version:         1.1
System Timestamp Freq.:  1000.000000MHz
Sig. Max Wait Duration:  18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model:           LARGE                              
System Endianness:       LITTLE                             
Mwaitx:                  DISABLED
DMAbuf Support:          NO

==========               
HSA Agents               
==========               
*******                  
Agent 1                  
*******                  
  Name:                    Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz
  Uuid:                    CPU-XX                             
  Marketing Name:          Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz
  Vendor Name:             CPU                                
  Feature:                 None specified                     
  Profile:                 FULL_PROFILE                       
  Float Round Mode:        NEAR                               
  Max Queue Number:        0(0x0)                             
  Queue Min Size:          0(0x0)                             
  Queue Max Size:          0(0x0)                             
  Queue Type:              MULTI                              
  Node:                    0                                  
  Device Type:             CPU                                
  Cache Info:              
    L1:                      32768(0x8000) KB                   
  Chip ID:                 0(0x0)                             
  ASIC Revision:           0(0x0)                             
  Cacheline Size:          64(0x40)                           
  Max Clock Freq. (MHz):   3600                               
  BDFID:                   0                                  
  Internal Node ID:        0                                  
  Compute Unit:            20                                 
  SIMDs per CU:            0                                  
  Shader Engines:          0                                  
  Shader Arrs. per Eng.:   0                                  
  WatchPts on Addr. Ranges:1                                  
  Features:                None
  Pool Info:               
    Pool 1                   
      Segment:                 GLOBAL; FLAGS: FINE GRAINED        
      Size:                    65893880(0x3ed75f8) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
    Pool 2                   
      Segment:                 GLOBAL; FLAGS: KERNARG, FINE GRAINED
      Size:                    65893880(0x3ed75f8) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
    Pool 3                   
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
      Size:                    65893880(0x3ed75f8) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
  ISA Info:                
*******                  
Agent 2                  
*******                  
  Name:                    Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz
  Uuid:                    CPU-XX                             
  Marketing Name:          Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz
  Vendor Name:             CPU                                
  Feature:                 None specified                     
  Profile:                 FULL_PROFILE                       
  Float Round Mode:        NEAR                               
  Max Queue Number:        0(0x0)                             
  Queue Min Size:          0(0x0)                             
  Queue Max Size:          0(0x0)                             
  Queue Type:              MULTI                              
  Node:                    1                                  
  Device Type:             CPU                                
  Cache Info:              
    L1:                      32768(0x8000) KB                   
  Chip ID:                 0(0x0)                             
  ASIC Revision:           0(0x0)                             
  Cacheline Size:          64(0x40)                           
  Max Clock Freq. (MHz):   3600                               
  BDFID:                   0                                  
  Internal Node ID:        1                                  
  Compute Unit:            20                                 
  SIMDs per CU:            0                                  
  Shader Engines:          0                                  
  Shader Arrs. per Eng.:   0                                  
  WatchPts on Addr. Ranges:1                                  
  Features:                None
  Pool Info:               
    Pool 1                   
      Segment:                 GLOBAL; FLAGS: FINE GRAINED        
      Size:                    65985328(0x3eedb30) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
    Pool 2                   
      Segment:                 GLOBAL; FLAGS: KERNARG, FINE GRAINED
      Size:                    65985328(0x3eedb30) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
    Pool 3                   
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
      Size:                    65985328(0x3eedb30) KB             
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
  ISA Info:                
*******                  
Agent 3                  
*******                  
  Name:                    gfx1030                            
  Uuid:                    GPU-XX                             
  Marketing Name:          AMD Radeon RX 6800                 
  Vendor Name:             AMD                                
  Feature:                 KERNEL_DISPATCH                    
  Profile:                 BASE_PROFILE                       
  Float Round Mode:        NEAR                               
  Max Queue Number:        128(0x80)                          
  Queue Min Size:          64(0x40)                           
  Queue Max Size:          131072(0x20000)                    
  Queue Type:              MULTI                              
  Node:                    2                                  
  Device Type:             GPU                                
  Cache Info:              
    L1:                      16(0x10) KB                        
  Chip ID:                 29631(0x73bf)                      
  ASIC Revision:           1(0x1)                             
  Cacheline Size:          64(0x40)                           
  Max Clock Freq. (MHz):   2475                               
  BDFID:                   1024                               
  Internal Node ID:        2                                  
  Compute Unit:            60                                 
  SIMDs per CU:            2                                  
  Shader Engines:          4                                  
  Shader Arrs. per Eng.:   2                                  
  WatchPts on Addr. Ranges:4                                  
  Features:                KERNEL_DISPATCH 
  Fast F16 Operation:      TRUE                               
  Wavefront Size:          32(0x20)                           
  Workgroup Max Size:      1024(0x400)                        
  Workgroup Max Size per Dimension:
    x                        1024(0x400)                        
    y                        1024(0x400)                        
    z                        1024(0x400)                        
  Max Waves Per CU:        32(0x20)                           
  Max Work-item Per CU:    1024(0x400)                        
  Grid Max Size:           4294967295(0xffffffff)             
  Grid Max Size per Dimension:
    x                        4294967295(0xffffffff)             
    y                        4294967295(0xffffffff)             
    z                        4294967295(0xffffffff)             
  Max fbarriers/Workgrp:   32                                 
  Packet Processor uCode:: 94                                 
  SDMA engine uCode::      81                                 
  IOMMU Support::          None                               
  Pool Info:               
    Pool 1                   
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
      Size:                    16760832(0xffc000) KB              
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       FALSE                              
    Pool 2                   
      Segment:                 GLOBAL; FLAGS:                     
      Size:                    16760832(0xffc000) KB              
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       FALSE                              
    Pool 3                   
      Segment:                 GROUP                              
      Size:                    64(0x40) KB                        
      Allocatable:             FALSE                              
      Alloc Granule:           0KB                                
      Alloc Alignment:         0KB                                
      Accessible by all:       FALSE                              
  ISA Info:                
    ISA 1                    
      Name:                    amdgcn-amd-amdhsa--gfx1030         
      Machine Models:          HSA_MACHINE_MODEL_LARGE            
      Profiles:                HSA_PROFILE_BASE                   
      Default Rounding Mode:   NEAR                               
      Default Rounding Mode:   NEAR                               
      Fast f16:                TRUE                               
      Workgroup Max Size:      1024(0x400)                        
      Workgroup Max Size per Dimension:
        x                        1024(0x400)                        
        y                        1024(0x400)                        
        z                        1024(0x400)                        
      Grid Max Size:           4294967295(0xffffffff)             
      Grid Max Size per Dimension:
        x                        4294967295(0xffffffff)             
        y                        4294967295(0xffffffff)             
        z                        4294967295(0xffffffff)             
      FBarrier Max Size:       32                                 
*** Done ***             

Benchmark

./hashcat -b
hashcat (v6.2.6-846-g4d412c8e0) starting in benchmark mode

Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.
Note: Using optimized kernel code limits the maximum supported password length.
To disable the optimized kernel code in benchmark mode, use the -w option.

hipMemGetInfo(): 1

HIP API (HIP 5.7.31921)
=======================
* Device #1: AMD Radeon RX 6800, skipped

OpenCL API (OpenCL 2.1 AMD-APP (3590.0)) - Platform #1 [Advanced Micro Devices, Inc.]
=====================================================================================
* Device #2: AMD Radeon RX 6800, 16256/16368 MB (13912 MB allocatable), 30MCU

Benchmark relevant options:
===========================
* --backend-devices-virtual=1
* --optimized-kernel-enable

-------------------
* Hash-Mode 0 (MD5)
-------------------

Speed.#2.........: 42241.8 MH/s (23.28ms) @ Accel:128 Loops:1024 Thr:256 Vec:1

----------------------
* Hash-Mode 100 (SHA1)
----------------------

Speed.#2.........: 16414.5 MH/s (60.77ms) @ Accel:128 Loops:1024 Thr:256 Vec:1

---------------------------
* Hash-Mode 1400 (SHA2-256)
---------------------------

Speed.#2.........:  7015.9 MH/s (71.11ms) @ Accel:64 Loops:1024 Thr:256 Vec:1

---------------------------
* Hash-Mode 1700 (SHA2-512)
---------------------------

Speed.#2.........:  1638.0 MH/s (76.20ms) @ Accel:32 Loops:512 Thr:256 Vec:1

-------------------------------------------------------------
* Hash-Mode 22000 (WPA-PBKDF2-PMKID+EAPOL) [Iterations: 4095]
-------------------------------------------------------------

Speed.#2.........:   832.9 kH/s (72.22ms) @ Accel:64 Loops:512 Thr:256 Vec:1

-----------------------
* Hash-Mode 1000 (NTLM)
-----------------------

Speed.#2.........: 75808.8 MH/s (12.70ms) @ Accel:256 Loops:1024 Thr:128 Vec:1

---------------------
* Hash-Mode 3000 (LM)
---------------------

Speed.#2.........: 42900.4 MH/s (22.84ms) @ Accel:256 Loops:1024 Thr:128 Vec:1

--------------------------------------------
* Hash-Mode 5500 (NetNTLMv1 / NetNTLMv1+ESS)
--------------------------------------------

Speed.#2.........: 48369.9 MH/s (20.28ms) @ Accel:128 Loops:1024 Thr:256 Vec:1

----------------------------
* Hash-Mode 5600 (NetNTLMv2)
----------------------------

Speed.#2.........:  2926.1 MH/s (85.36ms) @ Accel:64 Loops:512 Thr:256 Vec:1

--------------------------------------------------------
* Hash-Mode 1500 (descrypt, DES (Unix), Traditional DES)
--------------------------------------------------------

Speed.#2.........:  1607.1 MH/s (77.62ms) @ Accel:16 Loops:1024 Thr:256 Vec:1

------------------------------------------------------------------------------
* Hash-Mode 500 (md5crypt, MD5 (Unix), Cisco-IOS $1$ (MD5)) [Iterations: 1000]
------------------------------------------------------------------------------

Speed.#2.........: 11888.6 kH/s (75.71ms) @ Accel:128 Loops:1000 Thr:256 Vec:1

----------------------------------------------------------------
* Hash-Mode 3200 (bcrypt $2*$, Blowfish (Unix)) [Iterations: 32]
----------------------------------------------------------------

Speed.#2.........:    42646 H/s (86.13ms) @ Accel:32 Loops:8 Thr:16 Vec:1

--------------------------------------------------------------------
* Hash-Mode 1800 (sha512crypt $6$, SHA512 (Unix)) [Iterations: 5000]
--------------------------------------------------------------------

Speed.#2.........:   292.6 kH/s (88.57ms) @ Accel:2048 Loops:512 Thr:128 Vec:1

--------------------------------------------------------
* Hash-Mode 7500 (Kerberos 5, etype 23, AS-REQ Pre-Auth)
--------------------------------------------------------

Speed.#2.........:   836.8 MH/s (74.61ms) @ Accel:512 Loops:128 Thr:32 Vec:1

-------------------------------------------------
* Hash-Mode 13100 (Kerberos 5, etype 23, TGS-REP)
-------------------------------------------------

Speed.#2.........:   807.0 MH/s (77.39ms) @ Accel:64 Loops:1024 Thr:32 Vec:1

---------------------------------------------------------------------------------
* Hash-Mode 15300 (DPAPI masterkey file v1 (context 1 and 2)) [Iterations: 23999]
---------------------------------------------------------------------------------

Speed.#2.........:   143.7 kH/s (72.07ms) @ Accel:64 Loops:512 Thr:256 Vec:1

---------------------------------------------------------------------------------
* Hash-Mode 15900 (DPAPI masterkey file v2 (context 1 and 2)) [Iterations: 12899]
---------------------------------------------------------------------------------

Speed.#2.........:    61672 H/s (78.45ms) @ Accel:128 Loops:128 Thr:128 Vec:1

------------------------------------------------------------------
* Hash-Mode 7100 (macOS v10.8+ (PBKDF2-SHA512)) [Iterations: 1023]
------------------------------------------------------------------

Speed.#2.........:   773.7 kH/s (73.40ms) @ Accel:128 Loops:63 Thr:256 Vec:1

---------------------------------------------
* Hash-Mode 11600 (7-Zip) [Iterations: 16384]
---------------------------------------------

Speed.#2.........:   842.8 kH/s (67.10ms) @ Accel:32 Loops:4096 Thr:256 Vec:1

------------------------------------------------
* Hash-Mode 12500 (RAR3-hp) [Iterations: 262144]
------------------------------------------------

Speed.#2.........:   122.1 kH/s (61.62ms) @ Accel:16 Loops:16384 Thr:256 Vec:1

--------------------------------------------
* Hash-Mode 13000 (RAR5) [Iterations: 32799]
--------------------------------------------

Speed.#2.........:    84181 H/s (90.55ms) @ Accel:64 Loops:512 Thr:256 Vec:1

--------------------------------------------------------------------------------
* Hash-Mode 6211 (TrueCrypt RIPEMD160 + XTS 512 bit (legacy)) [Iterations: 1999]
--------------------------------------------------------------------------------

Speed.#2.........:   535.1 kH/s (55.87ms) @ Accel:128 Loops:128 Thr:128 Vec:1

-----------------------------------------------------------------------------------
* Hash-Mode 13400 (KeePass 1 (AES/Twofish) and KeePass 2 (AES)) [Iterations: 24569]
-----------------------------------------------------------------------------------

Speed.#2.........:    70559 H/s (69.44ms) @ Accel:512 Loops:128 Thr:64 Vec:1

-------------------------------------------------------------------
* Hash-Mode 6800 (LastPass + LastPass sniffed) [Iterations: 100099]
-------------------------------------------------------------------

Speed.#2.........:    28060 H/s (89.07ms) @ Accel:64 Loops:512 Thr:256 Vec:1

--------------------------------------------------------------------
* Hash-Mode 11300 (Bitcoin/Litecoin wallet.dat) [Iterations: 200459]
--------------------------------------------------------------------

Speed.#2.........:     8179 H/s (76.17ms) @ Accel:64 Loops:256 Thr:256 Vec:1

Started: Sat Dec  9 19:16:29 2023
Stopped: Sat Dec  9 19:23:20 2023

@sjnewbury
Copy link

@Demontager if you look carefully in your first example the OpenCL backend was skipped, in the second the HIP backend. This issue is valid for the HIP backend failing to work.

The "problem" is __ockl_get_group_id and __ockl_get_group_size functions are defined in ockl.bc from the amdgcn device lib. Although maybe the problem is that ockl is now needed? The below patch fixes hashcat by linking that library.

--- a/src/backend.c
+++ b/src/backend.c
@@ -8754,8 +8754,8 @@ static bool load_kernel (hashcat_ctx_t *hashcat_ctx, hc_device_param_t *device_p
       hiprtc_options[5] = "-I";
       */
 
-      hiprtc_options[1] = "-nocudainc";
-      hiprtc_options[2] = "-nocudalib";
+      hiprtc_options[1] = "-lockl";
+      hiprtc_options[2] = "";
       hiprtc_options[3] = "";
       hiprtc_options[4] = "";
 

On the other hand, why this is needed now, I'm not sure. I opened an issue in ROCm-Device-Libs, but they saw this as a hashcat issue.

@fxzjshm
Copy link

fxzjshm commented Feb 19, 2024

Also encountered this on ROCm 6.0.0, @sjnewbury could you please make a PR for this?

@matrix
Copy link
Member

matrix commented Feb 21, 2024

I am using 6.0.2 on Linux and everything is working perfectly

@fxzjshm
Copy link

fxzjshm commented Feb 22, 2024

Upgraded to ROCm 6.0.2 and this issue is fixed. Thanks for the info.

@MathiasDeWeerdt
Copy link

MathiasDeWeerdt commented Mar 29, 2024

I have a similar issue.

hiprtcCompileProgram(): HIPRTC_ERROR_COMPILATION

lld: error: undefined hidden symbol: __ockl_get_group_id
>>> referenced by /home/mathias/.local/share/hashcat/comgr-22df78/input/LLVMBitcode.bc.o:(gpu_decompress)
>>> referenced by /home/mathias/.local/share/hashcat/comgr-22df78/input/LLVMBitcode.bc.o:(gpu_decompress)
>>> referenced by /home/mathias/.local/share/hashcat/comgr-22df78/input/LLVMBitcode.bc.o:(gpu_memset)
>>> referenced 7 more times

lld: error: undefined hidden symbol: __ockl_get_local_size
>>> referenced by /home/mathias/.local/share/hashcat/comgr-22df78/input/LLVMBitcode.bc.o:(gpu_decompress)
>>> referenced by /home/mathias/.local/share/hashcat/comgr-22df78/input/LLVMBitcode.bc.o:(gpu_decompress)
>>> referenced by /home/mathias/.local/share/hashcat/comgr-22df78/input/LLVMBitcode.bc.o:(gpu_memset)
>>> referenced 7 more times

lld: error: undefined hidden symbol: __ockl_get_local_id
>>> referenced by /home/mathias/.local/share/hashcat/comgr-22df78/input/LLVMBitcode.bc.o:(gpu_decompress)
>>> referenced by /home/mathias/.local/share/hashcat/comgr-22df78/input/LLVMBitcode.bc.o:(gpu_decompress)
>>> referenced by /home/mathias/.local/share/hashcat/comgr-22df78/input/LLVMBitcode.bc.o:(gpu_memset)
>>> referenced 7 more times

* Device #1: Kernel /usr/local/share/hashcat/OpenCL/shared.cl build failed.

* Device #1: Kernel /usr/local/share/hashcat/OpenCL/shared.cl build failed.

System info

Linux mathias-ubuntu-desktop-pc 6.5.0-26-generic #26~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue Mar 12 10:22:43 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 22.04.4 LTS
Release:	22.04
Codename:	jammy
===================================== ROCm System Management Interface =====================================
=============================================== Concise Info ===============================================
Device  [Model : Revision]    Temp    Power  Partitions      SCLK    MCLK   Fan  Perf  PwrCap  VRAM%  GPU%  
        Name (20 chars)       (Edge)  (Avg)  (Mem, Compute)                                                 
============================================================================================================
0       [0x440e : 0xc0]       39.0°C  14.0W  N/A, N/A        500Mhz  96Mhz  0%   auto  289.0W    3%   2%    
        Navi 21 [Radeon RX 6                                                                                
============================================================================================================
=========================================== End of ROCm SMI Log ============================================
amdgpu-dkms version: 1:6.3.6.60002-1718217.22.04
rocm version: 6.0.2.60002-115~22.04
hashcat version: hashcat (v6.2.6-850-gfafb277e0) 
hashcat (v6.2.6-850-gfafb277e0) starting in backend information mode

The device #2 specifically listed was skipped because it is an alias of device #1

HIP Info:
=========

HIP.Version.: 6.0.32831

Backend Device ID #1 (Alias: #2)
  Name...........: AMD Radeon RX 6900 XT
  Processor(s)...: 40
  Clock..........: 2660
  Memory.Total...: 16368 MB
  Memory.Free....: 16312 MB
  Local.Memory...: 64 KB
  PCI.Addr.BDFe..: 0000:03:00.0

OpenCL Info:
============

OpenCL Platform ID #1
  Vendor..: Advanced Micro Devices, Inc.
  Name....: AMD Accelerated Parallel Processing
  Version.: OpenCL 2.1 AMD-APP (3602.0)

  Backend Device ID #2 (Alias: #1)
    Type...........: GPU
    Vendor.ID......: 1
    Vendor.........: Advanced Micro Devices, Inc.
    Name...........: AMD Radeon RX 6900 XT
    Version........: OpenCL 2.0 
    Processor(s)...: 40
    Clock..........: 2660
    Memory.Total...: 16368 MB (limited to 13912 MB allocatable in one block)
    Memory.Free....: 16256 MB
    Local.Memory...: 64 KB
    OpenCL.Version.: OpenCL C 2.0 
    Driver.Version.: 3602.0 (HSA1.1,LC)
    PCI.Addr.BDF...: 03:00.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants